yeahuh / rpms / qemu-kvm

Forked from rpms/qemu-kvm 2 years ago
Clone

Blame SOURCES/kvm-s390x-tod-Properly-stop-the-KVM-TOD-while-the-guest-.patch

26ba25
From 9552c25dc788925f211daaa55518e5144f8f0cc7 Mon Sep 17 00:00:00 2001
26ba25
From: David Hildenbrand <david@redhat.com>
26ba25
Date: Fri, 21 Dec 2018 15:36:13 +0000
26ba25
Subject: [PATCH 11/22] s390x/tod: Properly stop the KVM TOD while the guest is
26ba25
 not running
26ba25
26ba25
RH-Author: David Hildenbrand <david@redhat.com>
26ba25
Message-id: <20181221153614.27961-12-david@redhat.com>
26ba25
Patchwork-id: 83755
26ba25
O-Subject: [RHEL-8.0 qemu-kvm v2 PATCH 11/12] s390x/tod: Properly stop the KVM TOD while the guest is not running
26ba25
Bugzilla: 1653569
26ba25
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
26ba25
RH-Acked-by: Thomas Huth <thuth@redhat.com>
26ba25
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
26ba25
26ba25
Just like on other architectures, we should stop the clock while the guest
26ba25
is not running. This is already properly done for TCG. Right now, doing an
26ba25
offline migration (stop, migrate, cont) can easily trigger stalls in the
26ba25
guest.
26ba25
26ba25
Even doing a
26ba25
    (hmp) stop
26ba25
    ... wait 2 minutes ...
26ba25
    (hmp) cont
26ba25
will already trigger stalls.
26ba25
26ba25
So whenever the guest stops, backup the KVM TOD. When continuing to run
26ba25
the guest, restore the KVM TOD.
26ba25
26ba25
One special case is starting a simple VM: Reading the TOD from KVM to
26ba25
stop it right away until the guest is actually started means that the
26ba25
time of any simple VM will already differ to the host time. We can
26ba25
simply leave the TOD running and the guest won't be able to recognize
26ba25
it.
26ba25
26ba25
For migration, we actually want to keep the TOD stopped until really
26ba25
starting the guest. To be able to catch most errors, we should however
26ba25
try to set the TOD in addition to simply storing it. So we can still
26ba25
catch basic migration problems.
26ba25
26ba25
If anything goes wrong while backing up/restoring the TOD, we have to
26ba25
ignore it (but print a warning). This is then basically a fallback to
26ba25
old behavior (TOD remains running).
26ba25
26ba25
I tested this very basically with an initrd:
26ba25
    1. Start a simple VM. Observed that the TOD is kept running. Old
26ba25
       behavior.
26ba25
    2. Ordinary live migration. Observed that the TOD is temporarily
26ba25
       stopped on the destination when setting the new value and
26ba25
       correctly started when finally starting the guest.
26ba25
    3. Offline live migration. (stop, migrate, cont). Observed that the
26ba25
       TOD will be stopped on the source with the "stop" command. On the
26ba25
       destination, the TOD is temporarily stopped when setting the new
26ba25
       value and correctly started when finally starting the guest via
26ba25
       "cont".
26ba25
    4. Simple stop/cont correctly stops/starts the TOD. (multiple stops
26ba25
       or conts in a row have no effect, so works as expected)
26ba25
26ba25
In the future, we might want to send the guest a special kind of time sync
26ba25
interrupt under some conditions, so it can synchronize its tod to the
26ba25
host tod. This is interesting for migration scenarios but also when we
26ba25
get time sync interrupts ourselves. This however will most probably have
26ba25
to be handled in KVM (e.g. when the tods differ too much) and is not
26ba25
desired e.g. when debugging the guest (single stepping should not
26ba25
result in permanent time syncs). I consider something like that an add-on
26ba25
on top of this basic "don't break the guest" handling.
26ba25
26ba25
Signed-off-by: David Hildenbrand <david@redhat.com>
26ba25
Message-Id: <20181130094957.4121-1-david@redhat.com>
26ba25
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
26ba25
Reviewed-by: Thomas Huth <thuth@redhat.com>
26ba25
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
26ba25
(cherry picked from commit 9bc9d3d1ae3bcd1caaad1946494726b52f58b291)
26ba25
Signed-off-by: David Hildenbrand <david@redhat.com>
26ba25
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
26ba25
---
26ba25
 hw/s390x/tod-kvm.c     | 102 ++++++++++++++++++++++++++++++++++++++++++++++++-
26ba25
 include/hw/s390x/tod.h |   8 +++-
26ba25
 2 files changed, 107 insertions(+), 3 deletions(-)
26ba25
26ba25
diff --git a/hw/s390x/tod-kvm.c b/hw/s390x/tod-kvm.c
26ba25
index df564ab..2456bf7 100644
26ba25
--- a/hw/s390x/tod-kvm.c
26ba25
+++ b/hw/s390x/tod-kvm.c
26ba25
@@ -10,10 +10,11 @@
26ba25
 
26ba25
 #include "qemu/osdep.h"
26ba25
 #include "qapi/error.h"
26ba25
+#include "sysemu/sysemu.h"
26ba25
 #include "hw/s390x/tod.h"
26ba25
 #include "kvm_s390x.h"
26ba25
 
26ba25
-static void kvm_s390_tod_get(const S390TODState *td, S390TOD *tod, Error **errp)
26ba25
+static void kvm_s390_get_tod_raw(S390TOD *tod, Error **errp)
26ba25
 {
26ba25
     int r;
26ba25
 
26ba25
@@ -27,7 +28,17 @@ static void kvm_s390_tod_get(const S390TODState *td, S390TOD *tod, Error **errp)
26ba25
     }
26ba25
 }
26ba25
 
26ba25
-static void kvm_s390_tod_set(S390TODState *td, const S390TOD *tod, Error **errp)
26ba25
+static void kvm_s390_tod_get(const S390TODState *td, S390TOD *tod, Error **errp)
26ba25
+{
26ba25
+    if (td->stopped) {
26ba25
+        *tod = td->base;
26ba25
+        return;
26ba25
+    }
26ba25
+
26ba25
+    kvm_s390_get_tod_raw(tod, errp);
26ba25
+}
26ba25
+
26ba25
+static void kvm_s390_set_tod_raw(const S390TOD *tod, Error **errp)
26ba25
 {
26ba25
     int r;
26ba25
 
26ba25
@@ -41,18 +52,105 @@ static void kvm_s390_tod_set(S390TODState *td, const S390TOD *tod, Error **errp)
26ba25
     }
26ba25
 }
26ba25
 
26ba25
+static void kvm_s390_tod_set(S390TODState *td, const S390TOD *tod, Error **errp)
26ba25
+{
26ba25
+    Error *local_err = NULL;
26ba25
+
26ba25
+    /*
26ba25
+     * Somebody (e.g. migration) set the TOD. We'll store it into KVM to
26ba25
+     * properly detect errors now but take a look at the runstate to decide
26ba25
+     * whether really to keep the tod running. E.g. during migration, this
26ba25
+     * is the point where we want to stop the initially running TOD to fire
26ba25
+     * it back up when actually starting the migrated guest.
26ba25
+     */
26ba25
+    kvm_s390_set_tod_raw(tod, &local_err);
26ba25
+    if (local_err) {
26ba25
+        error_propagate(errp, local_err);
26ba25
+        return;
26ba25
+    }
26ba25
+
26ba25
+    if (runstate_is_running()) {
26ba25
+        td->stopped = false;
26ba25
+    } else {
26ba25
+        td->stopped = true;
26ba25
+        td->base = *tod;
26ba25
+    }
26ba25
+}
26ba25
+
26ba25
+static void kvm_s390_tod_vm_state_change(void *opaque, int running,
26ba25
+                                         RunState state)
26ba25
+{
26ba25
+    S390TODState *td = opaque;
26ba25
+    Error *local_err = NULL;
26ba25
+
26ba25
+    if (running && td->stopped) {
26ba25
+        /* Set the old TOD when running the VM - start the TOD clock. */
26ba25
+        kvm_s390_set_tod_raw(&td->base, &local_err);
26ba25
+        if (local_err) {
26ba25
+            warn_report_err(local_err);
26ba25
+        }
26ba25
+        /* Treat errors like the TOD was running all the time. */
26ba25
+        td->stopped = false;
26ba25
+    } else if (!running && !td->stopped) {
26ba25
+        /* Store the TOD when stopping the VM - stop the TOD clock. */
26ba25
+        kvm_s390_get_tod_raw(&td->base, &local_err);
26ba25
+        if (local_err) {
26ba25
+            /* Keep the TOD running in case we could not back it up. */
26ba25
+            warn_report_err(local_err);
26ba25
+        } else {
26ba25
+            td->stopped = true;
26ba25
+        }
26ba25
+    }
26ba25
+}
26ba25
+
26ba25
+static void kvm_s390_tod_realize(DeviceState *dev, Error **errp)
26ba25
+{
26ba25
+    S390TODState *td = S390_TOD(dev);
26ba25
+    S390TODClass *tdc = S390_TOD_GET_CLASS(td);
26ba25
+    Error *local_err = NULL;
26ba25
+
26ba25
+    tdc->parent_realize(dev, &local_err);
26ba25
+    if (local_err) {
26ba25
+        error_propagate(errp, local_err);
26ba25
+        return;
26ba25
+    }
26ba25
+
26ba25
+    /*
26ba25
+     * We need to know when the VM gets started/stopped to start/stop the TOD.
26ba25
+     * As we can never have more than one TOD instance (and that will never be
26ba25
+     * removed), registering here and never unregistering is good enough.
26ba25
+     */
26ba25
+    qemu_add_vm_change_state_handler(kvm_s390_tod_vm_state_change, td);
26ba25
+}
26ba25
+
26ba25
 static void kvm_s390_tod_class_init(ObjectClass *oc, void *data)
26ba25
 {
26ba25
     S390TODClass *tdc = S390_TOD_CLASS(oc);
26ba25
 
26ba25
+    device_class_set_parent_realize(DEVICE_CLASS(oc), kvm_s390_tod_realize,
26ba25
+                                    &tdc->parent_realize);
26ba25
     tdc->get = kvm_s390_tod_get;
26ba25
     tdc->set = kvm_s390_tod_set;
26ba25
 }
26ba25
 
26ba25
+static void kvm_s390_tod_init(Object *obj)
26ba25
+{
26ba25
+    S390TODState *td = S390_TOD(obj);
26ba25
+
26ba25
+    /*
26ba25
+     * The TOD is initially running (value stored in KVM). Avoid needless
26ba25
+     * loading/storing of the TOD when starting a simple VM, so let it
26ba25
+     * run although the (never started) VM is stopped. For migration, we
26ba25
+     * will properly set the TOD later.
26ba25
+     */
26ba25
+    td->stopped = false;
26ba25
+}
26ba25
+
26ba25
 static TypeInfo kvm_s390_tod_info = {
26ba25
     .name = TYPE_KVM_S390_TOD,
26ba25
     .parent = TYPE_S390_TOD,
26ba25
     .instance_size = sizeof(S390TODState),
26ba25
+    .instance_init = kvm_s390_tod_init,
26ba25
     .class_init = kvm_s390_tod_class_init,
26ba25
     .class_size = sizeof(S390TODClass),
26ba25
 };
26ba25
diff --git a/include/hw/s390x/tod.h b/include/hw/s390x/tod.h
26ba25
index 413c0d7..cbd7552 100644
26ba25
--- a/include/hw/s390x/tod.h
26ba25
+++ b/include/hw/s390x/tod.h
26ba25
@@ -31,13 +31,19 @@ typedef struct S390TODState {
26ba25
     /* private */
26ba25
     DeviceState parent_obj;
26ba25
 
26ba25
-    /* unused by KVM implementation */
26ba25
+    /*
26ba25
+     * Used by TCG to remember the time base. Used by KVM to backup the TOD
26ba25
+     * while the TOD is stopped.
26ba25
+     */
26ba25
     S390TOD base;
26ba25
+    /* Used by KVM to remember if the TOD is stopped and base is valid. */
26ba25
+    bool stopped;
26ba25
 } S390TODState;
26ba25
 
26ba25
 typedef struct S390TODClass {
26ba25
     /* private */
26ba25
     DeviceClass parent_class;
26ba25
+    void (*parent_realize)(DeviceState *dev, Error **errp);
26ba25
 
26ba25
     /* public */
26ba25
     void (*get)(const S390TODState *td, S390TOD *tod, Error **errp);
26ba25
-- 
26ba25
1.8.3.1
26ba25