Blob Blame History Raw
From 9ebed8090b88282f9b7432258df9182b9d3944ee Mon Sep 17 00:00:00 2001
From: Greg Kurz <gkurz@redhat.com>
Date: Tue, 19 Jan 2021 15:09:52 -0500
Subject: [PATCH 4/9] spapr: Fix handling of unplugged devices during CAS and
 migration

RH-Author: Greg Kurz <gkurz@redhat.com>
Message-id: <20210119150954.1017058-5-gkurz@redhat.com>
Patchwork-id: 100685
O-Subject: [RHEL-8.4.0 qemu-kvm PATCH v2 4/6] spapr: Fix handling of unplugged devices during CAS and migration
Bugzilla: 1901837
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
RH-Acked-by: David Gibson <dgibson@redhat.com>

From: Greg Kurz <groug@kaod.org>

We already detect if a device is being hot plugged before CAS to trigger
a CAS reboot and during migration to migrate the state of the associated
DRC. But hot unplugging a device is also an asynchronous operation that
requires the guest to take action. This means that if the guest is migrated
after the hot unplug event was sent but before it could release the device
with RTAS, the destination QEMU doesn't know about the pending unplug
operation and doesn't actually remove the device when the guest finally
releases it.

Similarly, if the unplug request is fired before CAS, the guest isn't
notified of the change, just like with hotplug. It ends up booting with
the device still present in the DT and configures it, just like it was
never removed. Even weirder, since the event is still queued, it will
be eventually processed when some other unrelated event is posted to
the guest.

Enhance spapr_drc_transient() to also return true if an unplug request is
pending. This fixes the issue at CAS with a CAS reboot request and
causes the DRC state to be migrated. Some extra care is still needed to
inform the destination that an unplug request is pending : migrate the
unplug_requested field of the DRC in an optional subsection. This might
break backwards migration, but this is still better than ending with
an inconsistent guest.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <158169248798.3465937.1108351365840514270.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit ab8584349c476f9818dc6403359c85f9ab0ad5eb)
Signed-off-by: Greg Kurz <gkurz@redhat.com>
Signed-off-by: Jon Maloy <jmaloy.redhat.com>
---
 hw/ppc/spapr_drc.c | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 9b498d429e..897bb7aae0 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -455,6 +455,22 @@ void spapr_drc_reset(SpaprDrc *drc)
     }
 }
 
+static bool spapr_drc_unplug_requested_needed(void *opaque)
+{
+    return spapr_drc_unplug_requested(opaque);
+}
+
+static const VMStateDescription vmstate_spapr_drc_unplug_requested = {
+    .name = "spapr_drc/unplug_requested",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = spapr_drc_unplug_requested_needed,
+    .fields  = (VMStateField []) {
+        VMSTATE_BOOL(unplug_requested, SpaprDrc),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 bool spapr_drc_transient(SpaprDrc *drc)
 {
     SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
@@ -470,9 +486,10 @@ bool spapr_drc_transient(SpaprDrc *drc)
     /*
      * We need to reset the DRC at CAS or to migrate the DRC state if it's
      * not equal to the expected long-term state, which is the same as the
-     * coldplugged initial state.
+     * coldplugged initial state, or if an unplug request is pending.
      */
-    return (drc->state != drck->ready_state);
+    return drc->state != drck->ready_state ||
+        spapr_drc_unplug_requested(drc);
 }
 
 static bool spapr_drc_needed(void *opaque)
@@ -488,6 +505,10 @@ static const VMStateDescription vmstate_spapr_drc = {
     .fields  = (VMStateField []) {
         VMSTATE_UINT32(state, SpaprDrc),
         VMSTATE_END_OF_LIST()
+    },
+    .subsections = (const VMStateDescription * []) {
+        &vmstate_spapr_drc_unplug_requested,
+        NULL
     }
 };
 
-- 
2.18.2