Blame SOURCES/kvm-intel_iommu-better-handling-of-dmar-state-switch.patch

26ba25
From 467b5e3127faf168fba6ce61a602f82d34a699c7 Mon Sep 17 00:00:00 2001
26ba25
From: Peter Xu <peterx@redhat.com>
26ba25
Date: Thu, 8 Nov 2018 06:29:36 +0000
26ba25
Subject: [PATCH 08/35] intel_iommu: better handling of dmar state switch
26ba25
26ba25
RH-Author: Peter Xu <peterx@redhat.com>
26ba25
Message-id: <20181108062938.21143-6-peterx@redhat.com>
26ba25
Patchwork-id: 82964
26ba25
O-Subject: [RHEL-8 qemu-kvm PATCH 5/7] intel_iommu: better handling of dmar state switch
26ba25
Bugzilla: 1625173
26ba25
RH-Acked-by: Auger Eric <eric.auger@redhat.com>
26ba25
RH-Acked-by: Michael S. Tsirkin <mst@redhat.com>
26ba25
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
26ba25
26ba25
Bugzilla: 1625173
26ba25
26ba25
QEMU is not handling the global DMAR switch well, especially when from
26ba25
"on" to "off".
26ba25
26ba25
Let's first take the example of system reset.
26ba25
26ba25
Assuming that a guest has IOMMU enabled.  When it reboots, we will drop
26ba25
all the existing DMAR mappings to handle the system reset, however we'll
26ba25
still keep the existing memory layouts which has the IOMMU memory region
26ba25
enabled.  So after the reboot and before the kernel reloads again, there
26ba25
will be no mapping at all for the host device.  That's problematic since
26ba25
any software (for example, SeaBIOS) that runs earlier than the kernel
26ba25
after the reboot will assume the IOMMU is disabled, so any DMA from the
26ba25
software will fail.
26ba25
26ba25
For example, a guest that boots on an assigned NVMe device might fail to
26ba25
find the boot device after a system reboot/reset and we'll be able to
26ba25
observe SeaBIOS errors if we capture the debugging log:
26ba25
26ba25
  WARNING - Timeout at nvme_wait:144!
26ba25
26ba25
Meanwhile, we should see DMAR errors on the host of that NVMe device.
26ba25
It's the DMA fault that caused a NVMe driver timeout.
26ba25
26ba25
The correct fix should be that we do proper switching of device DMA
26ba25
address spaces when system resets, which will setup correct memory
26ba25
regions and notify the backend of the devices.  This might not affect
26ba25
much on non-assigned devices since QEMU VT-d emulation will assume a
26ba25
default passthrough mapping if DMAR is not enabled in the GCMD
26ba25
register (please refer to vtd_iommu_translate).  However that's required
26ba25
for an assigned devices, since that'll rebuild the correct GPA to HPA
26ba25
mapping that is needed for any DMA operation during guest bootstrap.
26ba25
26ba25
Besides the system reset, we have some other places that might change
26ba25
the global DMAR status and we'd better do the same thing there.  For
26ba25
example, when we change the state of GCMD register, or the DMAR root
26ba25
pointer.  Do the same refresh for all these places.  For these two
26ba25
places we'll also need to explicitly invalidate the context entry cache
26ba25
and iotlb cache.
26ba25
26ba25
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1625173
26ba25
CC: QEMU Stable <qemu-stable@nongnu.org>
26ba25
Reported-by: Cong Li <coli@redhat.com>
26ba25
Signed-off-by: Peter Xu <peterx@redhat.com>
26ba25
--
26ba25
v2:
26ba25
- do the same for GCMD write, or root pointer update [Alex]
26ba25
- test is carried out by me this time, by observing the
26ba25
  vtd_switch_address_space tracepoint after system reboot
26ba25
v3:
26ba25
- rewrite commit message as suggested by Alex
26ba25
Signed-off-by: Peter Xu <peterx@redhat.com>
26ba25
Reviewed-by: Eric Auger <eric.auger@redhat.com>
26ba25
Reviewed-by: Jason Wang <jasowang@redhat.com>
26ba25
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
26ba25
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
26ba25
(cherry picked from commit 2cc9ddccebcaa48b3debfc279a83761fcbb7616c)
26ba25
Signed-off-by: Peter Xu <peterx@redhat.com>
26ba25
26ba25
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
26ba25
---
26ba25
 hw/i386/intel_iommu.c | 21 ++++++++++++++-------
26ba25
 1 file changed, 14 insertions(+), 7 deletions(-)
26ba25
26ba25
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
26ba25
index 48d0ba3..a6e87a9 100644
26ba25
--- a/hw/i386/intel_iommu.c
26ba25
+++ b/hw/i386/intel_iommu.c
26ba25
@@ -37,6 +37,8 @@
26ba25
 #include "kvm_i386.h"
26ba25
 #include "trace.h"
26ba25
 
26ba25
+static void vtd_address_space_refresh_all(IntelIOMMUState *s);
26ba25
+
26ba25
 static void vtd_define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val,
26ba25
                             uint64_t wmask, uint64_t w1cmask)
26ba25
 {
26ba25
@@ -1436,7 +1438,7 @@ static void vtd_context_global_invalidate(IntelIOMMUState *s)
26ba25
         vtd_reset_context_cache_locked(s);
26ba25
     }
26ba25
     vtd_iommu_unlock(s);
26ba25
-    vtd_switch_address_space_all(s);
26ba25
+    vtd_address_space_refresh_all(s);
26ba25
     /*
26ba25
      * From VT-d spec 6.5.2.1, a global context entry invalidation
26ba25
      * should be followed by a IOTLB global invalidation, so we should
26ba25
@@ -1727,6 +1729,8 @@ static void vtd_handle_gcmd_srtp(IntelIOMMUState *s)
26ba25
     vtd_root_table_setup(s);
26ba25
     /* Ok - report back to driver */
26ba25
     vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_RTPS);
26ba25
+    vtd_reset_caches(s);
26ba25
+    vtd_address_space_refresh_all(s);
26ba25
 }
26ba25
 
26ba25
 /* Set Interrupt Remap Table Pointer */
26ba25
@@ -1759,7 +1763,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
26ba25
         vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0);
26ba25
     }
26ba25
 
26ba25
-    vtd_switch_address_space_all(s);
26ba25
+    vtd_reset_caches(s);
26ba25
+    vtd_address_space_refresh_all(s);
26ba25
 }
26ba25
 
26ba25
 /* Handle Interrupt Remap Enable/Disable */
26ba25
@@ -3059,6 +3064,12 @@ static void vtd_address_space_unmap_all(IntelIOMMUState *s)
26ba25
     }
26ba25
 }
26ba25
 
26ba25
+static void vtd_address_space_refresh_all(IntelIOMMUState *s)
26ba25
+{
26ba25
+    vtd_address_space_unmap_all(s);
26ba25
+    vtd_switch_address_space_all(s);
26ba25
+}
26ba25
+
26ba25
 static int vtd_replay_hook(IOMMUTLBEntry *entry, void *private)
26ba25
 {
26ba25
     memory_region_notify_one((IOMMUNotifier *)private, entry);
26ba25
@@ -3231,11 +3242,7 @@ static void vtd_reset(DeviceState *dev)
26ba25
     IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev);
26ba25
 
26ba25
     vtd_init(s);
26ba25
-
26ba25
-    /*
26ba25
-     * When device reset, throw away all mappings and external caches
26ba25
-     */
26ba25
-    vtd_address_space_unmap_all(s);
26ba25
+    vtd_address_space_refresh_all(s);
26ba25
 }
26ba25
 
26ba25
 static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
26ba25
-- 
26ba25
1.8.3.1
26ba25