Blob Blame History Raw
From 08a95c3541cbe2b3a1c671fa683bd6214ad996f0 Mon Sep 17 00:00:00 2001
From: Laszlo Ersek <lersek@redhat.com>
Date: Thu, 27 Aug 2020 00:21:29 +0200
Subject: [PATCH 3/5] OvmfPkg/CpuHotplugSmm: fix CPU hotplug race just after
 SMI broadcast
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

RH-Author: Laszlo Ersek (lersek)
RH-MergeRequest: 1: [RHEL-8.4.0] complete the "VCPU hotplug with SMI" OVMF feature
RH-Commit: [3/3] 40521ea89725b8b0ff8ca3f0a610ff45431e610e (lersek/edk2)
RH-Bugzilla: 1849177

The "virsh setvcpus" (plural) command may hot-plug several VCPUs in quick
succession -- it means a series of "device_add" QEMU monitor commands,
back-to-back.

If a "device_add" occurs *just after* ACPI raises the broadcast SMI, then:

- the CPU_FOREACH() loop in QEMU's ich9_apm_ctrl_changed() cannot make the
  SMI pending for the new CPU -- at that time, the new CPU doesn't even
  exist yet,

- OVMF will find the new CPU however (in the CPU hotplug register block),
  in QemuCpuhpCollectApicIds().

As a result, when the firmware sends an INIT-SIPI-SIPI to the new CPU in
SmbaseRelocate(), expecting it to boot into SMM (due to the pending SMI),
the new CPU instead boots straight into the post-RSM (normal mode) "pen",
skipping its initial SMI handler.

The CPU halts nicely in the pen, but its SMBASE is never relocated, and
the SMRAM message exchange with the BSP falls apart -- the BSP gets stuck
in the following loop:

  //
  // Wait until the hot-added CPU is just about to execute RSM.
  //
  while (Context->AboutToLeaveSmm == 0) {
    CpuPause ();
  }

because the new CPU's initial SMI handler never sets the flag to nonzero.

Fix this by sending a directed SMI to the new CPU just before sending it
the INIT-SIPI-SIPI. The various scenarios are documented in the code --
the cases affected by the patch are documented under point (2).

Note that this is not considered a security patch, as for a malicious
guest OS, the issue is not exploitable -- the symptom is a hang on the
BSP, in the above-noted loop in SmbaseRelocate(). Instead, the patch fixes
behavior for a benign guest OS.

Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Fixes: 51a6fb41181529e4b50ea13377425bda6bb69ba6
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=2929
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20200826222129.25798-3-lersek@redhat.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@arm.com>
(cherry picked from commit cbccf995920a28071f5403b847f29ebf8b732fa9)
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 OvmfPkg/CpuHotplugSmm/Smbase.c | 35 ++++++++++++++++++++++++++++------
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/OvmfPkg/CpuHotplugSmm/Smbase.c b/OvmfPkg/CpuHotplugSmm/Smbase.c
index 170571221d..d8f45c4313 100644
--- a/OvmfPkg/CpuHotplugSmm/Smbase.c
+++ b/OvmfPkg/CpuHotplugSmm/Smbase.c
@@ -220,14 +220,37 @@ SmbaseRelocate (
   //
   // Boot the hot-added CPU.
   //
-  // If the OS is benign, and so the hot-added CPU is still in RESET state,
-  // then the broadcast SMI is still pending for it; it will now launch
-  // directly into SMM.
+  // There are 2*2 cases to consider:
   //
-  // If the OS is malicious, the hot-added CPU has been booted already, and so
-  // it is already spinning on the APIC ID gate. In that case, the
-  // INIT-SIPI-SIPI below will be ignored.
+  // (1) The CPU was hot-added before the SMI was broadcast.
   //
+  // (1.1) The OS is benign.
+  //
+  //       The hot-added CPU is in RESET state, with the broadcast SMI pending
+  //       for it. The directed SMI below will be ignored (it's idempotent),
+  //       and the INIT-SIPI-SIPI will launch the CPU directly into SMM.
+  //
+  // (1.2) The OS is malicious.
+  //
+  //       The hot-added CPU has been booted, by the OS. Thus, the hot-added
+  //       CPU is spinning on the APIC ID gate. In that case, both the SMI and
+  //       the INIT-SIPI-SIPI below will be ignored.
+  //
+  // (2) The CPU was hot-added after the SMI was broadcast.
+  //
+  // (2.1) The OS is benign.
+  //
+  //       The hot-added CPU is in RESET state, with no SMI pending for it. The
+  //       directed SMI will latch the SMI for the CPU. Then the INIT-SIPI-SIPI
+  //       will launch the CPU into SMM.
+  //
+  // (2.2) The OS is malicious.
+  //
+  //       The hot-added CPU is executing OS code. The directed SMI will pull
+  //       the hot-added CPU into SMM, where it will start spinning on the APIC
+  //       ID gate. The INIT-SIPI-SIPI will be ignored.
+  //
+  SendSmiIpi (ApicId);
   SendInitSipiSipi (ApicId, PenAddress);
 
   //
-- 
2.27.0