Blame SOURCES/0022-lib-Disable-5-level-page-tables-when-using-cpu-max.patch

480497
From bb19cc0cdd43619ccf830e1e608f79e46f8ddf86 Mon Sep 17 00:00:00 2001
480497
From: "Richard W.M. Jones" <rjones@redhat.com>
480497
Date: Thu, 12 May 2022 08:36:37 +0100
480497
Subject: [PATCH] lib: Disable 5-level page tables when using -cpu max
480497
480497
In https://bugzilla.redhat.com/show_bug.cgi?id=2082806 we've been
480497
tracking an insidious qemu bug which intermittently prevents the
480497
libguestfs appliance from starting.  The symptoms are that SeaBIOS
480497
starts and displays its messages, but the kernel isn't reached.  We
480497
found that the kernel does in fact start, but when it tries to set up
480497
page tables and jump to protected mode it gets a triple fault which
480497
causes the emulated CPU in qemu to reset (qemu exits).
480497
480497
This seems to only affect TCG (not KVM).
480497
480497
Yesterday I found that this is caused by using -cpu max which enables
480497
the "la57" feature (5-level page tables[0]), and that we can make the
480497
problem go away using -cpu max,la57=off.  Note that I still don't
480497
fully understand the qemu bug, so this is only a workaround.
480497
480497
I chose to disable 5-level page tables for both TCG and KVM, partly to
480497
make the patch simpler, and partly because I guess it's not a feature
480497
(ie. 57 bit linear addresses) that is useful for the libguestfs
480497
appliance case, where we have limited physical memory and no need to
480497
run any programs with huge address spaces.
480497
480497
I tested this by running both the direct & libvirt paths overnight.  I
480497
expect that this patch will fail with old qemu/libvirt which doesn't
480497
understand the "la57" feature, but this is only intended as a
480497
temporary workaround.
480497
480497
[0] Article about 5-level page tables as background:
480497
https://lwn.net/Articles/717293/
480497
480497
Thanks: Laszlo Ersek
480497
Fixes: https://answers.launchpad.net/ubuntu/+source/libguestfs/+question/701625
480497
480497
[RHEL 8.7: Patch is not upstream.  This is the initial patch as posted
480497
to the mailing list here:
480497
https://listman.redhat.com/archives/libguestfs/2022-May/028853.html]
480497
---
480497
 lib/launch-direct.c  | 15 +++++++++++++--
480497
 lib/launch-libvirt.c |  7 +++++++
480497
 2 files changed, 20 insertions(+), 2 deletions(-)
480497
480497
diff --git a/lib/launch-direct.c b/lib/launch-direct.c
480497
index de17d2167..6b28e4724 100644
480497
--- a/lib/launch-direct.c
480497
+++ b/lib/launch-direct.c
480497
@@ -534,8 +534,19 @@ launch_direct (guestfs_h *g, void *datav, const char *arg)
480497
   } end_list ();
480497
 
480497
   cpu_model = guestfs_int_get_cpu_model (has_kvm && !force_tcg);
480497
-  if (cpu_model)
480497
-    arg ("-cpu", cpu_model);
480497
+  if (cpu_model) {
480497
+#if defined(__x86_64__)
480497
+    /* Temporary workaround for RHBZ#2082806 */
480497
+    if (STREQ (cpu_model, "max")) {
480497
+      start_list ("-cpu") {
480497
+        append_list (cpu_model);
480497
+        append_list ("la57=off");
480497
+      } end_list ();
480497
+    }
480497
+    else
480497
+#endif
480497
+      arg ("-cpu", cpu_model);
480497
+  }
480497
 
480497
   if (g->smp > 1)
480497
     arg_format ("-smp", "%d", g->smp);
480497
diff --git a/lib/launch-libvirt.c b/lib/launch-libvirt.c
480497
index db619910f..bad4a54ea 100644
480497
--- a/lib/launch-libvirt.c
480497
+++ b/lib/launch-libvirt.c
480497
@@ -1172,6 +1172,13 @@ construct_libvirt_xml_cpu (guestfs_h *g,
480497
       else if (STREQ (cpu_model, "max")) {
480497
         /* https://bugzilla.redhat.com/show_bug.cgi?id=1935572#c11 */
480497
         attribute ("mode", "maximum");
480497
+#if defined(__x86_64__)
480497
+        /* Temporary workaround for RHBZ#2082806 */
480497
+        start_element ("feature") {
480497
+          attribute ("policy", "disable");
480497
+          attribute ("name", "la57");
480497
+        } end_element ();
480497
+#endif
480497
       }
480497
       else
480497
         single_element ("model", cpu_model);
480497
-- 
480497
2.31.1
480497