Pablo Greco e6a3ae
From 7ab2261eebf90ea8a3cf5701fa177d181fe665d1 Mon Sep 17 00:00:00 2001
Pablo Greco e6a3ae
From: Laurent Vivier <lvivier@redhat.com>
Pablo Greco e6a3ae
Date: Thu, 10 Oct 2019 07:34:38 +0100
Pablo Greco e6a3ae
Subject: [PATCH 22/22] pseries: do not allow memory-less/cpu-less NUMA node
Pablo Greco e6a3ae
MIME-Version: 1.0
Pablo Greco e6a3ae
Content-Type: text/plain; charset=UTF-8
Pablo Greco e6a3ae
Content-Transfer-Encoding: 8bit
Pablo Greco e6a3ae
Pablo Greco e6a3ae
RH-Author: Laurent Vivier <lvivier@redhat.com>
Pablo Greco e6a3ae
Message-id: <20191010073438.16478-1-lvivier@redhat.com>
Pablo Greco e6a3ae
Patchwork-id: 91379
Pablo Greco e6a3ae
O-Subject: [RHEL-8.2.0 qemu-kvm PATCH] pseries: do not allow memory-less/cpu-less NUMA node
Pablo Greco e6a3ae
Bugzilla: 1651474
Pablo Greco e6a3ae
RH-Acked-by: David Gibson <dgibson@redhat.com>
Pablo Greco e6a3ae
RH-Acked-by: Thomas Huth <thuth@redhat.com>
Pablo Greco e6a3ae
RH-Acked-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Pablo Greco e6a3ae
Pablo Greco e6a3ae
When we hotplug a CPU on memory-less/cpu-less node, the linux kernel
Pablo Greco e6a3ae
crashes.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
This happens because linux kernel needs to know the NUMA topology at
Pablo Greco e6a3ae
start to be able to initialize the distance lookup table.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
On pseries, the topology is provided by the firmware via the existing
Pablo Greco e6a3ae
CPUs and memory information. Thus a node without memory and CPU cannot be
Pablo Greco e6a3ae
discovered by the kernel.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
To avoid the kernel crash, do not allow to start pseries with empty
Pablo Greco e6a3ae
nodes.
Pablo Greco e6a3ae
Pablo Greco e6a3ae
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Pablo Greco e6a3ae
Message-Id: <20190830161345.22436-1-lvivier@redhat.com>
Pablo Greco e6a3ae
[dwg: Rework to cope with movement of numa state from globals to MachineState]
Pablo Greco e6a3ae
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Pablo Greco e6a3ae
(cherry picked from commit 58c46efa451caa3935224223f950216872e2eee3)
Pablo Greco e6a3ae
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Pablo Greco e6a3ae
Pablo Greco e6a3ae
Conflicts in the context:
Pablo Greco e6a3ae
	hw/ppc/spapr.c
Pablo Greco e6a3ae
because of missing downstream commits:
Pablo Greco e6a3ae
  0550b1206a91 ("spapr: don't advertise radix GTSE if max-compat-cpu < power9")
Pablo Greco e6a3ae
  ad99d04c76de ("target/ppc: Allow cpu compatiblity checks based on type, not instance")
Pablo Greco e6a3ae
Pablo Greco e6a3ae
because of missing donwtream commit:
Pablo Greco e6a3ae
Pablo Greco e6a3ae
  7e721e7b10e1 ("numa: move numa global variable numa_info into MachineState")
Pablo Greco e6a3ae
Pablo Greco e6a3ae
replaced numa_state by numa_info (revert dwg rework), back to original
Pablo Greco e6a3ae
patch I sent:
Pablo Greco e6a3ae
Pablo Greco e6a3ae
  https://patchew.org/QEMU/20190830161345.22436-1-lvivier@redhat.com/
Pablo Greco e6a3ae
Pablo Greco e6a3ae
BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1651474
Pablo Greco e6a3ae
BRANCH: rhel-8.2.0
Pablo Greco e6a3ae
UPSTREAM: merged
Pablo Greco e6a3ae
BREW: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=23924908
Pablo Greco e6a3ae
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
Pablo Greco e6a3ae
---
Pablo Greco e6a3ae
 hw/ppc/spapr.c | 33 +++++++++++++++++++++++++++++++++
Pablo Greco e6a3ae
 1 file changed, 33 insertions(+)
Pablo Greco e6a3ae
Pablo Greco e6a3ae
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
Pablo Greco e6a3ae
index 1a2f0d9..b4c9993 100644
Pablo Greco e6a3ae
--- a/hw/ppc/spapr.c
Pablo Greco e6a3ae
+++ b/hw/ppc/spapr.c
Pablo Greco e6a3ae
@@ -2527,6 +2527,39 @@ static void spapr_machine_init(MachineState *machine)
Pablo Greco e6a3ae
     /* init CPUs */
Pablo Greco e6a3ae
     spapr_init_cpus(spapr);
Pablo Greco e6a3ae
 
Pablo Greco e6a3ae
+    /*
Pablo Greco e6a3ae
+     * check we don't have a memory-less/cpu-less NUMA node
Pablo Greco e6a3ae
+     * Firmware relies on the existing memory/cpu topology to provide the
Pablo Greco e6a3ae
+     * NUMA topology to the kernel.
Pablo Greco e6a3ae
+     * And the linux kernel needs to know the NUMA topology at start
Pablo Greco e6a3ae
+     * to be able to hotplug CPUs later.
Pablo Greco e6a3ae
+     */
Pablo Greco e6a3ae
+    if (nb_numa_nodes) {
Pablo Greco e6a3ae
+        for (i = 0; i < nb_numa_nodes; ++i) {
Pablo Greco e6a3ae
+            /* check for memory-less node */
Pablo Greco e6a3ae
+            if (numa_info[i].node_mem == 0) {
Pablo Greco e6a3ae
+                CPUState *cs;
Pablo Greco e6a3ae
+                int found = 0;
Pablo Greco e6a3ae
+                /* check for cpu-less node */
Pablo Greco e6a3ae
+                CPU_FOREACH(cs) {
Pablo Greco e6a3ae
+                    PowerPCCPU *cpu = POWERPC_CPU(cs);
Pablo Greco e6a3ae
+                    if (cpu->node_id == i) {
Pablo Greco e6a3ae
+                        found = 1;
Pablo Greco e6a3ae
+                        break;
Pablo Greco e6a3ae
+                    }
Pablo Greco e6a3ae
+                }
Pablo Greco e6a3ae
+                /* memory-less and cpu-less node */
Pablo Greco e6a3ae
+                if (!found) {
Pablo Greco e6a3ae
+                    error_report(
Pablo Greco e6a3ae
+                       "Memory-less/cpu-less nodes are not supported (node %d)",
Pablo Greco e6a3ae
+                                 i);
Pablo Greco e6a3ae
+                    exit(1);
Pablo Greco e6a3ae
+                }
Pablo Greco e6a3ae
+            }
Pablo Greco e6a3ae
+        }
Pablo Greco e6a3ae
+
Pablo Greco e6a3ae
+    }
Pablo Greco e6a3ae
+
Pablo Greco e6a3ae
     if (kvm_enabled()) {
Pablo Greco e6a3ae
         /* Enable H_LOGICAL_CI_* so SLOF can talk to in-kernel devices */
Pablo Greco e6a3ae
         kvmppc_enable_logical_ci_hcalls();
Pablo Greco e6a3ae
-- 
Pablo Greco e6a3ae
1.8.3.1
Pablo Greco e6a3ae