Blame SOURCES/0313-ieee1275-request-memory-with-ibm-client-architecture.patch

b35c50
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
b35c50
From: Daniel Axtens <dja@axtens.net>
b35c50
Date: Mon, 6 Feb 2023 10:03:20 -0500
b35c50
Subject: [PATCH] ieee1275: request memory with ibm,
b35c50
 client-architecture-support
b35c50
b35c50
On PowerVM, the first time we boot a Linux partition, we may only get
b35c50
256MB of real memory area, even if the partition has more memory.
b35c50
b35c50
This isn't enough to reliably verify a kernel. Fortunately, the Power
b35c50
Architecture Platform Reference (PAPR) defines a method we can call to ask
b35c50
for more memory: the broad and powerful ibm,client-architecture-support
b35c50
(CAS) method.
b35c50
b35c50
CAS can do an enormous amount of things on a PAPR platform: as well as
b35c50
asking for memory, you can set the supported processor level, the interrupt
b35c50
controller, hash vs radix mmu, and so on.
b35c50
b35c50
If:
b35c50
b35c50
 - we are running under what we think is PowerVM (compatible property of /
b35c50
   begins with "IBM"), and
b35c50
b35c50
 - the full amount of RMA is less than 512MB (as determined by the reg
b35c50
   property of /memory)
b35c50
b35c50
then call CAS as follows: (refer to the Linux on Power Architecture
b35c50
Reference, LoPAR, which is public, at B.5.2.3):
b35c50
b35c50
 - Use the "any" PVR value and supply 2 option vectors.
b35c50
b35c50
 - Set option vector 1 (PowerPC Server Processor Architecture Level)
b35c50
   to "ignore".
b35c50
b35c50
 - Set option vector 2 with default or Linux-like options, including a
b35c50
   min-rma-size of 512MB.
b35c50
b35c50
 - Set option vector 3 to request Floating Point, VMX and Decimal Floating
b35c50
   point, but don't abort the boot if we can't get them.
b35c50
b35c50
 - Set option vector 4 to request a minimum VP percentage to 1%, which is
b35c50
   what Linux requests, and is below the default of 10%. Without this,
b35c50
   some systems with very large or very small configurations fail to boot.
b35c50
b35c50
This will cause a CAS reboot and the partition will restart with 512MB
b35c50
of RMA. Importantly, grub will notice the 512MB and not call CAS again.
b35c50
b35c50
Notes about the choices of parameters:
b35c50
b35c50
 - A partition can be configured with only 256MB of memory, which would
b35c50
   mean this request couldn't be satisfied, but PFW refuses to load with
b35c50
   only 256MB of memory, so it's a bit moot. SLOF will run fine with 256MB,
b35c50
   but we will never call CAS under qemu/SLOF because /compatible won't
b35c50
   begin with "IBM".)
b35c50
b35c50
 - unspecified CAS vectors take on default values. Some of these values
b35c50
   might restrict the ability of certain hardware configurations to boot.
b35c50
   This is why we need to specify the VP percentage in vector 4, which is
b35c50
   in turn why we need to specify vector 3.
b35c50
b35c50
Finally, we should have enough memory to verify a kernel, and we will
b35c50
reach Linux. One of the first things Linux does while still running under
b35c50
OpenFirmware is to call CAS with a much fuller set of options (including
b35c50
asking for 512MB of memory). Linux includes a much more restrictive set of
b35c50
PVR values and processor support levels, and this CAS invocation will likely
b35c50
induce another reboot. On this reboot grub will again notice the higher RMA,
b35c50
and not call CAS. We will get to Linux again, Linux will call CAS again, but
b35c50
because the values are now set for Linux this will not induce another CAS
b35c50
reboot and we will finally boot all the way to userspace.
b35c50
b35c50
On all subsequent boots, everything will be configured with 512MB of RMA,
b35c50
so there will be no further CAS reboots from grub. (phyp is super sticky
b35c50
with the RMA size - it persists even on cold boots. So if you've ever booted
b35c50
Linux in a partition, you'll probably never have grub call CAS. It'll only
b35c50
ever fire the first time a partition loads grub, or if you deliberately lower
b35c50
the amount of memory your partition has below 512MB.)
b35c50
b35c50
Signed-off-by: Daniel Axtens <dja@axtens.net>
b35c50
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
b35c50
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
b35c50
(cherry picked from commit d5571590b7de61887efac1c298901455697ba307)
b35c50
---
b35c50
 grub-core/kern/ieee1275/cmain.c  |   5 ++
b35c50
 grub-core/kern/ieee1275/init.c   | 167 ++++++++++++++++++++++++++++++++++++++-
b35c50
 include/grub/ieee1275/ieee1275.h |  12 ++-
b35c50
 3 files changed, 182 insertions(+), 2 deletions(-)
b35c50
b35c50
diff --git a/grub-core/kern/ieee1275/cmain.c b/grub-core/kern/ieee1275/cmain.c
b35c50
index 04df9d2c66..dce7b84922 100644
b35c50
--- a/grub-core/kern/ieee1275/cmain.c
b35c50
+++ b/grub-core/kern/ieee1275/cmain.c
b35c50
@@ -127,6 +127,11 @@ grub_ieee1275_find_options (void)
b35c50
 	      break;
b35c50
 	    }
b35c50
 	}
b35c50
+
b35c50
+#if defined(__powerpc__)
b35c50
+      if (grub_strncmp (tmp, "IBM,", 4) == 0)
b35c50
+	grub_ieee1275_set_flag (GRUB_IEEE1275_FLAG_CAN_TRY_CAS_FOR_MORE_MEMORY);
b35c50
+#endif
b35c50
     }
b35c50
 
b35c50
   if (is_smartfirmware)
b35c50
diff --git a/grub-core/kern/ieee1275/init.c b/grub-core/kern/ieee1275/init.c
b35c50
index 6581c2c996..8ae405bc79 100644
b35c50
--- a/grub-core/kern/ieee1275/init.c
b35c50
+++ b/grub-core/kern/ieee1275/init.c
b35c50
@@ -202,11 +202,176 @@ heap_init (grub_uint64_t addr, grub_uint64_t len, grub_memory_type_t type,
b35c50
   return 0;
b35c50
 }
b35c50
 
b35c50
-static void 
b35c50
+/*
b35c50
+ * How much memory does OF believe it has? (regardless of whether
b35c50
+ * it's accessible or not)
b35c50
+ */
b35c50
+static grub_err_t
b35c50
+grub_ieee1275_total_mem (grub_uint64_t *total)
b35c50
+{
b35c50
+  grub_ieee1275_phandle_t root;
b35c50
+  grub_ieee1275_phandle_t memory;
b35c50
+  grub_uint32_t reg[4];
b35c50
+  grub_ssize_t reg_size;
b35c50
+  grub_uint32_t address_cells = 1;
b35c50
+  grub_uint32_t size_cells = 1;
b35c50
+  grub_uint64_t size;
b35c50
+
b35c50
+  /* If we fail to get to the end, report 0. */
b35c50
+  *total = 0;
b35c50
+
b35c50
+  /* Determine the format of each entry in `reg'.  */
b35c50
+  if (grub_ieee1275_finddevice ("/", &root))
b35c50
+    return grub_error (GRUB_ERR_UNKNOWN_DEVICE, "couldn't find / node");
b35c50
+  if (grub_ieee1275_get_integer_property (root, "#address-cells", &address_cells,
b35c50
+					  sizeof (address_cells), 0))
b35c50
+    return grub_error (GRUB_ERR_UNKNOWN_DEVICE, "couldn't examine #address-cells");
b35c50
+  if (grub_ieee1275_get_integer_property (root, "#size-cells", &size_cells,
b35c50
+					  sizeof (size_cells), 0))
b35c50
+    return grub_error (GRUB_ERR_UNKNOWN_DEVICE, "couldn't examine #size-cells");
b35c50
+
b35c50
+  if (size_cells > address_cells)
b35c50
+    address_cells = size_cells;
b35c50
+
b35c50
+  /* Load `/memory/reg'.  */
b35c50
+  if (grub_ieee1275_finddevice ("/memory", &memory))
b35c50
+    return grub_error (GRUB_ERR_UNKNOWN_DEVICE, "couldn't find /memory node");
b35c50
+  if (grub_ieee1275_get_integer_property (memory, "reg", reg,
b35c50
+					  sizeof (reg), &reg_size))
b35c50
+    return grub_error (GRUB_ERR_UNKNOWN_DEVICE, "couldn't examine /memory/reg property");
b35c50
+  if (reg_size < 0 || (grub_size_t) reg_size > sizeof (reg))
b35c50
+    return grub_error (GRUB_ERR_UNKNOWN_DEVICE, "/memory response buffer exceeded");
b35c50
+
b35c50
+  if (grub_ieee1275_test_flag (GRUB_IEEE1275_FLAG_BROKEN_ADDRESS_CELLS))
b35c50
+    {
b35c50
+      address_cells = 1;
b35c50
+      size_cells = 1;
b35c50
+    }
b35c50
+
b35c50
+  /* Decode only the size */
b35c50
+  size = reg[address_cells];
b35c50
+  if (size_cells == 2)
b35c50
+    size = (size << 32) | reg[address_cells + 1];
b35c50
+
b35c50
+  *total = size;
b35c50
+
b35c50
+  return grub_errno;
b35c50
+}
b35c50
+
b35c50
+#if defined(__powerpc__)
b35c50
+
b35c50
+/* See PAPR or arch/powerpc/kernel/prom_init.c */
b35c50
+struct option_vector2
b35c50
+{
b35c50
+  grub_uint8_t byte1;
b35c50
+  grub_uint16_t reserved;
b35c50
+  grub_uint32_t real_base;
b35c50
+  grub_uint32_t real_size;
b35c50
+  grub_uint32_t virt_base;
b35c50
+  grub_uint32_t virt_size;
b35c50
+  grub_uint32_t load_base;
b35c50
+  grub_uint32_t min_rma;
b35c50
+  grub_uint32_t min_load;
b35c50
+  grub_uint8_t min_rma_percent;
b35c50
+  grub_uint8_t max_pft_size;
b35c50
+} GRUB_PACKED;
b35c50
+
b35c50
+struct pvr_entry
b35c50
+{
b35c50
+  grub_uint32_t mask;
b35c50
+  grub_uint32_t entry;
b35c50
+};
b35c50
+
b35c50
+struct cas_vector
b35c50
+{
b35c50
+  struct
b35c50
+  {
b35c50
+    struct pvr_entry terminal;
b35c50
+  } pvr_list;
b35c50
+  grub_uint8_t num_vecs;
b35c50
+  grub_uint8_t vec1_size;
b35c50
+  grub_uint8_t vec1;
b35c50
+  grub_uint8_t vec2_size;
b35c50
+  struct option_vector2 vec2;
b35c50
+  grub_uint8_t vec3_size;
b35c50
+  grub_uint16_t vec3;
b35c50
+  grub_uint8_t vec4_size;
b35c50
+  grub_uint16_t vec4;
b35c50
+} GRUB_PACKED;
b35c50
+
b35c50
+/*
b35c50
+ * Call ibm,client-architecture-support to try to get more RMA.
b35c50
+ * We ask for 512MB which should be enough to verify a distro kernel.
b35c50
+ * We ignore most errors: if we don't succeed we'll proceed with whatever
b35c50
+ * memory we have.
b35c50
+ */
b35c50
+static void
b35c50
+grub_ieee1275_ibm_cas (void)
b35c50
+{
b35c50
+  int rc;
b35c50
+  grub_ieee1275_ihandle_t root;
b35c50
+  struct cas_args
b35c50
+  {
b35c50
+    struct grub_ieee1275_common_hdr common;
b35c50
+    grub_ieee1275_cell_t method;
b35c50
+    grub_ieee1275_ihandle_t ihandle;
b35c50
+    grub_ieee1275_cell_t cas_addr;
b35c50
+    grub_ieee1275_cell_t result;
b35c50
+  } args;
b35c50
+  struct cas_vector vector =
b35c50
+  {
b35c50
+    .pvr_list = { { 0x00000000, 0xffffffff } }, /* any processor */
b35c50
+    .num_vecs = 4 - 1,
b35c50
+    .vec1_size = 0,
b35c50
+    .vec1 = 0x80, /* ignore */
b35c50
+    .vec2_size = 1 + sizeof (struct option_vector2) - 2,
b35c50
+    .vec2 = {
b35c50
+      0, 0, -1, -1, -1, -1, -1, 512, -1, 0, 48
b35c50
+    },
b35c50
+    .vec3_size = 2 - 1,
b35c50
+    .vec3 = 0x00e0, /* ask for FP + VMX + DFP but don't halt if unsatisfied */
b35c50
+    .vec4_size = 2 - 1,
b35c50
+    .vec4 = 0x0001, /* set required minimum capacity % to the lowest value */
b35c50
+  };
b35c50
+
b35c50
+  INIT_IEEE1275_COMMON (&args.common, "call-method", 3, 2);
b35c50
+  args.method = (grub_ieee1275_cell_t) "ibm,client-architecture-support";
b35c50
+  rc = grub_ieee1275_open ("/", &root);
b35c50
+  if (rc)
b35c50
+    {
b35c50
+      grub_error (GRUB_ERR_IO, "could not open root when trying to call CAS");
b35c50
+      return;
b35c50
+    }
b35c50
+  args.ihandle = root;
b35c50
+  args.cas_addr = (grub_ieee1275_cell_t) &vector;
b35c50
+
b35c50
+  grub_printf ("Calling ibm,client-architecture-support from grub...");
b35c50
+  IEEE1275_CALL_ENTRY_FN (&args);
b35c50
+  grub_printf ("done\n");
b35c50
+
b35c50
+  grub_ieee1275_close (root);
b35c50
+}
b35c50
+
b35c50
+#endif /* __powerpc__ */
b35c50
+
b35c50
+static void
b35c50
 grub_claim_heap (void)
b35c50
 {
b35c50
   unsigned long total = 0;
b35c50
 
b35c50
+#if defined(__powerpc__)
b35c50
+  if (grub_ieee1275_test_flag (GRUB_IEEE1275_FLAG_CAN_TRY_CAS_FOR_MORE_MEMORY))
b35c50
+    {
b35c50
+      grub_uint64_t rma_size;
b35c50
+      grub_err_t err;
b35c50
+
b35c50
+      err = grub_ieee1275_total_mem (&rma_size);
b35c50
+      /* if we have an error, don't call CAS, just hope for the best */
b35c50
+      if (err == GRUB_ERR_NONE && rma_size < (512 * 1024 * 1024))
b35c50
+	grub_ieee1275_ibm_cas ();
b35c50
+    }
b35c50
+#endif
b35c50
+
b35c50
   grub_machine_mmap_iterate (heap_init, &total);
b35c50
 }
b35c50
 #endif
b35c50
diff --git a/include/grub/ieee1275/ieee1275.h b/include/grub/ieee1275/ieee1275.h
b35c50
index 6a1d3e5d70..560c968460 100644
b35c50
--- a/include/grub/ieee1275/ieee1275.h
b35c50
+++ b/include/grub/ieee1275/ieee1275.h
b35c50
@@ -138,7 +138,17 @@ enum grub_ieee1275_flag
b35c50
 
b35c50
   GRUB_IEEE1275_FLAG_RAW_DEVNAMES,
b35c50
   
b35c50
-  GRUB_IEEE1275_FLAG_DISABLE_VIDEO_SUPPORT
b35c50
+  GRUB_IEEE1275_FLAG_DISABLE_VIDEO_SUPPORT,
b35c50
+
b35c50
+#if defined(__powerpc__)
b35c50
+  /*
b35c50
+   * On PFW, the first time we boot a Linux partition, we may only get 256MB of
b35c50
+   * real memory area, even if the partition has more memory. Set this flag if
b35c50
+   * we think we're running under PFW. Then, if this flag is set, and the RMA is
b35c50
+   * only 256MB in size, try asking for more with CAS.
b35c50
+   */
b35c50
+  GRUB_IEEE1275_FLAG_CAN_TRY_CAS_FOR_MORE_MEMORY,
b35c50
+#endif
b35c50
 };
b35c50
 
b35c50
 extern int EXPORT_FUNC(grub_ieee1275_test_flag) (enum grub_ieee1275_flag flag);