linuxtorvalds / rpms / kernel

Forked from rpms/kernel 2 years ago
Clone
Pablo Greco 7b2c62
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
Pablo Greco 7b2c62
From: Jeremy Cline <jcline@redhat.com>
Pablo Greco 7b2c62
Date: Tue, 23 Jul 2019 15:24:30 +0000
Pablo Greco 7b2c62
Subject: [PATCH] kdump: add support for crashkernel=auto
Pablo Greco 7b2c62
Pablo Greco 7b2c62
Rebased for v5.3-rc1 because the documentation has moved.
Pablo Greco 7b2c62
Pablo Greco 7b2c62
    Message-id: <20180604013831.574215750@redhat.com>
Pablo Greco 7b2c62
    Patchwork-id: 8166
Pablo Greco 7b2c62
    O-Subject: [kernel team] [PATCH RHEL8.0 V2 2/2] kdump: add support for crashkernel=auto
Pablo Greco 7b2c62
    Bugzilla: 1507353
Pablo Greco 7b2c62
    RH-Acked-by: Don Zickus <dzickus@redhat.com>
Pablo Greco 7b2c62
    RH-Acked-by: Baoquan He <bhe@redhat.com>
Pablo Greco 7b2c62
    RH-Acked-by: Pingfan Liu <piliu@redhat.com>
Pablo Greco 7b2c62
Pablo Greco 7b2c62
    Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1507353
Pablo Greco 7b2c62
    Build: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=16534135
Pablo Greco 7b2c62
    Tested: ppc64le, x86_64 with several memory sizes.
Pablo Greco 7b2c62
            kdump qe tested 160M on various x86 machines in lab.
Pablo Greco 7b2c62
Pablo Greco 7b2c62
    We continue to provide crashkernel=auto like we did in RHEL6
Pablo Greco 7b2c62
    and RHEL7,  this will simplify the kdump deployment for common
Pablo Greco 7b2c62
    use cases that kdump just works with the auto reserved values.
Pablo Greco 7b2c62
    But this is still a best effort estimation, we can not know the
Pablo Greco 7b2c62
    exact memory requirement because it depends on a lot of different
Pablo Greco 7b2c62
    factors.
Pablo Greco 7b2c62
Pablo Greco 7b2c62
    The implementation of crashkernel=auto is simplified as a wrapper
Pablo Greco 7b2c62
    to use below kernel cmdline:
Pablo Greco 7b2c62
    x86_64: crashkernel=1G-64G:160M,64G-1T:256M,1T-:512M
Pablo Greco 7b2c62
    s390x:  crashkernel=4G-64G:160M,64G-1T:256M,1T-:512M
Pablo Greco 7b2c62
    arm64:  crashkernel=2G-:512M
Pablo Greco 7b2c62
    ppc64:  crashkernel=2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G
Pablo Greco 7b2c62
Pablo Greco 7b2c62
    The difference between this way and the old implementation in
Pablo Greco 7b2c62
    RHEL6/7 is we do not scale the crash reserved memory size according
Pablo Greco 7b2c62
    to system memory size anymore.
Pablo Greco 7b2c62
Pablo Greco 7b2c62
    Latest effort to move upstream is below thread:
Pablo Greco 7b2c62
    https://lkml.org/lkml/2018/5/20/262
Pablo Greco 7b2c62
    But unfortunately it is still unlikely to be accepted, thus we
Pablo Greco 7b2c62
    will still use a RHEL only patch in RHEL8.
Pablo Greco 7b2c62
Pablo Greco 7b2c62
    Copied old patch description about the history reason see below:
Pablo Greco 7b2c62
    '''
Pablo Greco 7b2c62
        Non-upstream explanations:
Pablo Greco 7b2c62
        Besides "crashkenrel=X@Y" format, upstream also has advanced
Pablo Greco 7b2c62
        "crashkernel=range1:size1[,range2:size2,...][@offset]", and
Pablo Greco 7b2c62
        "crashkernel=X,high{low}" formats, but they need more careful
Pablo Greco 7b2c62
        manual configuration, and have different values for different
Pablo Greco 7b2c62
        architectures.
Pablo Greco 7b2c62
Pablo Greco 7b2c62
        Most of the distributions use the standard "crashkernel=X@Y"
Pablo Greco 7b2c62
        upstream format, and use crashkernel range format for advanced
Pablo Greco 7b2c62
        scenarios, heavily relying on the user's involvement.
Pablo Greco 7b2c62
Pablo Greco 7b2c62
        While "crashkernel=auto" is redhat's special feature, it exists
Pablo Greco 7b2c62
        and has been used as the default boot cmdline since 2008 rhel6.
Pablo Greco 7b2c62
        It does not require users to figure out how many crash memory
Pablo Greco 7b2c62
        size for their systems, also has been proved to be able to work
Pablo Greco 7b2c62
        pretty well for common scenarios.
Pablo Greco 7b2c62
Pablo Greco 7b2c62
        "crashkernel=auto" was tested/based on rhel-related products, as
Pablo Greco 7b2c62
        we have stable kernel configurations which means more or less
Pablo Greco 7b2c62
        stable memory consumption. In 2014 we tried to post them again to
Pablo Greco 7b2c62
        upstream but NACKed by people because they think it's not general
Pablo Greco 7b2c62
        and unnecessary, users can specify their own values or do that by
Pablo Greco 7b2c62
        scripts. However our customers insist on having it added to rhel.
Pablo Greco 7b2c62
Pablo Greco 7b2c62
        Also see one previous discussion related to this backport to Pegas:
Pablo Greco 7b2c62
        On 10/17/2016 at 10:15 PM, Don Zickus wrote:
Pablo Greco 7b2c62
        > On Fri, Oct 14, 2016 at 10:57:41AM +0800, Dave Young wrote:
Pablo Greco 7b2c62
        >> Don, agree with you we should evaluate them instead of just inherit
Pablo Greco 7b2c62
        >> them blindly. Below is what I think about kdump auto memory:
Pablo Greco 7b2c62
        >> There are two issues for crashkernel=auto in upstream:
Pablo Greco 7b2c62
        >> 1) It will be seen as a policy which should not go to kernel
Pablo Greco 7b2c62
        >> 2) It is hard to get a good number for the crash reserved size,
Pablo Greco 7b2c62
        >> considering various different kernel config options one can setups.
Pablo Greco 7b2c62
        >> In RHEL we are easier because our supported Kconfig is limited.
Pablo Greco 7b2c62
        >> I digged the upstream mail archive, but I'm not sure I got all the
Pablo Greco 7b2c62
        >> information, at least Michael Ellerman was objecting the series for
Pablo Greco 7b2c62
        >> 1).
Pablo Greco 7b2c62
        > Yes, I know.  Vivek and I have argued about this for years.  :-)
Pablo Greco 7b2c62
        >
Pablo Greco 7b2c62
        > I had hoped all the changes internally to the makedumpfile would allow
Pablo Greco 7b2c62
        > the memory configuration to stabilize at a number like 192M or 128M and
Pablo Greco 7b2c62
        > only in the rare cases extend beyond that.
Pablo Greco 7b2c62
        >
Pablo Greco 7b2c62
        > So I always treated that as a temporary hack until things were better.
Pablo Greco 7b2c62
        > With the hope of every new RHEL release we get smarter and better. :-)
Pablo Greco 7b2c62
        > Ideally it would be great if we could get the number down to 64M for most
Pablo Greco 7b2c62
        > cases and just turn it on in Fedora.  Maybe someday.... ;-)
Pablo Greco 7b2c62
        >
Pablo Greco 7b2c62
        > We can have this conversation when the patch gets reposted/refreshed
Pablo Greco 7b2c62
        > for upstream on rhkl?
Pablo Greco 7b2c62
        >
Pablo Greco 7b2c62
        > Cheers,
Pablo Greco 7b2c62
        > Don
Pablo Greco 7b2c62
Pablo Greco 7b2c62
        We had proposed to drop the historic crashkernel=auto code and move
Pablo Greco 7b2c62
        to use crashkernel=range:size format and pass them in anaconda.
Pablo Greco 7b2c62
Pablo Greco 7b2c62
        The initial reason is crashkernel=range:size works just fine because
Pablo Greco 7b2c62
        we do not need complex algorithm to scale crashkernel reserved size
Pablo Greco 7b2c62
        any more.  The old linear scaling is mainly for old makedumpfile
Pablo Greco 7b2c62
        requirements, now it is not necessary.
Pablo Greco 7b2c62
Pablo Greco 7b2c62
        But With the new approach, backward compatibility is potentially at risk.
Pablo Greco 7b2c62
        For e.g. let's consider the following cases:
Pablo Greco 7b2c62
        1) When we upgrade from an older distribution like rhel-alt-7.4(which
Pablo Greco 7b2c62
        uses crashkernel=auto) to rhel-alt-7.5 (which uses the crashkernel=xY
Pablo Greco 7b2c62
        format)
Pablo Greco 7b2c62
        In this case we can use anaconda scripts for checking
Pablo Greco 7b2c62
        'crashkernel=auto' in kernel spec and update to the new
Pablo Greco 7b2c62
        'crashkernel=range:size' format.
Pablo Greco 7b2c62
        2) When we upgrade from rhel-alt-7.5(which uses crashkernel=xY format)
Pablo Greco 7b2c62
        to rhel-alt-7.6(which uses crashkernel=xY format), but the x and/or Y
Pablo Greco 7b2c62
        values are changed in rhel-alt-7.6.
Pablo Greco 7b2c62
        For example from crashkernel=2G-:160M to crashkernel=2G-:192M, then we have
Pablo Greco 7b2c62
        no way to determine if the X and/or Y values were distribution
Pablo Greco 7b2c62
        provided or user specified ones.
Pablo Greco 7b2c62
        Since it is recommended to give precedence to user-specified values,
Pablo Greco 7b2c62
        so we cannot do an upgrade in such a case."
Pablo Greco 7b2c62
Pablo Greco 7b2c62
        Thus turn back to resolve it in kernel, and add a simpler version
Pablo Greco 7b2c62
        which just hacks to use the range:size style in code, and make
Pablo Greco 7b2c62
        rhel-only code easily to maintain.
Pablo Greco 7b2c62
    '''
Pablo Greco 7b2c62
Pablo Greco 7b2c62
    Signed-off-by: Dave Young <dyoung@redhat.com>
Pablo Greco 7b2c62
    Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Pablo Greco 7b2c62
Pablo Greco 7b2c62
Upstream Status: RHEL only
Pablo Greco 7b2c62
Signed-off-by: Jeremy Cline <jcline@redhat.com>
Pablo Greco 7b2c62
---
Pablo Greco 7b2c62
 Documentation/admin-guide/kdump/kdump.rst | 11 +++++++++++
Pablo Greco 7b2c62
 kernel/crash_core.c                       | 14 ++++++++++++++
Pablo Greco 7b2c62
 2 files changed, 25 insertions(+)
Pablo Greco 7b2c62
Pablo Greco 7b2c62
diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
Pablo Greco 7b2c62
index 2da65fef2a1c..d53a524f80f0 100644
Pablo Greco 7b2c62
--- a/Documentation/admin-guide/kdump/kdump.rst
Pablo Greco 7b2c62
+++ b/Documentation/admin-guide/kdump/kdump.rst
Pablo Greco 7b2c62
@@ -285,6 +285,17 @@ This would mean:
Pablo Greco 7b2c62
     2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
Pablo Greco 7b2c62
     3) if the RAM size is larger than 2G, then reserve 128M
Pablo Greco 7b2c62
Pablo Greco 7b2c62
+Or you can use crashkernel=auto if you have enough memory.  The threshold
Pablo Greco 7b2c62
+is 2G on x86_64, arm64, ppc64 and ppc64le. The threshold is 4G for s390x.
Pablo Greco 7b2c62
+If your system memory is less than the threshold crashkernel=auto will not
Pablo Greco 7b2c62
+reserve memory.
Pablo Greco 7b2c62
+
Pablo Greco 7b2c62
+The automatically reserved memory size varies based on architecture.
Pablo Greco 7b2c62
+The size changes according to system memory size like below:
Pablo Greco 7b2c62
+    x86_64: 1G-64G:160M,64G-1T:256M,1T-:512M
Pablo Greco 7b2c62
+    s390x:  4G-64G:160M,64G-1T:256M,1T-:512M
Pablo Greco 7b2c62
+    arm64:  2G-:512M
Pablo Greco 7b2c62
+    ppc64:  2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G
Pablo Greco 7b2c62
Pablo Greco 7b2c62
Pablo Greco 7b2c62
 Boot into System Kernel
Pablo Greco 7b2c62
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
Pablo Greco 7b2c62
index e4dfe2a05a31..8c6f59932247 100644
Pablo Greco 7b2c62
--- a/kernel/crash_core.c
Pablo Greco 7b2c62
+++ b/kernel/crash_core.c
Pablo Greco 7b2c62
@@ -258,6 +258,20 @@ static int __init __parse_crashkernel(char *cmdline,
Pablo Greco 7b2c62
 	if (suffix)
Pablo Greco 7b2c62
 		return parse_crashkernel_suffix(ck_cmdline, crash_size,
Pablo Greco 7b2c62
 				suffix);
Pablo Greco 7b2c62
+
Pablo Greco 7b2c62
+	if (strncmp(ck_cmdline, "auto", 4) == 0) {
Pablo Greco 7b2c62
+#ifdef CONFIG_X86_64
Pablo Greco 7b2c62
+		ck_cmdline = "1G-64G:160M,64G-1T:256M,1T-:512M";
Pablo Greco 7b2c62
+#elif defined(CONFIG_S390)
Pablo Greco 7b2c62
+		ck_cmdline = "4G-64G:160M,64G-1T:256M,1T-:512M";
Pablo Greco 7b2c62
+#elif defined(CONFIG_ARM64)
Pablo Greco 7b2c62
+		ck_cmdline = "2G-:512M";
Pablo Greco 7b2c62
+#elif defined(CONFIG_PPC64)
Pablo Greco 7b2c62
+		ck_cmdline = "2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G";
Pablo Greco 7b2c62
+#endif
Pablo Greco 7b2c62
+		pr_info("Using crashkernel=auto, the size choosed is a best effort estimation.\n");
Pablo Greco 7b2c62
+	}
Pablo Greco 7b2c62
+
Pablo Greco 7b2c62
 	/*
Pablo Greco 7b2c62
 	 * if the commandline contains a ':', then that's the extended
Pablo Greco 7b2c62
 	 * syntax -- if not, it must be the classic syntax
Pablo Greco 7b2c62
-- 
Pablo Greco 7b2c62
2.28.0
Pablo Greco 7b2c62