|
Justin Vreeland |
794d92 |
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
|
|
Justin Vreeland |
794d92 |
From: Jeremy Cline <jcline@redhat.com>
|
|
Justin Vreeland |
794d92 |
Date: Tue, 23 Jul 2019 15:24:30 +0000
|
|
Justin Vreeland |
794d92 |
Subject: [PATCH] kdump: add support for crashkernel=auto
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
Rebased for v5.3-rc1 because the documentation has moved.
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
Message-id: <20180604013831.574215750@redhat.com>
|
|
Justin Vreeland |
794d92 |
Patchwork-id: 8166
|
|
Justin Vreeland |
794d92 |
O-Subject: [kernel team] [PATCH RHEL8.0 V2 2/2] kdump: add support for crashkernel=auto
|
|
Justin Vreeland |
794d92 |
Bugzilla: 1507353
|
|
Justin Vreeland |
794d92 |
RH-Acked-by: Don Zickus <dzickus@redhat.com>
|
|
Justin Vreeland |
794d92 |
RH-Acked-by: Baoquan He <bhe@redhat.com>
|
|
Justin Vreeland |
794d92 |
RH-Acked-by: Pingfan Liu <piliu@redhat.com>
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
Bugzilla: https:
|
|
Justin Vreeland |
794d92 |
Build: https:
|
|
Justin Vreeland |
794d92 |
Tested: ppc64le, x86_64 with several memory sizes.
|
|
Justin Vreeland |
794d92 |
kdump qe tested 160M on various x86 machines in lab.
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
We continue to provide crashkernel=auto like we did in RHEL6
|
|
Justin Vreeland |
794d92 |
and RHEL7, this will simplify the kdump deployment for common
|
|
Justin Vreeland |
794d92 |
use cases that kdump just works with the auto reserved values.
|
|
Justin Vreeland |
794d92 |
But this is still a best effort estimation, we can not know the
|
|
Justin Vreeland |
794d92 |
exact memory requirement because it depends on a lot of different
|
|
Justin Vreeland |
794d92 |
factors.
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
The implementation of crashkernel=auto is simplified as a wrapper
|
|
Justin Vreeland |
794d92 |
to use below kernel cmdline:
|
|
Justin Vreeland |
794d92 |
x86_64: crashkernel=1G-64G:160M,64G-1T:256M,1T-:512M
|
|
Justin Vreeland |
794d92 |
s390x: crashkernel=4G-64G:160M,64G-1T:256M,1T-:512M
|
|
Justin Vreeland |
794d92 |
arm64: crashkernel=2G-:512M
|
|
Justin Vreeland |
794d92 |
ppc64: crashkernel=2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
The difference between this way and the old implementation in
|
|
Justin Vreeland |
794d92 |
RHEL6/7 is we do not scale the crash reserved memory size according
|
|
Justin Vreeland |
794d92 |
to system memory size anymore.
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
Latest effort to move upstream is below thread:
|
|
Justin Vreeland |
794d92 |
https://lkml.org/lkml/2018/5/20/262
|
|
Justin Vreeland |
794d92 |
But unfortunately it is still unlikely to be accepted, thus we
|
|
Justin Vreeland |
794d92 |
will still use a RHEL only patch in RHEL8.
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
Copied old patch description about the history reason see below:
|
|
Justin Vreeland |
794d92 |
'''
|
|
Justin Vreeland |
794d92 |
Non-upstream explanations:
|
|
Justin Vreeland |
794d92 |
Besides "crashkenrel=X@Y" format, upstream also has advanced
|
|
Justin Vreeland |
794d92 |
"crashkernel=range1:size1[,range2:size2,...][@offset]", and
|
|
Justin Vreeland |
794d92 |
"crashkernel=X,high{low}" formats, but they need more careful
|
|
Justin Vreeland |
794d92 |
manual configuration, and have different values for different
|
|
Justin Vreeland |
794d92 |
architectures.
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
Most of the distributions use the standard "crashkernel=X@Y"
|
|
Justin Vreeland |
794d92 |
upstream format, and use crashkernel range format for advanced
|
|
Justin Vreeland |
794d92 |
scenarios, heavily relying on the user's involvement.
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
While "crashkernel=auto" is redhat's special feature, it exists
|
|
Justin Vreeland |
794d92 |
and has been used as the default boot cmdline since 2008 rhel6.
|
|
Justin Vreeland |
794d92 |
It does not require users to figure out how many crash memory
|
|
Justin Vreeland |
794d92 |
size for their systems, also has been proved to be able to work
|
|
Justin Vreeland |
794d92 |
pretty well for common scenarios.
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
"crashkernel=auto" was tested/based on rhel-related products, as
|
|
Justin Vreeland |
794d92 |
we have stable kernel configurations which means more or less
|
|
Justin Vreeland |
794d92 |
stable memory consumption. In 2014 we tried to post them again to
|
|
Justin Vreeland |
794d92 |
upstream but NACKed by people because they think it's not general
|
|
Justin Vreeland |
794d92 |
and unnecessary, users can specify their own values or do that by
|
|
Justin Vreeland |
794d92 |
scripts. However our customers insist on having it added to rhel.
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
Also see one previous discussion related to this backport to Pegas:
|
|
Justin Vreeland |
794d92 |
On 10/17/2016 at 10:15 PM, Don Zickus wrote:
|
|
Justin Vreeland |
794d92 |
> On Fri, Oct 14, 2016 at 10:57:41AM +0800, Dave Young wrote:
|
|
Justin Vreeland |
794d92 |
>> Don, agree with you we should evaluate them instead of just inherit
|
|
Justin Vreeland |
794d92 |
>> them blindly. Below is what I think about kdump auto memory:
|
|
Justin Vreeland |
794d92 |
>> There are two issues for crashkernel=auto in upstream:
|
|
Justin Vreeland |
794d92 |
>> 1) It will be seen as a policy which should not go to kernel
|
|
Justin Vreeland |
794d92 |
>> 2) It is hard to get a good number for the crash reserved size,
|
|
Justin Vreeland |
794d92 |
>> considering various different kernel config options one can setups.
|
|
Justin Vreeland |
794d92 |
>> In RHEL we are easier because our supported Kconfig is limited.
|
|
Justin Vreeland |
794d92 |
>> I digged the upstream mail archive, but I'm not sure I got all the
|
|
Justin Vreeland |
794d92 |
>> information, at least Michael Ellerman was objecting the series for
|
|
Justin Vreeland |
794d92 |
>> 1).
|
|
Justin Vreeland |
794d92 |
> Yes, I know. Vivek and I have argued about this for years. :-)
|
|
Justin Vreeland |
794d92 |
>
|
|
Justin Vreeland |
794d92 |
> I had hoped all the changes internally to the makedumpfile would allow
|
|
Justin Vreeland |
794d92 |
> the memory configuration to stabilize at a number like 192M or 128M and
|
|
Justin Vreeland |
794d92 |
> only in the rare cases extend beyond that.
|
|
Justin Vreeland |
794d92 |
>
|
|
Justin Vreeland |
794d92 |
> So I always treated that as a temporary hack until things were better.
|
|
Justin Vreeland |
794d92 |
> With the hope of every new RHEL release we get smarter and better. :-)
|
|
Justin Vreeland |
794d92 |
> Ideally it would be great if we could get the number down to 64M for most
|
|
Justin Vreeland |
794d92 |
> cases and just turn it on in Fedora. Maybe someday.... ;-)
|
|
Justin Vreeland |
794d92 |
>
|
|
Justin Vreeland |
794d92 |
> We can have this conversation when the patch gets reposted/refreshed
|
|
Justin Vreeland |
794d92 |
> for upstream on rhkl?
|
|
Justin Vreeland |
794d92 |
>
|
|
Justin Vreeland |
794d92 |
> Cheers,
|
|
Justin Vreeland |
794d92 |
> Don
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
We had proposed to drop the historic crashkernel=auto code and move
|
|
Justin Vreeland |
794d92 |
to use crashkernel=range:size format and pass them in anaconda.
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
The initial reason is crashkernel=range:size works just fine because
|
|
Justin Vreeland |
794d92 |
we do not need complex algorithm to scale crashkernel reserved size
|
|
Justin Vreeland |
794d92 |
any more. The old linear scaling is mainly for old makedumpfile
|
|
Justin Vreeland |
794d92 |
requirements, now it is not necessary.
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
But With the new approach, backward compatibility is potentially at risk.
|
|
Justin Vreeland |
794d92 |
For e.g. let's consider the following cases:
|
|
Justin Vreeland |
794d92 |
1) When we upgrade from an older distribution like rhel-alt-7.4(which
|
|
Justin Vreeland |
794d92 |
uses crashkernel=auto) to rhel-alt-7.5 (which uses the crashkernel=xY
|
|
Justin Vreeland |
794d92 |
format)
|
|
Justin Vreeland |
794d92 |
In this case we can use anaconda scripts for checking
|
|
Justin Vreeland |
794d92 |
'crashkernel=auto' in kernel spec and update to the new
|
|
Justin Vreeland |
794d92 |
'crashkernel=range:size' format.
|
|
Justin Vreeland |
794d92 |
2) When we upgrade from rhel-alt-7.5(which uses crashkernel=xY format)
|
|
Justin Vreeland |
794d92 |
to rhel-alt-7.6(which uses crashkernel=xY format), but the x and/or Y
|
|
Justin Vreeland |
794d92 |
values are changed in rhel-alt-7.6.
|
|
Justin Vreeland |
794d92 |
For example from crashkernel=2G-:160M to crashkernel=2G-:192M, then we have
|
|
Justin Vreeland |
794d92 |
no way to determine if the X and/or Y values were distribution
|
|
Justin Vreeland |
794d92 |
provided or user specified ones.
|
|
Justin Vreeland |
794d92 |
Since it is recommended to give precedence to user-specified values,
|
|
Justin Vreeland |
794d92 |
so we cannot do an upgrade in such a case."
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
Thus turn back to resolve it in kernel, and add a simpler version
|
|
Justin Vreeland |
794d92 |
which just hacks to use the range:size style in code, and make
|
|
Justin Vreeland |
794d92 |
rhel-only code easily to maintain.
|
|
Justin Vreeland |
794d92 |
'''
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
Signed-off-by: Dave Young <dyoung@redhat.com>
|
|
Justin Vreeland |
794d92 |
Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
Upstream Status: RHEL only
|
|
Justin Vreeland |
794d92 |
Signed-off-by: Jeremy Cline <jcline@redhat.com>
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
Documentation/admin-guide/kdump/kdump.rst | 11 +++++++++++
|
|
Justin Vreeland |
794d92 |
kernel/crash_core.c | 14 ++++++++++++++
|
|
Justin Vreeland |
794d92 |
2 files changed, 25 insertions(+)
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
diff
|
|
Justin Vreeland |
794d92 |
index 2da65fef2a1c..d53a524f80f0 100644
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
@@ -285,6 +285,17 @@ This would mean:
|
|
Justin Vreeland |
794d92 |
2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
|
|
Justin Vreeland |
794d92 |
3) if the RAM size is larger than 2G, then reserve 128M
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
+Or you can use crashkernel=auto if you have enough memory. The threshold
|
|
Justin Vreeland |
794d92 |
+is 2G on x86_64, arm64, ppc64 and ppc64le. The threshold is 4G for s390x.
|
|
Justin Vreeland |
794d92 |
+If your system memory is less than the threshold crashkernel=auto will not
|
|
Justin Vreeland |
794d92 |
+reserve memory.
|
|
Justin Vreeland |
794d92 |
+
|
|
Justin Vreeland |
794d92 |
+The automatically reserved memory size varies based on architecture.
|
|
Justin Vreeland |
794d92 |
+The size changes according to system memory size like below:
|
|
Justin Vreeland |
794d92 |
+ x86_64: 1G-64G:160M,64G-1T:256M,1T-:512M
|
|
Justin Vreeland |
794d92 |
+ s390x: 4G-64G:160M,64G-1T:256M,1T-:512M
|
|
Justin Vreeland |
794d92 |
+ arm64: 2G-:512M
|
|
Justin Vreeland |
794d92 |
+ ppc64: 2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
Boot into System Kernel
|
|
Justin Vreeland |
794d92 |
diff
|
|
Justin Vreeland |
794d92 |
index e4dfe2a05a31..8c6f59932247 100644
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
|
|
Justin Vreeland |
794d92 |
@@ -258,6 +258,20 @@ static int __init __parse_crashkernel(char *cmdline,
|
|
Justin Vreeland |
794d92 |
if (suffix)
|
|
Justin Vreeland |
794d92 |
return parse_crashkernel_suffix(ck_cmdline, crash_size,
|
|
Justin Vreeland |
794d92 |
suffix);
|
|
Justin Vreeland |
794d92 |
+
|
|
Justin Vreeland |
794d92 |
+ if (strncmp(ck_cmdline, "auto", 4) == 0) {
|
|
Justin Vreeland |
794d92 |
+#ifdef CONFIG_X86_64
|
|
Justin Vreeland |
794d92 |
+ ck_cmdline = "1G-64G:160M,64G-1T:256M,1T-:512M";
|
|
Justin Vreeland |
794d92 |
+#elif defined(CONFIG_S390)
|
|
Justin Vreeland |
794d92 |
+ ck_cmdline = "4G-64G:160M,64G-1T:256M,1T-:512M";
|
|
Justin Vreeland |
794d92 |
+#elif defined(CONFIG_ARM64)
|
|
Justin Vreeland |
794d92 |
+ ck_cmdline = "2G-:512M";
|
|
Justin Vreeland |
794d92 |
+#elif defined(CONFIG_PPC64)
|
|
Justin Vreeland |
794d92 |
+ ck_cmdline = "2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G";
|
|
Justin Vreeland |
794d92 |
+#endif
|
|
Justin Vreeland |
794d92 |
+ pr_info("Using crashkernel=auto, the size choosed is a best effort estimation.\n");
|
|
Justin Vreeland |
794d92 |
+ }
|
|
Justin Vreeland |
794d92 |
+
|
|
Justin Vreeland |
794d92 |
/*
|
|
Justin Vreeland |
794d92 |
* if the commandline contains a ':', then that's the extended
|
|
Justin Vreeland |
794d92 |
* syntax -- if not, it must be the classic syntax
|
|
Justin Vreeland |
794d92 |
--
|
|
Justin Vreeland |
794d92 |
2.28.0
|
|
Justin Vreeland |
794d92 |
|