Petr Šabata f5bf49
=================
Petr Šabata f5bf49
Kexec/Kdump HOWTO
Petr Šabata f5bf49
=================
Petr Šabata f5bf49
Petr Šabata f5bf49
Petr Šabata f5bf49
Introduction
Petr Šabata f5bf49
============
Petr Šabata f5bf49
Petr Šabata f5bf49
Kexec and kdump are new features in the 2.6 mainstream kernel. These features
Petr Šabata f5bf49
are included in Red Hat Enterprise Linux 5. The purpose of these features
Petr Šabata f5bf49
is to ensure faster boot up and creation of reliable kernel vmcores for
Petr Šabata f5bf49
diagnostic purposes.
Petr Šabata f5bf49
Petr Šabata f5bf49
Petr Šabata f5bf49
Overview
Petr Šabata f5bf49
========
Petr Šabata f5bf49
Petr Šabata f5bf49
Kexec
Petr Šabata f5bf49
-----
Petr Šabata f5bf49
Petr Šabata f5bf49
Kexec is a fastboot mechanism which allows booting a Linux kernel from the
Petr Šabata f5bf49
context of already running kernel without going through BIOS. BIOS can be very
Petr Šabata f5bf49
time consuming especially on the big servers with lots of peripherals. This can
Petr Šabata f5bf49
save a lot of time for developers who end up booting a machine numerous times.
Petr Šabata f5bf49
Petr Šabata f5bf49
Kdump
Petr Šabata f5bf49
-----
Petr Šabata f5bf49
Petr Šabata f5bf49
Kdump is a new kernel crash dumping mechanism and is very reliable because
Petr Šabata f5bf49
the crash dump is captured from the context of a freshly booted kernel and
Petr Šabata f5bf49
not from the context of the crashed kernel. Kdump uses kexec to boot into
Petr Šabata f5bf49
a second kernel whenever system crashes. This second kernel, often called
Petr Šabata f5bf49
a capture kernel, boots with very little memory and captures the dump image.
Petr Šabata f5bf49
Petr Šabata f5bf49
The first kernel reserves a section of memory that the second kernel uses
Petr Šabata f5bf49
to boot. Kexec enables booting the capture kernel without going through BIOS
Petr Šabata f5bf49
hence contents of first kernel's memory are preserved, which is essentially
Petr Šabata f5bf49
the kernel crash dump.
Petr Šabata f5bf49
Petr Šabata f5bf49
Kdump is supported on the i686, x86_64, ia64 and ppc64 platforms. The
Petr Šabata f5bf49
standard kernel and capture kernel are one in the same on i686, x86_64,
Petr Šabata f5bf49
ia64 and ppc64.
Petr Šabata f5bf49
Petr Šabata f5bf49
If you're reading this document, you should already have kexec-tools
Petr Šabata f5bf49
installed. If not, you install it via the following command:
Petr Šabata f5bf49
Coiby Xu b749f7
    # dnf install kexec-tools
Petr Šabata f5bf49
Petr Šabata f5bf49
Now load a kernel with kexec:
Petr Šabata f5bf49
Petr Šabata f5bf49
    # kver=`uname -r` # kexec -l /boot/vmlinuz-$kver
Petr Šabata f5bf49
    --initrd=/boot/initrd-$kver.img \
Petr Šabata f5bf49
        --command-line="`cat /proc/cmdline`"
Petr Šabata f5bf49
Petr Šabata f5bf49
NOTE: The above will boot you back into the kernel you're currently running,
Petr Šabata f5bf49
if you want to load a different kernel, substitute it in place of `uname -r`.
Petr Šabata f5bf49
Petr Šabata f5bf49
Now reboot your system, taking note that it should bypass the BIOS:
Petr Šabata f5bf49
Petr Šabata f5bf49
    # reboot
Petr Šabata f5bf49
Petr Šabata f5bf49
Petr Šabata f5bf49
How to configure kdump
Petr Šabata f5bf49
======================
Petr Šabata f5bf49
Petr Šabata f5bf49
Again, we assume if you're reading this document, you should already have
Petr Šabata f5bf49
kexec-tools installed. If not, you install it via the following command:
Petr Šabata f5bf49
Coiby Xu b749f7
    # dnf install kexec-tools
Petr Šabata f5bf49
Petr Šabata f5bf49
To be able to do much of anything interesting in the way of debug analysis,
Petr Šabata f5bf49
you'll also need to install the kernel-debuginfo package, of the same arch
Petr Šabata f5bf49
as your running kernel, and the crash utility:
Petr Šabata f5bf49
Coiby Xu b749f7
    # dnf --enablerepo=\*debuginfo install kernel-debuginfo.$(uname -m) crash
Petr Šabata f5bf49
Coiby Xu b749f7
Next up, we need to reserve a chunk of memory for the capture kernel. To use
Coiby Xu b749f7
the default crashkernel value, you can kdumpctl:
Petr Šabata f5bf49
Coiby Xu b749f7
    # kdumpctl reset-crashkernel --kernel=/boot/vmlinuz-`uname -r`
Coiby Xu b749f7
Coiby Xu b749f7
If the default value does not work for your setup you can use
Coiby Xu b749f7
Coiby Xu b749f7
  # grubby --args="crashkernel=256M" --update-kernel=/boot/vmlinuz-`uname -r`
Coiby Xu b749f7
Coiby Xu b749f7
to specify a larger value, in this case 256M. You need to experiment to
Coiby Xu b749f7
find the best value that works for your setup. To begin with
Coiby Xu b749f7
Coiby Xu b749f7
  # kdumpctl estimate
Coiby Xu b749f7
Coiby Xu b749f7
gives you an estimation for the crashkernel value based on the currently
Coiby Xu b749f7
running kernel. For more details, please refer to the "Estimate crashkernel"
Coiby Xu b749f7
section in /usr/share/doc/kexec-tools/crashkernel-howto.txt.
Petr Šabata f5bf49
Petr Šabata f5bf49
Note that there is an alternative form in which to specify a crashkernel
Petr Šabata f5bf49
memory reservation, in the event that more control is needed over the size and
Petr Šabata f5bf49
placement of the reserved memory.  The format is:
Petr Šabata f5bf49
Petr Šabata f5bf49
crashkernel=range1:size1[,range2:size2,...][@offset]
Petr Šabata f5bf49
Petr Šabata f5bf49
Where range<n> specifies a range of values that are matched against the amount
Petr Šabata f5bf49
of physical RAM present in the system, and the corresponding size<n> value
Petr Šabata f5bf49
specifies the amount of kexec memory to reserve.  For example:
Petr Šabata f5bf49
Petr Šabata f5bf49
crashkernel=512M-2G:64M,2G-:128M
Petr Šabata f5bf49
Petr Šabata f5bf49
This line tells kexec to reserve 64M of ram if the system contains between
Petr Šabata f5bf49
512M and 2G of physical memory.  If the system contains 2G or more of physical
Petr Šabata f5bf49
memory, 128M should be reserved.
Petr Šabata f5bf49
Petr Šabata f5bf49
Besides, since kdump needs to access /proc/kallsyms during a kernel
Petr Šabata f5bf49
loading if KASLR is enabled, check /proc/sys/kernel/kptr_restrict to
Petr Šabata f5bf49
make sure that the content of /proc/kallsyms is exposed correctly.
Petr Šabata f5bf49
We recommend to set the value of kptr_restrict to '1'. Otherwise
Petr Šabata f5bf49
capture kernel loading could fail.
Petr Šabata f5bf49
Petr Šabata f5bf49
After making said changes, reboot your system, so that the X MB of memory is
Petr Šabata f5bf49
left untouched by the normal system, reserved for the capture kernel. Take note
Petr Šabata f5bf49
that the output of 'free -m' will show X MB less memory than without this
Petr Šabata f5bf49
parameter, which is expected. You may be able to get by with less than 128M, but
Petr Šabata f5bf49
testing with only 64M has proven unreliable of late. On ia64, as much as 512M
Petr Šabata f5bf49
may be required.
Petr Šabata f5bf49
Petr Šabata f5bf49
Now that you've got that reserved memory region set up, you want to turn on
Petr Šabata f5bf49
the kdump init script:
Petr Šabata f5bf49
Petr Šabata f5bf49
    # chkconfig kdump on
Petr Šabata f5bf49
Petr Šabata f5bf49
Then, start up kdump as well:
Petr Šabata f5bf49
Petr Šabata f5bf49
    # systemctl start kdump.service
Petr Šabata f5bf49
Petr Šabata f5bf49
This should load your kernel-kdump image via kexec, leaving the system ready
Petr Šabata f5bf49
to capture a vmcore upon crashing. To test this out, you can force-crash
Petr Šabata f5bf49
your system by echo'ing a c into /proc/sysrq-trigger:
Petr Šabata f5bf49
Petr Šabata f5bf49
    # echo c > /proc/sysrq-trigger
Petr Šabata f5bf49
Petr Šabata f5bf49
You should see some panic output, followed by the system restarting into
Petr Šabata f5bf49
the kdump kernel. When the boot process gets to the point where it starts
Petr Šabata f5bf49
the kdump service, your vmcore should be copied out to disk (by default,
Petr Šabata f5bf49
in /var/crash/<YYYY-MM-DD-HH:MM>/vmcore), then the system rebooted back into
Petr Šabata f5bf49
your normal kernel.
Petr Šabata f5bf49
Petr Šabata f5bf49
Once back to your normal kernel, you can use the previously installed crash
Coiby Xu b749f7
utility in conjunction with the previously installed kernel-debuginfo to
Petr Šabata f5bf49
perform postmortem analysis:
Petr Šabata f5bf49
Petr Šabata f5bf49
    # crash /usr/lib/debug/lib/modules/2.6.17-1.2621.el5/vmlinux
Petr Šabata f5bf49
    /var/crash/2006-08-23-15:34/vmcore
Petr Šabata f5bf49
Petr Šabata f5bf49
    crash> bt
Petr Šabata f5bf49
Petr Šabata f5bf49
and so on...
Petr Šabata f5bf49
Petr Šabata f5bf49
Petr Šabata f5bf49
Notes on kdump
Petr Šabata f5bf49
==============
Petr Šabata f5bf49
Petr Šabata f5bf49
When kdump starts, the kdump kernel is loaded together with the kdump
Petr Šabata f5bf49
initramfs. To save memory usage and disk space, the kdump initramfs is
Petr Šabata f5bf49
generated strictly against the system it will run on, and contains the
Petr Šabata f5bf49
minimum set of kernel modules and utilities to boot the machine to a stage
Petr Šabata f5bf49
where the dump target could be mounted.
Petr Šabata f5bf49
Petr Šabata f5bf49
With kdump service enabled, kdumpctl will try to detect possible system
Petr Šabata f5bf49
change and rebuild the kdump initramfs if needed. But it can not guarantee
Petr Šabata f5bf49
to cover every possible case. So after a hardware change, disk migration,
Petr Šabata f5bf49
storage setup update or any similar system level changes, it's highly
Petr Šabata f5bf49
recommended to rebuild the initramfs manually with following command:
Petr Šabata f5bf49
Petr Šabata f5bf49
    # kdumpctl rebuild
Petr Šabata f5bf49
Petr Šabata f5bf49
Petr Šabata f5bf49
Saving vmcore-dmesg.txt
Petr Šabata f5bf49
=======================
Petr Šabata f5bf49
Petr Šabata f5bf49
Kernel log bufferes are one of the most important information available
Petr Šabata f5bf49
in vmcore. Now before saving vmcore, kernel log bufferes are extracted
Petr Šabata f5bf49
from /proc/vmcore and saved into a file vmcore-dmesg.txt. After
Petr Šabata f5bf49
vmcore-dmesg.txt, vmcore is saved. Destination disk and directory for
Petr Šabata f5bf49
vmcore-dmesg.txt is same as vmcore. Note that kernel log buffers will
Petr Šabata f5bf49
not be available if dump target is raw device.
Petr Šabata f5bf49
Petr Šabata f5bf49
Petr Šabata f5bf49
Dump Triggering methods
Petr Šabata f5bf49
=======================
Petr Šabata f5bf49
Petr Šabata f5bf49
This section talks about the various ways, other than a Kernel Panic, in which
Petr Šabata f5bf49
Kdump can be triggered. The following methods assume that Kdump is configured
Petr Šabata f5bf49
on your system, with the scripts enabled as described in the section above.
Petr Šabata f5bf49
Petr Šabata f5bf49
1) AltSysRq C
Petr Šabata f5bf49
Petr Šabata f5bf49
Kdump can be triggered with the combination of the 'Alt','SysRq' and 'C'
Petr Šabata f5bf49
keyboard keys. Please refer to the following link for more details:
Petr Šabata f5bf49
Petr Šabata f5bf49
http://kbase.redhat.com/faq/FAQ_43_5559.shtm
Petr Šabata f5bf49
Petr Šabata f5bf49
In addition, on PowerPC boxes, Kdump can also be triggered via Hardware
Petr Šabata f5bf49
Management Console(HMC) using 'Ctrl', 'O' and 'C' keyboard keys.
Petr Šabata f5bf49
Petr Šabata f5bf49
2) NMI_WATCHDOG
Petr Šabata f5bf49
Petr Šabata f5bf49
In case a machine has a hard hang, it is quite possible that it does not
Petr Šabata f5bf49
respond to keyboard interrupts. As a result 'Alt-SysRq' keys will not help
Petr Šabata f5bf49
trigger a dump. In such scenarios Nmi Watchdog feature can prove to be useful.
Petr Šabata f5bf49
The following link has more details on configuring Nmi watchdog option.
Petr Šabata f5bf49
Petr Šabata f5bf49
http://kbase.redhat.com/faq/FAQ_85_9129.shtm
Petr Šabata f5bf49
Petr Šabata f5bf49
Once this feature has been enabled in the kernel, any lockups will result in an
Petr Šabata f5bf49
OOPs message to be generated, followed by Kdump being triggered.
Petr Šabata f5bf49
Petr Šabata f5bf49
3) Kernel OOPs
Petr Šabata f5bf49
Petr Šabata f5bf49
If we want to generate a dump everytime the Kernel OOPses, we can achieve this
Petr Šabata f5bf49
by setting the 'Panic On OOPs' option as follows:
Petr Šabata f5bf49
Petr Šabata f5bf49
    # echo 1 > /proc/sys/kernel/panic_on_oops
Petr Šabata f5bf49
Petr Šabata f5bf49
This is enabled by default on RHEL5.
Petr Šabata f5bf49
Petr Šabata f5bf49
4) NMI(Non maskable interrupt) button
Petr Šabata f5bf49
Petr Šabata f5bf49
In cases where the system is in a hung state, and is not accepting keyboard
Petr Šabata f5bf49
interrupts, using NMI button for triggering Kdump can be very useful. NMI
Petr Šabata f5bf49
button is present on most of the newer x86 and x86_64 machines. Please refer
Petr Šabata f5bf49
to the User guides/manuals to locate the button, though in most occasions it
Petr Šabata f5bf49
is not very well documented. In most cases it is hidden behind a small hole
Petr Šabata f5bf49
on the front or back panel of the machine. You could use a toothpick or some
Petr Šabata f5bf49
other non-conducting probe to press the button.
Petr Šabata f5bf49
Petr Šabata f5bf49
For example, on the IBM X series 366 machine, the NMI button is located behind
Petr Šabata f5bf49
a small hole on the bottom center of the rear panel.
Petr Šabata f5bf49
Petr Šabata f5bf49
To enable this method of dump triggering using NMI button, you will need to set
Petr Šabata f5bf49
the 'unknown_nmi_panic' option as follows:
Petr Šabata f5bf49
Petr Šabata f5bf49
   # echo 1 > /proc/sys/kernel/unknown_nmi_panic
Petr Šabata f5bf49
Petr Šabata f5bf49
5) PowerPC specific methods:
Petr Šabata f5bf49
Petr Šabata f5bf49
On IBM PowerPC machines, issuing a soft reset invokes the XMON debugger(if
Petr Šabata f5bf49
XMON is configured). To configure XMON one needs to compile the kernel with
Petr Šabata f5bf49
the CONFIG_XMON and CONFIG_XMON_DEFAULT options, or by compiling with
Petr Šabata f5bf49
CONFIG_XMON and booting the kernel with xmon=on option.
Petr Šabata f5bf49
Petr Šabata f5bf49
Following are the ways to remotely issue a soft reset on PowerPC boxes, which
Petr Šabata f5bf49
would drop you to XMON. Pressing a 'X' (capital alphabet X) followed by an
Petr Šabata f5bf49
'Enter' here will trigger the dump.
Petr Šabata f5bf49
Petr Šabata f5bf49
5.1) HMC
Petr Šabata f5bf49
Petr Šabata f5bf49
Hardware Management Console(HMC) available on Power4 and Power5 machines allow
Petr Šabata f5bf49
partitions to be reset remotely. This is specially useful in hang situations
Petr Šabata f5bf49
where the system is not accepting any keyboard inputs.
Petr Šabata f5bf49
Petr Šabata f5bf49
Once you have HMC configured, the following steps will enable you to trigger
Petr Šabata f5bf49
Kdump via a soft reset:
Petr Šabata f5bf49
Petr Šabata f5bf49
On Power4
Petr Šabata f5bf49
  Using GUI
Petr Šabata f5bf49
Petr Šabata f5bf49
    * In the right pane, right click on the partition you wish to dump.
Petr Šabata f5bf49
    * Select "Operating System->Reset".
Petr Šabata f5bf49
    * Select "Soft Reset".
Petr Šabata f5bf49
    * Select "Yes".
Petr Šabata f5bf49
Petr Šabata f5bf49
  Using HMC Commandline
Petr Šabata f5bf49
Petr Šabata f5bf49
    # reset_partition -m <machine> -p <partition> -t soft
Petr Šabata f5bf49
Petr Šabata f5bf49
On Power5
Petr Šabata f5bf49
  Using GUI
Petr Šabata f5bf49
Petr Šabata f5bf49
    * In the right pane, right click on the partition you wish to dump.
Petr Šabata f5bf49
    * Select "Restart Partition".
Petr Šabata f5bf49
    * Select "Dump".
Petr Šabata f5bf49
    * Select "OK".
Petr Šabata f5bf49
Petr Šabata f5bf49
  Using HMC Commandline
Petr Šabata f5bf49
Petr Šabata f5bf49
    # chsysstate -m <managed system name> -n <lpar name> -o dumprestart -r lpar
Petr Šabata f5bf49
Petr Šabata f5bf49
5.2) Blade Management Console for Blade Center
Petr Šabata f5bf49
Petr Šabata f5bf49
To initiate a dump operation, go to Power/Restart option under "Blade Tasks" in
Petr Šabata f5bf49
the Blade Management Console. Select the corresponding blade for which you want
Petr Šabata f5bf49
to initate the dump and then click "Restart blade with NMI". This issues a
Petr Šabata f5bf49
system reset and invokes xmon debugger.
Petr Šabata f5bf49
Petr Šabata f5bf49
Petr Šabata f5bf49
Dump targets
Petr Šabata f5bf49
============
Petr Šabata f5bf49
Petr Šabata f5bf49
In addition to being able to capture a vmcore to your system's local file
Petr Šabata f5bf49
system, kdump can be configured to capture a vmcore to a number of other
Petr Šabata f5bf49
locations, including a raw disk partition, a dedicated file system, an NFS
Petr Šabata f5bf49
mounted file system, or a remote system via ssh/scp. Additional options
Petr Šabata f5bf49
exist for specifying the relative path under which the dump is captured,
Petr Šabata f5bf49
what to do if the capture fails, and for compressing and filtering the dump
Petr Šabata f5bf49
(so as to produce smaller, more manageable, vmcore files, see "Advanced Setups"
Petr Šabata f5bf49
for more detail on these options).
Petr Šabata f5bf49
Petr Šabata f5bf49
In theory, dumping to a location other than the local file system should be
Petr Šabata f5bf49
safer than kdump's default setup, as its possible the default setup will try
Petr Šabata f5bf49
dumping to a file system that has become corrupted. The raw disk partition and
Petr Šabata f5bf49
dedicated file system options allow you to still dump to the local system,
Petr Šabata f5bf49
but without having to remount your possibly corrupted file system(s),
Petr Šabata f5bf49
thereby decreasing the chance a vmcore won't be captured. Dumping to an
Petr Šabata f5bf49
NFS server or remote system via ssh/scp also has this advantage, as well
Petr Šabata f5bf49
as allowing for the centralization of vmcore files, should you have several
Petr Šabata f5bf49
systems from which you'd like to obtain vmcore files. Of course, note that
Petr Šabata f5bf49
these configurations could present problems if your network is unreliable.
Petr Šabata f5bf49
Petr Šabata f5bf49
Kdump target and advanced setups are configured via modifications to
Petr Šabata f5bf49
/etc/kdump.conf, which out of the box, is fairly well documented itself.
Petr Šabata f5bf49
Any alterations to /etc/kdump.conf should be followed by a restart of the
Petr Šabata f5bf49
kdump service, so the changes can be incorporated in the kdump initrd.
Petr Šabata f5bf49
Restarting the kdump service is as simple as '/sbin/systemctl restart kdump.service'.
Petr Šabata f5bf49
Petr Šabata f5bf49
There are two ways to config the dump target, config dump target only
Petr Šabata f5bf49
using "path", and config dump target explicitly. Interpretation of "path"
Petr Šabata f5bf49
also differs in two config styles.
Petr Šabata f5bf49
Petr Šabata f5bf49
Config dump target only using "path"
Petr Šabata f5bf49
------------------------------------
Petr Šabata f5bf49
Petr Šabata f5bf49
You can change the dump target by setting "path" to a mount point where
Petr Šabata f5bf49
dump target is mounted. When there is no explicitly configured dump target,
Petr Šabata f5bf49
"path" in kdump.conf represents the current file system path in which vmcore
Petr Šabata f5bf49
will be saved.  Kdump will automatically detect the underlying device of
Petr Šabata f5bf49
"path" and use that as the dump target.
Petr Šabata f5bf49
Petr Šabata f5bf49
In fact, upon dump, kdump creates a directory $hostip-$date with-in "path"
Petr Šabata f5bf49
and saves vmcore there. So practically dump is saved in $path/$hostip-$date/.
Petr Šabata f5bf49
Petr Šabata f5bf49
Kdump will only check current mount status for mount entry corresponding to
Petr Šabata f5bf49
"path". So please ensure the dump target is mounted on "path" before kdump
Petr Šabata f5bf49
service starts.
Petr Šabata f5bf49
Petr Šabata f5bf49
NOTES:
Petr Šabata f5bf49
Petr Šabata f5bf49
- It's strongly recommanded to put an mount entry for "path" in /etc/fstab
Petr Šabata f5bf49
  and have it auto mounted on boot. This make sure the dump target is
Petr Šabata f5bf49
  reachable from the machine and kdump's configuration is stable.
Petr Šabata f5bf49
Petr Šabata f5bf49
EXAMPLES:
Petr Šabata f5bf49
Petr Šabata f5bf49
- path /var/crash/
Petr Šabata f5bf49
Petr Šabata f5bf49
  This is the default configuration. Assuming there is no disk mounted
Petr Šabata f5bf49
  on /var/ or on /var/crash, dump will be saved on disk backing rootfs
Petr Šabata f5bf49
  in directory /var/crash.
Petr Šabata f5bf49
Petr Šabata f5bf49
- path /var/crash/ (A separate disk mounted on /var/crash)
Petr Šabata f5bf49
Petr Šabata f5bf49
  Say a disk /dev/sdb is mounted on /var. In this case dump target will
Petr Šabata f5bf49
  become /dev/sdb and path will become "/" and dump will be saved
Petr Šabata f5bf49
  on "sdb:/var/crash/" directory.
Petr Šabata f5bf49
Petr Šabata f5bf49
- path /var/crash/ (NFS mounted on /var)
Petr Šabata f5bf49
Petr Šabata f5bf49
  Say foo.com:/export/tmp is mounted on /var. In this case dump target is
Petr Šabata f5bf49
  nfs server and path will be adjusted to "/crash" and dump will be saved to
Petr Šabata f5bf49
  foo.com:/export/tmp/crash/ directory.
Petr Šabata f5bf49
Petr Šabata f5bf49
Config dump target explicitely
Petr Šabata f5bf49
------------------------------
Petr Šabata f5bf49
Petr Šabata f5bf49
You can set the dump target explicitly in kdump.conf, and "path" will be
Petr Šabata f5bf49
the relative path in the specified dump target. For example, if dump
Petr Šabata f5bf49
target is "ext4 /dev/sda", then dump will be saved in "path" directory
Petr Šabata f5bf49
on /dev/sda.
Petr Šabata f5bf49
Petr Šabata f5bf49
Same is the case for nfs dump. If user specified "nfs foo.com:/export/tmp/"
Petr Šabata f5bf49
as dump target, then dump will effectively be saved in
Petr Šabata f5bf49
"foo.com:/export/tmp/var/crash/" directory.
Petr Šabata f5bf49
Petr Šabata f5bf49
If the dump target is "raw", then "path" is ignored.
Petr Šabata f5bf49
Petr Šabata f5bf49
If it's a filesystem target, kdump will need to know the right mount option.
Petr Šabata f5bf49
Kdump will check current mount status, and then /etc/fstab for mount options
Petr Šabata f5bf49
corresponding to the specified dump target and use it. If there are
Petr Šabata f5bf49
special mount option required for the dump target, it could be set by put
Petr Šabata f5bf49
an entry in fstab.
Petr Šabata f5bf49
Petr Šabata f5bf49
If there are no related mount entry, mount option is set to "defaults".
Petr Šabata f5bf49
Petr Šabata f5bf49
NOTES:
Petr Šabata f5bf49
Petr Šabata f5bf49
- It's recommended to put an entry for the dump target in /etc/fstab
Petr Šabata f5bf49
  and have it auto mounted on boot. This make sure the dump target is
Petr Šabata f5bf49
  reachable from the machine and kdump won't fail.
Petr Šabata f5bf49
Petr Šabata f5bf49
- Kdump ignores some mount options, including "noauto", "ro". This
Petr Šabata f5bf49
  make it possible to keep the dump target unmounted or read-only
Petr Šabata f5bf49
  when not used.
Petr Šabata f5bf49
Petr Šabata f5bf49
EXAMPLES:
Petr Šabata f5bf49
Petr Šabata f5bf49
- ext4 /dev/sda (mounted)
Petr Šabata f5bf49
  path /var/crash/
Petr Šabata f5bf49
Petr Šabata f5bf49
  In this case dump target is set to /dev/sdb, path is the absolute path
Petr Šabata f5bf49
  "/var/crash" in /dev/sda, vmcore path will saved on
Petr Šabata f5bf49
  "sda:/var/crash" directory.
Petr Šabata f5bf49
Petr Šabata f5bf49
- nfs foo.com:/export/tmp (mounted)
Petr Šabata f5bf49
  path /var/crash/
Petr Šabata f5bf49
Petr Šabata f5bf49
  In this case dump target is nfs server, path is the absolute path
Petr Šabata f5bf49
  "/var/crash", vmcore path will saved on "foo.com:/export/tmp/crash/" directory.
Petr Šabata f5bf49
Petr Šabata f5bf49
- nfs foo.com:/export/tmp (not mounted)
Petr Šabata f5bf49
  path /var/crash/
Petr Šabata f5bf49
Petr Šabata f5bf49
  Same with above case, kdump will use "defaults" as the mount option
Petr Šabata f5bf49
  for the dump target.
Petr Šabata f5bf49
Petr Šabata f5bf49
- nfs foo.com:/export/tmp (not mounted, entry with option "noauto,nolock" exists in /etc/fstab)
Petr Šabata f5bf49
  path /var/crash/
Petr Šabata f5bf49
Petr Šabata f5bf49
  In this case dump target is nfs server, vmcore path will saved on
Petr Šabata f5bf49
  "foo.com:/export/tmp/crash/" directory, and kdump will inherit "nolock" option.
Petr Šabata f5bf49
Petr Šabata f5bf49
Dump target and mkdumprd
Petr Šabata f5bf49
------------------------
Petr Šabata f5bf49
Petr Šabata f5bf49
MKdumprd is the tool used to create kdump initramfs, and it may change
Petr Šabata f5bf49
the mount status of the dump target in some condition.
Petr Šabata f5bf49
Petr Šabata f5bf49
Usually the dump target should be used only for kdump. If you worry about
Petr Šabata f5bf49
someone uses the filesystem for something else other than dumping vmcore
Petr Šabata f5bf49
you can mount it as read-only or make it a noauto mount. Mkdumprd will
Petr Šabata f5bf49
mount/remount it as read-write for creating dump directory and will
Petr Šabata f5bf49
move it back to it's original state afterwards.
Petr Šabata f5bf49
Petr Šabata f5bf49
Supported dump target types and requirements
Petr Šabata f5bf49
--------------------------------------------
Petr Šabata f5bf49
Petr Šabata f5bf49
1) Raw partition
Petr Šabata f5bf49
Petr Šabata f5bf49
Raw partition dumping requires that a disk partition in the system, at least
Petr Šabata f5bf49
as large as the amount of memory in the system, be left unformatted. Assuming
Petr Šabata f5bf49
/dev/vg/lv_kdump is left unformatted, kdump.conf can be configured with
Petr Šabata f5bf49
'raw /dev/vg/lv_kdump', and the vmcore file will be copied via dd directly
Petr Šabata f5bf49
onto partition /dev/vg/lv_kdump. Restart the kdump service via
Petr Šabata f5bf49
'/sbin/systemctl restart kdump.service' to commit this change to your kdump
Petr Šabata f5bf49
initrd. Dump target should be persistent device name, such as lvm or device
Petr Šabata f5bf49
mapper canonical name.
Petr Šabata f5bf49
Petr Šabata f5bf49
2) Dedicated file system
Petr Šabata f5bf49
Petr Šabata f5bf49
Similar to raw partition dumping, you can format a partition with the file
Petr Šabata f5bf49
system of your choice, Again, it should be at least as large as the amount
Petr Šabata f5bf49
of memory in the system. Assuming it should be at least as large as the
Petr Šabata f5bf49
amount of memory in the system. Assuming /dev/vg/lv_kdump has been
Petr Šabata f5bf49
formatted ext4, specify 'ext4 /dev/vg/lv_kdump' in kdump.conf, and a
Petr Šabata f5bf49
vmcore file will be copied onto the file system after it has been mounted.
Petr Šabata f5bf49
Dumping to a dedicated partition has the advantage that you can dump multiple
Petr Šabata f5bf49
vmcores to the file system, space permitting, without overwriting previous ones,
Petr Šabata f5bf49
as would be the case in a raw partition setup. Restart the kdump service via
Petr Šabata f5bf49
'/sbin/systemctl restart kdump.service' to commit this change to
Petr Šabata f5bf49
your kdump initrd.  Note that for local file systems ext4 and ext2 are
Petr Šabata f5bf49
supported as dumpable targets.  Kdump will not prevent you from specifying
Petr Šabata f5bf49
other filesystems, and they will most likely work, but their operation
Petr Šabata f5bf49
cannot be guaranteed.  for instance specifying a vfat filesystem or msdos
Petr Šabata f5bf49
filesystem will result in a successful load of the kdump service, but during
Petr Šabata f5bf49
crash recovery, the dump will fail if the system has more than 2GB of memory
Petr Šabata f5bf49
(since vfat and msdos filesystems do not support more than 2GB files).
Petr Šabata f5bf49
Be careful of your filesystem selection when using this target.
Petr Šabata f5bf49
Petr Šabata f5bf49
It is recommended to use persistent device names or UUID/LABEL for file system
Petr Šabata f5bf49
dumps. One example of persistent device is /dev/vg/<devname>.
Petr Šabata f5bf49
Petr Šabata f5bf49
3) NFS mount
Petr Šabata f5bf49
Petr Šabata f5bf49
Dumping over NFS requires an NFS server configured to export a file system
Petr Šabata f5bf49
with full read/write access for the root user. All operations done within
Petr Šabata f5bf49
the kdump initial ramdisk are done as root, and to write out a vmcore file,
Petr Šabata f5bf49
we obviously must be able to write to the NFS mount. Configuring an NFS
Petr Šabata f5bf49
server is outside the scope of this document, but either the no_root_squash
Petr Šabata f5bf49
or anonuid options on the NFS server side are likely of interest to permit
Petr Šabata f5bf49
the kdump initrd operations write to the NFS mount as root.
Petr Šabata f5bf49
Petr Šabata f5bf49
Assuming your're exporting /dump on the machine nfs-server.example.com,
Petr Šabata f5bf49
once the mount is properly configured, specify it in kdump.conf, via
Petr Šabata f5bf49
'nfs nfs-server.example.com:/dump'. The server portion can be specified either
Petr Šabata f5bf49
by host name or IP address. Following a system crash, the kdump initrd will
Petr Šabata f5bf49
mount the NFS mount and copy out the vmcore to your NFS server. Restart the
Petr Šabata f5bf49
kdump service via '/sbin/systemctl restart kdump.service' to commit this change
Petr Šabata f5bf49
to your kdump initrd.
Petr Šabata f5bf49
Petr Šabata f5bf49
4) Special mount via "dracut_args"
Petr Šabata f5bf49
Petr Šabata f5bf49
You can utilize "dracut_args" to pass "--mount" to kdump, see dracut manpage
Petr Šabata f5bf49
about the format of "--mount" for details. If there is any "--mount" specified
Petr Šabata f5bf49
via "dracut_args", kdump will build it as the mount target without doing any
Petr Šabata f5bf49
validation (mounting or checking like mount options, fs size, save path, etc),
Petr Šabata f5bf49
so you must test it to ensure all the correctness. You cannot use other targets
Petr Šabata f5bf49
in /etc/kdump.conf if you use "--mount" in "dracut_args". You also cannot specify
Petr Šabata f5bf49
mutliple "--mount" targets via "dracut_args".
Petr Šabata f5bf49
Petr Šabata f5bf49
One use case of "--mount" in "dracut_args" is you do not want to mount dump target
Petr Šabata f5bf49
before kdump service startup, for example, to reduce the burden of the shared nfs
Petr Šabata f5bf49
server. Such as the example below:
Petr Šabata f5bf49
dracut_args --mount "192.168.1.1:/share /mnt/test nfs4 defaults"
Petr Šabata f5bf49
Petr Šabata f5bf49
NOTE:
Petr Šabata f5bf49
- <mountpoint> must be specified as an absolute path.
Petr Šabata f5bf49
Petr Šabata f5bf49
5) Remote system via ssh/scp
Petr Šabata f5bf49
Petr Šabata f5bf49
Dumping over ssh/scp requires setting up passwordless ssh keys for every
Petr Šabata f5bf49
machine you wish to have dump via this method. First up, configure kdump.conf
Petr Šabata f5bf49
for ssh/scp dumping, adding a config line of 'ssh user@server', where 'user'
Petr Šabata f5bf49
can be any user on the target system you choose, and 'server' is the host
Petr Šabata f5bf49
name or IP address of the target system. Using a dedicated, restricted user
Petr Šabata f5bf49
account on the target system is recommended, as there will be keyless ssh
Petr Šabata f5bf49
access to this account.
Petr Šabata f5bf49
Petr Šabata f5bf49
Once kdump.conf is appropriately configured, issue the command
Petr Šabata f5bf49
'kdumpctl propagate' to automatically set up the ssh host keys and transmit
Petr Šabata f5bf49
the necessary bits to the target server. You'll have to type in 'yes'
Petr Šabata f5bf49
to accept the host key for your targer server if this is the first time
Petr Šabata f5bf49
you've connected to it, and then input the target system user's password
Petr Šabata f5bf49
to send over the necessary ssh key file. Restart the kdump service via
Petr Šabata f5bf49
'/sbin/systemctl restart kdump.service' to commit this change to your kdump initrd.
Petr Šabata f5bf49
Petr Šabata f5bf49
Advanced Setups
Petr Šabata f5bf49
===============
Petr Šabata f5bf49
DistroBaker 624a64
About /etc/sysconfig/kdump
DistroBaker 624a64
------------------------------
DistroBaker 624a64
DistroBaker 624a64
Currently, there are a few options in /etc/sysconfig/kdump, which are
DistroBaker 624a64
usually used to control the behavior of kdump kernel. Basically, all of
DistroBaker 624a64
these options have default values, usually we do not need to change them,
DistroBaker 624a64
but sometimes, we may modify them in order to better control the behavior
DistroBaker 624a64
of kdump kernel such as debug, etc.
DistroBaker 624a64
DistroBaker 624a64
-KDUMP_BOOTDIR
Petr Šabata f5bf49
Petr Šabata f5bf49
Usually kdump kernel is the same as 1st kernel. So kdump will try to find
Petr Šabata f5bf49
kdump kernel under /boot according to /proc/cmdline. E.g we execute below
Petr Šabata f5bf49
command and get an output:
Petr Šabata f5bf49
	cat /proc/cmdline
Petr Šabata f5bf49
	BOOT_IMAGE=/xxx/vmlinuz-3.yyy.zzz  root=xxxx .....
DistroBaker 624a64
DistroBaker 624a64
Then kdump kernel will be /boot/xxx/vmlinuz-3.yyy.zzz. However, this option
DistroBaker 624a64
is provided to user if kdump kernel is put in a different directory.
DistroBaker 624a64
DistroBaker 624a64
-KDUMP_IMG
DistroBaker 624a64
DistroBaker 624a64
This represents the image type used for kdump. The default value is "vmlinuz".
DistroBaker 624a64
DistroBaker 624a64
-KDUMP_IMG_EXT
DistroBaker 624a64
DistroBaker 624a64
This represents the images extension. Relocatable kernels don't have one.
DistroBaker 624a64
Currently, it is a null string by default.
DistroBaker 624a64
DistroBaker 624a64
-KEXEC_ARGS
DistroBaker 624a64
DistroBaker 624a64
Any additional kexec arguments required. For example:
DistroBaker 624a64
KEXEC_ARGS="--elf32-core-headers".
DistroBaker 624a64
DistroBaker 624a64
In most situations, this should be left empty. But, sometimes we hope to get
DistroBaker 624a64
additional kexec loading debugging information, we can add the '-d' option
DistroBaker 624a64
for the debugging.
DistroBaker 624a64
DistroBaker 624a64
-KDUMP_KERNELVER
DistroBaker 624a64
DistroBaker 624a64
This is a kernel version string for the kdump kernel. If the version is not
DistroBaker 624a64
specified, the init script will try to find a kdump kernel with the same
DistroBaker 624a64
version number as the running kernel.
DistroBaker 624a64
DistroBaker 624a64
-KDUMP_COMMANDLINE
DistroBaker 624a64
DistroBaker 624a64
The value of 'KDUMP_COMMANDLINE' will be passed to kdump kernel as command
DistroBaker 624a64
line parameters, this will likely match the contents of the grub kernel line.
DistroBaker 624a64
DistroBaker 624a64
In general, if a command line is not specified, which means that it is a null
DistroBaker 624a64
string such as KDUMP_COMMANDLINE="", the default will be taken automatically
DistroBaker 624a64
from the '/proc/cmdline'.
DistroBaker 624a64
DistroBaker 624a64
-KDUMP_COMMANDLINE_REMOVE
DistroBaker 624a64
DistroBaker 624a64
This option allows us to remove arguments from the current kdump command line.
DistroBaker 624a64
If we don't specify any parameters for the KDUMP_COMMANDLINE, it will inherit
DistroBaker 624a64
all values from the '/proc/cmdline', which is not expected. As you know, some
DistroBaker 624a64
default kernel parameters could affect kdump, furthermore, that could cause
DistroBaker 624a64
the failure of kdump kernel boot.
DistroBaker 624a64
DistroBaker 624a64
In addition, the option is also helpful to debug the kdump kernel, we can use
DistroBaker 624a64
this option to change kdump kernel command line.
DistroBaker 624a64
DistroBaker 624a64
For more kernel parameters, please refer to kernel document.
DistroBaker 624a64
DistroBaker 624a64
-KDUMP_COMMANDLINE_APPEND
DistroBaker 624a64
DistroBaker 624a64
This option allows us to append arguments to the current kdump command line
DistroBaker 624a64
after processed by the KDUMP_COMMANDLINE_REMOVE. For kdump kernel, some
DistroBaker 624a64
specific modules require to be disabled like the mce, cgroup, numa, hest_disable,
DistroBaker 624a64
etc. Those modules may waste memory or kdump kernel doesn't need them,
DistroBaker 624a64
furthermore, there may affect kdump kernel boot.
DistroBaker 624a64
DistroBaker 624a64
Just like above option, it can be used to disable or enable some kernel
DistroBaker 624a64
modules so that we can exclude any errors for kdump kernel, this is very
DistroBaker 624a64
meaningful for debugging.
DistroBaker 624a64
DistroBaker 624a64
-KDUMP_STDLOGLVL | KDUMP_SYSLOGLVL | KDUMP_KMSGLOGLVL
DistroBaker 624a64
DistroBaker 624a64
These variables are used to control the kdump log level in the first kernel.
DistroBaker 624a64
In the second kernel, kdump will use the rd.kdumploglvl option to set the log
DistroBaker 624a64
level in the above KDUMP_COMMANDLINE_APPEND.
DistroBaker 624a64
DistroBaker 624a64
Logging levels: no logging(0), error(1), warn(2), info(3), debug(4)
Petr Šabata f5bf49
Petr Šabata f5bf49
Kdump Post-Capture Executable
Petr Šabata f5bf49
-----------------------------
Petr Šabata f5bf49
Petr Šabata f5bf49
It is possible to specify a custom script or binary you wish to run following
Petr Šabata f5bf49
an attempt to capture a vmcore. The executable is passed an exit code from
Petr Šabata f5bf49
the capture process, which can be used to trigger different actions from
Petr Šabata f5bf49
within your post-capture executable.
Petr Šabata f5bf49
If /etc/kdump/post.d directory exist, All files in the directory are
Petr Šabata f5bf49
collectively sorted and executed in lexical order, before binary or script
Petr Šabata f5bf49
specified kdump_post parameter is executed.
Petr Šabata f5bf49
e38a68
In these scripts, the reference to the storage or network device should adhere
e38a68
to the section 'Supported dump target types and requirements'
e38a68
Petr Šabata f5bf49
Kdump Pre-Capture Executable
Petr Šabata f5bf49
----------------------------
Petr Šabata f5bf49
Petr Šabata f5bf49
It is possible to specify a custom script or binary you wish to run before
Petr Šabata f5bf49
capturing a vmcore. Exit status of this binary is interpreted:
Petr Šabata f5bf49
0 - continue with dump process as usual
Petr Šabata f5bf49
non 0 - run the final action (reboot/poweroff/halt)
Petr Šabata f5bf49
If /etc/kdump/pre.d directory exists, all files in the directory are collectively
Petr Šabata f5bf49
sorted and executed in lexical order, after binary or script specified
Petr Šabata f5bf49
kdump_pre parameter is executed.
Petr Šabata f5bf49
Even if the binary or script in /etc/kdump/pre.d directory returns non 0
Petr Šabata f5bf49
exit status, the processing is continued.
Petr Šabata f5bf49
e38a68
In these scripts, the reference to the storage or network device should adhere
e38a68
to the section 'Supported dump target types and requirements'
e38a68
Petr Šabata f5bf49
Extra Binaries
Petr Šabata f5bf49
--------------
Petr Šabata f5bf49
Petr Šabata f5bf49
If you have specific binaries or scripts you want to have made available
Petr Šabata f5bf49
within your kdump initrd, you can specify them by their full path, and they
Petr Šabata f5bf49
will be included in your kdump initrd, along with all dependent libraries.
Petr Šabata f5bf49
This may be particularly useful for those running post-capture scripts that
Petr Šabata f5bf49
rely on other binaries.
Petr Šabata f5bf49
Petr Šabata f5bf49
Extra Modules
Petr Šabata f5bf49
-------------
Petr Šabata f5bf49
Petr Šabata f5bf49
By default, only the bare minimum of kernel modules will be included in your
Petr Šabata f5bf49
kdump initrd. Should you wish to capture your vmcore files to a non-boot-path
Petr Šabata f5bf49
storage device, such as an iscsi target disk or clustered file system, you may
Petr Šabata f5bf49
need to manually specify additional kernel modules to load into your kdump
Petr Šabata f5bf49
initrd.
Petr Šabata f5bf49
Petr Šabata f5bf49
Failure action
Petr Šabata f5bf49
--------------
Petr Šabata f5bf49
Petr Šabata f5bf49
Failure action specifies what to do when dump to configured dump target
Petr Šabata f5bf49
fails. By default, failure action is "reboot" and that is system reboots
Petr Šabata f5bf49
if attempt to save dump to dump target fails.
Petr Šabata f5bf49
Petr Šabata f5bf49
There are other failure actions available though.
Petr Šabata f5bf49
Petr Šabata f5bf49
- dump_to_rootfs
Petr Šabata f5bf49
  This option tries to mount root and save dump on root filesystem
Petr Šabata f5bf49
  in a path specified by "path". This option will generally make
Petr Šabata f5bf49
  sense when dump target is not root filesystem. For example, if
Petr Šabata f5bf49
  dump is being saved over network using "ssh" then one can specify
Petr Šabata f5bf49
  failure action to "dump_to_rootfs" to try saving dump to root
Petr Šabata f5bf49
  filesystem if dump over network fails.
Petr Šabata f5bf49
Petr Šabata f5bf49
- shell
Petr Šabata f5bf49
  Drop into a shell session inside initramfs.
Petr Šabata f5bf49
Petr Šabata f5bf49
- halt
Petr Šabata f5bf49
  Halt system after failure
Petr Šabata f5bf49
Petr Šabata f5bf49
- poweroff
Petr Šabata f5bf49
  Poweroff system after failure.
Petr Šabata f5bf49
Petr Šabata f5bf49
Compression and filtering
Petr Šabata f5bf49
-------------------------
Petr Šabata f5bf49
Petr Šabata f5bf49
The 'core_collector' parameter in kdump.conf allows you to specify a custom
Petr Šabata f5bf49
dump capture method. The most common alternate method is makedumpfile, which
Petr Šabata f5bf49
is a dump filtering and compression utility provided with kexec-tools. On
Petr Šabata f5bf49
some architectures, it can drastically reduce the size of your vmcore files,
Petr Šabata f5bf49
which becomes very useful on systems with large amounts of memory.
Petr Šabata f5bf49
DistroBaker 5cac7c
A typical setup is 'core_collector makedumpfile -F -l --message-level 7 -d 31',
Petr Šabata f5bf49
but check the output of '/sbin/makedumpfile --help' for a list of all available
Petr Šabata f5bf49
options (-i and -g don't need to be specified, they're automatically taken care
Petr Šabata f5bf49
of). Note that use of makedumpfile requires that the kernel-debuginfo package
Petr Šabata f5bf49
corresponding with your running kernel be installed.
Petr Šabata f5bf49
Petr Šabata f5bf49
Core collector command format depends on dump target type. Typically for
Petr Šabata f5bf49
filesystem (local/remote), core_collector should accept two arguments.
Petr Šabata f5bf49
First one is source file and second one is target file. For ex.
Petr Šabata f5bf49
Petr Šabata f5bf49
- ex1.
Petr Šabata f5bf49
Petr Šabata f5bf49
  core_collector "cp --sparse=always"
Petr Šabata f5bf49
Petr Šabata f5bf49
  Above will effectively be translated to:
Petr Šabata f5bf49
Petr Šabata f5bf49
  cp --sparse=always /proc/vmcore <dest-path>/vmcore
Petr Šabata f5bf49
Petr Šabata f5bf49
- ex2.
Petr Šabata f5bf49
DistroBaker 5cac7c
  core_collector "makedumpfile -l --message-level 7 -d 31"
Petr Šabata f5bf49
Petr Šabata f5bf49
  Above will effectively be translated to:
Petr Šabata f5bf49
DistroBaker 5cac7c
  makedumpfile -l --message-level 7 -d 31 /proc/vmcore <dest-path>/vmcore
Petr Šabata f5bf49
Petr Šabata f5bf49
For dump targets like raw and ssh, in general, core collector should expect
Petr Šabata f5bf49
one argument (source file) and should output the processed core on standard
Petr Šabata f5bf49
output (There is one exception of "scp", discussed later). This standard
Petr Šabata f5bf49
output will be saved to destination using appropriate commands.
Petr Šabata f5bf49
Petr Šabata f5bf49
raw dumps core_collector examples:
Petr Šabata f5bf49
Petr Šabata f5bf49
- ex3.
Petr Šabata f5bf49
Petr Šabata f5bf49
  core_collector "cat"
Petr Šabata f5bf49
Petr Šabata f5bf49
  Above will effectively be translated to.
Petr Šabata f5bf49
Petr Šabata f5bf49
  cat /proc/vmcore | dd of=<target-device>
Petr Šabata f5bf49
Petr Šabata f5bf49
- ex4.
Petr Šabata f5bf49
DistroBaker 5cac7c
  core_collector "makedumpfile -F -l --message-level 7 -d 31"
Petr Šabata f5bf49
Petr Šabata f5bf49
  Above will effectively be translated to.
Petr Šabata f5bf49
DistroBaker 5cac7c
  makedumpfile -F -l --message-level 7 -d 31 | dd of=<target-device>
Petr Šabata f5bf49
Petr Šabata f5bf49
ssh dumps core_collector examples:
Petr Šabata f5bf49
Petr Šabata f5bf49
- ex5.
Petr Šabata f5bf49
Petr Šabata f5bf49
  core_collector "cat"
Petr Šabata f5bf49
Petr Šabata f5bf49
  Above will effectively be translated to.
Petr Šabata f5bf49
Petr Šabata f5bf49
  cat /proc/vmcore | ssh <options> <remote-location> "dd of=path/vmcore"
Petr Šabata f5bf49
Petr Šabata f5bf49
- ex6.
Petr Šabata f5bf49
DistroBaker 5cac7c
  core_collector "makedumpfile -F -l --message-level 7 -d 31"
Petr Šabata f5bf49
Petr Šabata f5bf49
  Above will effectively be translated to.
Petr Šabata f5bf49
DistroBaker 5cac7c
  makedumpfile -F -l --message-level 7 -d 31 | ssh <options> <remote-location> "dd of=path/vmcore"
Petr Šabata f5bf49
Petr Šabata f5bf49
There is one exception to standard output rule for ssh dumps. And that is
Petr Šabata f5bf49
scp. As scp can handle ssh destinations for file transfers, one can
Petr Šabata f5bf49
specify "scp" as core collector for ssh targets (no output on stdout).
Petr Šabata f5bf49
Petr Šabata f5bf49
- ex7.
Petr Šabata f5bf49
Petr Šabata f5bf49
  core_collector "scp"
Petr Šabata f5bf49
Petr Šabata f5bf49
  Above will effectively be translated to.
Petr Šabata f5bf49
Petr Šabata f5bf49
  scp /proc/vmcore <user@host>:path/vmcore
Petr Šabata f5bf49
Petr Šabata f5bf49
About default core collector
Petr Šabata f5bf49
----------------------------
Petr Šabata f5bf49
Petr Šabata f5bf49
Default core_collector for ssh/raw dump is:
DistroBaker 5cac7c
"makedumpfile -F -l --message-level 7 -d 31".
Petr Šabata f5bf49
Default core_collector for other targets is:
DistroBaker 5cac7c
"makedumpfile -l --message-level 7 -d 31".
Petr Šabata f5bf49
Petr Šabata f5bf49
Even if core_collector option is commented out in kdump.conf, makedumpfile
Petr Šabata f5bf49
is default core collector and kdump uses it internally.
Petr Šabata f5bf49
Petr Šabata f5bf49
If one does not want makedumpfile as default core_collector, then they
Petr Šabata f5bf49
need to specify one using core_collector option to change the behavior.
Petr Šabata f5bf49
Petr Šabata f5bf49
Note: If "makedumpfile -F" is used then you will get a flattened format
Petr Šabata f5bf49
vmcore.flat, you will need to use "makedumpfile -R" to rearrange the
Petr Šabata f5bf49
dump data from stdard input to a normal dumpfile (readable with analysis
Petr Šabata f5bf49
tools).
Petr Šabata f5bf49
For example: "makedumpfile -R vmcore < vmcore.flat"
Petr Šabata f5bf49
Petr Šabata f5bf49
Petr Šabata f5bf49
Caveats
Petr Šabata f5bf49
=======
Petr Šabata f5bf49
Petr Šabata f5bf49
Console frame-buffers and X are not properly supported. If you typically run
Petr Šabata f5bf49
with something along the lines of "vga=791" in your kernel config line or
Petr Šabata f5bf49
have X running, console video will be garbled when a kernel is booted via
Petr Šabata f5bf49
kexec. Note that the kdump kernel should still be able to create a dump,
Petr Šabata f5bf49
and when the system reboots, video should be restored to normal.
Petr Šabata f5bf49
Petr Šabata f5bf49
Petr Šabata f5bf49
Notes
Petr Šabata f5bf49
=====
Petr Šabata f5bf49
Petr Šabata f5bf49
Notes on resetting video:
Petr Šabata f5bf49
-------------------------
Petr Šabata f5bf49
Petr Šabata f5bf49
Video is a notoriously difficult issue with kexec.  Video cards contain ROM code
Petr Šabata f5bf49
that controls their initial configuration and setup.  This code is nominally
Petr Šabata f5bf49
accessed and executed from the Bios, and otherwise not safely executable. Since
Petr Šabata f5bf49
the purpose of kexec is to reboot the system without re-executing the Bios, it
Petr Šabata f5bf49
is rather difficult if not impossible to reset video cards with kexec.  The
Petr Šabata f5bf49
result is, that if a system crashes while running in a graphical mode (i.e.
Petr Šabata f5bf49
running X), the screen may appear to become 'frozen' while the dump capture is
Petr Šabata f5bf49
taking place.  A serial console will of course reveal that the system is
Petr Šabata f5bf49
operating and capturing a vmcore image, but a casual observer will see the
Petr Šabata f5bf49
system as hung until the dump completes and a true reboot is executed.
Petr Šabata f5bf49
Petr Šabata f5bf49
There are two possiblilties to work around this issue.  One is by adding
Petr Šabata f5bf49
--reset-vga to the kexec command line options in /etc/sysconfig/kdump.  This
Petr Šabata f5bf49
tells kdump to write some reasonable default values to the video card register
Petr Šabata f5bf49
file, in the hopes of returning it to a text mode such that boot messages are
Petr Šabata f5bf49
visible on the screen.  It does not work with all video cards however.
Petr Šabata f5bf49
Secondly, it may be worth trying to add vga15fb.ko to the extra_modules list in
Petr Šabata f5bf49
/etc/kdump.conf.  This will attempt to use the video card in framebuffer mode,
Petr Šabata f5bf49
which can blank the screen prior to the start of a dump capture.
Petr Šabata f5bf49
Petr Šabata f5bf49
Notes on rootfs mount
Petr Šabata f5bf49
---------------------
Petr Šabata f5bf49
Petr Šabata f5bf49
Dracut is designed to mount rootfs by default. If rootfs mounting fails it
Petr Šabata f5bf49
will refuse to go on. So kdump leaves rootfs mounting to dracut currently.
Petr Šabata f5bf49
We make the assumtion that proper root= cmdline is being passed to dracut
Petr Šabata f5bf49
initramfs for the time being. If you need modify "KDUMP_COMMANDLINE=" in
Petr Šabata f5bf49
/etc/sysconfig/kdump, you will need to make sure that appropriate root=
Petr Šabata f5bf49
options are copied from /proc/cmdline. In general it is best to append
Petr Šabata f5bf49
command line options using "KDUMP_COMMANDLINE_APPEND=" instead of replacing
Petr Šabata f5bf49
the original command line completely.
Petr Šabata f5bf49
Petr Šabata f5bf49
Notes on watchdog module handling
Petr Šabata f5bf49
---------------------------------
Petr Šabata f5bf49
Petr Šabata f5bf49
If a watchdog is active in first kernel then, we must have it's module
Petr Šabata f5bf49
loaded in crash kernel, so that either watchdog is deactivated or started
Petr Šabata f5bf49
being kicked in second kernel. Otherwise, we might face watchdog reboot
Petr Šabata f5bf49
when vmcore is being saved. When dracut watchdog module is enabled, it
Petr Šabata f5bf49
installs kernel watchdog module of active watchdog device in initrd.
Petr Šabata f5bf49
kexec-tools always add "-a watchdog" to the dracut_args if there exists at
Petr Šabata f5bf49
least one active watchdog and user has not added specifically "-o watchdog"
Petr Šabata f5bf49
in dracut_args of kdump.conf. If a watchdog module (such as hp_wdt) has
Petr Šabata f5bf49
not been written in watchdog-core framework then this option will not have
Petr Šabata f5bf49
any effect and module will not be added. Please note that only systemd
Petr Šabata f5bf49
watchdog daemon is supported as watchdog kick application.
Petr Šabata f5bf49
Petr Šabata f5bf49
Notes for disk images
Petr Šabata f5bf49
---------------------
Petr Šabata f5bf49
Petr Šabata f5bf49
Kdump initramfs is a critical component for capturing the crash dump.
Petr Šabata f5bf49
But it's strictly generated for the machine it will run on, and have
Petr Šabata f5bf49
no generality. If you install a new machine with a previous disk image
Petr Šabata f5bf49
(eg. VMs created with disk image or snapshot), kdump could be broken
Petr Šabata f5bf49
easily due to hardware changes or disk ID changes. So it's strongly
Petr Šabata f5bf49
recommended to not include the kdump initramfs in the disk image in the
Petr Šabata f5bf49
first place, this helps to save space, and kdumpctl will build the
Petr Šabata f5bf49
initramfs automatically if it's missing. If you have already installed
Petr Šabata f5bf49
a machine with a disk image which have kdump initramfs embedded, you
Petr Šabata f5bf49
should rebuild the initramfs using "kdumpctl rebuild" command manually,
Petr Šabata f5bf49
or else kdump may not work as expeceted.
Petr Šabata f5bf49
Petr Šabata f5bf49
Notes on encrypted dump target
Petr Šabata f5bf49
------------------------------
Petr Šabata f5bf49
Petr Šabata f5bf49
Currently, kdump is not working well with encrypted dump target.
Petr Šabata f5bf49
First, user have to give the password manually in capture kernel,
Petr Šabata f5bf49
so a working interactive terminal is required in the capture kernel.
Petr Šabata f5bf49
And another major issue is that an OOM problem will occur with certain
Petr Šabata f5bf49
encryption setup. For example, the default setup for LUKS2 will use a
Petr Šabata f5bf49
memory hard key derivation function to mitigate brute force attach,
Petr Šabata f5bf49
it's impossible to reduce the memory usage for mounting the encrypted
Petr Šabata f5bf49
target. In such case, you have to either reserved enough memory for
Petr Šabata f5bf49
crash kernel according, or update your encryption setup.
Petr Šabata f5bf49
It's recommanded to use a non-encrypted target (eg. remote target)
Petr Šabata f5bf49
instead.
Petr Šabata f5bf49
Petr Šabata f5bf49
Notes on device dump
Petr Šabata f5bf49
--------------------
Petr Šabata f5bf49
Petr Šabata f5bf49
Device dump allows drivers to append dump data to vmcore, so you can
Petr Šabata f5bf49
collect driver specified debug info. The drivers could append the
Petr Šabata f5bf49
data without any limit, and the data is stored in memory, this may
Petr Šabata f5bf49
bring a significant memory stress. So device dump is disabled by default
Petr Šabata f5bf49
by passing "novmcoredd" command line option to the kdump capture kernel.
Petr Šabata f5bf49
If you want to collect debug data with device dump, you need to modify
Petr Šabata f5bf49
"KDUMP_COMMANDLINE_APPEND=" value in /etc/sysconfig/kdump and remove the
Petr Šabata f5bf49
"novmcoredd" option. You also need to increase the "crashkernel=" value
Petr Šabata f5bf49
accordingly in case of OOM issue.
Petr Šabata f5bf49
Besides, kdump initramfs won't automatically include the device drivers
Petr Šabata f5bf49
which support device dump, only device drivers that are required for
Petr Šabata f5bf49
the dump target setup will be included. To ensure the device dump data
Petr Šabata f5bf49
will be included in the vmcore, you need to force include related
Petr Šabata f5bf49
device drivers by using "extra_modules" option in /etc/kdump.conf
Petr Šabata f5bf49
Petr Šabata f5bf49
Petr Šabata f5bf49
Parallel Dumping Operation
Petr Šabata f5bf49
==========================
Petr Šabata f5bf49
Petr Šabata f5bf49
Kexec allows kdump using multiple cpus. So parallel feature can accelerate
Petr Šabata f5bf49
dumping substantially, especially in executing compression and filter.
Petr Šabata f5bf49
For example:
Petr Šabata f5bf49
Petr Šabata f5bf49
	1."makedumpfile -c --num-threads [THREAD_NUM] /proc/vmcore dumpfile"
Petr Šabata f5bf49
	2."makedumpfile -c /proc/vmcore dumpfile",
Petr Šabata f5bf49
Petr Šabata f5bf49
	1 has better performance than 2, if THREAD_NUM is larger than two
Petr Šabata f5bf49
	and the usable cpus number is larger than THREAD_NUM.
Petr Šabata f5bf49
Petr Šabata f5bf49
Notes on how to use multiple cpus on a capture kernel on x86 system:
Petr Šabata f5bf49
Petr Šabata f5bf49
Make sure that you are using a kernel that supports disable_cpu_apicid
Petr Šabata f5bf49
kernel option as a capture kernel, which is needed to avoid x86 specific
Petr Šabata f5bf49
hardware issue (*). The disable_cpu_apicid kernel option is automatically
Petr Šabata f5bf49
appended by kdumpctl script and is ignored if the kernel doesn't support it.
Petr Šabata f5bf49
Petr Šabata f5bf49
You need to specify how many cpus to be used in a capture kernel by specifying
Petr Šabata f5bf49
the number of cpus in nr_cpus kernel option in /etc/sysconfig/kdump. nr_cpus
Petr Šabata f5bf49
is 1 at default.
Petr Šabata f5bf49
Petr Šabata f5bf49
You should use necessary and sufficient number of cpus on a capture kernel.
Petr Šabata f5bf49
Warning: Don't use too many cpus on a capture kernel, or the capture kernel
Petr Šabata f5bf49
may lead to panic due to Out Of Memory.
Petr Šabata f5bf49
Petr Šabata f5bf49
(*) Without disable_cpu_apicid kernel option, capture kernel may lead to
Petr Šabata f5bf49
hang, system reset or power-off at boot, depending on your system and runtime
Petr Šabata f5bf49
situation at the time of crash.
Petr Šabata f5bf49
Petr Šabata f5bf49
Petr Šabata f5bf49
Debugging Tips
Petr Šabata f5bf49
==============
Petr Šabata f5bf49
Petr Šabata f5bf49
- One can drop into a shell before/after saving vmcore with the help of
Petr Šabata f5bf49
  using kdump_pre/kdump_post hooks. Use following in one of the pre/post
Petr Šabata f5bf49
  scripts to drop into a shell.
Petr Šabata f5bf49
Petr Šabata f5bf49
  #!/bin/bash
Petr Šabata f5bf49
  _ctty=/dev/ttyS0
Petr Šabata f5bf49
  setsid /bin/sh -i -l 0<>$_ctty 1<>$_ctty 2<>$_ctty
Petr Šabata f5bf49
Petr Šabata f5bf49
  One might have to change the terminal depending on what they are using.
Petr Šabata f5bf49
Petr Šabata f5bf49
- Serial console logging for virtual machines
Petr Šabata f5bf49
Petr Šabata f5bf49
  I generally use "virsh console <domain-name>" to get to serial console.
Petr Šabata f5bf49
  I noticed after dump saving system reboots and when grub menu shows up
Petr Šabata f5bf49
  some of the previously logged messages are no more there. That means
Petr Šabata f5bf49
  any important debugging info at the end will be lost.
Petr Šabata f5bf49
Petr Šabata f5bf49
  One can log serial console as follows to make sure messages are not lost.
Petr Šabata f5bf49
Petr Šabata f5bf49
  virsh ttyconsole <domain-name>
Petr Šabata f5bf49
  ln -s <name-of-tty> /dev/modem
Petr Šabata f5bf49
  minicom -C /tmp/console-logs
Petr Šabata f5bf49
Petr Šabata f5bf49
  Now minicom should be logging serial console in file console-logs.
DistroBaker 5cac7c
DistroBaker 5cac7c
- Using the logger to output kdump log messages
DistroBaker 5cac7c
DistroBaker 17a515
  You can configure the kdump log level for the first kernel in the
DistroBaker 17a515
  /etc/sysconfig/kdump. For example:
DistroBaker 17a515
DistroBaker 17a515
  KDUMP_STDLOGLVL=3
DistroBaker 17a515
  KDUMP_SYSLOGLVL=0
DistroBaker 17a515
  KDUMP_KMSGLOGLVL=0
DistroBaker 17a515
DistroBaker 17a515
  The above configurations indicate that kdump messages will be printed
DistroBaker 17a515
  to the console, and the KDUMP_STDLOGLVL is set to 3(info), but the
DistroBaker 17a515
  KDUMP_SYSLOGLVL and KDUMP_KMSGLOGLVL are set to 0(no logging). This
DistroBaker 17a515
  is also the current default log levels in the first kernel.
DistroBaker 17a515
DistroBaker 17a515
  In the second kernel, you can add the 'rd.kdumploglvl=X' option to the
DistroBaker 17a515
  KDUMP_COMMANDLINE_APPEND in the /etc/sysconfig/kdump so that you can also
DistroBaker 17a515
  set the log levels for the second kernel. The 'X' represents the logging
DistroBaker 17a515
  levels, the default log level is 3(info) in the second kernel, for example:
DistroBaker 17a515
DistroBaker 17a515
  # cat /etc/sysconfig/kdump |grep rd.kdumploglvl
DistroBaker 17a515
  KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 acpi_no_memhotplug transparent_hugepage=never nokaslr hest_disable novmcoredd rd.kdumploglvl=3"
DistroBaker 17a515
DistroBaker 17a515
  Logging levels: no logging(0), error(1),warn(2),info(3),debug(4)
DistroBaker 17a515
DistroBaker 17a515
  The ERROR level designates error events that might still allow the application
DistroBaker 17a515
  to continue running.
DistroBaker 17a515
DistroBaker 17a515
  The WARN level designates potentially harmful situations.
DistroBaker 17a515
DistroBaker 17a515
  The INFO level designates informational messages that highlight the progress
DistroBaker 17a515
  of the application at coarse-grained level.
DistroBaker 17a515
DistroBaker 17a515
  The DEBUG level designates fine-grained informational events that are most
DistroBaker 17a515
  useful to debug an application.
DistroBaker 17a515
DistroBaker 17a515
  Note: if you set the log level to 0, that will disable the logs at the
DistroBaker 17a515
  corresponding log level, which indicates that it has no log output.
DistroBaker 17a515
DistroBaker 17a515
  At present, the logger works in both the first kernel(kdump service debugging)
DistroBaker 17a515
  and the second kernel.
DistroBaker 17a515
DistroBaker 17a515
  In the first kernel, you can find the historical logs with the journalctl
DistroBaker 17a515
  command and check kdump service debugging information. In addition, the
DistroBaker 17a515
  'kexec -d' debugging messages are also saved to /var/log/kdump.log in the
DistroBaker 17a515
  first kernel. For example:
DistroBaker 17a515
DistroBaker 17a515
  [root@ibm-z-109 ~]# ls -al /var/log/kdump.log
DistroBaker 17a515
  -rw-r--r--. 1 root root 63238 Oct 28 06:40 /var/log/kdump.log
DistroBaker 17a515
DistroBaker 17a515
  If you want to get the debugging information of building kdump initramfs, you
DistroBaker 17a515
  can enable the '--debug' option for the dracut_args in the /etc/kdump.conf, and
DistroBaker 17a515
  then rebuild the kdump initramfs as below:
DistroBaker 17a515
DistroBaker 17a515
  # systemctl restart kdump.service
DistroBaker 17a515
DistroBaker 17a515
  That will rebuild the kdump initramfs and gerenate some logs to journald, you
DistroBaker 17a515
  can find the dracut logs with the journalctl command.
DistroBaker 17a515
DistroBaker 17a515
  In the second kernel, kdump will automatically put the kexec-dmesg.log to a same
DistroBaker 17a515
  directory with the vmcore, the log file includes the debugging messages like dmesg
DistroBaker 17a515
  and journald logs. For example:
DistroBaker 17a515
DistroBaker 17a515
  [root@ibm-z-109 ~]# ls -al /var/crash/127.0.0.1-2020-10-28-02\:01\:23/
DistroBaker 17a515
  drwxr-xr-x. 2 root root       67 Oct 28 02:02 .
DistroBaker 17a515
  drwxr-xr-x. 6 root root      154 Oct 28 02:01 ..
DistroBaker 17a515
  -rw-r--r--. 1 root root    21164 Oct 28 02:01 kexec-dmesg.log
DistroBaker 17a515
  -rw-------. 1 root root 74238698 Oct 28 02:01 vmcore
DistroBaker 17a515
  -rw-r--r--. 1 root root    17532 Oct 28 02:01 vmcore-dmesg.txt
DistroBaker 17a515
DistroBaker 17a515
  If you want to get more debugging information in the second kernel, you can add
DistroBaker 17a515
  the 'rd.debug' option to the KDUMP_COMMANDLINE_APPEND in the /etc/sysconfig/kdump,
DistroBaker 17a515
  and then reload them in order to make the changes take effect.
DistroBaker 17a515
DistroBaker 17a515
  In addition, you can also add the 'rd.memdebug=X' option to the KDUMP_COMMANDLINE_APPEND
DistroBaker 17a515
  in the /etc/sysconfig/kdump in order to output the additional information about
DistroBaker 17a515
  kernel module memory consumption during loading.
DistroBaker 17a515
DistroBaker 17a515
  For more details, please refer to the /etc/sysconfig/kdump, or the man page of
DistroBaker 17a515
  dracut.cmdline and kdump.conf.