ab224c
Kexec/Kdump HOWTO
ab224c
ab224c
Introduction
ab224c
ab224c
Kexec and kdump are new features in the 2.6 mainstream kernel. These features
ab224c
are included in Red Hat Enterprise Linux 5. The purpose of these features
ab224c
is to ensure faster boot up and creation of reliable kernel vmcores for
ab224c
diagnostic purposes.
ab224c
ab224c
Overview
ab224c
ab224c
Kexec
ab224c
ab224c
Kexec is a fastboot mechanism which allows booting a Linux kernel from the
ab224c
context of already running kernel without going through BIOS. BIOS can be very
ab224c
time consuming especially on the big servers with lots of peripherals. This can
ab224c
save a lot of time for developers who end up booting a machine numerous times.
ab224c
ab224c
Kdump
ab224c
ab224c
Kdump is a new kernel crash dumping mechanism and is very reliable because
ab224c
the crash dump is captured from the context of a freshly booted kernel and
ab224c
not from the context of the crashed kernel. Kdump uses kexec to boot into
ab224c
a second kernel whenever system crashes. This second kernel, often called
ab224c
a capture kernel, boots with very little memory and captures the dump image.
ab224c
ab224c
The first kernel reserves a section of memory that the second kernel uses
ab224c
to boot. Kexec enables booting the capture kernel without going through BIOS
ab224c
hence contents of first kernel's memory are preserved, which is essentially
ab224c
the kernel crash dump.
ab224c
ab224c
Kdump is supported on the i686, x86_64, ia64 and ppc64 platforms. The
ab224c
standard kernel and capture kernel are one in the same on i686, x86_64,
ab224c
ia64 and ppc64.
ab224c
ab224c
If you're reading this document, you should already have kexec-tools
ab224c
installed. If not, you install it via the following command:
ab224c
ab224c
    # yum install kexec-tools
ab224c
ab224c
Now load a kernel with kexec:
ab224c
ab224c
    # kver=`uname -r` # kexec -l /boot/vmlinuz-$kver
ab224c
    --initrd=/boot/initrd-$kver.img \
ab224c
        --command-line="`cat /proc/cmdline`"
ab224c
ab224c
NOTE: The above will boot you back into the kernel you're currently running,
ab224c
if you want to load a different kernel, substitute it in place of `uname -r`.
ab224c
ab224c
Now reboot your system, taking note that it should bypass the BIOS:
ab224c
ab224c
    # reboot
ab224c
ab224c
ab224c
How to configure kdump:
ab224c
ab224c
Again, we assume if you're reading this document, you should already have
ab224c
kexec-tools installed. If not, you install it via the following command:
ab224c
ab224c
    # yum install kexec-tools
ab224c
ab224c
To be able to do much of anything interesting in the way of debug analysis,
ab224c
you'll also need to install the kernel-debuginfo package, of the same arch
ab224c
as your running kernel, and the crash utility:
ab224c
ab224c
    # yum --enablerepo=\*debuginfo install kernel-debuginfo.$(uname -m) crash
ab224c
ab224c
Next up, we need to modify some boot parameters to reserve a chunk of memory for
ab224c
the capture kernel. With the help of grubby, it's very easy to append
ab224c
"crashkernel=128M" to the end of your kernel boot parameters. Note that the X
ab224c
values are such that X = the amount of memory to reserve for the capture kernel.
ab224c
And based on arch and system configuration, one might require more than 128M to
ab224c
be reserved for kdump. One need to experiment and test kdump, if 128M is not
ab224c
sufficient, try reserving more memory.
ab224c
ab224c
   # grubby --args="crashkernel=128M" --update-kernel=/boot/vmlinuz-`uname -r`
ab224c
ab224c
Note that there is an alternative form in which to specify a crashkernel
ab224c
memory reservation, in the event that more control is needed over the size and
ab224c
placement of the reserved memory.  The format is:
ab224c
ab224c
crashkernel=range1:size1[,range2:size2,...][@offset]
ab224c
ab224c
Where range<n> specifies a range of values that are matched against the amount
ab224c
of physical RAM present in the system, and the corresponding size<n> value
ab224c
specifies the amount of kexec memory to reserve.  For example:
ab224c
ab224c
crashkernel=512M-2G:64M,2G-:128M
ab224c
ab224c
This line tells kexec to reserve 64M of ram if the system contains between
ab224c
512M and 2G of physical memory.  If the system contains 2G or more of physical
ab224c
memory, 128M should be reserved.
ab224c
766e0d
You can also use the default crashkernel=auto to let kernel set the
766e0d
crashkernel size.
766e0d
766e0d
crashkernel=auto indicates a best effort estimation for usual use cases,
766e0d
however one still needs do a test to ensure that the kernel reserved
766e0d
memory size is enough.
766e0d
766e0d
NOTE:
766e0d
When a debug variant kernel is used as the capture kernel and the
766e0d
primary kernel was booted with 'crashkernel=auto' set in the bootargs,
766e0d
the capture kernel boot can fail.
766e0d
766e0d
A debug variant kernel usually is the same stable kernel with some
766e0d
debug options enabled which uses much more memory in the kdump kernel.
766e0d
Thus when you use 'crashkernel=auto', kdump kernel will likely run out
766e0d
of memory.
766e0d
766e0d
So it is not advisable to use a debug variant kernel as the capture
766e0d
kernel when primary kernel is booted with 'crashkernel=auto' set in
766e0d
bootargs.
766e0d
bedde7
Besides, since kdump needs to access /proc/kallsyms during a kernel
bedde7
loading if KASLR is enabled, check /proc/sys/kernel/kptr_restrict to
bedde7
make sure that the content of /proc/kallsyms is exposed correctly.
bedde7
We recommend to set the value of kptr_restrict to '1'. Otherwise
bedde7
capture kernel loading could fail.
bedde7
ab224c
After making said changes, reboot your system, so that the X MB of memory is
ab224c
left untouched by the normal system, reserved for the capture kernel. Take note
ab224c
that the output of 'free -m' will show X MB less memory than without this
ab224c
parameter, which is expected. You may be able to get by with less than 128M, but
ab224c
testing with only 64M has proven unreliable of late. On ia64, as much as 512M
ab224c
may be required.
ab224c
ab224c
Now that you've got that reserved memory region set up, you want to turn on
ab224c
the kdump init script:
ab224c
ab224c
    # chkconfig kdump on
ab224c
ab224c
Then, start up kdump as well:
ab224c
ab224c
    # systemctl start kdump.service
ab224c
ab224c
This should load your kernel-kdump image via kexec, leaving the system ready
ab224c
to capture a vmcore upon crashing. To test this out, you can force-crash
ab224c
your system by echo'ing a c into /proc/sysrq-trigger:
ab224c
ab224c
    # echo c > /proc/sysrq-trigger
ab224c
ab224c
You should see some panic output, followed by the system restarting into
ab224c
the kdump kernel. When the boot process gets to the point where it starts
ab224c
the kdump service, your vmcore should be copied out to disk (by default,
ab224c
in /var/crash/<YYYY-MM-DD-HH:MM>/vmcore), then the system rebooted back into
ab224c
your normal kernel.
ab224c
ab224c
Once back to your normal kernel, you can use the previously installed crash
ab224c
kernel in conjunction with the previously installed kernel-debuginfo to
ab224c
perform postmortem analysis:
ab224c
ab224c
    # crash /usr/lib/debug/lib/modules/2.6.17-1.2621.el5/vmlinux
ab224c
    /var/crash/2006-08-23-15:34/vmcore
ab224c
ab224c
    crash> bt
ab224c
ab224c
and so on...
ab224c
0c9820
Notes:
0c9820
0c9820
When kdump starts, the kdump kernel is loaded together with the kdump
0c9820
initramfs. To save memory usage and disk space, the kdump initramfs is
0c9820
generated strictly against the system it will run on, and contains the
0c9820
minimum set of kernel modules and utilities to boot the machine to a stage
0c9820
where the dump target could be mounted.
0c9820
0c9820
With kdump service enabled, kdumpctl will try to detect possible system
0c9820
change and rebuild the kdump initramfs if needed. But it can not guarantee
0c9820
to cover every possible case. So after a hardware change, disk migration,
0c9820
storage setup update or any similar system level changes, it's highly
0c9820
recommended to rebuild the initramfs manually with following command:
0c9820
0c9820
    # kdumpctl rebuild
0c9820
ab224c
Saving vmcore-dmesg.txt
ab224c
----------------------
ab224c
Kernel log bufferes are one of the most important information available
ab224c
in vmcore. Now before saving vmcore, kernel log bufferes are extracted
ab224c
from /proc/vmcore and saved into a file vmcore-dmesg.txt. After
ab224c
vmcore-dmesg.txt, vmcore is saved. Destination disk and directory for
ab224c
vmcore-dmesg.txt is same as vmcore. Note that kernel log buffers will
ab224c
not be available if dump target is raw device.
ab224c
ab224c
Dump Triggering methods:
ab224c
ab224c
This section talks about the various ways, other than a Kernel Panic, in which
ab224c
Kdump can be triggered. The following methods assume that Kdump is configured
ab224c
on your system, with the scripts enabled as described in the section above.
ab224c
ab224c
1) AltSysRq C
ab224c
ab224c
Kdump can be triggered with the combination of the 'Alt','SysRq' and 'C'
ab224c
keyboard keys. Please refer to the following link for more details:
ab224c
ab224c
http://kbase.redhat.com/faq/FAQ_43_5559.shtm
ab224c
ab224c
In addition, on PowerPC boxes, Kdump can also be triggered via Hardware
ab224c
Management Console(HMC) using 'Ctrl', 'O' and 'C' keyboard keys.
ab224c
ab224c
2) NMI_WATCHDOG
ab224c
ab224c
In case a machine has a hard hang, it is quite possible that it does not
ab224c
respond to keyboard interrupts. As a result 'Alt-SysRq' keys will not help
ab224c
trigger a dump. In such scenarios Nmi Watchdog feature can prove to be useful.
ab224c
The following link has more details on configuring Nmi watchdog option.
ab224c
ab224c
http://kbase.redhat.com/faq/FAQ_85_9129.shtm
ab224c
ab224c
Once this feature has been enabled in the kernel, any lockups will result in an
ab224c
OOPs message to be generated, followed by Kdump being triggered.
ab224c
ab224c
3) Kernel OOPs
ab224c
ab224c
If we want to generate a dump everytime the Kernel OOPses, we can achieve this
ab224c
by setting the 'Panic On OOPs' option as follows:
ab224c
ab224c
    # echo 1 > /proc/sys/kernel/panic_on_oops
ab224c
ab224c
This is enabled by default on RHEL5.
ab224c
ab224c
4) NMI(Non maskable interrupt) button
ab224c
ab224c
In cases where the system is in a hung state, and is not accepting keyboard
ab224c
interrupts, using NMI button for triggering Kdump can be very useful. NMI
ab224c
button is present on most of the newer x86 and x86_64 machines. Please refer
ab224c
to the User guides/manuals to locate the button, though in most occasions it
ab224c
is not very well documented. In most cases it is hidden behind a small hole
ab224c
on the front or back panel of the machine. You could use a toothpick or some
ab224c
other non-conducting probe to press the button.
ab224c
ab224c
For example, on the IBM X series 366 machine, the NMI button is located behind
ab224c
a small hole on the bottom center of the rear panel.
ab224c
ab224c
To enable this method of dump triggering using NMI button, you will need to set
ab224c
the 'unknown_nmi_panic' option as follows:
ab224c
ab224c
   # echo 1 > /proc/sys/kernel/unknown_nmi_panic
ab224c
ab224c
5) PowerPC specific methods:
ab224c
ab224c
On IBM PowerPC machines, issuing a soft reset invokes the XMON debugger(if
ab224c
XMON is configured). To configure XMON one needs to compile the kernel with
ab224c
the CONFIG_XMON and CONFIG_XMON_DEFAULT options, or by compiling with
ab224c
CONFIG_XMON and booting the kernel with xmon=on option.
ab224c
ab224c
Following are the ways to remotely issue a soft reset on PowerPC boxes, which
ab224c
would drop you to XMON. Pressing a 'X' (capital alphabet X) followed by an
ab224c
'Enter' here will trigger the dump.
ab224c
ab224c
5.1) HMC
ab224c
ab224c
Hardware Management Console(HMC) available on Power4 and Power5 machines allow
ab224c
partitions to be reset remotely. This is specially useful in hang situations
ab224c
where the system is not accepting any keyboard inputs.
ab224c
ab224c
Once you have HMC configured, the following steps will enable you to trigger
ab224c
Kdump via a soft reset:
ab224c
ab224c
On Power4
ab224c
  Using GUI
ab224c
ab224c
    * In the right pane, right click on the partition you wish to dump.
ab224c
    * Select "Operating System->Reset".
ab224c
    * Select "Soft Reset".
ab224c
    * Select "Yes".
ab224c
ab224c
  Using HMC Commandline
ab224c
ab224c
    # reset_partition -m <machine> -p <partition> -t soft
ab224c
ab224c
On Power5
ab224c
  Using GUI
ab224c
ab224c
    * In the right pane, right click on the partition you wish to dump.
ab224c
    * Select "Restart Partition".
ab224c
    * Select "Dump".
ab224c
    * Select "OK".
ab224c
ab224c
  Using HMC Commandline
ab224c
ab224c
    # chsysstate -m <managed system name> -n <lpar name> -o dumprestart -r lpar
ab224c
ab224c
5.2) Blade Management Console for Blade Center
ab224c
ab224c
To initiate a dump operation, go to Power/Restart option under "Blade Tasks" in
ab224c
the Blade Management Console. Select the corresponding blade for which you want
ab224c
to initate the dump and then click "Restart blade with NMI". This issues a
ab224c
system reset and invokes xmon debugger.
ab224c
ab224c
ab224c
Advanced Setups:
ab224c
ab224c
In addition to being able to capture a vmcore to your system's local file
ab224c
system, kdump can be configured to capture a vmcore to a number of other
ab224c
locations, including a raw disk partition, a dedicated file system, an NFS
ab224c
mounted file system, or a remote system via ssh/scp. Additional options
ab224c
exist for specifying the relative path under which the dump is captured,
ab224c
what to do if the capture fails, and for compressing and filtering the dump
ab224c
(so as to produce smaller, more manageable, vmcore files).
ab224c
ab224c
In theory, dumping to a location other than the local file system should be
ab224c
safer than kdump's default setup, as its possible the default setup will try
ab224c
dumping to a file system that has become corrupted. The raw disk partition and
ab224c
dedicated file system options allow you to still dump to the local system,
ab224c
but without having to remount your possibly corrupted file system(s),
ab224c
thereby decreasing the chance a vmcore won't be captured. Dumping to an
ab224c
NFS server or remote system via ssh/scp also has this advantage, as well
ab224c
as allowing for the centralization of vmcore files, should you have several
ab224c
systems from which you'd like to obtain vmcore files. Of course, note that
ab224c
these configurations could present problems if your network is unreliable.
ab224c
ab224c
Advanced setups are configured via modifications to /etc/kdump.conf,
ab224c
which out of the box, is fairly well documented itself. Any alterations to
ab224c
/etc/kdump.conf should be followed by a restart of the kdump service, so
ab224c
the changes can be incorporated in the kdump initrd. Restarting the kdump
ab224c
service is as simple as '/sbin/systemctl restart kdump.service'.
ab224c
ab224c
ab224c
Note that kdump.conf is used as a configuration mechanism for capturing dump
ab224c
files from the initramfs (in the interests of safety), the root file system is
ab224c
mounted, and the init process is started, only as a last resort if the
ab224c
initramfs fails to capture the vmcore.  As such, configuration made in
ab224c
/etc/kdump.conf is only applicable to capture recorded in the initramfs.  If
ab224c
for any reason the init process is started on the root file system, only a
ab224c
simple copying of the vmcore from /proc/vmcore to /var/crash/$DATE/vmcore will
ab224c
be preformed.
ab224c
ab224c
For both local filesystem and nfs dump the dump target must be mounted before
ab224c
building kdump initramfs. That means one needs to put an entry for the dump
ab224c
file system in /etc/fstab so that after reboot when kdump service starts,
ab224c
it can find the dump target and build initramfs instead of failing.
ab224c
Usually the dump target should be used only for kdump. If you worry about
ab224c
someone uses the filesystem for something else other than dumping vmcore
ab224c
you can mount it as read-only. Mkdumprd will still remount it as read-write
ab224c
for creating dump directory and will move it back to read-only afterwards.
ab224c
ab224c
Raw partition
ab224c
ab224c
Raw partition dumping requires that a disk partition in the system, at least
ab224c
as large as the amount of memory in the system, be left unformatted. Assuming
ab224c
/dev/vg/lv_kdump is left unformatted, kdump.conf can be configured with
ab224c
'raw /dev/vg/lv_kdump', and the vmcore file will be copied via dd directly
ab224c
onto partition /dev/vg/lv_kdump. Restart the kdump service via
ab224c
'/sbin/systemctl restart kdump.service' to commit this change to your kdump
ab224c
initrd. Dump target should be persistent device name, such as lvm or device
ab224c
mapper canonical name.
ab224c
ab224c
Dedicated file system
ab224c
ab224c
Similar to raw partition dumping, you can format a partition with the file
ab224c
system of your choice, Again, it should be at least as large as the amount
ab224c
of memory in the system. Assuming it should be at least as large as the
ab224c
amount of memory in the system. Assuming /dev/vg/lv_kdump has been
ab224c
formatted ext4, specify 'ext4 /dev/vg/lv_kdump' in kdump.conf, and a
ab224c
vmcore file will be copied onto the file system after it has been mounted.
ab224c
Dumping to a dedicated partition has the advantage that you can dump multiple
ab224c
vmcores to the file system, space permitting, without overwriting previous ones,
ab224c
as would be the case in a raw partition setup. Restart the kdump service via
ab224c
'/sbin/systemctl restart kdump.service' to commit this change to
ab224c
your kdump initrd.  Note that for local file systems ext4 and ext2 are
ab224c
supported as dumpable targets.  Kdump will not prevent you from specifying
ab224c
other filesystems, and they will most likely work, but their operation
ab224c
cannot be guaranteed.  for instance specifying a vfat filesystem or msdos
ab224c
filesystem will result in a successful load of the kdump service, but during
ab224c
crash recovery, the dump will fail if the system has more than 2GB of memory
ab224c
(since vfat and msdos filesystems do not support more than 2GB files).
ab224c
Be careful of your filesystem selection when using this target.
ab224c
ab224c
It is recommended to use persistent device names or UUID/LABEL for file system
ab224c
dumps. One example of persistent device is /dev/vg/<devname>.
ab224c
ab224c
NFS mount
ab224c
ab224c
Dumping over NFS requires an NFS server configured to export a file system
ab224c
with full read/write access for the root user. All operations done within
ab224c
the kdump initial ramdisk are done as root, and to write out a vmcore file,
ab224c
we obviously must be able to write to the NFS mount. Configuring an NFS
ab224c
server is outside the scope of this document, but either the no_root_squash
ab224c
or anonuid options on the NFS server side are likely of interest to permit
ab224c
the kdump initrd operations write to the NFS mount as root.
ab224c
ab224c
Assuming your're exporting /dump on the machine nfs-server.example.com,
ab224c
once the mount is properly configured, specify it in kdump.conf, via
ab224c
'nfs nfs-server.example.com:/dump'. The server portion can be specified either
ab224c
by host name or IP address. Following a system crash, the kdump initrd will
ab224c
mount the NFS mount and copy out the vmcore to your NFS server. Restart the
ab224c
kdump service via '/sbin/systemctl restart kdump.service' to commit this change
ab224c
to your kdump initrd.
ab224c
e35838
Special mount via "dracut_args"
e35838
e35838
You can utilize "dracut_args" to pass "--mount" to kdump, see dracut manpage
e35838
about the format of "--mount" for details. If there is any "--mount" specified
e35838
via "dracut_args", kdump will build it as the mount target without doing any
e35838
validation (mounting or checking like mount options, fs size, save path, etc),
e35838
so you must test it to ensure all the correctness. You cannot use other targets
e35838
in /etc/kdump.conf if you use "--mount" in "dracut_args". You also cannot specify
e35838
mutliple "--mount" targets via "dracut_args".
e35838
e35838
One use case of "--mount" in "dracut_args" is you do not want to mount dump target
e35838
before kdump service startup, for example, to reduce the burden of the shared nfs
e35838
server. Such as the example below:
e35838
dracut_args --mount "192.168.1.1:/share /mnt/test nfs4 defaults"
e35838
e35838
NOTE:
e35838
- <mountpoint> must be specified as an absolute path.
e35838
ab224c
Remote system via ssh/scp
ab224c
ab224c
Dumping over ssh/scp requires setting up passwordless ssh keys for every
ab224c
machine you wish to have dump via this method. First up, configure kdump.conf
ab224c
for ssh/scp dumping, adding a config line of 'ssh user@server', where 'user'
ab224c
can be any user on the target system you choose, and 'server' is the host
ab224c
name or IP address of the target system. Using a dedicated, restricted user
ab224c
account on the target system is recommended, as there will be keyless ssh
ab224c
access to this account.
ab224c
ab224c
Once kdump.conf is appropriately configured, issue the command
ab224c
'kdumpctl propagate' to automatically set up the ssh host keys and transmit
ab224c
the necessary bits to the target server. You'll have to type in 'yes'
ab224c
to accept the host key for your targer server if this is the first time
ab224c
you've connected to it, and then input the target system user's password
ab224c
to send over the necessary ssh key file. Restart the kdump service via
ab224c
'/sbin/systemctl restart kdump.service' to commit this change to your kdump initrd.
ab224c
ab224c
Path
1b417c
====
1b417c
"path" represents the file system path in which vmcore will be saved. In
1b417c
fact kdump creates a directory $hostip-$date with-in "path" and saves
1b417c
vmcore there. So practically dump is saved in $path/$hostip-$date/. To
1b417c
simplify discussion further, if we say dump will be saved in $path, it
1b417c
is implied that kdump will create another directory inside path and
1b417c
save vmcore there.
1b417c
1b417c
If a dump target is specified in kdump.conf, then "path" is relative to the
1b417c
specified dump target. For example, if dump target is "ext4 /dev/sda", then
1b417c
dump will be saved in "$path" directory on /dev/sda.
1b417c
1b417c
Same is the case for nfs dump. If user specified "nfs foo.com:/export/tmp/"
1b417c
as dump target, then dump will effectively be saved in
1b417c
"foo.com:/export/tmp/var/crash/" directory.
1b417c
1b417c
Interpretation of path changes a bit if user has not specified a dump
1b417c
target explicitly in kdump.conf. In this case, "path" represents the
1b417c
absolute path from root. And dump target and adjusted path are arrived
1b417c
at automatically depending on what's mounted in the current system.
1b417c
1b417c
Following are few examples.
1b417c
1b417c
path /var/crash/
1b417c
----------------
1b417c
Assuming there is no disk mounted on /var/ or on /var/crash, dump will
1b417c
be saved on disk backing rootfs in directory /var/crash.
1b417c
1b417c
path /var/crash/ (A separate disk mounted on /var)
1b417c
--------------------------------------------------
1b417c
Say a disk /dev/sdb is mouted on /var. In this case dump target will
1b417c
become /dev/sdb and path will become "/crash" and dump will be saved
1b417c
on "sdb:/crash/" directory.
1b417c
1b417c
path /var/crash/ (NFS mounted on /var)
1b417c
-------------------------------------
1b417c
Say foo.com:/export/tmp is mounted on /var. In this case dump target is
1b417c
nfs server and path will be adjusted to "/crash" and dump will be saved to
1b417c
foo.com:/export/tmp/crash/ directory.
1b417c
1b417c
Kdump boot directory
1b417c
====================
1b417c
Usually kdump kernel is the same as 1st kernel. So kdump will try to find
1b417c
kdump kernel under /boot according to /proc/cmdline. E.g we execute below
1b417c
command and get an output:
1b417c
	cat /proc/cmdline
1b417c
	BOOT_IMAGE=/xxx/vmlinuz-3.yyy.zzz  root=xxxx .....
1b417c
Then kdump kernel will be /boot/xxx/vmlinuz-3.yyy.zzz.
1b417c
However a variable KDUMP_BOOTDIR in /etc/sysconfig/kdump is provided to
1b417c
user if kdump kernel is put in a different directory.
ab224c
ab224c
Kdump Post-Capture Executable
ab224c
ab224c
It is possible to specify a custom script or binary you wish to run following
ab224c
an attempt to capture a vmcore. The executable is passed an exit code from
ab224c
the capture process, which can be used to trigger different actions from
ab224c
within your post-capture executable.
ab224c
ab224c
Kdump Pre-Capture Executable
ab224c
ab224c
It is possible to specify a custom script or binary you wish to run before
ab224c
capturing a vmcore. Exit status of this binary is interpreted:
ab224c
0 - continue with dump process as usual
ab224c
non 0 - reboot the system
ab224c
ab224c
Extra Binaries
ab224c
ab224c
If you have specific binaries or scripts you want to have made available
ab224c
within your kdump initrd, you can specify them by their full path, and they
ab224c
will be included in your kdump initrd, along with all dependent libraries.
ab224c
This may be particularly useful for those running post-capture scripts that
ab224c
rely on other binaries.
ab224c
ab224c
Extra Modules
ab224c
ab224c
By default, only the bare minimum of kernel modules will be included in your
ab224c
kdump initrd. Should you wish to capture your vmcore files to a non-boot-path
ab224c
storage device, such as an iscsi target disk or clustered file system, you may
ab224c
need to manually specify additional kernel modules to load into your kdump
ab224c
initrd.
ab224c
ab224c
Default action
ab224c
==============
ab224c
Default action specifies what to do when dump to configured dump target
ab224c
fails. By default, default action is "reboot" and that is system reboots
ab224c
if attempt to save dump to dump target fails.
ab224c
ab224c
There are other default actions available though.
ab224c
ab224c
- dump_to_rootfs
ab224c
	This option tries to mount root and save dump on root filesystem
ab224c
	in a path specified by "path". This option will generally make
ab224c
	sense when dump target is not root filesystem. For example, if
ab224c
	dump is being saved over network using "ssh" then one can specify
ab224c
	default to "dump_to_rootfs" to try saving dump to root filesystem
ab224c
	if dump over network fails.
ab224c
ab224c
- shell
ab224c
	Drop into a shell session inside initramfs.
ab224c
- halt
ab224c
	Halt system after failure
ab224c
- poweroff
ab224c
	Poweroff system after failure.
ab224c
ab224c
Compression and filtering
ab224c
ab224c
The 'core_collector' parameter in kdump.conf allows you to specify a custom
ab224c
dump capture method. The most common alternate method is makedumpfile, which
ab224c
is a dump filtering and compression utility provided with kexec-tools. On
ab224c
some architectures, it can drastically reduce the size of your vmcore files,
ab224c
which becomes very useful on systems with large amounts of memory.
ab224c
765b01
A typical setup is 'core_collector makedumpfile -F -l --message-level 1 -d 31',
ab224c
but check the output of '/sbin/makedumpfile --help' for a list of all available
ab224c
options (-i and -g don't need to be specified, they're automatically taken care
ab224c
of). Note that use of makedumpfile requires that the kernel-debuginfo package
ab224c
corresponding with your running kernel be installed.
ab224c
ab224c
Core collector command format depends on dump target type. Typically for
ab224c
filesystem (local/remote), core_collector should accept two arguments.
ab224c
First one is source file and second one is target file. For ex.
ab224c
ab224c
ex1.
ab224c
---
ab224c
core_collector "cp --sparse=always"
ab224c
ab224c
Above will effectively be translated to:
ab224c
ab224c
cp --sparse=always /proc/vmcore <dest-path>/vmcore
ab224c
ab224c
ex2.
ab224c
---
765b01
core_collector "makedumpfile -l --message-level 1 -d 31"
ab224c
ab224c
Above will effectively be translated to:
ab224c
765b01
makedumpfile -l --message-level 1 -d 31 /proc/vmcore <dest-path>/vmcore
ab224c
ab224c
ab224c
For dump targets like raw and ssh, in general, core collector should expect
ab224c
one argument (source file) and should output the processed core on standard
ab224c
output (There is one exception of "scp", discussed later). This standard
ab224c
output will be saved to destination using appropriate commands.
ab224c
ab224c
raw dumps core_collector examples:
ab224c
---------
ab224c
ex3.
ab224c
---
ab224c
core_collector "cat"
ab224c
ab224c
Above will effectively be translated to.
ab224c
ab224c
cat /proc/vmcore | dd of=<target-device>
ab224c
ab224c
ex4.
ab224c
---
765b01
core_collector "makedumpfile -F -l --message-level 1 -d 31"
ab224c
ab224c
Above will effectively be translated to.
ab224c
765b01
makedumpfile -F -l --message-level 1 -d 31 | dd of=<target-device>
ab224c
ab224c
ssh dumps core_collector examples:
ab224c
---------
ab224c
ex5.
ab224c
---
ab224c
core_collector "cat"
ab224c
ab224c
Above will effectively be translated to.
ab224c
ab224c
cat /proc/vmcore | ssh <options> <remote-location> "dd of=path/vmcore"
ab224c
ab224c
ex6.
ab224c
---
765b01
core_collector "makedumpfile -F -l --message-level 1 -d 31"
ab224c
ab224c
Above will effectively be translated to.
ab224c
765b01
makedumpfile -F -l --message-level 1 -d 31 | ssh <options> <remote-location> "dd of=path/vmcore"
ab224c
ab224c
There is one exception to standard output rule for ssh dumps. And that is
ab224c
scp. As scp can handle ssh destinations for file transfers, one can
ab224c
specify "scp" as core collector for ssh targets (no output on stdout).
ab224c
ab224c
ex7.
ab224c
----
ab224c
core_collector "scp"
ab224c
ab224c
Above will effectively be translated to.
ab224c
ab224c
scp /proc/vmcore <user@host>:path/vmcore
ab224c
ab224c
About default core collector
ab224c
----------------------------
ab224c
Default core_collector for ssh/raw dump is:
765b01
"makedumpfile -F -l --message-level 1 -d 31".
ab224c
Default core_collector for other targets is:
765b01
"makedumpfile -l --message-level 1 -d 31".
ab224c
ab224c
Even if core_collector option is commented out in kdump.conf, makedumpfile
ab224c
is default core collector and kdump uses it internally.
ab224c
ab224c
If one does not want makedumpfile as default core_collector, then they
ab224c
need to specify one using core_collector option to change the behavior.
ab224c
ab224c
Note: If "makedumpfile -F" is used then you will get a flattened format
ab224c
vmcore.flat, you will need to use "makedumpfile -R" to rearrange the
ab224c
dump data from stdard input to a normal dumpfile (readable with analysis
ab224c
tools).
ab224c
For example: "makedumpfile -R vmcore < vmcore.flat"
ab224c
ab224c
Caveats:
ab224c
ab224c
Console frame-buffers and X are not properly supported. If you typically run
ab224c
with something along the lines of "vga=791" in your kernel config line or
ab224c
have X running, console video will be garbled when a kernel is booted via
ab224c
kexec. Note that the kdump kernel should still be able to create a dump,
ab224c
and when the system reboots, video should be restored to normal.
ab224c
ab224c
ab224c
Notes on resetting video:
ab224c
ab224c
Video is a notoriously difficult issue with kexec.  Video cards contain ROM code
ab224c
that controls their initial configuration and setup.  This code is nominally
ab224c
accessed and executed from the Bios, and otherwise not safely executable. Since
ab224c
the purpose of kexec is to reboot the system without re-executing the Bios, it
ab224c
is rather difficult if not impossible to reset video cards with kexec.  The
ab224c
result is, that if a system crashes while running in a graphical mode (i.e.
ab224c
running X), the screen may appear to become 'frozen' while the dump capture is
ab224c
taking place.  A serial console will of course reveal that the system is
ab224c
operating and capturing a vmcore image, but a casual observer will see the
ab224c
system as hung until the dump completes and a true reboot is executed.
ab224c
ab224c
There are two possiblilties to work around this issue.  One is by adding
ab224c
--reset-vga to the kexec command line options in /etc/sysconfig/kdump.  This
ab224c
tells kdump to write some reasonable default values to the video card register
ab224c
file, in the hopes of returning it to a text mode such that boot messages are
ab224c
visible on the screen.  It does not work with all video cards however.
ab224c
Secondly, it may be worth trying to add vga15fb.ko to the extra_modules list in
ab224c
/etc/kdump.conf.  This will attempt to use the video card in framebuffer mode,
ab224c
which can blank the screen prior to the start of a dump capture.
ab224c
ab224c
Notes on rootfs mount:
ab224c
Dracut is designed to mount rootfs by default. If rootfs mounting fails it
ab224c
will refuse to go on. So kdump leaves rootfs mounting to dracut currently.
ab224c
We make the assumtion that proper root= cmdline is being passed to dracut
ab224c
initramfs for the time being. If you need modify "KDUMP_COMMANDLINE=" in
ab224c
/etc/sysconfig/kdump, you will need to make sure that appropriate root=
ab224c
options are copied from /proc/cmdline. In general it is best to append
ab224c
command line options using "KDUMP_COMMANDLINE_APPEND=" instead of replacing
ab224c
the original command line completely.
ab224c
e35838
Notes on watchdog module handling:
e35838
e35838
If a watchdog is active in first kernel then, we must have it's module
e35838
loaded in crash kernel, so that either watchdog is deactivated or started
e35838
being kicked in second kernel. Otherwise, we might face watchdog reboot
e35838
when vmcore is being saved. When dracut watchdog module is enabled, it
e35838
installs kernel watchdog module of active watchdog device in initrd.
e35838
kexec-tools always add "-a watchdog" to the dracut_args if there exists at
e35838
least one active watchdog and user has not added specifically "-o watchdog"
e35838
in dracut_args of kdump.conf. If a watchdog module (such as hp_wdt) has
e35838
not been written in watchdog-core framework then this option will not have
e35838
any effect and module will not be added. Please note that only systemd
e35838
watchdog daemon is supported as watchdog kick application.
e35838
0c9820
Notes for disk images:
0c9820
0c9820
Kdump initramfs is a critical component for capturing the crash dump.
0c9820
But it's strictly generated for the machine it will run on, and have
0c9820
no generality. If you install a new machine with a previous disk image
0c9820
(eg. VMs created with disk image or snapshot), kdump could be broken
0c9820
easily due to hardware changes or disk ID changes. So it's strongly
0c9820
recommended to not include the kdump initramfs in the disk image in the
0c9820
first place, this helps to save space, and kdumpctl will build the
0c9820
initramfs automatically if it's missing. If you have already installed
0c9820
a machine with a disk image which have kdump initramfs embedded, you
0c9820
should rebuild the initramfs using "kdumpctl rebuild" command manually,
0c9820
or else kdump may not work as expeceted.
0c9820
bedde7
Notes on device dump:
bedde7
bedde7
Device dump allows drivers to append dump data to vmcore, so you can
bedde7
collect driver specified debug info. Since the drivers could append the
bedde7
data without any limit and the data is stored in memory, this may
bedde7
bring a significant memory stress. So device dump is presently
bedde7
disabled by default by passing "novmcoredd" command line option to the
bedde7
kdump capture kernel. If you want to collect debug data with
bedde7
device dump, you need to modify "KDUMP_COMMANDLINE_APPEND=" value in
bedde7
'/etc/sysconfig/kdump' and remove the "novmcoredd" option.
bedde7
You also need to increase the "crashkernel=" value accordingly in case
bedde7
of OOM issues with kdump kernel. Also the kdump initramfs won't
bedde7
automatically include the device drivers which support device dump,
bedde7
(it includes only device drivers that are required for dump target
bedde7
setup). So, to ensure the device dump data is included in the vmcore,
bedde7
you need to explicitly add related device drivers by using
bedde7
"extra_modules" option in /etc/kdump.conf
bedde7
e35838
Parallel Dumping Operation
e35838
==========================
e35838
Kexec allows kdump using multiple cpus. So parallel feature can accelerate
e35838
dumping substantially, especially in executing compression and filter.
e35838
For example:
e35838
e35838
	1."makedumpfile -c --num-threads [THREAD_NUM] /proc/vmcore dumpfile"
e35838
	2."makedumpfile -c /proc/vmcore dumpfile",
e35838
e35838
	1 has better performance than 2, if THREAD_NUM is larger than two
e35838
	and the usable cpus number is larger than THREAD_NUM.
e35838
e35838
Notes on how to use multiple cpus on a capture kernel on x86 system:
e35838
e35838
Make sure that you are using a kernel that supports disable_cpu_apicid
e35838
kernel option as a capture kernel, which is needed to avoid x86 specific
e35838
hardware issue (*). The disable_cpu_apicid kernel option is automatically
e35838
appended by kdumpctl script and is ignored if the kernel doesn't support it.
e35838
e35838
You need to specify how many cpus to be used in a capture kernel by specifying
c97d8c
the number of cpus in nr_cpus kernel option in /etc/sysconfig/kdump (**).
c97d8c
nr_cpus is 1 at default.
e35838
e35838
You should use necessary and sufficient number of cpus on a capture kernel.
e35838
Warning: Don't use too many cpus on a capture kernel, or the capture kernel
e35838
may lead to panic due to Out Of Memory.
e35838
e35838
(*) Without disable_cpu_apicid kernel option, capture kernel may lead to
e35838
hang, system reset or power-off at boot, depending on your system and runtime
e35838
situation at the time of crash.
e35838
c97d8c
(**) On HyperV systems, only nr_cpus=1 is supported, if nr_cpus value is
c97d8c
larger than 1, capture kernel may hang at boot.
c97d8c
ab224c
Debugging Tips
ab224c
--------------
ab224c
- One can drop into a shell before/after saving vmcore with the help of
ab224c
  using kdump_pre/kdump_post hooks. Use following in one of the pre/post
ab224c
  scripts to drop into a shell.
ab224c
ab224c
  #!/bin/bash
ab224c
  _ctty=/dev/ttyS0
ab224c
  setsid /bin/sh -i -l 0<>$_ctty 1<>$_ctty 2<>$_ctty
ab224c
ab224c
  One might have to change the terminal depending on what they are using.
ab224c
ab224c
- Serial console logging for virtual machines
ab224c
ab224c
  I generally use "virsh console <domain-name>" to get to serial console.
ab224c
  I noticed after dump saving system reboots and when grub menu shows up
ab224c
  some of the previously logged messages are no more there. That means
ab224c
  any important debugging info at the end will be lost.
ab224c
ab224c
  One can log serial console as follows to make sure messages are not lost.
ab224c
ab224c
  virsh ttyconsole <domain-name>
ab224c
  ln -s <name-of-tty> /dev/modem
ab224c
  minicom -C /tmp/console-logs
ab224c
ab224c
  Now minicom should be logging serial console in file console-logs.
ab224c
ab224c