|
|
ab224c |
Kexec/Kdump HOWTO
|
|
|
ab224c |
|
|
|
ab224c |
Introduction
|
|
|
ab224c |
|
|
|
ab224c |
Kexec and kdump are new features in the 2.6 mainstream kernel. These features
|
|
|
ab224c |
are included in Red Hat Enterprise Linux 5. The purpose of these features
|
|
|
ab224c |
is to ensure faster boot up and creation of reliable kernel vmcores for
|
|
|
ab224c |
diagnostic purposes.
|
|
|
ab224c |
|
|
|
ab224c |
Overview
|
|
|
ab224c |
|
|
|
ab224c |
Kexec
|
|
|
ab224c |
|
|
|
ab224c |
Kexec is a fastboot mechanism which allows booting a Linux kernel from the
|
|
|
ab224c |
context of already running kernel without going through BIOS. BIOS can be very
|
|
|
ab224c |
time consuming especially on the big servers with lots of peripherals. This can
|
|
|
ab224c |
save a lot of time for developers who end up booting a machine numerous times.
|
|
|
ab224c |
|
|
|
ab224c |
Kdump
|
|
|
ab224c |
|
|
|
ab224c |
Kdump is a new kernel crash dumping mechanism and is very reliable because
|
|
|
ab224c |
the crash dump is captured from the context of a freshly booted kernel and
|
|
|
ab224c |
not from the context of the crashed kernel. Kdump uses kexec to boot into
|
|
|
ab224c |
a second kernel whenever system crashes. This second kernel, often called
|
|
|
ab224c |
a capture kernel, boots with very little memory and captures the dump image.
|
|
|
ab224c |
|
|
|
ab224c |
The first kernel reserves a section of memory that the second kernel uses
|
|
|
ab224c |
to boot. Kexec enables booting the capture kernel without going through BIOS
|
|
|
ab224c |
hence contents of first kernel's memory are preserved, which is essentially
|
|
|
ab224c |
the kernel crash dump.
|
|
|
ab224c |
|
|
|
ab224c |
Kdump is supported on the i686, x86_64, ia64 and ppc64 platforms. The
|
|
|
ab224c |
standard kernel and capture kernel are one in the same on i686, x86_64,
|
|
|
ab224c |
ia64 and ppc64.
|
|
|
ab224c |
|
|
|
ab224c |
If you're reading this document, you should already have kexec-tools
|
|
|
ab224c |
installed. If not, you install it via the following command:
|
|
|
ab224c |
|
|
|
ab224c |
# yum install kexec-tools
|
|
|
ab224c |
|
|
|
ab224c |
Now load a kernel with kexec:
|
|
|
ab224c |
|
|
|
ab224c |
# kver=`uname -r` # kexec -l /boot/vmlinuz-$kver
|
|
|
ab224c |
--initrd=/boot/initrd-$kver.img \
|
|
|
ab224c |
--command-line="`cat /proc/cmdline`"
|
|
|
ab224c |
|
|
|
ab224c |
NOTE: The above will boot you back into the kernel you're currently running,
|
|
|
ab224c |
if you want to load a different kernel, substitute it in place of `uname -r`.
|
|
|
ab224c |
|
|
|
ab224c |
Now reboot your system, taking note that it should bypass the BIOS:
|
|
|
ab224c |
|
|
|
ab224c |
# reboot
|
|
|
ab224c |
|
|
|
ab224c |
|
|
|
ab224c |
How to configure kdump:
|
|
|
ab224c |
|
|
|
ab224c |
Again, we assume if you're reading this document, you should already have
|
|
|
ab224c |
kexec-tools installed. If not, you install it via the following command:
|
|
|
ab224c |
|
|
|
ab224c |
# yum install kexec-tools
|
|
|
ab224c |
|
|
|
ab224c |
To be able to do much of anything interesting in the way of debug analysis,
|
|
|
ab224c |
you'll also need to install the kernel-debuginfo package, of the same arch
|
|
|
ab224c |
as your running kernel, and the crash utility:
|
|
|
ab224c |
|
|
|
ab224c |
# yum --enablerepo=\*debuginfo install kernel-debuginfo.$(uname -m) crash
|
|
|
ab224c |
|
|
|
ab224c |
Next up, we need to modify some boot parameters to reserve a chunk of memory for
|
|
|
ab224c |
the capture kernel. With the help of grubby, it's very easy to append
|
|
|
ab224c |
"crashkernel=128M" to the end of your kernel boot parameters. Note that the X
|
|
|
ab224c |
values are such that X = the amount of memory to reserve for the capture kernel.
|
|
|
ab224c |
And based on arch and system configuration, one might require more than 128M to
|
|
|
ab224c |
be reserved for kdump. One need to experiment and test kdump, if 128M is not
|
|
|
ab224c |
sufficient, try reserving more memory.
|
|
|
ab224c |
|
|
|
ab224c |
# grubby --args="crashkernel=128M" --update-kernel=/boot/vmlinuz-`uname -r`
|
|
|
ab224c |
|
|
|
ab224c |
Note that there is an alternative form in which to specify a crashkernel
|
|
|
ab224c |
memory reservation, in the event that more control is needed over the size and
|
|
|
ab224c |
placement of the reserved memory. The format is:
|
|
|
ab224c |
|
|
|
ab224c |
crashkernel=range1:size1[,range2:size2,...][@offset]
|
|
|
ab224c |
|
|
|
ab224c |
Where range<n> specifies a range of values that are matched against the amount
|
|
|
ab224c |
of physical RAM present in the system, and the corresponding size<n> value
|
|
|
ab224c |
specifies the amount of kexec memory to reserve. For example:
|
|
|
ab224c |
|
|
|
ab224c |
crashkernel=512M-2G:64M,2G-:128M
|
|
|
ab224c |
|
|
|
ab224c |
This line tells kexec to reserve 64M of ram if the system contains between
|
|
|
ab224c |
512M and 2G of physical memory. If the system contains 2G or more of physical
|
|
|
ab224c |
memory, 128M should be reserved.
|
|
|
ab224c |
|
|
|
ab224c |
After making said changes, reboot your system, so that the X MB of memory is
|
|
|
ab224c |
left untouched by the normal system, reserved for the capture kernel. Take note
|
|
|
ab224c |
that the output of 'free -m' will show X MB less memory than without this
|
|
|
ab224c |
parameter, which is expected. You may be able to get by with less than 128M, but
|
|
|
ab224c |
testing with only 64M has proven unreliable of late. On ia64, as much as 512M
|
|
|
ab224c |
may be required.
|
|
|
ab224c |
|
|
|
ab224c |
Now that you've got that reserved memory region set up, you want to turn on
|
|
|
ab224c |
the kdump init script:
|
|
|
ab224c |
|
|
|
ab224c |
# chkconfig kdump on
|
|
|
ab224c |
|
|
|
ab224c |
Then, start up kdump as well:
|
|
|
ab224c |
|
|
|
ab224c |
# systemctl start kdump.service
|
|
|
ab224c |
|
|
|
ab224c |
This should load your kernel-kdump image via kexec, leaving the system ready
|
|
|
ab224c |
to capture a vmcore upon crashing. To test this out, you can force-crash
|
|
|
ab224c |
your system by echo'ing a c into /proc/sysrq-trigger:
|
|
|
ab224c |
|
|
|
ab224c |
# echo c > /proc/sysrq-trigger
|
|
|
ab224c |
|
|
|
ab224c |
You should see some panic output, followed by the system restarting into
|
|
|
ab224c |
the kdump kernel. When the boot process gets to the point where it starts
|
|
|
ab224c |
the kdump service, your vmcore should be copied out to disk (by default,
|
|
|
ab224c |
in /var/crash/<YYYY-MM-DD-HH:MM>/vmcore), then the system rebooted back into
|
|
|
ab224c |
your normal kernel.
|
|
|
ab224c |
|
|
|
ab224c |
Once back to your normal kernel, you can use the previously installed crash
|
|
|
ab224c |
kernel in conjunction with the previously installed kernel-debuginfo to
|
|
|
ab224c |
perform postmortem analysis:
|
|
|
ab224c |
|
|
|
ab224c |
# crash /usr/lib/debug/lib/modules/2.6.17-1.2621.el5/vmlinux
|
|
|
ab224c |
/var/crash/2006-08-23-15:34/vmcore
|
|
|
ab224c |
|
|
|
ab224c |
crash> bt
|
|
|
ab224c |
|
|
|
ab224c |
and so on...
|
|
|
ab224c |
|
|
|
ab224c |
Saving vmcore-dmesg.txt
|
|
|
ab224c |
----------------------
|
|
|
ab224c |
Kernel log bufferes are one of the most important information available
|
|
|
ab224c |
in vmcore. Now before saving vmcore, kernel log bufferes are extracted
|
|
|
ab224c |
from /proc/vmcore and saved into a file vmcore-dmesg.txt. After
|
|
|
ab224c |
vmcore-dmesg.txt, vmcore is saved. Destination disk and directory for
|
|
|
ab224c |
vmcore-dmesg.txt is same as vmcore. Note that kernel log buffers will
|
|
|
ab224c |
not be available if dump target is raw device.
|
|
|
ab224c |
|
|
|
ab224c |
Dump Triggering methods:
|
|
|
ab224c |
|
|
|
ab224c |
This section talks about the various ways, other than a Kernel Panic, in which
|
|
|
ab224c |
Kdump can be triggered. The following methods assume that Kdump is configured
|
|
|
ab224c |
on your system, with the scripts enabled as described in the section above.
|
|
|
ab224c |
|
|
|
ab224c |
1) AltSysRq C
|
|
|
ab224c |
|
|
|
ab224c |
Kdump can be triggered with the combination of the 'Alt','SysRq' and 'C'
|
|
|
ab224c |
keyboard keys. Please refer to the following link for more details:
|
|
|
ab224c |
|
|
|
ab224c |
http://kbase.redhat.com/faq/FAQ_43_5559.shtm
|
|
|
ab224c |
|
|
|
ab224c |
In addition, on PowerPC boxes, Kdump can also be triggered via Hardware
|
|
|
ab224c |
Management Console(HMC) using 'Ctrl', 'O' and 'C' keyboard keys.
|
|
|
ab224c |
|
|
|
ab224c |
2) NMI_WATCHDOG
|
|
|
ab224c |
|
|
|
ab224c |
In case a machine has a hard hang, it is quite possible that it does not
|
|
|
ab224c |
respond to keyboard interrupts. As a result 'Alt-SysRq' keys will not help
|
|
|
ab224c |
trigger a dump. In such scenarios Nmi Watchdog feature can prove to be useful.
|
|
|
ab224c |
The following link has more details on configuring Nmi watchdog option.
|
|
|
ab224c |
|
|
|
ab224c |
http://kbase.redhat.com/faq/FAQ_85_9129.shtm
|
|
|
ab224c |
|
|
|
ab224c |
Once this feature has been enabled in the kernel, any lockups will result in an
|
|
|
ab224c |
OOPs message to be generated, followed by Kdump being triggered.
|
|
|
ab224c |
|
|
|
ab224c |
3) Kernel OOPs
|
|
|
ab224c |
|
|
|
ab224c |
If we want to generate a dump everytime the Kernel OOPses, we can achieve this
|
|
|
ab224c |
by setting the 'Panic On OOPs' option as follows:
|
|
|
ab224c |
|
|
|
ab224c |
# echo 1 > /proc/sys/kernel/panic_on_oops
|
|
|
ab224c |
|
|
|
ab224c |
This is enabled by default on RHEL5.
|
|
|
ab224c |
|
|
|
ab224c |
4) NMI(Non maskable interrupt) button
|
|
|
ab224c |
|
|
|
ab224c |
In cases where the system is in a hung state, and is not accepting keyboard
|
|
|
ab224c |
interrupts, using NMI button for triggering Kdump can be very useful. NMI
|
|
|
ab224c |
button is present on most of the newer x86 and x86_64 machines. Please refer
|
|
|
ab224c |
to the User guides/manuals to locate the button, though in most occasions it
|
|
|
ab224c |
is not very well documented. In most cases it is hidden behind a small hole
|
|
|
ab224c |
on the front or back panel of the machine. You could use a toothpick or some
|
|
|
ab224c |
other non-conducting probe to press the button.
|
|
|
ab224c |
|
|
|
ab224c |
For example, on the IBM X series 366 machine, the NMI button is located behind
|
|
|
ab224c |
a small hole on the bottom center of the rear panel.
|
|
|
ab224c |
|
|
|
ab224c |
To enable this method of dump triggering using NMI button, you will need to set
|
|
|
ab224c |
the 'unknown_nmi_panic' option as follows:
|
|
|
ab224c |
|
|
|
ab224c |
# echo 1 > /proc/sys/kernel/unknown_nmi_panic
|
|
|
ab224c |
|
|
|
ab224c |
5) PowerPC specific methods:
|
|
|
ab224c |
|
|
|
ab224c |
On IBM PowerPC machines, issuing a soft reset invokes the XMON debugger(if
|
|
|
ab224c |
XMON is configured). To configure XMON one needs to compile the kernel with
|
|
|
ab224c |
the CONFIG_XMON and CONFIG_XMON_DEFAULT options, or by compiling with
|
|
|
ab224c |
CONFIG_XMON and booting the kernel with xmon=on option.
|
|
|
ab224c |
|
|
|
ab224c |
Following are the ways to remotely issue a soft reset on PowerPC boxes, which
|
|
|
ab224c |
would drop you to XMON. Pressing a 'X' (capital alphabet X) followed by an
|
|
|
ab224c |
'Enter' here will trigger the dump.
|
|
|
ab224c |
|
|
|
ab224c |
5.1) HMC
|
|
|
ab224c |
|
|
|
ab224c |
Hardware Management Console(HMC) available on Power4 and Power5 machines allow
|
|
|
ab224c |
partitions to be reset remotely. This is specially useful in hang situations
|
|
|
ab224c |
where the system is not accepting any keyboard inputs.
|
|
|
ab224c |
|
|
|
ab224c |
Once you have HMC configured, the following steps will enable you to trigger
|
|
|
ab224c |
Kdump via a soft reset:
|
|
|
ab224c |
|
|
|
ab224c |
On Power4
|
|
|
ab224c |
Using GUI
|
|
|
ab224c |
|
|
|
ab224c |
* In the right pane, right click on the partition you wish to dump.
|
|
|
ab224c |
* Select "Operating System->Reset".
|
|
|
ab224c |
* Select "Soft Reset".
|
|
|
ab224c |
* Select "Yes".
|
|
|
ab224c |
|
|
|
ab224c |
Using HMC Commandline
|
|
|
ab224c |
|
|
|
ab224c |
# reset_partition -m <machine> -p <partition> -t soft
|
|
|
ab224c |
|
|
|
ab224c |
On Power5
|
|
|
ab224c |
Using GUI
|
|
|
ab224c |
|
|
|
ab224c |
* In the right pane, right click on the partition you wish to dump.
|
|
|
ab224c |
* Select "Restart Partition".
|
|
|
ab224c |
* Select "Dump".
|
|
|
ab224c |
* Select "OK".
|
|
|
ab224c |
|
|
|
ab224c |
Using HMC Commandline
|
|
|
ab224c |
|
|
|
ab224c |
# chsysstate -m <managed system name> -n <lpar name> -o dumprestart -r lpar
|
|
|
ab224c |
|
|
|
ab224c |
5.2) Blade Management Console for Blade Center
|
|
|
ab224c |
|
|
|
ab224c |
To initiate a dump operation, go to Power/Restart option under "Blade Tasks" in
|
|
|
ab224c |
the Blade Management Console. Select the corresponding blade for which you want
|
|
|
ab224c |
to initate the dump and then click "Restart blade with NMI". This issues a
|
|
|
ab224c |
system reset and invokes xmon debugger.
|
|
|
ab224c |
|
|
|
ab224c |
|
|
|
ab224c |
Advanced Setups:
|
|
|
ab224c |
|
|
|
ab224c |
In addition to being able to capture a vmcore to your system's local file
|
|
|
ab224c |
system, kdump can be configured to capture a vmcore to a number of other
|
|
|
ab224c |
locations, including a raw disk partition, a dedicated file system, an NFS
|
|
|
ab224c |
mounted file system, or a remote system via ssh/scp. Additional options
|
|
|
ab224c |
exist for specifying the relative path under which the dump is captured,
|
|
|
ab224c |
what to do if the capture fails, and for compressing and filtering the dump
|
|
|
ab224c |
(so as to produce smaller, more manageable, vmcore files).
|
|
|
ab224c |
|
|
|
ab224c |
In theory, dumping to a location other than the local file system should be
|
|
|
ab224c |
safer than kdump's default setup, as its possible the default setup will try
|
|
|
ab224c |
dumping to a file system that has become corrupted. The raw disk partition and
|
|
|
ab224c |
dedicated file system options allow you to still dump to the local system,
|
|
|
ab224c |
but without having to remount your possibly corrupted file system(s),
|
|
|
ab224c |
thereby decreasing the chance a vmcore won't be captured. Dumping to an
|
|
|
ab224c |
NFS server or remote system via ssh/scp also has this advantage, as well
|
|
|
ab224c |
as allowing for the centralization of vmcore files, should you have several
|
|
|
ab224c |
systems from which you'd like to obtain vmcore files. Of course, note that
|
|
|
ab224c |
these configurations could present problems if your network is unreliable.
|
|
|
ab224c |
|
|
|
ab224c |
Advanced setups are configured via modifications to /etc/kdump.conf,
|
|
|
ab224c |
which out of the box, is fairly well documented itself. Any alterations to
|
|
|
ab224c |
/etc/kdump.conf should be followed by a restart of the kdump service, so
|
|
|
ab224c |
the changes can be incorporated in the kdump initrd. Restarting the kdump
|
|
|
ab224c |
service is as simple as '/sbin/systemctl restart kdump.service'.
|
|
|
ab224c |
|
|
|
ab224c |
|
|
|
ab224c |
Note that kdump.conf is used as a configuration mechanism for capturing dump
|
|
|
ab224c |
files from the initramfs (in the interests of safety), the root file system is
|
|
|
ab224c |
mounted, and the init process is started, only as a last resort if the
|
|
|
ab224c |
initramfs fails to capture the vmcore. As such, configuration made in
|
|
|
ab224c |
/etc/kdump.conf is only applicable to capture recorded in the initramfs. If
|
|
|
ab224c |
for any reason the init process is started on the root file system, only a
|
|
|
ab224c |
simple copying of the vmcore from /proc/vmcore to /var/crash/$DATE/vmcore will
|
|
|
ab224c |
be preformed.
|
|
|
ab224c |
|
|
|
ab224c |
For both local filesystem and nfs dump the dump target must be mounted before
|
|
|
ab224c |
building kdump initramfs. That means one needs to put an entry for the dump
|
|
|
ab224c |
file system in /etc/fstab so that after reboot when kdump service starts,
|
|
|
ab224c |
it can find the dump target and build initramfs instead of failing.
|
|
|
ab224c |
Usually the dump target should be used only for kdump. If you worry about
|
|
|
ab224c |
someone uses the filesystem for something else other than dumping vmcore
|
|
|
ab224c |
you can mount it as read-only. Mkdumprd will still remount it as read-write
|
|
|
ab224c |
for creating dump directory and will move it back to read-only afterwards.
|
|
|
ab224c |
|
|
|
ab224c |
Raw partition
|
|
|
ab224c |
|
|
|
ab224c |
Raw partition dumping requires that a disk partition in the system, at least
|
|
|
ab224c |
as large as the amount of memory in the system, be left unformatted. Assuming
|
|
|
ab224c |
/dev/vg/lv_kdump is left unformatted, kdump.conf can be configured with
|
|
|
ab224c |
'raw /dev/vg/lv_kdump', and the vmcore file will be copied via dd directly
|
|
|
ab224c |
onto partition /dev/vg/lv_kdump. Restart the kdump service via
|
|
|
ab224c |
'/sbin/systemctl restart kdump.service' to commit this change to your kdump
|
|
|
ab224c |
initrd. Dump target should be persistent device name, such as lvm or device
|
|
|
ab224c |
mapper canonical name.
|
|
|
ab224c |
|
|
|
ab224c |
Dedicated file system
|
|
|
ab224c |
|
|
|
ab224c |
Similar to raw partition dumping, you can format a partition with the file
|
|
|
ab224c |
system of your choice, Again, it should be at least as large as the amount
|
|
|
ab224c |
of memory in the system. Assuming it should be at least as large as the
|
|
|
ab224c |
amount of memory in the system. Assuming /dev/vg/lv_kdump has been
|
|
|
ab224c |
formatted ext4, specify 'ext4 /dev/vg/lv_kdump' in kdump.conf, and a
|
|
|
ab224c |
vmcore file will be copied onto the file system after it has been mounted.
|
|
|
ab224c |
Dumping to a dedicated partition has the advantage that you can dump multiple
|
|
|
ab224c |
vmcores to the file system, space permitting, without overwriting previous ones,
|
|
|
ab224c |
as would be the case in a raw partition setup. Restart the kdump service via
|
|
|
ab224c |
'/sbin/systemctl restart kdump.service' to commit this change to
|
|
|
ab224c |
your kdump initrd. Note that for local file systems ext4 and ext2 are
|
|
|
ab224c |
supported as dumpable targets. Kdump will not prevent you from specifying
|
|
|
ab224c |
other filesystems, and they will most likely work, but their operation
|
|
|
ab224c |
cannot be guaranteed. for instance specifying a vfat filesystem or msdos
|
|
|
ab224c |
filesystem will result in a successful load of the kdump service, but during
|
|
|
ab224c |
crash recovery, the dump will fail if the system has more than 2GB of memory
|
|
|
ab224c |
(since vfat and msdos filesystems do not support more than 2GB files).
|
|
|
ab224c |
Be careful of your filesystem selection when using this target.
|
|
|
ab224c |
|
|
|
ab224c |
It is recommended to use persistent device names or UUID/LABEL for file system
|
|
|
ab224c |
dumps. One example of persistent device is /dev/vg/<devname>.
|
|
|
ab224c |
|
|
|
ab224c |
NFS mount
|
|
|
ab224c |
|
|
|
ab224c |
Dumping over NFS requires an NFS server configured to export a file system
|
|
|
ab224c |
with full read/write access for the root user. All operations done within
|
|
|
ab224c |
the kdump initial ramdisk are done as root, and to write out a vmcore file,
|
|
|
ab224c |
we obviously must be able to write to the NFS mount. Configuring an NFS
|
|
|
ab224c |
server is outside the scope of this document, but either the no_root_squash
|
|
|
ab224c |
or anonuid options on the NFS server side are likely of interest to permit
|
|
|
ab224c |
the kdump initrd operations write to the NFS mount as root.
|
|
|
ab224c |
|
|
|
ab224c |
Assuming your're exporting /dump on the machine nfs-server.example.com,
|
|
|
ab224c |
once the mount is properly configured, specify it in kdump.conf, via
|
|
|
ab224c |
'nfs nfs-server.example.com:/dump'. The server portion can be specified either
|
|
|
ab224c |
by host name or IP address. Following a system crash, the kdump initrd will
|
|
|
ab224c |
mount the NFS mount and copy out the vmcore to your NFS server. Restart the
|
|
|
ab224c |
kdump service via '/sbin/systemctl restart kdump.service' to commit this change
|
|
|
ab224c |
to your kdump initrd.
|
|
|
ab224c |
|
|
|
e35838 |
Special mount via "dracut_args"
|
|
|
e35838 |
|
|
|
e35838 |
You can utilize "dracut_args" to pass "--mount" to kdump, see dracut manpage
|
|
|
e35838 |
about the format of "--mount" for details. If there is any "--mount" specified
|
|
|
e35838 |
via "dracut_args", kdump will build it as the mount target without doing any
|
|
|
e35838 |
validation (mounting or checking like mount options, fs size, save path, etc),
|
|
|
e35838 |
so you must test it to ensure all the correctness. You cannot use other targets
|
|
|
e35838 |
in /etc/kdump.conf if you use "--mount" in "dracut_args". You also cannot specify
|
|
|
e35838 |
mutliple "--mount" targets via "dracut_args".
|
|
|
e35838 |
|
|
|
e35838 |
One use case of "--mount" in "dracut_args" is you do not want to mount dump target
|
|
|
e35838 |
before kdump service startup, for example, to reduce the burden of the shared nfs
|
|
|
e35838 |
server. Such as the example below:
|
|
|
e35838 |
dracut_args --mount "192.168.1.1:/share /mnt/test nfs4 defaults"
|
|
|
e35838 |
|
|
|
e35838 |
NOTE:
|
|
|
e35838 |
- <mountpoint> must be specified as an absolute path.
|
|
|
e35838 |
|
|
|
ab224c |
Remote system via ssh/scp
|
|
|
ab224c |
|
|
|
ab224c |
Dumping over ssh/scp requires setting up passwordless ssh keys for every
|
|
|
ab224c |
machine you wish to have dump via this method. First up, configure kdump.conf
|
|
|
ab224c |
for ssh/scp dumping, adding a config line of 'ssh user@server', where 'user'
|
|
|
ab224c |
can be any user on the target system you choose, and 'server' is the host
|
|
|
ab224c |
name or IP address of the target system. Using a dedicated, restricted user
|
|
|
ab224c |
account on the target system is recommended, as there will be keyless ssh
|
|
|
ab224c |
access to this account.
|
|
|
ab224c |
|
|
|
ab224c |
Once kdump.conf is appropriately configured, issue the command
|
|
|
ab224c |
'kdumpctl propagate' to automatically set up the ssh host keys and transmit
|
|
|
ab224c |
the necessary bits to the target server. You'll have to type in 'yes'
|
|
|
ab224c |
to accept the host key for your targer server if this is the first time
|
|
|
ab224c |
you've connected to it, and then input the target system user's password
|
|
|
ab224c |
to send over the necessary ssh key file. Restart the kdump service via
|
|
|
ab224c |
'/sbin/systemctl restart kdump.service' to commit this change to your kdump initrd.
|
|
|
ab224c |
|
|
|
ab224c |
Path
|
|
|
1b417c |
====
|
|
|
1b417c |
"path" represents the file system path in which vmcore will be saved. In
|
|
|
1b417c |
fact kdump creates a directory $hostip-$date with-in "path" and saves
|
|
|
1b417c |
vmcore there. So practically dump is saved in $path/$hostip-$date/. To
|
|
|
1b417c |
simplify discussion further, if we say dump will be saved in $path, it
|
|
|
1b417c |
is implied that kdump will create another directory inside path and
|
|
|
1b417c |
save vmcore there.
|
|
|
1b417c |
|
|
|
1b417c |
If a dump target is specified in kdump.conf, then "path" is relative to the
|
|
|
1b417c |
specified dump target. For example, if dump target is "ext4 /dev/sda", then
|
|
|
1b417c |
dump will be saved in "$path" directory on /dev/sda.
|
|
|
1b417c |
|
|
|
1b417c |
Same is the case for nfs dump. If user specified "nfs foo.com:/export/tmp/"
|
|
|
1b417c |
as dump target, then dump will effectively be saved in
|
|
|
1b417c |
"foo.com:/export/tmp/var/crash/" directory.
|
|
|
1b417c |
|
|
|
1b417c |
Interpretation of path changes a bit if user has not specified a dump
|
|
|
1b417c |
target explicitly in kdump.conf. In this case, "path" represents the
|
|
|
1b417c |
absolute path from root. And dump target and adjusted path are arrived
|
|
|
1b417c |
at automatically depending on what's mounted in the current system.
|
|
|
1b417c |
|
|
|
1b417c |
Following are few examples.
|
|
|
1b417c |
|
|
|
1b417c |
path /var/crash/
|
|
|
1b417c |
----------------
|
|
|
1b417c |
Assuming there is no disk mounted on /var/ or on /var/crash, dump will
|
|
|
1b417c |
be saved on disk backing rootfs in directory /var/crash.
|
|
|
1b417c |
|
|
|
1b417c |
path /var/crash/ (A separate disk mounted on /var)
|
|
|
1b417c |
--------------------------------------------------
|
|
|
1b417c |
Say a disk /dev/sdb is mouted on /var. In this case dump target will
|
|
|
1b417c |
become /dev/sdb and path will become "/crash" and dump will be saved
|
|
|
1b417c |
on "sdb:/crash/" directory.
|
|
|
1b417c |
|
|
|
1b417c |
path /var/crash/ (NFS mounted on /var)
|
|
|
1b417c |
-------------------------------------
|
|
|
1b417c |
Say foo.com:/export/tmp is mounted on /var. In this case dump target is
|
|
|
1b417c |
nfs server and path will be adjusted to "/crash" and dump will be saved to
|
|
|
1b417c |
foo.com:/export/tmp/crash/ directory.
|
|
|
1b417c |
|
|
|
1b417c |
Kdump boot directory
|
|
|
1b417c |
====================
|
|
|
1b417c |
Usually kdump kernel is the same as 1st kernel. So kdump will try to find
|
|
|
1b417c |
kdump kernel under /boot according to /proc/cmdline. E.g we execute below
|
|
|
1b417c |
command and get an output:
|
|
|
1b417c |
cat /proc/cmdline
|
|
|
1b417c |
BOOT_IMAGE=/xxx/vmlinuz-3.yyy.zzz root=xxxx .....
|
|
|
1b417c |
Then kdump kernel will be /boot/xxx/vmlinuz-3.yyy.zzz.
|
|
|
1b417c |
However a variable KDUMP_BOOTDIR in /etc/sysconfig/kdump is provided to
|
|
|
1b417c |
user if kdump kernel is put in a different directory.
|
|
|
ab224c |
|
|
|
ab224c |
Kdump Post-Capture Executable
|
|
|
ab224c |
|
|
|
ab224c |
It is possible to specify a custom script or binary you wish to run following
|
|
|
ab224c |
an attempt to capture a vmcore. The executable is passed an exit code from
|
|
|
ab224c |
the capture process, which can be used to trigger different actions from
|
|
|
ab224c |
within your post-capture executable.
|
|
|
ab224c |
|
|
|
ab224c |
Kdump Pre-Capture Executable
|
|
|
ab224c |
|
|
|
ab224c |
It is possible to specify a custom script or binary you wish to run before
|
|
|
ab224c |
capturing a vmcore. Exit status of this binary is interpreted:
|
|
|
ab224c |
0 - continue with dump process as usual
|
|
|
ab224c |
non 0 - reboot the system
|
|
|
ab224c |
|
|
|
ab224c |
Extra Binaries
|
|
|
ab224c |
|
|
|
ab224c |
If you have specific binaries or scripts you want to have made available
|
|
|
ab224c |
within your kdump initrd, you can specify them by their full path, and they
|
|
|
ab224c |
will be included in your kdump initrd, along with all dependent libraries.
|
|
|
ab224c |
This may be particularly useful for those running post-capture scripts that
|
|
|
ab224c |
rely on other binaries.
|
|
|
ab224c |
|
|
|
ab224c |
Extra Modules
|
|
|
ab224c |
|
|
|
ab224c |
By default, only the bare minimum of kernel modules will be included in your
|
|
|
ab224c |
kdump initrd. Should you wish to capture your vmcore files to a non-boot-path
|
|
|
ab224c |
storage device, such as an iscsi target disk or clustered file system, you may
|
|
|
ab224c |
need to manually specify additional kernel modules to load into your kdump
|
|
|
ab224c |
initrd.
|
|
|
ab224c |
|
|
|
ab224c |
Default action
|
|
|
ab224c |
==============
|
|
|
ab224c |
Default action specifies what to do when dump to configured dump target
|
|
|
ab224c |
fails. By default, default action is "reboot" and that is system reboots
|
|
|
ab224c |
if attempt to save dump to dump target fails.
|
|
|
ab224c |
|
|
|
ab224c |
There are other default actions available though.
|
|
|
ab224c |
|
|
|
ab224c |
- dump_to_rootfs
|
|
|
ab224c |
This option tries to mount root and save dump on root filesystem
|
|
|
ab224c |
in a path specified by "path". This option will generally make
|
|
|
ab224c |
sense when dump target is not root filesystem. For example, if
|
|
|
ab224c |
dump is being saved over network using "ssh" then one can specify
|
|
|
ab224c |
default to "dump_to_rootfs" to try saving dump to root filesystem
|
|
|
ab224c |
if dump over network fails.
|
|
|
ab224c |
|
|
|
ab224c |
- shell
|
|
|
ab224c |
Drop into a shell session inside initramfs.
|
|
|
ab224c |
- halt
|
|
|
ab224c |
Halt system after failure
|
|
|
ab224c |
- poweroff
|
|
|
ab224c |
Poweroff system after failure.
|
|
|
ab224c |
|
|
|
ab224c |
Compression and filtering
|
|
|
ab224c |
|
|
|
ab224c |
The 'core_collector' parameter in kdump.conf allows you to specify a custom
|
|
|
ab224c |
dump capture method. The most common alternate method is makedumpfile, which
|
|
|
ab224c |
is a dump filtering and compression utility provided with kexec-tools. On
|
|
|
ab224c |
some architectures, it can drastically reduce the size of your vmcore files,
|
|
|
ab224c |
which becomes very useful on systems with large amounts of memory.
|
|
|
ab224c |
|
|
|
765b01 |
A typical setup is 'core_collector makedumpfile -F -l --message-level 1 -d 31',
|
|
|
ab224c |
but check the output of '/sbin/makedumpfile --help' for a list of all available
|
|
|
ab224c |
options (-i and -g don't need to be specified, they're automatically taken care
|
|
|
ab224c |
of). Note that use of makedumpfile requires that the kernel-debuginfo package
|
|
|
ab224c |
corresponding with your running kernel be installed.
|
|
|
ab224c |
|
|
|
ab224c |
Core collector command format depends on dump target type. Typically for
|
|
|
ab224c |
filesystem (local/remote), core_collector should accept two arguments.
|
|
|
ab224c |
First one is source file and second one is target file. For ex.
|
|
|
ab224c |
|
|
|
ab224c |
ex1.
|
|
|
ab224c |
---
|
|
|
ab224c |
core_collector "cp --sparse=always"
|
|
|
ab224c |
|
|
|
ab224c |
Above will effectively be translated to:
|
|
|
ab224c |
|
|
|
ab224c |
cp --sparse=always /proc/vmcore <dest-path>/vmcore
|
|
|
ab224c |
|
|
|
ab224c |
ex2.
|
|
|
ab224c |
---
|
|
|
765b01 |
core_collector "makedumpfile -l --message-level 1 -d 31"
|
|
|
ab224c |
|
|
|
ab224c |
Above will effectively be translated to:
|
|
|
ab224c |
|
|
|
765b01 |
makedumpfile -l --message-level 1 -d 31 /proc/vmcore <dest-path>/vmcore
|
|
|
ab224c |
|
|
|
ab224c |
|
|
|
ab224c |
For dump targets like raw and ssh, in general, core collector should expect
|
|
|
ab224c |
one argument (source file) and should output the processed core on standard
|
|
|
ab224c |
output (There is one exception of "scp", discussed later). This standard
|
|
|
ab224c |
output will be saved to destination using appropriate commands.
|
|
|
ab224c |
|
|
|
ab224c |
raw dumps core_collector examples:
|
|
|
ab224c |
---------
|
|
|
ab224c |
ex3.
|
|
|
ab224c |
---
|
|
|
ab224c |
core_collector "cat"
|
|
|
ab224c |
|
|
|
ab224c |
Above will effectively be translated to.
|
|
|
ab224c |
|
|
|
ab224c |
cat /proc/vmcore | dd of=<target-device>
|
|
|
ab224c |
|
|
|
ab224c |
ex4.
|
|
|
ab224c |
---
|
|
|
765b01 |
core_collector "makedumpfile -F -l --message-level 1 -d 31"
|
|
|
ab224c |
|
|
|
ab224c |
Above will effectively be translated to.
|
|
|
ab224c |
|
|
|
765b01 |
makedumpfile -F -l --message-level 1 -d 31 | dd of=<target-device>
|
|
|
ab224c |
|
|
|
ab224c |
ssh dumps core_collector examples:
|
|
|
ab224c |
---------
|
|
|
ab224c |
ex5.
|
|
|
ab224c |
---
|
|
|
ab224c |
core_collector "cat"
|
|
|
ab224c |
|
|
|
ab224c |
Above will effectively be translated to.
|
|
|
ab224c |
|
|
|
ab224c |
cat /proc/vmcore | ssh <options> <remote-location> "dd of=path/vmcore"
|
|
|
ab224c |
|
|
|
ab224c |
ex6.
|
|
|
ab224c |
---
|
|
|
765b01 |
core_collector "makedumpfile -F -l --message-level 1 -d 31"
|
|
|
ab224c |
|
|
|
ab224c |
Above will effectively be translated to.
|
|
|
ab224c |
|
|
|
765b01 |
makedumpfile -F -l --message-level 1 -d 31 | ssh <options> <remote-location> "dd of=path/vmcore"
|
|
|
ab224c |
|
|
|
ab224c |
There is one exception to standard output rule for ssh dumps. And that is
|
|
|
ab224c |
scp. As scp can handle ssh destinations for file transfers, one can
|
|
|
ab224c |
specify "scp" as core collector for ssh targets (no output on stdout).
|
|
|
ab224c |
|
|
|
ab224c |
ex7.
|
|
|
ab224c |
----
|
|
|
ab224c |
core_collector "scp"
|
|
|
ab224c |
|
|
|
ab224c |
Above will effectively be translated to.
|
|
|
ab224c |
|
|
|
ab224c |
scp /proc/vmcore <user@host>:path/vmcore
|
|
|
ab224c |
|
|
|
ab224c |
About default core collector
|
|
|
ab224c |
----------------------------
|
|
|
ab224c |
Default core_collector for ssh/raw dump is:
|
|
|
765b01 |
"makedumpfile -F -l --message-level 1 -d 31".
|
|
|
ab224c |
Default core_collector for other targets is:
|
|
|
765b01 |
"makedumpfile -l --message-level 1 -d 31".
|
|
|
ab224c |
|
|
|
ab224c |
Even if core_collector option is commented out in kdump.conf, makedumpfile
|
|
|
ab224c |
is default core collector and kdump uses it internally.
|
|
|
ab224c |
|
|
|
ab224c |
If one does not want makedumpfile as default core_collector, then they
|
|
|
ab224c |
need to specify one using core_collector option to change the behavior.
|
|
|
ab224c |
|
|
|
ab224c |
Note: If "makedumpfile -F" is used then you will get a flattened format
|
|
|
ab224c |
vmcore.flat, you will need to use "makedumpfile -R" to rearrange the
|
|
|
ab224c |
dump data from stdard input to a normal dumpfile (readable with analysis
|
|
|
ab224c |
tools).
|
|
|
ab224c |
For example: "makedumpfile -R vmcore < vmcore.flat"
|
|
|
ab224c |
|
|
|
ab224c |
Caveats:
|
|
|
ab224c |
|
|
|
ab224c |
Console frame-buffers and X are not properly supported. If you typically run
|
|
|
ab224c |
with something along the lines of "vga=791" in your kernel config line or
|
|
|
ab224c |
have X running, console video will be garbled when a kernel is booted via
|
|
|
ab224c |
kexec. Note that the kdump kernel should still be able to create a dump,
|
|
|
ab224c |
and when the system reboots, video should be restored to normal.
|
|
|
ab224c |
|
|
|
ab224c |
|
|
|
ab224c |
Notes on resetting video:
|
|
|
ab224c |
|
|
|
ab224c |
Video is a notoriously difficult issue with kexec. Video cards contain ROM code
|
|
|
ab224c |
that controls their initial configuration and setup. This code is nominally
|
|
|
ab224c |
accessed and executed from the Bios, and otherwise not safely executable. Since
|
|
|
ab224c |
the purpose of kexec is to reboot the system without re-executing the Bios, it
|
|
|
ab224c |
is rather difficult if not impossible to reset video cards with kexec. The
|
|
|
ab224c |
result is, that if a system crashes while running in a graphical mode (i.e.
|
|
|
ab224c |
running X), the screen may appear to become 'frozen' while the dump capture is
|
|
|
ab224c |
taking place. A serial console will of course reveal that the system is
|
|
|
ab224c |
operating and capturing a vmcore image, but a casual observer will see the
|
|
|
ab224c |
system as hung until the dump completes and a true reboot is executed.
|
|
|
ab224c |
|
|
|
ab224c |
There are two possiblilties to work around this issue. One is by adding
|
|
|
ab224c |
--reset-vga to the kexec command line options in /etc/sysconfig/kdump. This
|
|
|
ab224c |
tells kdump to write some reasonable default values to the video card register
|
|
|
ab224c |
file, in the hopes of returning it to a text mode such that boot messages are
|
|
|
ab224c |
visible on the screen. It does not work with all video cards however.
|
|
|
ab224c |
Secondly, it may be worth trying to add vga15fb.ko to the extra_modules list in
|
|
|
ab224c |
/etc/kdump.conf. This will attempt to use the video card in framebuffer mode,
|
|
|
ab224c |
which can blank the screen prior to the start of a dump capture.
|
|
|
ab224c |
|
|
|
ab224c |
Notes on rootfs mount:
|
|
|
ab224c |
Dracut is designed to mount rootfs by default. If rootfs mounting fails it
|
|
|
ab224c |
will refuse to go on. So kdump leaves rootfs mounting to dracut currently.
|
|
|
ab224c |
We make the assumtion that proper root= cmdline is being passed to dracut
|
|
|
ab224c |
initramfs for the time being. If you need modify "KDUMP_COMMANDLINE=" in
|
|
|
ab224c |
/etc/sysconfig/kdump, you will need to make sure that appropriate root=
|
|
|
ab224c |
options are copied from /proc/cmdline. In general it is best to append
|
|
|
ab224c |
command line options using "KDUMP_COMMANDLINE_APPEND=" instead of replacing
|
|
|
ab224c |
the original command line completely.
|
|
|
ab224c |
|
|
|
e35838 |
Notes on watchdog module handling:
|
|
|
e35838 |
|
|
|
e35838 |
If a watchdog is active in first kernel then, we must have it's module
|
|
|
e35838 |
loaded in crash kernel, so that either watchdog is deactivated or started
|
|
|
e35838 |
being kicked in second kernel. Otherwise, we might face watchdog reboot
|
|
|
e35838 |
when vmcore is being saved. When dracut watchdog module is enabled, it
|
|
|
e35838 |
installs kernel watchdog module of active watchdog device in initrd.
|
|
|
e35838 |
kexec-tools always add "-a watchdog" to the dracut_args if there exists at
|
|
|
e35838 |
least one active watchdog and user has not added specifically "-o watchdog"
|
|
|
e35838 |
in dracut_args of kdump.conf. If a watchdog module (such as hp_wdt) has
|
|
|
e35838 |
not been written in watchdog-core framework then this option will not have
|
|
|
e35838 |
any effect and module will not be added. Please note that only systemd
|
|
|
e35838 |
watchdog daemon is supported as watchdog kick application.
|
|
|
e35838 |
|
|
|
e35838 |
Parallel Dumping Operation
|
|
|
e35838 |
==========================
|
|
|
e35838 |
Kexec allows kdump using multiple cpus. So parallel feature can accelerate
|
|
|
e35838 |
dumping substantially, especially in executing compression and filter.
|
|
|
e35838 |
For example:
|
|
|
e35838 |
|
|
|
e35838 |
1."makedumpfile -c --num-threads [THREAD_NUM] /proc/vmcore dumpfile"
|
|
|
e35838 |
2."makedumpfile -c /proc/vmcore dumpfile",
|
|
|
e35838 |
|
|
|
e35838 |
1 has better performance than 2, if THREAD_NUM is larger than two
|
|
|
e35838 |
and the usable cpus number is larger than THREAD_NUM.
|
|
|
e35838 |
|
|
|
e35838 |
Notes on how to use multiple cpus on a capture kernel on x86 system:
|
|
|
e35838 |
|
|
|
e35838 |
Make sure that you are using a kernel that supports disable_cpu_apicid
|
|
|
e35838 |
kernel option as a capture kernel, which is needed to avoid x86 specific
|
|
|
e35838 |
hardware issue (*). The disable_cpu_apicid kernel option is automatically
|
|
|
e35838 |
appended by kdumpctl script and is ignored if the kernel doesn't support it.
|
|
|
e35838 |
|
|
|
e35838 |
You need to specify how many cpus to be used in a capture kernel by specifying
|
|
|
e35838 |
the number of cpus in nr_cpus kernel option in /etc/sysconfig/kdump. nr_cpus
|
|
|
e35838 |
is 1 at default.
|
|
|
e35838 |
|
|
|
e35838 |
You should use necessary and sufficient number of cpus on a capture kernel.
|
|
|
e35838 |
Warning: Don't use too many cpus on a capture kernel, or the capture kernel
|
|
|
e35838 |
may lead to panic due to Out Of Memory.
|
|
|
e35838 |
|
|
|
e35838 |
(*) Without disable_cpu_apicid kernel option, capture kernel may lead to
|
|
|
e35838 |
hang, system reset or power-off at boot, depending on your system and runtime
|
|
|
e35838 |
situation at the time of crash.
|
|
|
e35838 |
|
|
|
ab224c |
Debugging Tips
|
|
|
ab224c |
--------------
|
|
|
ab224c |
- One can drop into a shell before/after saving vmcore with the help of
|
|
|
ab224c |
using kdump_pre/kdump_post hooks. Use following in one of the pre/post
|
|
|
ab224c |
scripts to drop into a shell.
|
|
|
ab224c |
|
|
|
ab224c |
#!/bin/bash
|
|
|
ab224c |
_ctty=/dev/ttyS0
|
|
|
ab224c |
setsid /bin/sh -i -l 0<>$_ctty 1<>$_ctty 2<>$_ctty
|
|
|
ab224c |
|
|
|
ab224c |
One might have to change the terminal depending on what they are using.
|
|
|
ab224c |
|
|
|
ab224c |
- Serial console logging for virtual machines
|
|
|
ab224c |
|
|
|
ab224c |
I generally use "virsh console <domain-name>" to get to serial console.
|
|
|
ab224c |
I noticed after dump saving system reboots and when grub menu shows up
|
|
|
ab224c |
some of the previously logged messages are no more there. That means
|
|
|
ab224c |
any important debugging info at the end will be lost.
|
|
|
ab224c |
|
|
|
ab224c |
One can log serial console as follows to make sure messages are not lost.
|
|
|
ab224c |
|
|
|
ab224c |
virsh ttyconsole <domain-name>
|
|
|
ab224c |
ln -s <name-of-tty> /dev/modem
|
|
|
ab224c |
minicom -C /tmp/console-logs
|
|
|
ab224c |
|
|
|
ab224c |
Now minicom should be logging serial console in file console-logs.
|
|
|
ab224c |
|
|
|
ab224c |
|