de80c6
Kexec/Kdump HOWTO
de80c6
de80c6
Introduction
de80c6
de80c6
Kexec and kdump are new features in the 2.6 mainstream kernel. These features
de80c6
are included in Red Hat Enterprise Linux 5. The purpose of these features
de80c6
is to ensure faster boot up and creation of reliable kernel vmcores for
de80c6
diagnostic purposes.
de80c6
de80c6
Overview
de80c6
de80c6
Kexec
de80c6
de80c6
Kexec is a fastboot mechanism which allows booting a Linux kernel from the
de80c6
context of already running kernel without going through BIOS. BIOS can be very
de80c6
time consuming especially on the big servers with lots of peripherals. This can
de80c6
save a lot of time for developers who end up booting a machine numerous times.
de80c6
de80c6
Kdump
de80c6
de80c6
Kdump is a new kernel crash dumping mechanism and is very reliable because
de80c6
the crash dump is captured from the context of a freshly booted kernel and
de80c6
not from the context of the crashed kernel. Kdump uses kexec to boot into
de80c6
a second kernel whenever system crashes. This second kernel, often called
de80c6
a capture kernel, boots with very little memory and captures the dump image.
de80c6
de80c6
The first kernel reserves a section of memory that the second kernel uses
de80c6
to boot. Kexec enables booting the capture kernel without going through BIOS
de80c6
hence contents of first kernel's memory are preserved, which is essentially
de80c6
the kernel crash dump.
de80c6
de80c6
Kdump is supported on the i686, x86_64, ia64 and ppc64 platforms. The
de80c6
standard kernel and capture kernel are one in the same on i686, x86_64,
de80c6
ia64 and ppc64.
de80c6
de80c6
If you're reading this document, you should already have kexec-tools
de80c6
installed. If not, you install it via the following command:
de80c6
de80c6
    # yum install kexec-tools
de80c6
de80c6
Now load a kernel with kexec:
de80c6
de80c6
    # kver=`uname -r` # kexec -l /boot/vmlinuz-$kver
de80c6
    --initrd=/boot/initrd-$kver.img \
de80c6
        --command-line="`cat /proc/cmdline`"
de80c6
de80c6
NOTE: The above will boot you back into the kernel you're currently running,
de80c6
if you want to load a different kernel, substitute it in place of `uname -r`.
de80c6
de80c6
Now reboot your system, taking note that it should bypass the BIOS:
de80c6
de80c6
    # reboot
de80c6
de80c6
de80c6
How to configure kdump:
de80c6
de80c6
Again, we assume if you're reading this document, you should already have
de80c6
kexec-tools installed. If not, you install it via the following command:
de80c6
de80c6
    # yum install kexec-tools
de80c6
de80c6
To be able to do much of anything interesting in the way of debug analysis,
de80c6
you'll also need to install the kernel-debuginfo package, of the same arch
de80c6
as your running kernel, and the crash utility:
de80c6
de80c6
    # yum --enablerepo=\*debuginfo install kernel-debuginfo.$(uname -m) crash
de80c6
de80c6
Next up, we need to modify some boot parameters to reserve a chunk of memory for
de80c6
the capture kernel. With the help of grubby, it's very easy to append
de80c6
"crashkernel=128M" to the end of your kernel boot parameters. Note that the X
de80c6
values are such that X = the amount of memory to reserve for the capture kernel.
de80c6
And based on arch and system configuration, one might require more than 128M to
de80c6
be reserved for kdump. One need to experiment and test kdump, if 128M is not
de80c6
sufficient, try reserving more memory.
de80c6
de80c6
   # grubby --args="crashkernel=128M" --update-kernel=/boot/vmlinuz-`uname -r`
de80c6
de80c6
Note that there is an alternative form in which to specify a crashkernel
de80c6
memory reservation, in the event that more control is needed over the size and
de80c6
placement of the reserved memory.  The format is:
de80c6
de80c6
crashkernel=range1:size1[,range2:size2,...][@offset]
de80c6
de80c6
Where range<n> specifies a range of values that are matched against the amount
de80c6
of physical RAM present in the system, and the corresponding size<n> value
de80c6
specifies the amount of kexec memory to reserve.  For example:
de80c6
de80c6
crashkernel=512M-2G:64M,2G-:128M
de80c6
de80c6
This line tells kexec to reserve 64M of ram if the system contains between
de80c6
512M and 2G of physical memory.  If the system contains 2G or more of physical
de80c6
memory, 128M should be reserved.
de80c6
de80c6
After making said changes, reboot your system, so that the X MB of memory is
de80c6
left untouched by the normal system, reserved for the capture kernel. Take note
de80c6
that the output of 'free -m' will show X MB less memory than without this
de80c6
parameter, which is expected. You may be able to get by with less than 128M, but
de80c6
testing with only 64M has proven unreliable of late. On ia64, as much as 512M
de80c6
may be required.
de80c6
de80c6
Now that you've got that reserved memory region set up, you want to turn on
de80c6
the kdump init script:
de80c6
de80c6
    # chkconfig kdump on
de80c6
de80c6
Then, start up kdump as well:
de80c6
de80c6
    # systemctl start kdump.service
de80c6
de80c6
This should load your kernel-kdump image via kexec, leaving the system ready
de80c6
to capture a vmcore upon crashing. To test this out, you can force-crash
de80c6
your system by echo'ing a c into /proc/sysrq-trigger:
de80c6
de80c6
    # echo c > /proc/sysrq-trigger
de80c6
de80c6
You should see some panic output, followed by the system restarting into
de80c6
the kdump kernel. When the boot process gets to the point where it starts
de80c6
the kdump service, your vmcore should be copied out to disk (by default,
de80c6
in /var/crash/<YYYY-MM-DD-HH:MM>/vmcore), then the system rebooted back into
de80c6
your normal kernel.
de80c6
de80c6
Once back to your normal kernel, you can use the previously installed crash
de80c6
kernel in conjunction with the previously installed kernel-debuginfo to
de80c6
perform postmortem analysis:
de80c6
de80c6
    # crash /usr/lib/debug/lib/modules/2.6.17-1.2621.el5/vmlinux
de80c6
    /var/crash/2006-08-23-15:34/vmcore
de80c6
de80c6
    crash> bt
de80c6
de80c6
and so on...
de80c6
de80c6
Saving vmcore-dmesg.txt
de80c6
----------------------
de80c6
Kernel log bufferes are one of the most important information available
de80c6
in vmcore. Now before saving vmcore, kernel log bufferes are extracted
de80c6
from /proc/vmcore and saved into a file vmcore-dmesg.txt. After
de80c6
vmcore-dmesg.txt, vmcore is saved. Destination disk and directory for
de80c6
vmcore-dmesg.txt is same as vmcore. Note that kernel log buffers will
de80c6
not be available if dump target is raw device.
de80c6
de80c6
Dump Triggering methods:
de80c6
de80c6
This section talks about the various ways, other than a Kernel Panic, in which
de80c6
Kdump can be triggered. The following methods assume that Kdump is configured
de80c6
on your system, with the scripts enabled as described in the section above.
de80c6
de80c6
1) AltSysRq C
de80c6
de80c6
Kdump can be triggered with the combination of the 'Alt','SysRq' and 'C'
de80c6
keyboard keys. Please refer to the following link for more details:
de80c6
de80c6
http://kbase.redhat.com/faq/FAQ_43_5559.shtm
de80c6
de80c6
In addition, on PowerPC boxes, Kdump can also be triggered via Hardware
de80c6
Management Console(HMC) using 'Ctrl', 'O' and 'C' keyboard keys.
de80c6
de80c6
2) NMI_WATCHDOG
de80c6
de80c6
In case a machine has a hard hang, it is quite possible that it does not
de80c6
respond to keyboard interrupts. As a result 'Alt-SysRq' keys will not help
de80c6
trigger a dump. In such scenarios Nmi Watchdog feature can prove to be useful.
de80c6
The following link has more details on configuring Nmi watchdog option.
de80c6
de80c6
http://kbase.redhat.com/faq/FAQ_85_9129.shtm
de80c6
de80c6
Once this feature has been enabled in the kernel, any lockups will result in an
de80c6
OOPs message to be generated, followed by Kdump being triggered.
de80c6
de80c6
3) Kernel OOPs
de80c6
de80c6
If we want to generate a dump everytime the Kernel OOPses, we can achieve this
de80c6
by setting the 'Panic On OOPs' option as follows:
de80c6
de80c6
    # echo 1 > /proc/sys/kernel/panic_on_oops
de80c6
de80c6
This is enabled by default on RHEL5.
de80c6
de80c6
4) NMI(Non maskable interrupt) button
de80c6
de80c6
In cases where the system is in a hung state, and is not accepting keyboard
de80c6
interrupts, using NMI button for triggering Kdump can be very useful. NMI
de80c6
button is present on most of the newer x86 and x86_64 machines. Please refer
de80c6
to the User guides/manuals to locate the button, though in most occasions it
de80c6
is not very well documented. In most cases it is hidden behind a small hole
de80c6
on the front or back panel of the machine. You could use a toothpick or some
de80c6
other non-conducting probe to press the button.
de80c6
de80c6
For example, on the IBM X series 366 machine, the NMI button is located behind
de80c6
a small hole on the bottom center of the rear panel.
de80c6
de80c6
To enable this method of dump triggering using NMI button, you will need to set
de80c6
the 'unknown_nmi_panic' option as follows:
de80c6
de80c6
   # echo 1 > /proc/sys/kernel/unknown_nmi_panic
de80c6
de80c6
5) PowerPC specific methods:
de80c6
de80c6
On IBM PowerPC machines, issuing a soft reset invokes the XMON debugger(if
de80c6
XMON is configured). To configure XMON one needs to compile the kernel with
de80c6
the CONFIG_XMON and CONFIG_XMON_DEFAULT options, or by compiling with
de80c6
CONFIG_XMON and booting the kernel with xmon=on option.
de80c6
de80c6
Following are the ways to remotely issue a soft reset on PowerPC boxes, which
de80c6
would drop you to XMON. Pressing a 'X' (capital alphabet X) followed by an
de80c6
'Enter' here will trigger the dump.
de80c6
de80c6
5.1) HMC
de80c6
de80c6
Hardware Management Console(HMC) available on Power4 and Power5 machines allow
de80c6
partitions to be reset remotely. This is specially useful in hang situations
de80c6
where the system is not accepting any keyboard inputs.
de80c6
de80c6
Once you have HMC configured, the following steps will enable you to trigger
de80c6
Kdump via a soft reset:
de80c6
de80c6
On Power4
de80c6
  Using GUI
de80c6
de80c6
    * In the right pane, right click on the partition you wish to dump.
de80c6
    * Select "Operating System->Reset".
de80c6
    * Select "Soft Reset".
de80c6
    * Select "Yes".
de80c6
de80c6
  Using HMC Commandline
de80c6
de80c6
    # reset_partition -m <machine> -p <partition> -t soft
de80c6
de80c6
On Power5
de80c6
  Using GUI
de80c6
de80c6
    * In the right pane, right click on the partition you wish to dump.
de80c6
    * Select "Restart Partition".
de80c6
    * Select "Dump".
de80c6
    * Select "OK".
de80c6
de80c6
  Using HMC Commandline
de80c6
de80c6
    # chsysstate -m <managed system name> -n <lpar name> -o dumprestart -r lpar
de80c6
de80c6
5.2) Blade Management Console for Blade Center
de80c6
de80c6
To initiate a dump operation, go to Power/Restart option under "Blade Tasks" in
de80c6
the Blade Management Console. Select the corresponding blade for which you want
de80c6
to initate the dump and then click "Restart blade with NMI". This issues a
de80c6
system reset and invokes xmon debugger.
de80c6
de80c6
de80c6
Advanced Setups:
de80c6
de80c6
In addition to being able to capture a vmcore to your system's local file
de80c6
system, kdump can be configured to capture a vmcore to a number of other
de80c6
locations, including a raw disk partition, a dedicated file system, an NFS
de80c6
mounted file system, or a remote system via ssh/scp. Additional options
de80c6
exist for specifying the relative path under which the dump is captured,
de80c6
what to do if the capture fails, and for compressing and filtering the dump
de80c6
(so as to produce smaller, more manageable, vmcore files).
de80c6
de80c6
In theory, dumping to a location other than the local file system should be
de80c6
safer than kdump's default setup, as its possible the default setup will try
de80c6
dumping to a file system that has become corrupted. The raw disk partition and
de80c6
dedicated file system options allow you to still dump to the local system,
de80c6
but without having to remount your possibly corrupted file system(s),
de80c6
thereby decreasing the chance a vmcore won't be captured. Dumping to an
de80c6
NFS server or remote system via ssh/scp also has this advantage, as well
de80c6
as allowing for the centralization of vmcore files, should you have several
de80c6
systems from which you'd like to obtain vmcore files. Of course, note that
de80c6
these configurations could present problems if your network is unreliable.
de80c6
de80c6
Advanced setups are configured via modifications to /etc/kdump.conf,
de80c6
which out of the box, is fairly well documented itself. Any alterations to
de80c6
/etc/kdump.conf should be followed by a restart of the kdump service, so
de80c6
the changes can be incorporated in the kdump initrd. Restarting the kdump
de80c6
service is as simple as '/sbin/systemctl restart kdump.service'.
de80c6
de80c6
de80c6
Note that kdump.conf is used as a configuration mechanism for capturing dump
de80c6
files from the initramfs (in the interests of safety), the root file system is
de80c6
mounted, and the init process is started, only as a last resort if the
de80c6
initramfs fails to capture the vmcore.  As such, configuration made in
de80c6
/etc/kdump.conf is only applicable to capture recorded in the initramfs.  If
de80c6
for any reason the init process is started on the root file system, only a
de80c6
simple copying of the vmcore from /proc/vmcore to /var/crash/$DATE/vmcore will
de80c6
be preformed.
de80c6
de80c6
For both local filesystem and nfs dump the dump target must be mounted before
de80c6
building kdump initramfs. That means one needs to put an entry for the dump
de80c6
file system in /etc/fstab so that after reboot when kdump service starts,
de80c6
it can find the dump target and build initramfs instead of failing.
de80c6
Usually the dump target should be used only for kdump. If you worry about
de80c6
someone uses the filesystem for something else other than dumping vmcore
de80c6
you can mount it as read-only. Mkdumprd will still remount it as read-write
de80c6
for creating dump directory and will move it back to read-only afterwards.
de80c6
de80c6
Raw partition
de80c6
de80c6
Raw partition dumping requires that a disk partition in the system, at least
de80c6
as large as the amount of memory in the system, be left unformatted. Assuming
de80c6
/dev/vg/lv_kdump is left unformatted, kdump.conf can be configured with
de80c6
'raw /dev/vg/lv_kdump', and the vmcore file will be copied via dd directly
de80c6
onto partition /dev/vg/lv_kdump. Restart the kdump service via
de80c6
'/sbin/systemctl restart kdump.service' to commit this change to your kdump
de80c6
initrd. Dump target should be persistent device name, such as lvm or device
de80c6
mapper canonical name.
de80c6
de80c6
Dedicated file system
de80c6
de80c6
Similar to raw partition dumping, you can format a partition with the file
de80c6
system of your choice, Again, it should be at least as large as the amount
de80c6
of memory in the system. Assuming it should be at least as large as the
de80c6
amount of memory in the system. Assuming /dev/vg/lv_kdump has been
de80c6
formatted ext4, specify 'ext4 /dev/vg/lv_kdump' in kdump.conf, and a
de80c6
vmcore file will be copied onto the file system after it has been mounted.
de80c6
Dumping to a dedicated partition has the advantage that you can dump multiple
de80c6
vmcores to the file system, space permitting, without overwriting previous ones,
de80c6
as would be the case in a raw partition setup. Restart the kdump service via
de80c6
'/sbin/systemctl restart kdump.service' to commit this change to
de80c6
your kdump initrd.  Note that for local file systems ext4 and ext2 are
de80c6
supported as dumpable targets.  Kdump will not prevent you from specifying
de80c6
other filesystems, and they will most likely work, but their operation
de80c6
cannot be guaranteed.  for instance specifying a vfat filesystem or msdos
de80c6
filesystem will result in a successful load of the kdump service, but during
de80c6
crash recovery, the dump will fail if the system has more than 2GB of memory
de80c6
(since vfat and msdos filesystems do not support more than 2GB files).
de80c6
Be careful of your filesystem selection when using this target.
de80c6
de80c6
It is recommended to use persistent device names or UUID/LABEL for file system
de80c6
dumps. One example of persistent device is /dev/vg/<devname>.
de80c6
de80c6
NFS mount
de80c6
de80c6
Dumping over NFS requires an NFS server configured to export a file system
de80c6
with full read/write access for the root user. All operations done within
de80c6
the kdump initial ramdisk are done as root, and to write out a vmcore file,
de80c6
we obviously must be able to write to the NFS mount. Configuring an NFS
de80c6
server is outside the scope of this document, but either the no_root_squash
de80c6
or anonuid options on the NFS server side are likely of interest to permit
de80c6
the kdump initrd operations write to the NFS mount as root.
de80c6
de80c6
Assuming your're exporting /dump on the machine nfs-server.example.com,
de80c6
once the mount is properly configured, specify it in kdump.conf, via
de80c6
'nfs nfs-server.example.com:/dump'. The server portion can be specified either
de80c6
by host name or IP address. Following a system crash, the kdump initrd will
de80c6
mount the NFS mount and copy out the vmcore to your NFS server. Restart the
de80c6
kdump service via '/sbin/systemctl restart kdump.service' to commit this change
de80c6
to your kdump initrd.
de80c6
de80c6
Remote system via ssh/scp
de80c6
de80c6
Dumping over ssh/scp requires setting up passwordless ssh keys for every
de80c6
machine you wish to have dump via this method. First up, configure kdump.conf
de80c6
for ssh/scp dumping, adding a config line of 'ssh user@server', where 'user'
de80c6
can be any user on the target system you choose, and 'server' is the host
de80c6
name or IP address of the target system. Using a dedicated, restricted user
de80c6
account on the target system is recommended, as there will be keyless ssh
de80c6
access to this account.
de80c6
de80c6
Once kdump.conf is appropriately configured, issue the command
de80c6
'kdumpctl propagate' to automatically set up the ssh host keys and transmit
de80c6
the necessary bits to the target server. You'll have to type in 'yes'
de80c6
to accept the host key for your targer server if this is the first time
de80c6
you've connected to it, and then input the target system user's password
de80c6
to send over the necessary ssh key file. Restart the kdump service via
de80c6
'/sbin/systemctl restart kdump.service' to commit this change to your kdump initrd.
de80c6
de80c6
Path
de80c6
====
de80c6
"path" represents the file system path in which vmcore will be saved. In
de80c6
fact kdump creates a directory $hostip-$date with-in "path" and saves
de80c6
vmcore there. So practically dump is saved in $path/$hostip-$date/. To
de80c6
simplify discussion further, if we say dump will be saved in $path, it
de80c6
is implied that kdump will create another directory inside path and
de80c6
save vmcore there.
de80c6
de80c6
If a dump target is specified in kdump.conf, then "path" is relative to the
de80c6
specified dump target. For example, if dump target is "ext4 /dev/sda", then
de80c6
dump will be saved in "$path" directory on /dev/sda.
de80c6
de80c6
Same is the case for nfs dump. If user specified "nfs foo.com:/export/tmp/"
de80c6
as dump target, then dump will effectively be saved in
de80c6
"foo.com:/export/tmp/var/crash/" directory.
de80c6
de80c6
Interpretation of path changes a bit if user has not specified a dump
de80c6
target explicitly in kdump.conf. In this case, "path" represents the
de80c6
absolute path from root. And dump target and adjusted path are arrived
de80c6
at automatically depending on what's mounted in the current system.
de80c6
de80c6
Following are few examples.
de80c6
de80c6
path /var/crash/
de80c6
----------------
de80c6
Assuming there is no disk mounted on /var/ or on /var/crash, dump will
de80c6
be saved on disk backing rootfs in directory /var/crash.
de80c6
de80c6
path /var/crash/ (A separate disk mounted on /var)
de80c6
--------------------------------------------------
de80c6
Say a disk /dev/sdb is mouted on /var. In this case dump target will
de80c6
become /dev/sdb and path will become "/crash" and dump will be saved
de80c6
on "sdb:/crash/" directory.
de80c6
de80c6
path /var/crash/ (NFS mounted on /var)
de80c6
-------------------------------------
de80c6
Say foo.com:/export/tmp is mounted on /var. In this case dump target is
de80c6
nfs server and path will be adjusted to "/crash" and dump will be saved to
de80c6
foo.com:/export/tmp/crash/ directory.
de80c6
de80c6
de80c6
Kdump Post-Capture Executable
de80c6
de80c6
It is possible to specify a custom script or binary you wish to run following
de80c6
an attempt to capture a vmcore. The executable is passed an exit code from
de80c6
the capture process, which can be used to trigger different actions from
de80c6
within your post-capture executable.
de80c6
de80c6
Kdump Pre-Capture Executable
de80c6
de80c6
It is possible to specify a custom script or binary you wish to run before
de80c6
capturing a vmcore. Exit status of this binary is interpreted:
de80c6
0 - continue with dump process as usual
de80c6
non 0 - reboot the system
de80c6
de80c6
Extra Binaries
de80c6
de80c6
If you have specific binaries or scripts you want to have made available
de80c6
within your kdump initrd, you can specify them by their full path, and they
de80c6
will be included in your kdump initrd, along with all dependent libraries.
de80c6
This may be particularly useful for those running post-capture scripts that
de80c6
rely on other binaries.
de80c6
de80c6
Extra Modules
de80c6
de80c6
By default, only the bare minimum of kernel modules will be included in your
de80c6
kdump initrd. Should you wish to capture your vmcore files to a non-boot-path
de80c6
storage device, such as an iscsi target disk or clustered file system, you may
de80c6
need to manually specify additional kernel modules to load into your kdump
de80c6
initrd.
de80c6
de80c6
Default action
de80c6
==============
de80c6
Default action specifies what to do when dump to configured dump target
de80c6
fails. By default, default action is "reboot" and that is system reboots
de80c6
if attempt to save dump to dump target fails.
de80c6
de80c6
There are other default actions available though.
de80c6
de80c6
- dump_to_rootfs
de80c6
	This option tries to mount root and save dump on root filesystem
de80c6
	in a path specified by "path". This option will generally make
de80c6
	sense when dump target is not root filesystem. For example, if
de80c6
	dump is being saved over network using "ssh" then one can specify
de80c6
	default to "dump_to_rootfs" to try saving dump to root filesystem
de80c6
	if dump over network fails.
de80c6
de80c6
- shell
de80c6
	Drop into a shell session inside initramfs.
de80c6
- halt
de80c6
	Halt system after failure
de80c6
- poweroff
de80c6
	Poweroff system after failure.
de80c6
de80c6
Compression and filtering
de80c6
de80c6
The 'core_collector' parameter in kdump.conf allows you to specify a custom
de80c6
dump capture method. The most common alternate method is makedumpfile, which
de80c6
is a dump filtering and compression utility provided with kexec-tools. On
de80c6
some architectures, it can drastically reduce the size of your vmcore files,
de80c6
which becomes very useful on systems with large amounts of memory.
de80c6
de80c6
A typical setup is 'core_collector makedumpfile -F -l --message-level 1 -d 31',
de80c6
but check the output of '/sbin/makedumpfile --help' for a list of all available
de80c6
options (-i and -g don't need to be specified, they're automatically taken care
de80c6
of). Note that use of makedumpfile requires that the kernel-debuginfo package
de80c6
corresponding with your running kernel be installed.
de80c6
de80c6
Core collector command format depends on dump target type. Typically for
de80c6
filesystem (local/remote), core_collector should accept two arguments.
de80c6
First one is source file and second one is target file. For ex.
de80c6
de80c6
ex1.
de80c6
---
de80c6
core_collector "cp --sparse=always"
de80c6
de80c6
Above will effectively be translated to:
de80c6
de80c6
cp --sparse=always /proc/vmcore <dest-path>/vmcore
de80c6
de80c6
ex2.
de80c6
---
de80c6
core_collector "makedumpfile -l --message-level 1 -d 31"
de80c6
de80c6
Above will effectively be translated to:
de80c6
de80c6
makedumpfile -l --message-level 1 -d 31 /proc/vmcore <dest-path>/vmcore
de80c6
de80c6
de80c6
For dump targets like raw and ssh, in general, core collector should expect
de80c6
one argument (source file) and should output the processed core on standard
de80c6
output (There is one exception of "scp", discussed later). This standard
de80c6
output will be saved to destination using appropriate commands.
de80c6
de80c6
raw dumps core_collector examples:
de80c6
---------
de80c6
ex3.
de80c6
---
de80c6
core_collector "cat"
de80c6
de80c6
Above will effectively be translated to.
de80c6
de80c6
cat /proc/vmcore | dd of=<target-device>
de80c6
de80c6
ex4.
de80c6
---
de80c6
core_collector "makedumpfile -F -l --message-level 1 -d 31"
de80c6
de80c6
Above will effectively be translated to.
de80c6
de80c6
makedumpfile -F -l --message-level 1 -d 31 | dd of=<target-device>
de80c6
de80c6
ssh dumps core_collector examples:
de80c6
---------
de80c6
ex5.
de80c6
---
de80c6
core_collector "cat"
de80c6
de80c6
Above will effectively be translated to.
de80c6
de80c6
cat /proc/vmcore | ssh <options> <remote-location> "dd of=path/vmcore"
de80c6
de80c6
ex6.
de80c6
---
de80c6
core_collector "makedumpfile -F -l --message-level 1 -d 31"
de80c6
de80c6
Above will effectively be translated to.
de80c6
de80c6
makedumpfile -F -l --message-level 1 -d 31 | ssh <options> <remote-location> "dd of=path/vmcore"
de80c6
de80c6
There is one exception to standard output rule for ssh dumps. And that is
de80c6
scp. As scp can handle ssh destinations for file transfers, one can
de80c6
specify "scp" as core collector for ssh targets (no output on stdout).
de80c6
de80c6
ex7.
de80c6
----
de80c6
core_collector "scp"
de80c6
de80c6
Above will effectively be translated to.
de80c6
de80c6
scp /proc/vmcore <user@host>:path/vmcore
de80c6
de80c6
About default core collector
de80c6
----------------------------
de80c6
Default core_collector for ssh/raw dump is:
de80c6
"makedumpfile -F -l --message-level 1 -d 31".
de80c6
Default core_collector for other targets is:
de80c6
"makedumpfile -l --message-level 1 -d 31".
de80c6
de80c6
Even if core_collector option is commented out in kdump.conf, makedumpfile
de80c6
is default core collector and kdump uses it internally.
de80c6
de80c6
If one does not want makedumpfile as default core_collector, then they
de80c6
need to specify one using core_collector option to change the behavior.
de80c6
de80c6
Note: If "makedumpfile -F" is used then you will get a flattened format
de80c6
vmcore.flat, you will need to use "makedumpfile -R" to rearrange the
de80c6
dump data from stdard input to a normal dumpfile (readable with analysis
de80c6
tools).
de80c6
For example: "makedumpfile -R vmcore < vmcore.flat"
de80c6
de80c6
Caveats:
de80c6
de80c6
Console frame-buffers and X are not properly supported. If you typically run
de80c6
with something along the lines of "vga=791" in your kernel config line or
de80c6
have X running, console video will be garbled when a kernel is booted via
de80c6
kexec. Note that the kdump kernel should still be able to create a dump,
de80c6
and when the system reboots, video should be restored to normal.
de80c6
de80c6
de80c6
Notes on resetting video:
de80c6
de80c6
Video is a notoriously difficult issue with kexec.  Video cards contain ROM code
de80c6
that controls their initial configuration and setup.  This code is nominally
de80c6
accessed and executed from the Bios, and otherwise not safely executable. Since
de80c6
the purpose of kexec is to reboot the system without re-executing the Bios, it
de80c6
is rather difficult if not impossible to reset video cards with kexec.  The
de80c6
result is, that if a system crashes while running in a graphical mode (i.e.
de80c6
running X), the screen may appear to become 'frozen' while the dump capture is
de80c6
taking place.  A serial console will of course reveal that the system is
de80c6
operating and capturing a vmcore image, but a casual observer will see the
de80c6
system as hung until the dump completes and a true reboot is executed.
de80c6
de80c6
There are two possiblilties to work around this issue.  One is by adding
de80c6
--reset-vga to the kexec command line options in /etc/sysconfig/kdump.  This
de80c6
tells kdump to write some reasonable default values to the video card register
de80c6
file, in the hopes of returning it to a text mode such that boot messages are
de80c6
visible on the screen.  It does not work with all video cards however.
de80c6
Secondly, it may be worth trying to add vga15fb.ko to the extra_modules list in
de80c6
/etc/kdump.conf.  This will attempt to use the video card in framebuffer mode,
de80c6
which can blank the screen prior to the start of a dump capture.
de80c6
de80c6
Notes on rootfs mount:
de80c6
Dracut is designed to mount rootfs by default. If rootfs mounting fails it
de80c6
will refuse to go on. So kdump leaves rootfs mounting to dracut currently.
de80c6
We make the assumtion that proper root= cmdline is being passed to dracut
de80c6
initramfs for the time being. If you need modify "KDUMP_COMMANDLINE=" in
de80c6
/etc/sysconfig/kdump, you will need to make sure that appropriate root=
de80c6
options are copied from /proc/cmdline. In general it is best to append
de80c6
command line options using "KDUMP_COMMANDLINE_APPEND=" instead of replacing
de80c6
the original command line completely.
de80c6
de80c6
Debugging Tips
de80c6
--------------
de80c6
- One can drop into a shell before/after saving vmcore with the help of
de80c6
  using kdump_pre/kdump_post hooks. Use following in one of the pre/post
de80c6
  scripts to drop into a shell.
de80c6
de80c6
  #!/bin/bash
de80c6
  _ctty=/dev/ttyS0
de80c6
  setsid /bin/sh -i -l 0<>$_ctty 1<>$_ctty 2<>$_ctty
de80c6
de80c6
  One might have to change the terminal depending on what they are using.
de80c6
de80c6
- Serial console logging for virtual machines
de80c6
de80c6
  I generally use "virsh console <domain-name>" to get to serial console.
de80c6
  I noticed after dump saving system reboots and when grub menu shows up
de80c6
  some of the previously logged messages are no more there. That means
de80c6
  any important debugging info at the end will be lost.
de80c6
de80c6
  One can log serial console as follows to make sure messages are not lost.
de80c6
de80c6
  virsh ttyconsole <domain-name>
de80c6
  ln -s <name-of-tty> /dev/modem
de80c6
  minicom -C /tmp/console-logs
de80c6
de80c6
  Now minicom should be logging serial console in file console-logs.
de80c6
de80c6