399b37
Firmware assisted dump (fadump) HOWTO
399b37
399b37
Introduction
399b37
399b37
Firmware assisted dump is a new feature in the 3.4 mainline kernel supported
399b37
only on powerpc architecture. The goal of firmware-assisted dump is to enable
399b37
the dump of a crashed system, and to do so from a fully-reset system, and to
399b37
minimize the total elapsed time until the system is back in production use. A
399b37
complete documentation on implementation can be found at
399b37
Documentation/powerpc/firmware-assisted-dump.txt in upstream linux kernel tree
399b37
from 3.4 version and above.
399b37
399b37
Please note that the firmware-assisted dump feature is only available on Power6
399b37
and above systems with recent firmware versions.
399b37
399b37
Overview
399b37
399b37
Fadump
399b37
399b37
Fadump is a robust kernel crash dumping mechanism to get reliable kernel crash
399b37
dump with assistance from firmware. This approach does not use kexec, instead
399b37
firmware assists in booting the kdump kernel while preserving memory contents.
399b37
Unlike kdump, the system is fully reset, and loaded with a fresh copy of the
399b37
kernel. In particular, PCI and I/O devices are reinitialized and are in a
399b37
clean, consistent state.  This second kernel, often called a capture kernel,
399b37
boots with very little memory and captures the dump image.
399b37
399b37
The first kernel registers the sections of memory with the Power firmware for
399b37
dump preservation during OS initialization. These registered sections of memory
399b37
are reserved by the first kernel during early boot. When a system crashes, the
399b37
Power firmware fully resets the system, preserves all the system memory
399b37
contents, save the low memory (boot memory of size larger of 5% of system
399b37
RAM or 256MB) of RAM to the previous registered region. It will also save
399b37
system registers, and hardware PTE's.
399b37
399b37
Fadump is supported only on ppc64 platform. The standard kernel and capture
399b37
kernel are one and the same on ppc64.
399b37
399b37
If you're reading this document, you should already have kexec-tools
399b37
installed. If not, you install it via the following command:
399b37
399b37
    # yum install kexec-tools
399b37
399b37
Fadump Operational Flow:
399b37
399b37
Like kdump, fadump also exports the ELF formatted kernel crash dump through
399b37
/proc/vmcore. Hence existing kdump infrastructure can be used to capture fadump
399b37
vmcore. The idea is to keep the functionality transparent to end user. From
399b37
user perspective there is no change in the way kdump init script works.
399b37
399b37
However, unlike kdump, fadump does not pre-load kdump kernel and initrd into
399b37
reserved memory, instead it always uses default OS initrd during second boot
399b37
after crash. Hence, for fadump, we rebuild the new kdump initrd and replace it
399b37
with default initrd. Before replacing existing default initrd we take a backup
399b37
of original default initrd for user's reference. The dracut package has been
399b37
enhanced to rebuild the default initrd with vmcore capture steps. The initrd
399b37
image is rebuilt as per the configuration in /etc/kdump.conf file.
399b37
399b37
The control flow of fadump works as follows:
399b37
01. System panics.
399b37
02. At the crash, kernel informs power firmware that kernel has crashed.
399b37
03. Firmware takes the control and reboots the entire system preserving
399b37
    only the memory (resets all other devices).
399b37
04. The reboot follows the normal booting process (non-kexec).
399b37
05. The boot loader loads the default kernel and initrd from /boot
399b37
06. The default initrd loads and runs /init
399b37
07. dracut-kdump.sh script present in fadump aware default initrd checks if
399b37
    '/proc/device-tree/rtas/ibm,kernel-dump'  file exists  before executing
399b37
    steps to capture vmcore.
399b37
    (This check will help to bypass the vmcore capture steps during normal boot
399b37
     process.)
399b37
09. Captures dump according to /etc/kdump.conf
399b37
10. Is dump capture successful (yes goto 12, no goto 11)
399b37
11. Perform the failure action specified in /etc/kdump.conf
399b37
    (The default failure action is reboot, if unspecified)
399b37
12. Perform the final action specified in /etc/kdump.conf
399b37
    (The default final action is reboot, if unspecified)
399b37
399b37
399b37
How to configure fadump:
399b37
399b37
Again, we assume if you're reading this document, you should already have
399b37
kexec-tools installed. If not, you install it via the following command:
399b37
399b37
    # yum install kexec-tools
399b37
399b37
Make the kernel to be configured with FADump as the default boot entry, if
399b37
it isn't already:
399b37
399b37
   # grubby --set-default=/boot/vmlinuz-<kver>
399b37
399b37
Boot into the kernel to be configured for FADump. To be able to do much of
399b37
anything interesting in the way of debug analysis, you'll also need to install
399b37
the kernel-debuginfo package, of the same arch as your running kernel, and the
399b37
crash utility:
399b37
399b37
    # yum --enablerepo=\*debuginfo install kernel-debuginfo.$(uname -m) crash
399b37
399b37
Next up, we need to modify some boot parameters to enable firmware assisted
399b37
dump. With the help of grubby, it's very easy to append "fadump=on" to the end
399b37
of your kernel boot parameters. To reserve the appropriate amount of memory
399b37
for boot memory preservation, pass 'crashkernel=X' kernel cmdline parameter.
399b37
For the recommended value of X, see 'FADump Memory Requirements' section.
399b37
399b37
   # grubby --args="fadump=on crashkernel=6G" --update-kernel=/boot/vmlinuz-`uname -r`
399b37
ca2458
By default, FADump reserved memory will be initialized as CMA area to make the
ca2458
memory available through CMA allocator on the production kernel. We can opt out
ca2458
of this, making reserved memory unavailable to production kernel, by booting the
ca2458
linux kernel with 'fadump=nocma' instead of 'fadump=on'.
ca2458
399b37
The term 'boot memory' means size of the low memory chunk that is required for
399b37
a kernel to boot successfully when booted with restricted memory.  By default,
399b37
the boot memory size will be the larger of 5% of system RAM or 256MB.
399b37
Alternatively, user can also specify boot memory size through boot parameter
399b37
'fadump_reserve_mem=' which will override the default calculated size. Use this
399b37
option if default boot memory size is not sufficient for second kernel to boot
399b37
successfully.
399b37
399b37
After making said changes, reboot your system, so that the specified memory is
399b37
reserved and left untouched by the normal system. Take note that the output of
399b37
'free -m' will show X MB less memory than without this parameter, which is
399b37
expected. If you see OOM (Out Of Memory) error messages while loading capture
399b37
kernel, then you should bump up the memory reservation size.
399b37
399b37
Now that you've got that reserved memory region set up, you want to turn on
399b37
the kdump init script:
399b37
399b37
    # systemctl enable kdump.service
399b37
399b37
Then, start up kdump as well:
399b37
399b37
    # systemctl start kdump.service
399b37
399b37
This should turn on the firmware assisted functionality in kernel by
399b37
echo'ing 1 to /sys/kernel/fadump_registered, leaving the system ready
399b37
to capture a vmcore upon crashing. For journaling filesystems like XFS an
399b37
additional step is required to ensure bootloader does not pick the
399b37
older initrd (without vmcore capture scripts):
399b37
399b37
  * If /boot is a separate partition, run the below commands as the root user,
399b37
    or as a user with CAP_SYS_ADMIN rights:
399b37
399b37
        # fsfreeze -f
399b37
        # fsfreeze -u
399b37
399b37
  * If /boot is not a separate partition, reboot the system.
399b37
399b37
After reboot check if the kdump service is up and running with:
399b37
399b37
  # systemctl status kdump.service
399b37
399b37
To test out whether FADump is configured properly, you can force-crash your
399b37
system by echo'ing a 'c' into /proc/sysrq-trigger:
399b37
399b37
    # echo c > /proc/sysrq-trigger
399b37
399b37
You should see some panic output, followed by the system reset and booting into
399b37
fresh copy of kernel. When default initrd loads and runs /init, vmcore should
399b37
be copied out to disk (by default, in /var/crash/<YYYY.MM.DD-HH:MM:SS>/vmcore),
399b37
then the system rebooted back into your normal kernel.
399b37
399b37
Once back to your normal kernel, you can use the previously installed crash
399b37
kernel in conjunction with the previously installed kernel-debuginfo to
399b37
perform postmortem analysis:
399b37
399b37
    # crash /usr/lib/debug/lib/modules/2.6.17-1.2621.el5/vmlinux
399b37
    /var/crash/2006-08-23-15:34/vmcore
399b37
399b37
    crash> bt
399b37
399b37
and so on...
399b37
399b37
Saving vmcore-dmesg.txt
399b37
-----------------------
399b37
Kernel log bufferes are one of the most important information available
399b37
in vmcore. Now before saving vmcore, kernel log bufferes are extracted
399b37
from /proc/vmcore and saved into a file vmcore-dmesg.txt. After
399b37
vmcore-dmesg.txt, vmcore is saved. Destination disk and directory for
399b37
vmcore-dmesg.txt is same as vmcore. Note that kernel log buffers will
399b37
not be available if dump target is raw device.
399b37
399b37
FADump Memory Requirements:
399b37
399b37
  System Memory          Recommended memory
399b37
--------------------- ----------------------
399b37
    4 GB - 16 GB     :        768 MB
399b37
   16 GB - 64 GB     :       1024 MB
399b37
   64 GB - 128 GB    :          2 GB
399b37
  128 GB - 1 TB      :          4 GB
399b37
    1 TB - 2 TB      :          6 GB
399b37
    2 TB - 4 TB      :         12 GB
399b37
    4 TB - 8 TB      :         20 GB
399b37
    8 TB - 16 TB     :         36 GB
399b37
   16 TB - 32 TB     :         64 GB
399b37
   32 TB - 64 TB     :        128 GB
399b37
   64 TB & above     :        180 GB
399b37
399b37
Things to remember:
399b37
399b37
1) The memory required to boot capture Kernel is a moving target that depends
399b37
   on many factors like hardware attached to the system, kernel and modules in
399b37
   use, packages installed and services enabled, there is no one-size-fits-all.
399b37
   But the above recommendations are based on system memory. So, the above
399b37
   recommendations for FADump come with a few assumptions, based on available
399b37
   system memory, about the resources the system could have. So, please take
399b37
   the recommendations with a pinch of salt and remember to try capturing dump
399b37
   a few times to confirm that the system is configured successfully with dump
399b37
   capturing support.
399b37
399b37
2) Though the memory requirements for FADump seem high, this memory is not
399b37
   completely set aside but made available for userspace applications to use,
399b37
   through the CMA allocator.
399b37
399b37
3) As the same initrd is used for booting production kernel as well as capture
399b37
   kernel and with dump being captured in a restricted memory environment, few
399b37
   optimizations (like not inclding network dracut module, disabling multipath
399b37
   and such) are applied while building the initrd. In case, the production
399b37
   environment needs these optimizations to be avoided, dracut_args option in
399b37
   /etc/kdump.conf file could be leveraged. For example, if a user wishes for
399b37
   network module to be included in the initrd, adding the below entry in
399b37
   /etc/kdump.conf file and restarting kdump service would take care of it.
399b37
399b37
   dracut_args --add "network"
399b37
399b37
4) If FADump is configured to capture vmcore to a remote dump target using SSH
12feb9
   or NFS protocol, the corresponding network interface '<interface-name>' is
12feb9
   renamed to 'kdump-<interface-name>', if it is generic (like *eth# or net#).
12feb9
   It happens because vmcore capture scripts in the initial RAM disk (initrd)
12feb9
   add the 'kdump-' prefix to the network interface name to secure persistent
12feb9
   naming. And as capture kernel and production kernel use the same initrd in
12feb9
   case of FADump, the interface name is changed for the production kernel too.
12feb9
   This is likely to impact network configuration setup for production kernel.
12feb9
   So, it is recommended to use a non-generic name for a network interface,
12feb9
   before setting up FADump to capture vmcore to a remote dump target based on
12feb9
   that network interface, to avoid running into network configuration issues.
399b37
399b37
Dump Triggering methods:
399b37
399b37
This section talks about the various ways, other than a Kernel Panic, in which
399b37
fadump can be triggered. The following methods assume that fadump is configured
399b37
on your system, with the scripts enabled as described in the section above.
399b37
399b37
1) AltSysRq C
399b37
399b37
FAdump can be triggered with the combination of the 'Alt','SysRq' and 'C'
399b37
keyboard keys. Please refer to the following link for more details:
399b37
399b37
https://fedoraproject.org/wiki/QA/Sysrq
399b37
399b37
In addition, on PowerPC boxes, fadump can also be triggered via Hardware
399b37
Management Console(HMC) using 'Ctrl', 'O' and 'C' keyboard keys.
399b37
399b37
2) Kernel OOPs
399b37
399b37
If we want to generate a dump everytime the Kernel OOPses, we can achieve this
399b37
by setting the 'Panic On OOPs' option as follows:
399b37
399b37
    # echo 1 > /proc/sys/kernel/panic_on_oops
399b37
399b37
3) PowerPC specific methods:
399b37
399b37
On IBM PowerPC machines, issuing a soft reset invokes the XMON debugger(if
399b37
XMON is configured). To configure XMON one needs to compile the kernel with
399b37
the CONFIG_XMON and CONFIG_XMON_DEFAULT options, or by compiling with
399b37
CONFIG_XMON and booting the kernel with xmon=on option.
399b37
399b37
Following are the ways to remotely issue a soft reset on PowerPC boxes, which
399b37
would drop you to XMON. Pressing a 'X' (capital alphabet X) followed by an
399b37
'Enter' here will trigger the dump.
399b37
399b37
3.1) HMC
399b37
399b37
Hardware Management Console(HMC) available on Power4 and Power5 machines allow
399b37
partitions to be reset remotely. This is specially useful in hang situations
399b37
where the system is not accepting any keyboard inputs.
399b37
399b37
Once you have HMC configured, the following steps will enable you to trigger
399b37
fadump via a soft reset:
399b37
399b37
On Power4
399b37
  Using GUI
399b37
399b37
    * In the right pane, right click on the partition you wish to dump.
399b37
    * Select "Operating System->Reset".
399b37
    * Select "Soft Reset".
399b37
    * Select "Yes".
399b37
399b37
  Using HMC Commandline
399b37
399b37
    # reset_partition -m <machine> -p <partition> -t soft
399b37
399b37
On Power5
399b37
  Using GUI
399b37
399b37
    * In the right pane, right click on the partition you wish to dump.
399b37
    * Select "Restart Partition".
399b37
    * Select "Dump".
399b37
    * Select "OK".
399b37
399b37
  Using HMC Commandline
399b37
399b37
    # chsysstate -m <managed system name> -n <lpar name> -o dumprestart -r lpar
399b37
399b37
3.2) Blade Management Console for Blade Center
399b37
399b37
To initiate a dump operation, go to Power/Restart option under "Blade Tasks" in
399b37
the Blade Management Console. Select the corresponding blade for which you want
399b37
to initate the dump and then click "Restart blade with NMI". This issues a
399b37
system reset and invokes xmon debugger.
399b37
399b37
399b37
Advanced Setups & Failure action:
399b37
399b37
Kdump and fadump exhibit similar behavior in terms of setup & failure action.
399b37
For fadump advanced setup related information see section "Advanced Setups" in
399b37
"kexec-kdump-howto.txt" document. Refer to "Failure action" section in "kexec-
399b37
kdump-howto.txt" document for fadump failure action related information.
399b37
399b37
Compression and filtering
399b37
399b37
Refer "Compression and filtering" section in "kexec-kdump-howto.txt" document.
399b37
Compression and filtering are same for kdump & fadump.
399b37
399b37
399b37
Notes on rootfs mount:
399b37
Dracut is designed to mount rootfs by default. If rootfs mounting fails it
399b37
will refuse to go on. So fadump leaves rootfs mounting to dracut currently.
399b37
We make the assumtion that proper root= cmdline is being passed to dracut
399b37
initramfs for the time being. If you need modify "KDUMP_COMMANDLINE=" in
399b37
/etc/sysconfig/kdump, you will need to make sure that appropriate root=
399b37
options are copied from /proc/cmdline. In general it is best to append
399b37
command line options using "KDUMP_COMMANDLINE_APPEND=" instead of replacing
399b37
the original command line completely.
399b37
399b37
How to disable FADump:
399b37
ca2458
Remove "fadump=on"/"fadump=nocma" from kernel cmdline parameters OR replace
ca2458
it with "fadump=off" kernel cmdline parameter:
399b37
399b37
   # grubby --update-kernel=/boot/vmlinuz-`uname -r` --remove-args="fadump=on"
ca2458
or
ca2458
   # grubby --update-kernel=/boot/vmlinuz-`uname -r` --remove-args="fadump=nocma"
ca2458
OR
ca2458
   # grubby --update-kernel=/boot/vmlinuz-`uname -r` --args="fadump=off"
399b37
399b37
If KDump is to be used as the dump capturing mechanism, update the crashkernel
399b37
parameter (Else, remove "crashkernel=" parameter too, using grubby):
399b37
399b37
   # grubby --update-kernel=/boot/vmlinuz-$kver --args="crashkernl=auto"
399b37
399b37
Reboot the system for the settings to take effect.