5cf148
Firmware assisted dump (fadump) HOWTO
5cf148
5cf148
Introduction
5cf148
5cf148
Firmware assisted dump is a new feature in the 3.4 mainline kernel supported
5cf148
only on powerpc architecture. The goal of firmware-assisted dump is to enable
5cf148
the dump of a crashed system, and to do so from a fully-reset system, and to
5cf148
minimize the total elapsed time until the system is back in production use. A
5cf148
complete documentation on implementation can be found at
5cf148
Documentation/powerpc/firmware-assisted-dump.txt in upstream linux kernel tree
5cf148
from 3.4 version and above.
5cf148
5cf148
Please note that the firmware-assisted dump feature is only available on Power6
5cf148
and above systems with recent firmware versions.
5cf148
5cf148
Overview
5cf148
5cf148
Fadump
5cf148
5cf148
Fadump is a robust kernel crash dumping mechanism to get reliable kernel crash
5cf148
dump with assistance from firmware. This approach does not use kexec, instead
5cf148
firmware assists in booting the kdump kernel while preserving memory contents.
5cf148
Unlike kdump, the system is fully reset, and loaded with a fresh copy of the
5cf148
kernel. In particular, PCI and I/O devices are reinitialized and are in a
5cf148
clean, consistent state.  This second kernel, often called a capture kernel,
5cf148
boots with very little memory and captures the dump image.
5cf148
5cf148
The first kernel registers the sections of memory with the Power firmware for
5cf148
dump preservation during OS initialization. These registered sections of memory
5cf148
are reserved by the first kernel during early boot. When a system crashes, the
5cf148
Power firmware fully resets the system, preserves all the system memory
5cf148
contents, save the low memory (boot memory of size larger of 5% of system
5cf148
RAM or 256MB) of RAM to the previous registered region. It will also save
5cf148
system registers, and hardware PTE's.
5cf148
5cf148
Fadump is supported only on ppc64 platform. The standard kernel and capture
5cf148
kernel are one and the same on ppc64.
5cf148
5cf148
If you're reading this document, you should already have kexec-tools
5cf148
installed. If not, you install it via the following command:
5cf148
5a6191
    # dnf install kexec-tools
5cf148
5cf148
Fadump Operational Flow:
5cf148
5cf148
Like kdump, fadump also exports the ELF formatted kernel crash dump through
5cf148
/proc/vmcore. Hence existing kdump infrastructure can be used to capture fadump
5cf148
vmcore. The idea is to keep the functionality transparent to end user. From
5cf148
user perspective there is no change in the way kdump init script works.
5cf148
5cf148
However, unlike kdump, fadump does not pre-load kdump kernel and initrd into
5cf148
reserved memory, instead it always uses default OS initrd during second boot
5cf148
after crash. Hence, for fadump, we rebuild the new kdump initrd and replace it
5cf148
with default initrd. Before replacing existing default initrd we take a backup
5cf148
of original default initrd for user's reference. The dracut package has been
5cf148
enhanced to rebuild the default initrd with vmcore capture steps. The initrd
5cf148
image is rebuilt as per the configuration in /etc/kdump.conf file.
5cf148
5cf148
The control flow of fadump works as follows:
5cf148
01. System panics.
5cf148
02. At the crash, kernel informs power firmware that kernel has crashed.
5cf148
03. Firmware takes the control and reboots the entire system preserving
5cf148
    only the memory (resets all other devices).
5cf148
04. The reboot follows the normal booting process (non-kexec).
5cf148
05. The boot loader loads the default kernel and initrd from /boot
5cf148
06. The default initrd loads and runs /init
5cf148
07. dracut-kdump.sh script present in fadump aware default initrd checks if
5cf148
    '/proc/device-tree/rtas/ibm,kernel-dump'  file exists  before executing
5cf148
    steps to capture vmcore.
5cf148
    (This check will help to bypass the vmcore capture steps during normal boot
5cf148
     process.)
5cf148
09. Captures dump according to /etc/kdump.conf
5cf148
10. Is dump capture successful (yes goto 12, no goto 11)
5cf148
11. Perform the failure action specified in /etc/kdump.conf
5cf148
    (The default failure action is reboot, if unspecified)
5cf148
12. Perform the final action specified in /etc/kdump.conf
5cf148
    (The default final action is reboot, if unspecified)
5cf148
5cf148
5cf148
How to configure fadump:
5cf148
5cf148
Again, we assume if you're reading this document, you should already have
5cf148
kexec-tools installed. If not, you install it via the following command:
5cf148
5a6191
    # dnf install kexec-tools
5cf148
5cf148
Make the kernel to be configured with FADump as the default boot entry, if
5cf148
it isn't already:
5cf148
5cf148
   # grubby --set-default=/boot/vmlinuz-<kver>
5cf148
5cf148
Boot into the kernel to be configured for FADump. To be able to do much of
5cf148
anything interesting in the way of debug analysis, you'll also need to install
5cf148
the kernel-debuginfo package, of the same arch as your running kernel, and the
5cf148
crash utility:
5cf148
5a6191
    # dnf --enablerepo=\*debuginfo install kernel-debuginfo.$(uname -m) crash
5cf148
5a6191
Next up, we can enable firmware assisted dump and reserve the memory for boot
5a6191
memory preservation as specified in in the table of 'FADump Memory Requirements'
5a6191
section:
5a6191
5a6191
   # kdumpctl reset-crashkernel --fadump=on
5a6191
5a6191
Alternatively, you can use grubby to reserve custom amount of memory:
5cf148
5cf148
   # grubby --args="fadump=on crashkernel=6G" --update-kernel=/boot/vmlinuz-`uname -r`
5cf148
5cf148
By default, FADump reserved memory will be initialized as CMA area to make the
5cf148
memory available through CMA allocator on the production kernel. We can opt out
5cf148
of this, making reserved memory unavailable to production kernel, by booting the
5a6191
linux kernel with 'fadump=nocma' instead of 'fadump=on':
5a6191
5a6191
   # kdumpctl reset-crashkernel --fadump=nocma
5cf148
5cf148
The term 'boot memory' means size of the low memory chunk that is required for
5cf148
a kernel to boot successfully when booted with restricted memory.  By default,
5cf148
the boot memory size will be the larger of 5% of system RAM or 256MB.
5cf148
Alternatively, user can also specify boot memory size through boot parameter
5cf148
'fadump_reserve_mem=' which will override the default calculated size. Use this
5cf148
option if default boot memory size is not sufficient for second kernel to boot
5cf148
successfully.
5cf148
5cf148
After making said changes, reboot your system, so that the specified memory is
5cf148
reserved and left untouched by the normal system. Take note that the output of
5cf148
'free -m' will show X MB less memory than without this parameter, which is
5cf148
expected. If you see OOM (Out Of Memory) error messages while loading capture
5cf148
kernel, then you should bump up the memory reservation size.
5cf148
5cf148
Now that you've got that reserved memory region set up, you want to turn on
5cf148
the kdump init script:
5cf148
5cf148
    # systemctl enable kdump.service
5cf148
5cf148
Then, start up kdump as well:
5cf148
5cf148
    # systemctl start kdump.service
5cf148
5cf148
This should turn on the firmware assisted functionality in kernel by
5cf148
echo'ing 1 to /sys/kernel/fadump_registered, leaving the system ready
5cf148
to capture a vmcore upon crashing. For journaling filesystems like XFS an
5cf148
additional step is required to ensure bootloader does not pick the
5cf148
older initrd (without vmcore capture scripts):
5cf148
5cf148
  * If /boot is a separate partition, run the below commands as the root user,
5cf148
    or as a user with CAP_SYS_ADMIN rights:
5cf148
5cf148
        # fsfreeze -f
5cf148
        # fsfreeze -u
5cf148
5cf148
  * If /boot is not a separate partition, reboot the system.
5cf148
5cf148
After reboot check if the kdump service is up and running with:
5cf148
5cf148
  # systemctl status kdump.service
5cf148
5cf148
To test out whether FADump is configured properly, you can force-crash your
5cf148
system by echo'ing a 'c' into /proc/sysrq-trigger:
5cf148
5cf148
    # echo c > /proc/sysrq-trigger
5cf148
5cf148
You should see some panic output, followed by the system reset and booting into
5cf148
fresh copy of kernel. When default initrd loads and runs /init, vmcore should
5cf148
be copied out to disk (by default, in /var/crash/<YYYY.MM.DD-HH:MM:SS>/vmcore),
5cf148
then the system rebooted back into your normal kernel.
5cf148
5cf148
Once back to your normal kernel, you can use the previously installed crash
5cf148
kernel in conjunction with the previously installed kernel-debuginfo to
5cf148
perform postmortem analysis:
5cf148
5cf148
    # crash /usr/lib/debug/lib/modules/2.6.17-1.2621.el5/vmlinux
5cf148
    /var/crash/2006-08-23-15:34/vmcore
5cf148
5cf148
    crash> bt
5cf148
5cf148
and so on...
5cf148
5cf148
Saving vmcore-dmesg.txt
5cf148
-----------------------
5cf148
Kernel log bufferes are one of the most important information available
5cf148
in vmcore. Now before saving vmcore, kernel log bufferes are extracted
5cf148
from /proc/vmcore and saved into a file vmcore-dmesg.txt. After
5cf148
vmcore-dmesg.txt, vmcore is saved. Destination disk and directory for
5cf148
vmcore-dmesg.txt is same as vmcore. Note that kernel log buffers will
5cf148
not be available if dump target is raw device.
5cf148
5cf148
FADump Memory Requirements:
5cf148
5cf148
  System Memory          Recommended memory
5cf148
--------------------- ----------------------
5cf148
    4 GB - 16 GB     :        768 MB
5cf148
   16 GB - 64 GB     :       1024 MB
5cf148
   64 GB - 128 GB    :          2 GB
5cf148
  128 GB - 1 TB      :          4 GB
5cf148
    1 TB - 2 TB      :          6 GB
5cf148
    2 TB - 4 TB      :         12 GB
5cf148
    4 TB - 8 TB      :         20 GB
5cf148
    8 TB - 16 TB     :         36 GB
5cf148
   16 TB - 32 TB     :         64 GB
5cf148
   32 TB - 64 TB     :        128 GB
5cf148
   64 TB & above     :        180 GB
5cf148
5cf148
Things to remember:
5cf148
5cf148
1) The memory required to boot capture Kernel is a moving target that depends
5cf148
   on many factors like hardware attached to the system, kernel and modules in
5cf148
   use, packages installed and services enabled, there is no one-size-fits-all.
5cf148
   But the above recommendations are based on system memory. So, the above
5cf148
   recommendations for FADump come with a few assumptions, based on available
5cf148
   system memory, about the resources the system could have. So, please take
5cf148
   the recommendations with a pinch of salt and remember to try capturing dump
5cf148
   a few times to confirm that the system is configured successfully with dump
5cf148
   capturing support.
5cf148
5cf148
2) Though the memory requirements for FADump seem high, this memory is not
5cf148
   completely set aside but made available for userspace applications to use,
5cf148
   through the CMA allocator.
5cf148
5cf148
3) As the same initrd is used for booting production kernel as well as capture
5cf148
   kernel and with dump being captured in a restricted memory environment, few
5cf148
   optimizations (like not inclding network dracut module, disabling multipath
5cf148
   and such) are applied while building the initrd. In case, the production
5cf148
   environment needs these optimizations to be avoided, dracut_args option in
5cf148
   /etc/kdump.conf file could be leveraged. For example, if a user wishes for
5cf148
   network module to be included in the initrd, adding the below entry in
5cf148
   /etc/kdump.conf file and restarting kdump service would take care of it.
5cf148
5cf148
   dracut_args --add "network"
5cf148
5cf148
4) If FADump is configured to capture vmcore to a remote dump target using SSH
5cf148
   or NFS protocol, the corresponding network interface '<interface-name>' is
5cf148
   renamed to 'kdump-<interface-name>', if it is generic (like *eth# or net#).
5cf148
   It happens because vmcore capture scripts in the initial RAM disk (initrd)
5cf148
   add the 'kdump-' prefix to the network interface name to secure persistent
5cf148
   naming. And as capture kernel and production kernel use the same initrd in
5cf148
   case of FADump, the interface name is changed for the production kernel too.
5cf148
   This is likely to impact network configuration setup for production kernel.
5cf148
   So, it is recommended to use a non-generic name for a network interface,
5cf148
   before setting up FADump to capture vmcore to a remote dump target based on
5cf148
   that network interface, to avoid running into network configuration issues.
5cf148
5cf148
Dump Triggering methods:
5cf148
5cf148
This section talks about the various ways, other than a Kernel Panic, in which
5cf148
fadump can be triggered. The following methods assume that fadump is configured
5cf148
on your system, with the scripts enabled as described in the section above.
5cf148
5cf148
1) AltSysRq C
5cf148
5cf148
FAdump can be triggered with the combination of the 'Alt','SysRq' and 'C'
5cf148
keyboard keys. Please refer to the following link for more details:
5cf148
5cf148
https://fedoraproject.org/wiki/QA/Sysrq
5cf148
5cf148
In addition, on PowerPC boxes, fadump can also be triggered via Hardware
5cf148
Management Console(HMC) using 'Ctrl', 'O' and 'C' keyboard keys.
5cf148
5cf148
2) Kernel OOPs
5cf148
5cf148
If we want to generate a dump everytime the Kernel OOPses, we can achieve this
5cf148
by setting the 'Panic On OOPs' option as follows:
5cf148
5cf148
    # echo 1 > /proc/sys/kernel/panic_on_oops
5cf148
5cf148
3) PowerPC specific methods:
5cf148
5cf148
On IBM PowerPC machines, issuing a soft reset invokes the XMON debugger(if
5cf148
XMON is configured). To configure XMON one needs to compile the kernel with
5cf148
the CONFIG_XMON and CONFIG_XMON_DEFAULT options, or by compiling with
5cf148
CONFIG_XMON and booting the kernel with xmon=on option.
5cf148
5cf148
Following are the ways to remotely issue a soft reset on PowerPC boxes, which
5cf148
would drop you to XMON. Pressing a 'X' (capital alphabet X) followed by an
5cf148
'Enter' here will trigger the dump.
5cf148
5cf148
3.1) HMC
5cf148
5cf148
Hardware Management Console(HMC) available on Power4 and Power5 machines allow
5cf148
partitions to be reset remotely. This is specially useful in hang situations
5cf148
where the system is not accepting any keyboard inputs.
5cf148
5cf148
Once you have HMC configured, the following steps will enable you to trigger
5cf148
fadump via a soft reset:
5cf148
5cf148
On Power4
5cf148
  Using GUI
5cf148
5cf148
    * In the right pane, right click on the partition you wish to dump.
5cf148
    * Select "Operating System->Reset".
5cf148
    * Select "Soft Reset".
5cf148
    * Select "Yes".
5cf148
5cf148
  Using HMC Commandline
5cf148
5cf148
    # reset_partition -m <machine> -p <partition> -t soft
5cf148
5cf148
On Power5
5cf148
  Using GUI
5cf148
5cf148
    * In the right pane, right click on the partition you wish to dump.
5cf148
    * Select "Restart Partition".
5cf148
    * Select "Dump".
5cf148
    * Select "OK".
5cf148
5cf148
  Using HMC Commandline
5cf148
5cf148
    # chsysstate -m <managed system name> -n <lpar name> -o dumprestart -r lpar
5cf148
5cf148
3.2) Blade Management Console for Blade Center
5cf148
5cf148
To initiate a dump operation, go to Power/Restart option under "Blade Tasks" in
5cf148
the Blade Management Console. Select the corresponding blade for which you want
5cf148
to initate the dump and then click "Restart blade with NMI". This issues a
5cf148
system reset and invokes xmon debugger.
5cf148
5cf148
5cf148
Advanced Setups & Failure action:
5cf148
5cf148
Kdump and fadump exhibit similar behavior in terms of setup & failure action.
5cf148
For fadump advanced setup related information see section "Advanced Setups" in
5cf148
"kexec-kdump-howto.txt" document. Refer to "Failure action" section in "kexec-
5cf148
kdump-howto.txt" document for fadump failure action related information.
5cf148
5cf148
Compression and filtering
5cf148
5cf148
Refer "Compression and filtering" section in "kexec-kdump-howto.txt" document.
5cf148
Compression and filtering are same for kdump & fadump.
5cf148
5cf148
5cf148
Notes on rootfs mount:
5cf148
Dracut is designed to mount rootfs by default. If rootfs mounting fails it
5cf148
will refuse to go on. So fadump leaves rootfs mounting to dracut currently.
5cf148
We make the assumtion that proper root= cmdline is being passed to dracut
5cf148
initramfs for the time being. If you need modify "KDUMP_COMMANDLINE=" in
5cf148
/etc/sysconfig/kdump, you will need to make sure that appropriate root=
5cf148
options are copied from /proc/cmdline. In general it is best to append
5cf148
command line options using "KDUMP_COMMANDLINE_APPEND=" instead of replacing
5cf148
the original command line completely.
5cf148
5cf148
How to disable FADump:
5cf148
5cf148
Remove "fadump=on"/"fadump=nocma" from kernel cmdline parameters OR replace
5cf148
it with "fadump=off" kernel cmdline parameter:
5cf148
5cf148
   # grubby --update-kernel=/boot/vmlinuz-`uname -r` --remove-args="fadump=on"
5cf148
or
5cf148
   # grubby --update-kernel=/boot/vmlinuz-`uname -r` --remove-args="fadump=nocma"
5cf148
OR
5cf148
   # grubby --update-kernel=/boot/vmlinuz-`uname -r` --args="fadump=off"
5cf148
5cf148
Remove "crashkernel=" from kernel cmdline parameters:
5cf148
5cf148
   # grubby --update-kernel=/boot/vmlinuz-`uname -r` --remove-args="crashkernel"
5cf148
5cf148
If KDump is to be used as the dump capturing mechanism, reset the crashkernel parameter:
5cf148
5a6191
   # kdumpctl reset-crashkernel --fadump=off
5cf148
5cf148
Reboot the system for the settings to take effect.