db7d74
Firmware assisted dump (fadump) HOWTO
db7d74
db7d74
Introduction
db7d74
db7d74
Firmware assisted dump is a new feature in the 3.4 mainline kernel supported
db7d74
only on powerpc architecture. The goal of firmware-assisted dump is to enable
db7d74
the dump of a crashed system, and to do so from a fully-reset system, and to
db7d74
minimize the total elapsed time until the system is back in production use. A
db7d74
complete documentation on implementation can be found at
db7d74
Documentation/powerpc/firmware-assisted-dump.txt in upstream linux kernel tree
db7d74
from 3.4 version and above.
db7d74
db7d74
Please note that the firmware-assisted dump feature is only available on Power6
db7d74
and above systems with recent firmware versions.
db7d74
db7d74
Overview
db7d74
db7d74
Fadump
db7d74
db7d74
Fadump is a robust kernel crash dumping mechanism to get reliable kernel crash
db7d74
dump with assistance from firmware. This approach does not use kexec, instead
db7d74
firmware assists in booting the kdump kernel while preserving memory contents.
db7d74
Unlike kdump, the system is fully reset, and loaded with a fresh copy of the
db7d74
kernel. In particular, PCI and I/O devices are reinitialized and are in a
db7d74
clean, consistent state.  This second kernel, often called a capture kernel,
db7d74
boots with very little memory and captures the dump image.
db7d74
db7d74
The first kernel registers the sections of memory with the Power firmware for
db7d74
dump preservation during OS initialization. These registered sections of memory
db7d74
are reserved by the first kernel during early boot. When a system crashes, the
db7d74
Power firmware fully resets the system, preserves all the system memory
db7d74
contents, save the low memory (boot memory of size larger of 5% of system
db7d74
RAM or 256MB) of RAM to the previous registered region. It will also save
db7d74
system registers, and hardware PTE's.
db7d74
db7d74
Fadump is supported only on ppc64 platform. The standard kernel and capture
db7d74
kernel are one and the same on ppc64.
db7d74
db7d74
If you're reading this document, you should already have kexec-tools
db7d74
installed. If not, you install it via the following command:
db7d74
e4f61c
    # dnf install kexec-tools
db7d74
db7d74
Fadump Operational Flow:
db7d74
db7d74
Like kdump, fadump also exports the ELF formatted kernel crash dump through
db7d74
/proc/vmcore. Hence existing kdump infrastructure can be used to capture fadump
db7d74
vmcore. The idea is to keep the functionality transparent to end user. From
db7d74
user perspective there is no change in the way kdump init script works.
db7d74
db7d74
However, unlike kdump, fadump does not pre-load kdump kernel and initrd into
db7d74
reserved memory, instead it always uses default OS initrd during second boot
db7d74
after crash. Hence, for fadump, we rebuild the new kdump initrd and replace it
db7d74
with default initrd. Before replacing existing default initrd we take a backup
db7d74
of original default initrd for user's reference. The dracut package has been
db7d74
enhanced to rebuild the default initrd with vmcore capture steps. The initrd
db7d74
image is rebuilt as per the configuration in /etc/kdump.conf file.
db7d74
db7d74
The control flow of fadump works as follows:
db7d74
01. System panics.
db7d74
02. At the crash, kernel informs power firmware that kernel has crashed.
db7d74
03. Firmware takes the control and reboots the entire system preserving
db7d74
    only the memory (resets all other devices).
db7d74
04. The reboot follows the normal booting process (non-kexec).
db7d74
05. The boot loader loads the default kernel and initrd from /boot
db7d74
06. The default initrd loads and runs /init
db7d74
07. dracut-kdump.sh script present in fadump aware default initrd checks if
db7d74
    '/proc/device-tree/rtas/ibm,kernel-dump'  file exists  before executing
db7d74
    steps to capture vmcore.
db7d74
    (This check will help to bypass the vmcore capture steps during normal boot
db7d74
     process.)
db7d74
09. Captures dump according to /etc/kdump.conf
db7d74
10. Is dump capture successful (yes goto 12, no goto 11)
db7d74
11. Perform the failure action specified in /etc/kdump.conf
db7d74
    (The default failure action is reboot, if unspecified)
db7d74
12. Perform the final action specified in /etc/kdump.conf
db7d74
    (The default final action is reboot, if unspecified)
db7d74
db7d74
db7d74
How to configure fadump:
db7d74
db7d74
Again, we assume if you're reading this document, you should already have
db7d74
kexec-tools installed. If not, you install it via the following command:
db7d74
e4f61c
    # dnf install kexec-tools
db7d74
db7d74
Make the kernel to be configured with FADump as the default boot entry, if
db7d74
it isn't already:
db7d74
db7d74
   # grubby --set-default=/boot/vmlinuz-<kver>
db7d74
db7d74
Boot into the kernel to be configured for FADump. To be able to do much of
db7d74
anything interesting in the way of debug analysis, you'll also need to install
db7d74
the kernel-debuginfo package, of the same arch as your running kernel, and the
db7d74
crash utility:
db7d74
e4f61c
    # dnf --enablerepo=\*debuginfo install kernel-debuginfo.$(uname -m) crash
db7d74
e4f61c
Next up, we can enable firmware assisted dump and reserve the memory for boot
e4f61c
memory preservation as specified in in the table of 'FADump Memory Requirements'
e4f61c
section:
e4f61c
e4f61c
   # kdumpctl reset-crashkernel --fadump=on
e4f61c
e4f61c
Alternatively, you can use grubby to reserve custom amount of memory:
db7d74
db7d74
   # grubby --args="fadump=on crashkernel=6G" --update-kernel=/boot/vmlinuz-`uname -r`
db7d74
db7d74
By default, FADump reserved memory will be initialized as CMA area to make the
db7d74
memory available through CMA allocator on the production kernel. We can opt out
db7d74
of this, making reserved memory unavailable to production kernel, by booting the
e4f61c
linux kernel with 'fadump=nocma' instead of 'fadump=on':
e4f61c
e4f61c
   # kdumpctl reset-crashkernel --fadump=nocma
db7d74
db7d74
The term 'boot memory' means size of the low memory chunk that is required for
db7d74
a kernel to boot successfully when booted with restricted memory.  By default,
db7d74
the boot memory size will be the larger of 5% of system RAM or 256MB.
db7d74
Alternatively, user can also specify boot memory size through boot parameter
db7d74
'fadump_reserve_mem=' which will override the default calculated size. Use this
db7d74
option if default boot memory size is not sufficient for second kernel to boot
db7d74
successfully.
db7d74
db7d74
After making said changes, reboot your system, so that the specified memory is
db7d74
reserved and left untouched by the normal system. Take note that the output of
db7d74
'free -m' will show X MB less memory than without this parameter, which is
db7d74
expected. If you see OOM (Out Of Memory) error messages while loading capture
db7d74
kernel, then you should bump up the memory reservation size.
db7d74
db7d74
Now that you've got that reserved memory region set up, you want to turn on
db7d74
the kdump init script:
db7d74
db7d74
    # systemctl enable kdump.service
db7d74
db7d74
Then, start up kdump as well:
db7d74
db7d74
    # systemctl start kdump.service
db7d74
db7d74
This should turn on the firmware assisted functionality in kernel by
db7d74
echo'ing 1 to /sys/kernel/fadump_registered, leaving the system ready
db7d74
to capture a vmcore upon crashing. For journaling filesystems like XFS an
db7d74
additional step is required to ensure bootloader does not pick the
db7d74
older initrd (without vmcore capture scripts):
db7d74
db7d74
  * If /boot is a separate partition, run the below commands as the root user,
db7d74
    or as a user with CAP_SYS_ADMIN rights:
db7d74
db7d74
        # fsfreeze -f
db7d74
        # fsfreeze -u
db7d74
db7d74
  * If /boot is not a separate partition, reboot the system.
db7d74
db7d74
After reboot check if the kdump service is up and running with:
db7d74
db7d74
  # systemctl status kdump.service
db7d74
db7d74
To test out whether FADump is configured properly, you can force-crash your
db7d74
system by echo'ing a 'c' into /proc/sysrq-trigger:
db7d74
db7d74
    # echo c > /proc/sysrq-trigger
db7d74
db7d74
You should see some panic output, followed by the system reset and booting into
db7d74
fresh copy of kernel. When default initrd loads and runs /init, vmcore should
db7d74
be copied out to disk (by default, in /var/crash/<YYYY.MM.DD-HH:MM:SS>/vmcore),
db7d74
then the system rebooted back into your normal kernel.
db7d74
db7d74
Once back to your normal kernel, you can use the previously installed crash
db7d74
kernel in conjunction with the previously installed kernel-debuginfo to
db7d74
perform postmortem analysis:
db7d74
db7d74
    # crash /usr/lib/debug/lib/modules/2.6.17-1.2621.el5/vmlinux
db7d74
    /var/crash/2006-08-23-15:34/vmcore
db7d74
db7d74
    crash> bt
db7d74
db7d74
and so on...
db7d74
db7d74
Saving vmcore-dmesg.txt
db7d74
-----------------------
db7d74
Kernel log bufferes are one of the most important information available
db7d74
in vmcore. Now before saving vmcore, kernel log bufferes are extracted
db7d74
from /proc/vmcore and saved into a file vmcore-dmesg.txt. After
db7d74
vmcore-dmesg.txt, vmcore is saved. Destination disk and directory for
db7d74
vmcore-dmesg.txt is same as vmcore. Note that kernel log buffers will
db7d74
not be available if dump target is raw device.
db7d74
db7d74
FADump Memory Requirements:
db7d74
db7d74
  System Memory          Recommended memory
db7d74
--------------------- ----------------------
db7d74
    4 GB - 16 GB     :        768 MB
db7d74
   16 GB - 64 GB     :       1024 MB
db7d74
   64 GB - 128 GB    :          2 GB
db7d74
  128 GB - 1 TB      :          4 GB
db7d74
    1 TB - 2 TB      :          6 GB
db7d74
    2 TB - 4 TB      :         12 GB
db7d74
    4 TB - 8 TB      :         20 GB
db7d74
    8 TB - 16 TB     :         36 GB
db7d74
   16 TB - 32 TB     :         64 GB
db7d74
   32 TB - 64 TB     :        128 GB
db7d74
   64 TB & above     :        180 GB
db7d74
db7d74
Things to remember:
db7d74
db7d74
1) The memory required to boot capture Kernel is a moving target that depends
db7d74
   on many factors like hardware attached to the system, kernel and modules in
db7d74
   use, packages installed and services enabled, there is no one-size-fits-all.
db7d74
   But the above recommendations are based on system memory. So, the above
db7d74
   recommendations for FADump come with a few assumptions, based on available
db7d74
   system memory, about the resources the system could have. So, please take
db7d74
   the recommendations with a pinch of salt and remember to try capturing dump
db7d74
   a few times to confirm that the system is configured successfully with dump
db7d74
   capturing support.
db7d74
db7d74
2) Though the memory requirements for FADump seem high, this memory is not
db7d74
   completely set aside but made available for userspace applications to use,
db7d74
   through the CMA allocator.
db7d74
db7d74
3) As the same initrd is used for booting production kernel as well as capture
db7d74
   kernel and with dump being captured in a restricted memory environment, few
db7d74
   optimizations (like not inclding network dracut module, disabling multipath
db7d74
   and such) are applied while building the initrd. In case, the production
db7d74
   environment needs these optimizations to be avoided, dracut_args option in
db7d74
   /etc/kdump.conf file could be leveraged. For example, if a user wishes for
db7d74
   network module to be included in the initrd, adding the below entry in
db7d74
   /etc/kdump.conf file and restarting kdump service would take care of it.
db7d74
db7d74
   dracut_args --add "network"
db7d74
db7d74
4) If FADump is configured to capture vmcore to a remote dump target using SSH
41d07d
   or NFS protocol, the corresponding network interface '<interface-name>' is
41d07d
   renamed to 'kdump-<interface-name>', if it is generic (like *eth# or net#).
41d07d
   It happens because vmcore capture scripts in the initial RAM disk (initrd)
41d07d
   add the 'kdump-' prefix to the network interface name to secure persistent
41d07d
   naming. And as capture kernel and production kernel use the same initrd in
41d07d
   case of FADump, the interface name is changed for the production kernel too.
41d07d
   This is likely to impact network configuration setup for production kernel.
41d07d
   So, it is recommended to use a non-generic name for a network interface,
41d07d
   before setting up FADump to capture vmcore to a remote dump target based on
41d07d
   that network interface, to avoid running into network configuration issues.
db7d74
db7d74
Dump Triggering methods:
db7d74
db7d74
This section talks about the various ways, other than a Kernel Panic, in which
db7d74
fadump can be triggered. The following methods assume that fadump is configured
db7d74
on your system, with the scripts enabled as described in the section above.
db7d74
db7d74
1) AltSysRq C
db7d74
db7d74
FAdump can be triggered with the combination of the 'Alt','SysRq' and 'C'
db7d74
keyboard keys. Please refer to the following link for more details:
db7d74
db7d74
https://fedoraproject.org/wiki/QA/Sysrq
db7d74
db7d74
In addition, on PowerPC boxes, fadump can also be triggered via Hardware
db7d74
Management Console(HMC) using 'Ctrl', 'O' and 'C' keyboard keys.
db7d74
db7d74
2) Kernel OOPs
db7d74
db7d74
If we want to generate a dump everytime the Kernel OOPses, we can achieve this
db7d74
by setting the 'Panic On OOPs' option as follows:
db7d74
db7d74
    # echo 1 > /proc/sys/kernel/panic_on_oops
db7d74
db7d74
3) PowerPC specific methods:
db7d74
db7d74
On IBM PowerPC machines, issuing a soft reset invokes the XMON debugger(if
db7d74
XMON is configured). To configure XMON one needs to compile the kernel with
db7d74
the CONFIG_XMON and CONFIG_XMON_DEFAULT options, or by compiling with
db7d74
CONFIG_XMON and booting the kernel with xmon=on option.
db7d74
db7d74
Following are the ways to remotely issue a soft reset on PowerPC boxes, which
db7d74
would drop you to XMON. Pressing a 'X' (capital alphabet X) followed by an
db7d74
'Enter' here will trigger the dump.
db7d74
db7d74
3.1) HMC
db7d74
db7d74
Hardware Management Console(HMC) available on Power4 and Power5 machines allow
db7d74
partitions to be reset remotely. This is specially useful in hang situations
db7d74
where the system is not accepting any keyboard inputs.
db7d74
db7d74
Once you have HMC configured, the following steps will enable you to trigger
db7d74
fadump via a soft reset:
db7d74
db7d74
On Power4
db7d74
  Using GUI
db7d74
db7d74
    * In the right pane, right click on the partition you wish to dump.
db7d74
    * Select "Operating System->Reset".
db7d74
    * Select "Soft Reset".
db7d74
    * Select "Yes".
db7d74
db7d74
  Using HMC Commandline
db7d74
db7d74
    # reset_partition -m <machine> -p <partition> -t soft
db7d74
db7d74
On Power5
db7d74
  Using GUI
db7d74
db7d74
    * In the right pane, right click on the partition you wish to dump.
db7d74
    * Select "Restart Partition".
db7d74
    * Select "Dump".
db7d74
    * Select "OK".
db7d74
db7d74
  Using HMC Commandline
db7d74
db7d74
    # chsysstate -m <managed system name> -n <lpar name> -o dumprestart -r lpar
db7d74
db7d74
3.2) Blade Management Console for Blade Center
db7d74
db7d74
To initiate a dump operation, go to Power/Restart option under "Blade Tasks" in
db7d74
the Blade Management Console. Select the corresponding blade for which you want
db7d74
to initate the dump and then click "Restart blade with NMI". This issues a
db7d74
system reset and invokes xmon debugger.
db7d74
db7d74
db7d74
Advanced Setups & Failure action:
db7d74
db7d74
Kdump and fadump exhibit similar behavior in terms of setup & failure action.
db7d74
For fadump advanced setup related information see section "Advanced Setups" in
db7d74
"kexec-kdump-howto.txt" document. Refer to "Failure action" section in "kexec-
db7d74
kdump-howto.txt" document for fadump failure action related information.
db7d74
db7d74
Compression and filtering
db7d74
db7d74
Refer "Compression and filtering" section in "kexec-kdump-howto.txt" document.
db7d74
Compression and filtering are same for kdump & fadump.
db7d74
db7d74
db7d74
Notes on rootfs mount:
db7d74
Dracut is designed to mount rootfs by default. If rootfs mounting fails it
db7d74
will refuse to go on. So fadump leaves rootfs mounting to dracut currently.
db7d74
We make the assumtion that proper root= cmdline is being passed to dracut
db7d74
initramfs for the time being. If you need modify "KDUMP_COMMANDLINE=" in
db7d74
/etc/sysconfig/kdump, you will need to make sure that appropriate root=
db7d74
options are copied from /proc/cmdline. In general it is best to append
db7d74
command line options using "KDUMP_COMMANDLINE_APPEND=" instead of replacing
db7d74
the original command line completely.
db7d74
db7d74
How to disable FADump:
db7d74
db7d74
Remove "fadump=on"/"fadump=nocma" from kernel cmdline parameters OR replace
db7d74
it with "fadump=off" kernel cmdline parameter:
db7d74
db7d74
   # grubby --update-kernel=/boot/vmlinuz-`uname -r` --remove-args="fadump=on"
db7d74
or
db7d74
   # grubby --update-kernel=/boot/vmlinuz-`uname -r` --remove-args="fadump=nocma"
db7d74
OR
db7d74
   # grubby --update-kernel=/boot/vmlinuz-`uname -r` --args="fadump=off"
db7d74
db7d74
Remove "crashkernel=" from kernel cmdline parameters:
db7d74
db7d74
   # grubby --update-kernel=/boot/vmlinuz-`uname -r` --remove-args="crashkernel"
db7d74
db7d74
If KDump is to be used as the dump capturing mechanism, reset the crashkernel parameter:
db7d74
e4f61c
   # kdumpctl reset-crashkernel --fadump=off
db7d74
db7d74
Reboot the system for the settings to take effect.