f8bec6
Firmware assisted dump (fadump) HOWTO
f8bec6
f8bec6
Introduction
f8bec6
f8bec6
Firmware assisted dump is a new feature in the 3.4 mainline kernel supported
f8bec6
only on powerpc architecture. The goal of firmware-assisted dump is to enable
f8bec6
the dump of a crashed system, and to do so from a fully-reset system, and to
f8bec6
minimize the total elapsed time until the system is back in production use. A
f8bec6
complete documentation on implementation can be found at
f8bec6
Documentation/powerpc/firmware-assisted-dump.txt in upstream linux kernel tree
f8bec6
from 3.4 version and above.
f8bec6
f8bec6
Please note that the firmware-assisted dump feature is only available on Power6
f8bec6
and above systems with recent firmware versions.
f8bec6
f8bec6
Overview
f8bec6
f8bec6
Fadump
f8bec6
f8bec6
Fadump is a robust kernel crash dumping mechanism to get reliable kernel crash
f8bec6
dump with assistance from firmware. This approach does not use kexec, instead
f8bec6
firmware assists in booting the kdump kernel while preserving memory contents.
f8bec6
Unlike kdump, the system is fully reset, and loaded with a fresh copy of the
f8bec6
kernel. In particular, PCI and I/O devices are reinitialized and are in a
f8bec6
clean, consistent state.  This second kernel, often called a capture kernel,
f8bec6
boots with very little memory and captures the dump image.
f8bec6
f8bec6
The first kernel registers the sections of memory with the Power firmware for
f8bec6
dump preservation during OS initialization. These registered sections of memory
f8bec6
are reserved by the first kernel during early boot. When a system crashes, the
f8bec6
Power firmware fully resets the system, preserves all the system memory
f8bec6
contents, save the low memory (boot memory of size larger of 5% of system
f8bec6
RAM or 256MB) of RAM to the previous registered region. It will also save
f8bec6
system registers, and hardware PTE's.
f8bec6
f8bec6
Fadump is supported only on ppc64 platform. The standard kernel and capture
f8bec6
kernel are one and the same on ppc64.
f8bec6
f8bec6
If you're reading this document, you should already have kexec-tools
f8bec6
installed. If not, you install it via the following command:
f8bec6
f8bec6
    # yum install kexec-tools
f8bec6
f8bec6
Fadump Operational Flow:
f8bec6
f8bec6
Like kdump, fadump also exports the ELF formatted kernel crash dump through
f8bec6
/proc/vmcore. Hence existing kdump infrastructure can be used to capture fadump
f8bec6
vmcore. The idea is to keep the functionality transparent to end user. From
f8bec6
user perspective there is no change in the way kdump init script works.
f8bec6
f8bec6
However, unlike kdump, fadump does not pre-load kdump kernel and initrd into
f8bec6
reserved memory, instead it always uses default OS initrd during second boot
f8bec6
after crash. Hence, for fadump, we rebuild the new kdump initrd and replace it
f8bec6
with default initrd. Before replacing existing default initrd we take a backup
f8bec6
of original default initrd for user's reference. The dracut package has been
f8bec6
enhanced to rebuild the default initrd with vmcore capture steps. The initrd
f8bec6
image is rebuilt as per the configuration in /etc/kdump.conf file.
f8bec6
f8bec6
The control flow of fadump works as follows:
f8bec6
01. System panics.
f8bec6
02. At the crash, kernel informs power firmware that kernel has crashed.
f8bec6
03. Firmware takes the control and reboots the entire system preserving
f8bec6
    only the memory (resets all other devices).
f8bec6
04. The reboot follows the normal booting process (non-kexec).
f8bec6
05. The boot loader loads the default kernel and initrd from /boot
f8bec6
06. The default initrd loads and runs /init
f8bec6
07. dracut-kdump.sh script present in fadump aware default initrd checks if
f8bec6
    '/proc/device-tree/rtas/ibm,kernel-dump'  file exists  before executing
f8bec6
    steps to capture vmcore.
f8bec6
    (This check will help to bypass the vmcore capture steps during normal boot
f8bec6
     process.)
f8bec6
09. Captures dump according to /etc/kdump.conf
f8bec6
10. Is dump capture successful (yes goto 12, no goto 11)
f8bec6
11. Perform the failure action specified in /etc/kdump.conf
f8bec6
    (The default failure action is reboot, if unspecified)
f8bec6
12. Perform the final action specified in /etc/kdump.conf
f8bec6
    (The default final action is reboot, if unspecified)
f8bec6
f8bec6
f8bec6
How to configure fadump:
f8bec6
f8bec6
Again, we assume if you're reading this document, you should already have
f8bec6
kexec-tools installed. If not, you install it via the following command:
f8bec6
f8bec6
    # yum install kexec-tools
f8bec6
f8bec6
Make the kernel to be configured with FADump as the default boot entry, if
f8bec6
it isn't already:
f8bec6
f8bec6
   # grubby --set-default=/boot/vmlinuz-<kver>
f8bec6
f8bec6
Boot into the kernel to be configured for FADump. To be able to do much of
f8bec6
anything interesting in the way of debug analysis, you'll also need to install
f8bec6
the kernel-debuginfo package, of the same arch as your running kernel, and the
f8bec6
crash utility:
f8bec6
f8bec6
    # yum --enablerepo=\*debuginfo install kernel-debuginfo.$(uname -m) crash
f8bec6
f8bec6
Next up, we need to modify some boot parameters to enable firmware assisted
f8bec6
dump. With the help of grubby, it's very easy to append "fadump=on" to the end
f8bec6
of your kernel boot parameters. To reserve the appropriate amount of memory
f8bec6
for boot memory preservation, pass 'crashkernel=X' kernel cmdline parameter.
f8bec6
For the recommended value of X, see 'FADump Memory Requirements' section.
f8bec6
f8bec6
   # grubby --args="fadump=on crashkernel=6G" --update-kernel=/boot/vmlinuz-`uname -r`
f8bec6
8f4abc
By default, FADump reserved memory will be initialized as CMA area to make the
8f4abc
memory available through CMA allocator on the production kernel. We can opt out
8f4abc
of this, making reserved memory unavailable to production kernel, by booting the
8f4abc
linux kernel with 'fadump=nocma' instead of 'fadump=on'.
8f4abc
f8bec6
The term 'boot memory' means size of the low memory chunk that is required for
f8bec6
a kernel to boot successfully when booted with restricted memory.  By default,
f8bec6
the boot memory size will be the larger of 5% of system RAM or 256MB.
f8bec6
Alternatively, user can also specify boot memory size through boot parameter
f8bec6
'fadump_reserve_mem=' which will override the default calculated size. Use this
f8bec6
option if default boot memory size is not sufficient for second kernel to boot
f8bec6
successfully.
f8bec6
f8bec6
After making said changes, reboot your system, so that the specified memory is
f8bec6
reserved and left untouched by the normal system. Take note that the output of
f8bec6
'free -m' will show X MB less memory than without this parameter, which is
f8bec6
expected. If you see OOM (Out Of Memory) error messages while loading capture
f8bec6
kernel, then you should bump up the memory reservation size.
f8bec6
f8bec6
Now that you've got that reserved memory region set up, you want to turn on
f8bec6
the kdump init script:
f8bec6
f8bec6
    # systemctl enable kdump.service
f8bec6
f8bec6
Then, start up kdump as well:
f8bec6
f8bec6
    # systemctl start kdump.service
f8bec6
f8bec6
This should turn on the firmware assisted functionality in kernel by
f8bec6
echo'ing 1 to /sys/kernel/fadump_registered, leaving the system ready
f8bec6
to capture a vmcore upon crashing. For journaling filesystems like XFS an
f8bec6
additional step is required to ensure bootloader does not pick the
f8bec6
older initrd (without vmcore capture scripts):
f8bec6
f8bec6
  * If /boot is a separate partition, run the below commands as the root user,
f8bec6
    or as a user with CAP_SYS_ADMIN rights:
f8bec6
f8bec6
        # fsfreeze -f
f8bec6
        # fsfreeze -u
f8bec6
f8bec6
  * If /boot is not a separate partition, reboot the system.
f8bec6
f8bec6
After reboot check if the kdump service is up and running with:
f8bec6
f8bec6
  # systemctl status kdump.service
f8bec6
f8bec6
To test out whether FADump is configured properly, you can force-crash your
f8bec6
system by echo'ing a 'c' into /proc/sysrq-trigger:
f8bec6
f8bec6
    # echo c > /proc/sysrq-trigger
f8bec6
f8bec6
You should see some panic output, followed by the system reset and booting into
f8bec6
fresh copy of kernel. When default initrd loads and runs /init, vmcore should
f8bec6
be copied out to disk (by default, in /var/crash/<YYYY.MM.DD-HH:MM:SS>/vmcore),
f8bec6
then the system rebooted back into your normal kernel.
f8bec6
f8bec6
Once back to your normal kernel, you can use the previously installed crash
f8bec6
kernel in conjunction with the previously installed kernel-debuginfo to
f8bec6
perform postmortem analysis:
f8bec6
f8bec6
    # crash /usr/lib/debug/lib/modules/2.6.17-1.2621.el5/vmlinux
f8bec6
    /var/crash/2006-08-23-15:34/vmcore
f8bec6
f8bec6
    crash> bt
f8bec6
f8bec6
and so on...
f8bec6
f8bec6
Saving vmcore-dmesg.txt
f8bec6
-----------------------
f8bec6
Kernel log bufferes are one of the most important information available
f8bec6
in vmcore. Now before saving vmcore, kernel log bufferes are extracted
f8bec6
from /proc/vmcore and saved into a file vmcore-dmesg.txt. After
f8bec6
vmcore-dmesg.txt, vmcore is saved. Destination disk and directory for
f8bec6
vmcore-dmesg.txt is same as vmcore. Note that kernel log buffers will
f8bec6
not be available if dump target is raw device.
f8bec6
f8bec6
FADump Memory Requirements:
f8bec6
f8bec6
  System Memory          Recommended memory
f8bec6
--------------------- ----------------------
f8bec6
    4 GB - 16 GB     :        768 MB
f8bec6
   16 GB - 64 GB     :       1024 MB
f8bec6
   64 GB - 128 GB    :          2 GB
f8bec6
  128 GB - 1 TB      :          4 GB
f8bec6
    1 TB - 2 TB      :          6 GB
f8bec6
    2 TB - 4 TB      :         12 GB
f8bec6
    4 TB - 8 TB      :         20 GB
f8bec6
    8 TB - 16 TB     :         36 GB
f8bec6
   16 TB - 32 TB     :         64 GB
f8bec6
   32 TB - 64 TB     :        128 GB
f8bec6
   64 TB & above     :        180 GB
f8bec6
f8bec6
Things to remember:
f8bec6
f8bec6
1) The memory required to boot capture Kernel is a moving target that depends
f8bec6
   on many factors like hardware attached to the system, kernel and modules in
f8bec6
   use, packages installed and services enabled, there is no one-size-fits-all.
f8bec6
   But the above recommendations are based on system memory. So, the above
f8bec6
   recommendations for FADump come with a few assumptions, based on available
f8bec6
   system memory, about the resources the system could have. So, please take
f8bec6
   the recommendations with a pinch of salt and remember to try capturing dump
f8bec6
   a few times to confirm that the system is configured successfully with dump
f8bec6
   capturing support.
f8bec6
f8bec6
2) Though the memory requirements for FADump seem high, this memory is not
f8bec6
   completely set aside but made available for userspace applications to use,
f8bec6
   through the CMA allocator.
f8bec6
f8bec6
3) As the same initrd is used for booting production kernel as well as capture
f8bec6
   kernel and with dump being captured in a restricted memory environment, few
f8bec6
   optimizations (like not inclding network dracut module, disabling multipath
f8bec6
   and such) are applied while building the initrd. In case, the production
f8bec6
   environment needs these optimizations to be avoided, dracut_args option in
f8bec6
   /etc/kdump.conf file could be leveraged. For example, if a user wishes for
f8bec6
   network module to be included in the initrd, adding the below entry in
f8bec6
   /etc/kdump.conf file and restarting kdump service would take care of it.
f8bec6
f8bec6
   dracut_args --add "network"
f8bec6
f8bec6
4) If FADump is configured to capture vmcore to a remote dump target using SSH
d43fe6
   or NFS protocol, the corresponding network interface '<interface-name>' is
d43fe6
   renamed to 'kdump-<interface-name>', if it is generic (like *eth# or net#).
d43fe6
   It happens because vmcore capture scripts in the initial RAM disk (initrd)
d43fe6
   add the 'kdump-' prefix to the network interface name to secure persistent
d43fe6
   naming. And as capture kernel and production kernel use the same initrd in
d43fe6
   case of FADump, the interface name is changed for the production kernel too.
d43fe6
   This is likely to impact network configuration setup for production kernel.
d43fe6
   So, it is recommended to use a non-generic name for a network interface,
d43fe6
   before setting up FADump to capture vmcore to a remote dump target based on
d43fe6
   that network interface, to avoid running into network configuration issues.
f8bec6
f8bec6
Dump Triggering methods:
f8bec6
f8bec6
This section talks about the various ways, other than a Kernel Panic, in which
f8bec6
fadump can be triggered. The following methods assume that fadump is configured
f8bec6
on your system, with the scripts enabled as described in the section above.
f8bec6
f8bec6
1) AltSysRq C
f8bec6
f8bec6
FAdump can be triggered with the combination of the 'Alt','SysRq' and 'C'
f8bec6
keyboard keys. Please refer to the following link for more details:
f8bec6
f8bec6
https://fedoraproject.org/wiki/QA/Sysrq
f8bec6
f8bec6
In addition, on PowerPC boxes, fadump can also be triggered via Hardware
f8bec6
Management Console(HMC) using 'Ctrl', 'O' and 'C' keyboard keys.
f8bec6
f8bec6
2) Kernel OOPs
f8bec6
f8bec6
If we want to generate a dump everytime the Kernel OOPses, we can achieve this
f8bec6
by setting the 'Panic On OOPs' option as follows:
f8bec6
f8bec6
    # echo 1 > /proc/sys/kernel/panic_on_oops
f8bec6
f8bec6
3) PowerPC specific methods:
f8bec6
f8bec6
On IBM PowerPC machines, issuing a soft reset invokes the XMON debugger(if
f8bec6
XMON is configured). To configure XMON one needs to compile the kernel with
f8bec6
the CONFIG_XMON and CONFIG_XMON_DEFAULT options, or by compiling with
f8bec6
CONFIG_XMON and booting the kernel with xmon=on option.
f8bec6
f8bec6
Following are the ways to remotely issue a soft reset on PowerPC boxes, which
f8bec6
would drop you to XMON. Pressing a 'X' (capital alphabet X) followed by an
f8bec6
'Enter' here will trigger the dump.
f8bec6
f8bec6
3.1) HMC
f8bec6
f8bec6
Hardware Management Console(HMC) available on Power4 and Power5 machines allow
f8bec6
partitions to be reset remotely. This is specially useful in hang situations
f8bec6
where the system is not accepting any keyboard inputs.
f8bec6
f8bec6
Once you have HMC configured, the following steps will enable you to trigger
f8bec6
fadump via a soft reset:
f8bec6
f8bec6
On Power4
f8bec6
  Using GUI
f8bec6
f8bec6
    * In the right pane, right click on the partition you wish to dump.
f8bec6
    * Select "Operating System->Reset".
f8bec6
    * Select "Soft Reset".
f8bec6
    * Select "Yes".
f8bec6
f8bec6
  Using HMC Commandline
f8bec6
f8bec6
    # reset_partition -m <machine> -p <partition> -t soft
f8bec6
f8bec6
On Power5
f8bec6
  Using GUI
f8bec6
f8bec6
    * In the right pane, right click on the partition you wish to dump.
f8bec6
    * Select "Restart Partition".
f8bec6
    * Select "Dump".
f8bec6
    * Select "OK".
f8bec6
f8bec6
  Using HMC Commandline
f8bec6
f8bec6
    # chsysstate -m <managed system name> -n <lpar name> -o dumprestart -r lpar
f8bec6
f8bec6
3.2) Blade Management Console for Blade Center
f8bec6
f8bec6
To initiate a dump operation, go to Power/Restart option under "Blade Tasks" in
f8bec6
the Blade Management Console. Select the corresponding blade for which you want
f8bec6
to initate the dump and then click "Restart blade with NMI". This issues a
f8bec6
system reset and invokes xmon debugger.
f8bec6
f8bec6
f8bec6
Advanced Setups & Failure action:
f8bec6
f8bec6
Kdump and fadump exhibit similar behavior in terms of setup & failure action.
f8bec6
For fadump advanced setup related information see section "Advanced Setups" in
f8bec6
"kexec-kdump-howto.txt" document. Refer to "Failure action" section in "kexec-
f8bec6
kdump-howto.txt" document for fadump failure action related information.
f8bec6
f8bec6
Compression and filtering
f8bec6
f8bec6
Refer "Compression and filtering" section in "kexec-kdump-howto.txt" document.
f8bec6
Compression and filtering are same for kdump & fadump.
f8bec6
f8bec6
f8bec6
Notes on rootfs mount:
f8bec6
Dracut is designed to mount rootfs by default. If rootfs mounting fails it
f8bec6
will refuse to go on. So fadump leaves rootfs mounting to dracut currently.
f8bec6
We make the assumtion that proper root= cmdline is being passed to dracut
f8bec6
initramfs for the time being. If you need modify "KDUMP_COMMANDLINE=" in
f8bec6
/etc/sysconfig/kdump, you will need to make sure that appropriate root=
f8bec6
options are copied from /proc/cmdline. In general it is best to append
f8bec6
command line options using "KDUMP_COMMANDLINE_APPEND=" instead of replacing
f8bec6
the original command line completely.
f8bec6
f8bec6
How to disable FADump:
f8bec6
8f4abc
Remove "fadump=on"/"fadump=nocma" from kernel cmdline parameters OR replace
8f4abc
it with "fadump=off" kernel cmdline parameter:
f8bec6
f8bec6
   # grubby --update-kernel=/boot/vmlinuz-`uname -r` --remove-args="fadump=on"
8f4abc
or
8f4abc
   # grubby --update-kernel=/boot/vmlinuz-`uname -r` --remove-args="fadump=nocma"
8f4abc
OR
8f4abc
   # grubby --update-kernel=/boot/vmlinuz-`uname -r` --args="fadump=off"
f8bec6
f8bec6
If KDump is to be used as the dump capturing mechanism, update the crashkernel
f8bec6
parameter (Else, remove "crashkernel=" parameter too, using grubby):
f8bec6
f8bec6
   # grubby --update-kernel=/boot/vmlinuz-$kver --args="crashkernl=auto"
f8bec6
f8bec6
Reboot the system for the settings to take effect.