Petr Šabata f5bf49
Kdump-in-cluster-environment HOWTO
Petr Šabata f5bf49
Petr Šabata f5bf49
Introduction
Petr Šabata f5bf49
Petr Šabata f5bf49
Kdump is a kexec based crash dumping mechansim for Linux. This docuement 
Petr Šabata f5bf49
illustrate how to configure kdump in cluster environment to allow the kdump 
Petr Šabata f5bf49
crash recovery service complete without being preempted by traditional power
Petr Šabata f5bf49
fencing methods. 
Petr Šabata f5bf49
Petr Šabata f5bf49
Overview
Petr Šabata f5bf49
Petr Šabata f5bf49
Kexec/Kdump
Petr Šabata f5bf49
Petr Šabata f5bf49
Details about Kexec/Kdump are available in Kexec-Kdump-howto file and will not
Petr Šabata f5bf49
be described here.
Petr Šabata f5bf49
Petr Šabata f5bf49
fence_kdump
Petr Šabata f5bf49
Petr Šabata f5bf49
fence_kdump is an I/O fencing agent to be used with the kdump crash recovery 
Petr Šabata f5bf49
service. When the fence_kdump agent is invoked, it will listen for a message 
Petr Šabata f5bf49
from the failed node that acknowledges that the failed node is executing the 
Petr Šabata f5bf49
kdump crash kernel. Note that fence_kdump is not a replacement for traditional
Petr Šabata f5bf49
fencing methods. The fence_kdump agent can only detect that a node has entered
Petr Šabata f5bf49
the kdump crash recovery service. This allows the kdump crash recovery service
Petr Šabata f5bf49
complete without being preempted by traditional power fencing methods. 
Petr Šabata f5bf49
Petr Šabata f5bf49
fence_kdump_send
Petr Šabata f5bf49
Petr Šabata f5bf49
fence_kdump_send is a utility used to send messages that acknowledge that the 
Petr Šabata f5bf49
node itself has entered the kdump crash recovery service. The fence_kdump_send
Petr Šabata f5bf49
utility is typically run in the kdump kernel after a cluster node has 
Petr Šabata f5bf49
encountered a kernel panic. Once the cluster node has entered the kdump crash 
Petr Šabata f5bf49
recovery service, fence_kdump_send will periodically send messages to all 
Petr Šabata f5bf49
cluster nodes. When the fence_kdump agent receives a valid message from the 
Petr Šabata f5bf49
failed nodes, fencing is complete.
Petr Šabata f5bf49
Petr Šabata f5bf49
How to configure Pacemaker cluster environment:
Petr Šabata f5bf49
Petr Šabata f5bf49
If we want to use kdump in Pacemaker cluster environment, fence-agents-kdump
Petr Šabata f5bf49
should be installed in every nodes in the cluster. You can achieve this via
Petr Šabata f5bf49
the following command:
Petr Šabata f5bf49
Petr Šabata f5bf49
  # yum install -y fence-agents-kdump
Petr Šabata f5bf49
Petr Šabata f5bf49
Next is to add kdump_fence to the cluster. Assuming that the cluster consists 
Petr Šabata f5bf49
of three nodes, they are node1, node2 and node3, and use Pacemaker to perform
Petr Šabata f5bf49
resource management and pcs as cli configuration tool. 
Petr Šabata f5bf49
Petr Šabata f5bf49
With pcs it is easy to add a stonith resource to the cluster. For example, add
Petr Šabata f5bf49
a stonith resource named mykdumpfence with fence type of fence_kdump via the 
Petr Šabata f5bf49
following commands:
Petr Šabata f5bf49
  
Petr Šabata f5bf49
   # pcs stonith create mykdumpfence fence_kdump \
Petr Šabata f5bf49
     pcmk_host_check=static-list pcmk_host_list="node1 node2 node3"
Petr Šabata f5bf49
   # pcs stonith update mykdumpfence pcmk_monitor_action=metadata --force
Petr Šabata f5bf49
   # pcs stonith update mykdumpfence pcmk_status_action=metadata --force
Petr Šabata f5bf49
   # pcs stonith update mykdumpfence pcmk_reboot_action=off --force
Petr Šabata f5bf49
   
Petr Šabata f5bf49
Then enable stonith
Petr Šabata f5bf49
   # pcs property set stonith-enabled=true
Petr Šabata f5bf49
Petr Šabata f5bf49
How to configure kdump:
Petr Šabata f5bf49
Petr Šabata f5bf49
Actually there are two ways how to configure fence_kdump support:
Petr Šabata f5bf49
Petr Šabata f5bf49
1) Pacemaker based clusters
Petr Šabata f5bf49
     If you have successfully configured fence_kdump in Pacemaker, there is
Petr Šabata f5bf49
     no need to add some special configuration in kdump. So please refer to
Petr Šabata f5bf49
     Kexec-Kdump-howto file for more information.
Petr Šabata f5bf49
Petr Šabata f5bf49
2) Generic clusters
Petr Šabata f5bf49
     For other types of clusters there are two configuration options in
Petr Šabata f5bf49
     kdump.conf which enables fence_kdump support:
Petr Šabata f5bf49
Petr Šabata f5bf49
       fence_kdump_nodes <node(s)>
Petr Šabata f5bf49
            Contains list of cluster node(s) separated by space to send
Petr Šabata f5bf49
            fence_kdump notification to (this option is mandatory to enable
Petr Šabata f5bf49
            fence_kdump)
Petr Šabata f5bf49
Petr Šabata f5bf49
       fence_kdump_args <arg(s)>
Petr Šabata f5bf49
            Command line arguments for fence_kdump_send (it can contain
Petr Šabata f5bf49
            all valid arguments except hosts to send notification to)
Petr Šabata f5bf49
Petr Šabata f5bf49
     These options will most probably be configured by your cluster software,
Petr Šabata f5bf49
     so please refer to your cluster documentation how to enable fence_kdump
Petr Šabata f5bf49
     support.
Petr Šabata f5bf49
Petr Šabata f5bf49
Please be aware that these two ways cannot be combined and 2) has precedence
Petr Šabata f5bf49
over 1). It means that if fence_kdump is configured using fence_kdump_nodes
Petr Šabata f5bf49
and fence_kdump_args options in kdump.conf, Pacemaker configuration is not
Petr Šabata f5bf49
used even if it exists.