765b01
Kdump-in-cluster-environment HOWTO
765b01
765b01
Introduction
765b01
765b01
Kdump is a kexec based crash dumping mechansim for Linux. This docuement 
765b01
illustrate how to configure kdump in cluster environment to allow the kdump 
765b01
crash recovery service complete without being preempted by traditional power
765b01
fencing methods. 
765b01
765b01
Overview
765b01
765b01
Kexec/Kdump
765b01
765b01
Details about Kexec/Kdump are available in Kexec-Kdump-howto file and will not
765b01
be described here.
765b01
765b01
fence_kdump
765b01
765b01
fence_kdump is an I/O fencing agent to be used with the kdump crash recovery 
765b01
service. When the fence_kdump agent is invoked, it will listen for a message 
765b01
from the failed node that acknowledges that the failed node is executing the 
765b01
kdump crash kernel. Note that fence_kdump is not a replacement for traditional
765b01
fencing methods. The fence_kdump agent can only detect that a node has entered
765b01
the kdump crash recovery service. This allows the kdump crash recovery service
765b01
complete without being preempted by traditional power fencing methods. 
765b01
765b01
fence_kdump_send
765b01
765b01
fence_kdump_send is a utility used to send messages that acknowledge that the 
765b01
node itself has entered the kdump crash recovery service. The fence_kdump_send
765b01
utility is typically run in the kdump kernel after a cluster node has 
765b01
encountered a kernel panic. Once the cluster node has entered the kdump crash 
765b01
recovery service, fence_kdump_send will periodically send messages to all 
765b01
cluster nodes. When the fence_kdump agent receives a valid message from the 
765b01
failed nodes, fencing is complete.
765b01
765b01
How to configure cluster environment:
765b01
765b01
If we want to use kdump in cluster environment, fence-agents-kdump should be 
765b01
installed in every nodes in the cluster. You can achieve this via the following 
765b01
command:
765b01
765b01
  # yum install -y fence-agents-kdump
765b01
765b01
Next is to add kdump_fence to the cluster. Assuming that the cluster consists 
765b01
of three nodes, they are node1, node2 and node3, and use Pacemaker to perform
765b01
resource management and pcs as cli configuration tool. 
765b01
765b01
With pcs it is easy to add a stonith resource to the cluster. For example, add
765b01
a stonith resource named mykdumpfence with fence type of fence_kdump via the 
765b01
following commands:
765b01
  
765b01
   # pcs stonith create mykdumpfence fence_kdump \
765b01
     pcmk_host_check=static-list pcmk_host_list="node1 node2 node3"
765b01
   # pcs stonith update mykdumpfence pcmk_monitor_action=metadata --force
765b01
   # pcs stonith update mykdumpfence pcmk_status_action=metadata --force
765b01
   # pcs stonith update mykdumpfence pcmk_reboot_action=off --force
765b01
   
765b01
Then enable stonith
765b01
   # pcs property set stonith-enabled=true
765b01
765b01
How to configure kdump:
765b01
765b01
Actually there is nothing special in configuration between normal kdump and
765b01
cluster environment kdump. So please refer to Kexec-Kdump-howto file for more
765b01
information.