e8fa8b
e8fa8b
Instructions for how to set up the watchdog daemon to work with IPMI's hardware watchdog
e8fa8b
----------------------------------------------------------------------------------------
e8fa8b
e8fa8b
First, verify that the ipmitool utility is present on the system to allow
e8fa8b
the watchdog timer to be turned off via the command line (which ipmitool).
e8fa8b
This will allow the hardware watchdog timer to be turned off gracefully
e8fa8b
should it ever become necessary.   If ipmitool is not present, install
e8fa8b
it or download the latest version from http://ipmitool.sourceforge.net and
e8fa8b
build and install it on your system.
e8fa8b
e8fa8b
Next, prior to starting up the watchdog daemon, the BMC BIOS should be set 
e8fa8b
to enable the IPMI/BMC hardware watchdog timer, the OpenIPMI watchdog driver 
e8fa8b
module should be inserted with the desired configuration/startup settings, 
e8fa8b
and the watchdog daemon's configuration file should be modified to use /dev/watchdog:
e8fa8b
e8fa8b
     1. To setup the IPMI/BMC BIOS to enable the hardware watchdog
e8fa8b
     timer, see BMC documentation. The main settings in the BMC BIOS 
e8fa8b
     requiring modification to turn on the IPMI watchdog timer are: 
e8fa8b
e8fa8b
      - Set the BMC POST Watchdog to "ENABLED".
e8fa8b
      - Set the BMC POST Watchdog Timeout to "5 Minutes".
e8fa8b
e8fa8b
     2. To insert the OpenIPMI watchdog driver module with the
e8fa8b
     desired configuration settings, two steps are necessary: 
e8fa8b
e8fa8b
        i.) Configure the OpenIPMI watchdog driver by editing the 
e8fa8b
            /etc/sysconfig/ipmi configuration file:
e8fa8b
e8fa8b
          - Set "IPMI_WATCHDOG=yes".
e8fa8b
          - Set desired options via the IPMI_WATCHDOG_OPTIONS
e8fa8b
            config entry.
e8fa8b
e8fa8b
           EXAMPLE: 'IPMI_WATCHDOG_OPTIONS="timeout=60 start_now=1 \
e8fa8b
                   preop=preop_give_data action=power_cycle pretimeout=1" '
e8fa8b
e8fa8b
            Execute "modinfo ipmi_watchdog" for more detailed information
e8fa8b
            on the available ipmi watchdog timer options.
e8fa8b
e8fa8b
          - Execute "service ipmi start" (the watchdog driver starts 
e8fa8b
            automatically along with the other ipmi drivers).
e8fa8b
e8fa8b
          IMPORTANT: If "start_now=1" has been set as one of the 
e8fa8b
               configuration options, be sure to start up the watchdog 
e8fa8b
               daemon before the BMC timer expires!
e8fa8b
e8fa8b
       ii.) Set the OpenIPMI daemon and watchdog to start during bootup:
e8fa8b
e8fa8b
          - chkconfig ipmi on
e8fa8b
          - chkconfig watchdog on
e8fa8b
e8fa8b
e8fa8b
     3. Configure the watchdog daemon by editing the 
e8fa8b
     /etc/watchdog.conf configuration file: 
e8fa8b
e8fa8b
      - Uncomment the "watchdog-device = /dev/watchdog" line.
e8fa8b
      - Ensure that "realtime = yes" and "priority = 1" are set and not
e8fa8b
        commented-out.
e8fa8b
      - Uncomment the "interval" line, and set the interval to be less 
e8fa8b
        than what you set the timeout option to be in the /etc/sysconfig/ipmi
e8fa8b
        file (ex "timeout=60" so you might set interval to 50).
e8fa8b
e8fa8b
     So in the example described herein, the BMC BIOS setting is in 
e8fa8b
     minutes (5), and the "interval" and ipmi_watchdog "timeout" settings 
e8fa8b
     are both in seconds (50 and 60 respectively).  Therefore, the BMC 
e8fa8b
     hardware watchdog timer is set to expire and trigger a system power 
e8fa8b
     cycle unless reset by the watchdog daemon within 5 minutes, and the 
e8fa8b
     watchdog daemon will reset the timer every 60 seconds.
e8fa8b
e8fa8b
e8fa8b
     4. Start the Watchdog daemon:
e8fa8b
e8fa8b
      - execute "service watchdog start" 
e8fa8b
e8fa8b
e8fa8b
IMPORTANT:  To gracefully stop/kill the watchdog daemon, be sure
e8fa8b
to use "service watchdog stop" (which executes "kill -s SIGTERM <pid>")
e8fa8b
and do *not* use "kill -9 <pid>".  Using "kill -9 <pid>" will cause the 
e8fa8b
daemon to be shut off without stopping the BMC's watchdog timer, thus 
e8fa8b
a system reboot will be triggered when the BMC's watchdog timer expires.
e8fa8b
e8fa8b
Alternately, or in case the watchdog daemon is killed "ungracefully", 
e8fa8b
you can stop the BMC timer by executing the following ipmitool utility 
e8fa8b
command before the watchdog timer expires:
e8fa8b
e8fa8b
 # ipmitool -v raw 0x06 0x24 0x04 0x01 0x00 0x10 0x00 0x0a
e8fa8b
e8fa8b
----------------------------------------------------------------------
e8fa8b
e8fa8b
To test the watchdog after system configuration and setup:
e8fa8b
e8fa8b
.  Use kill -9 on the watchdog daemon so it doesn't shut down the watchdog daemon 
e8fa8b
   gracefully.  Verify that the system gets reset after the BMC timer expires.
e8fa8b
 
e8fa8b
.  Use "service watchdog stop" and verify that the watchdog daemon shuts off
e8fa8b
   the BMC watchdog timer gracefully (the system doesn't get reset).
e8fa8b
e8fa8b
.  Set the timer on the watchdog daemon to be greater than the time set in
e8fa8b
   the BMC BIOS for system reset and verify that the system is reset.
e8fa8b
e8fa8b
.  Set the timer on the daemon to be less than the time set in the
e8fa8b
   BMC timer and verify that the BMC watchdog is poked regularly and the 
e8fa8b
   system is not reset.
e8fa8b
e8fa8b
.  Test some of the other actions the BMC can take when the watchdog timer 
e8fa8b
   goes off (see modinfo ipmi_watchdog for some other settings to try).
e8fa8b