Configuring the Behavior of Fenced Nodes With Kdump

If the hearbeat mechanism detects that a node with a mounted OCFS2 volume has lost contact with the other cluster nodes, that node is removed from the cluster in a process called fencing. Fencing prevents other nodes from hanging while trying to access resources that are held by the fenced node. By default, a fenced node automatically restarts so that it can rejoin the cluster as soon as possible. 

However, under some circumstances, you might not want this default behavior. For example, if a node often restarts for no obvious reason, then causing the node to panic instead of restarting is preferable, so that you can troubleshoot the issue. By enabling Kdump on the node, you can obtain a vmcore crash dump from the fenced node and analyze it to diagnose the cause of frequent node restarts.

  1. Configure the fence method.

    To configure a node to panic at the next fencing, set fence_method to panic by running the following command on the node after the cluster starts:

    echo "panic" | sudo tee /sys/kernel/config/cluster/cluster-name/fence_method
  2. Persist the change across reboots.

    To set the value after each system reboot, add the same line to the /etc/rc.local file.

To restore the default behavior, change the value of fence_method back to reset.

echo "reset" | sudo tee /sys/kernel/config/cluster/cluster-name/fence_method

Then, remove the panic line from /etc/rc.local if the line exists in the file.