Fencing Configuration Examples
The following examples describe that various types of fencing configurations that you can implement.
IPMI LAN Fencing
Intelligent Platform Management Interface (IPMI) is an
interface to a subsystem that provides management features of
the host system's hardware and firmware and includes
facilities to power cycle a system over a dedicated network
without any requirement to access the system's operating
system. You can configure the fence_ipmilan
fencing agent for the cluster so that
stonith
can be achieved across the IPMI
LAN.
If your systems are configured for IPMI, you can run the
following commands on one of the nodes in the cluster to
enable the ipmilan
fencing agent and
configure stonith
for both nodes, for
example:
sudo pcs stonith create ipmilan_n1_fencing fence_ipmilan pcmk_host_list=node1 delay=5 \
ipaddr=203.0.113.1 login=root passwd=password lanplus=1 op monitor interval=60s
sudo pcs stonith create ipmilan_n2_fencing fence_ipmilan pcmk_host_list=node2 \
ipaddr=203.0.113.2 login=root passwd=password lanplus=1 op monitor interval=60s
In the example, node1 is a host
that has an IPMI LAN interface configured on the IP address
203.0.113.1. The host named
node2 has an IPMI LAN interface
that is configured on the IP
203.0.113.2. The
root
user password for the IPMI login on
both systems is specified in this example as
password. In each instance. You
should replace these configuration variables with the
appropriate values for your particular environment.
Note that the delay option should only be set to one node. This setting ensures that in the rare case of a fence race condition that only one node is killed and the other continues to run. Without this option set, it is possible that both nodes make the assumption that they are the only surviving node and then simultaneously reset each other.
Attention:
The IPMI LAN agent exposes the login credentials of the IPMI subsystem in plain text. Your security policy should ensure that it is acceptable for users with access to the Pacemaker configuration and tools to also have access to these credentials and the underlying subsystems that are involved.
SCSI Fencing
The SCSI Fencing agent is used to provide storage-level
fencing. This configuration protects storage resources from
being written to by two nodes simultaneously by using SCSI-3
PR (Persistent Reservation). Used in conjunction with a
watchdog service, a node can be reset automatically by using
stonith
when it attempts to access the SCSI
resource without a reservation.
To configure an environment in this way:
-
Install the watchdog service on both nodes and then copy the provided
fence_scsi_check
script to the watchdog configuration before enabling the service, as shown in the following example:sudo dnf install watchdog sudo cp /usr/share/cluster/fence_scsi_check /etc/watchdog.d/ sudo systemctl enable --now watchdog
-
Enable the
iscsid
service that is provided in theiscsi-initiator-utils
package on both nodes:sudo dnf install -y iscsi-initiator-utils sudo systemctl enable --now iscsid
-
After both nodes are configured with the watchdog service and the
iscsid
service, you can configure thefence_scsi
fencing agent on one of the cluster nodes to monitor a shared storage device, such as an iSCSI target, for example:sudo pcs stonith create scsi_fencing fence_scsi pcmk_host_list="node1 node2" \ devices="/dev/sdb" meta provides="unfencing"
In the example, node1 and node2 represent the hostnames of the nodes in the cluster and /dev/sdb is the shared storage device. Replace these variables with the appropriate values for your particular environment.
SBD Fencing
The Storage Based Death (SBD) daemon can run on a system and
monitor shared storage. The SBD daemon can use a messaging
system to track cluster health. SBD can also trigger a reset
if the appropriate fencing agent determines that
stonith
should be implemented.
Note:
SBD Fencing is the method used with Oracle Linux HA clusters running on Oracle Cloud Infrastructure, as documented in Create a High Availability Cluster on Oracle Cloud Infrastructure (OCI) .
To set up and configure SBD fencing:
-
Stop the cluster by running the following command on one of the nodes:
sudo pcs cluster stop --all
-
On each node, install and configure the SBD daemon:
sudo dnf install sbd
-
Enable the
sbd
systemd service:sudo systemctl enable sbd
Note that the
sbd
systemd service is automatically started and stopped as a dependency of thepacemaker
service, you do not need to run this service independently. Attempting to start or stop thesbd
systemd service fails and returns an error indicating that it is controlled as a dependency service. -
Edit the
/etc/sysconfig/sbd
file and set theSBD_DEVICE
parameter to identify the shared storage device. For example, if your shared storage device is available on /dev/sdc, make sure the file contains the following line:SBD_DEVICE="/dev/sdc"
-
On one of the nodes, create the SBD messaging layout on the shared storage device and confirm that it is in place. For example, to set up and verify messaging on the shared storage device at /dev/sdc, run the following commands:
sudo sbd -d /dev/sdc create sudo sbd -d /dev/sdc list
-
Finally, start the cluster and configure the
fence_sbd
fencing agent for the shared storage device. For example, to configure the shared storage device, /dev/sdc, run the following commands on one of the nodes:sudo pcs cluster start --all sudo pcs stonith create sbd_fencing fence_sbd devices=/dev/sdc
IF-MIB Fencing
IF-MIB fencing takes advantage of SNMP to access the IF-MIB on an Ethernet network switch
and to also shutdown the port on the switch, which effectively takes a host offline. This
configuration leaves the host running, while disconnecting it from the network. Bear in mind
that any FibreChannel or InfiniBand connections could remain intact, even after the Ethernet
connection has been stopped, which means that any data made available on these connections
could still be at risk. Thus, consider configuring this fencing method as a fallback fencing
mechanism. See Configuring Fencing Levels
for more information about how to use multiple fencing agents in combination to maximize
stonith
success.
To configure IF-MIB fencing:
-
Configure the switch for SNMP v2c, at minimum, and ensure that SNMP SET messages are enabled. For example, on an Oracle Switch, by using the ILOM CLI, you could run the following commands:
sudo set /SP/services/snmp/ sets=enabled sudo set /SP/services/snmp/ v2c=enabled
-
On one of the nodes in the cluster, configure the
fence_ifmib
fencing agent for each node in the environment, as shown in the following example:sudo pcs stonith create ifmib_n1_fencing fence_ifmib pcmk_host_list=node1 \ ipaddr=203.0.113.10 community=private port=1 delay=5 op monitor interval=60s sudo pcs stonith create ifmib_n2_fencing fence_ifmib pcmk_host_list=node2 \ ipaddr=203.0.113.10 community=private port=2 op monitor interval=60s
In the example, the SNMP IF-MIB switch is accessible at the IP address 203.0.113.10; the node1 host is connected to port 1 on the switch, and the node2 host is connected to port 2 on the switch. Replace these variables with the appropriate values for the particular environment.
Azure ARM Fencing
If your high availability Oracle Linux cluster is hosted on Azure virtual machines you need to use the Azure Resource Manager (ARM) fencing agent.
Note:
For systems hosted on Azure, clustering with Pacemaker and Corosync is only available for Azure x86 VMs.To set up and configure Azure ARM fencing:
-
On each node in the cluster, install the package with the Azure SDK dependency:
sudo dnf install fence-agents-azure-arm python3-azure-sdk
-
On the node you have set up the
pcs
cluster
on, run the following, once for each node in your cluster:sudo pcs stonith create resource_stonith_azure fence_azure_arm msi=true \ resourceGroup="Azure_resource_group" \ subscriptionId="Azure_subscription_id" \ pcmk_host_map="resolvable_host_name:Azure_VM_Name" \ power_timeout=240 \ pcmk_reboot_timeout=900 \ pcmk_monitor_timeout=120 \ pcmk_monitor_retries=4 \ pcmk_action_limit=3 \ op monitor interval=3600 \ --group fencegroup
When running the preceding command
-
Replace resource_stonith_azure with a node-specific resource name of your choice.
For example, you might specify resource name resource_stonith_azure-1 when you run the command for the first server, and resource_stonith_azure-2 when you run the command for the second server, and so on.
-
Replace Azure_resource_group with the name of the resource group Azure portal resource group that holds VMs and other resources.
-
Replace Azure_subscription_id with your subscription ID in Azure.
-
Replace resolvable_host_name with the resolvable hostname of the node you are running the command for, and Azure_VM_Name with the name of the host in Azure.
Note:
The optionpcmk_host_map
is only required if the hostnames and the Azure VM names are not identical. -
Replace fencegroup with a group name of your choice.
-