High Availability Clustering
With Oracle Linux instances running on Oracle Cloud Infrastructure (OCI), you can create high availability (HA) clusters that deliver continuous access to applications and services running across multiple nodes. HA clustering minimizes downtime and provides continuous service if and when system components fail.
You can create HA clusters with OCI instances by installing and using Pacemaker, an open source high availability resource manager, and Corosync, an open source cluster engine. For more information about HA clustering and the Pacemaker and Corosync technologies, see the following sources:
Prerequisite
Before you begin, configure a shared storage device to be accessible from all nodes that you want in the HA cluster. A shared storage device is needed for cluster service and application messaging, and for cluster SBD fencing. For more information about setting up a shared storage device, see the following sources:
Setting Up High Availability Clustering With OCI Instances
To set up high availability clustering with OCI instances:
Installing Pacemaker
To create a high availability (HA) cluster with Oracle Cloud Infrastructure (OCI) instances, you must first install the Pacemaker and Corosync packages on each instance, or node, that you want in the cluster. You can then configure each cluster node, and ensure that the Pacemaker service automatically starts and runs on each node at boot time.
OCI instance, node, and cluster node are used interchangeably with HA clustering for OCI.
Best Practice
For each OCI instance that you want in the cluster, open a terminal window and connect to the instance.
ssh
:ssh instance-IP-address
Having a terminal window open for each node prevents the need to repeatedly log in and out of the nodes when configuring the HA cluster.
Installing Pacemaker and Corosync
To install the Pacemaker and Corosync packages and configure the HA cluster nodes:
- Complete the prerequisite in High Availability Clustering.
-
Enable the repository on the Oracle Linux yum server where the Pacemaker and Corosync packages reside.
Oracle Linux 10:sudo dnf config-manager --enable ol10_addons
Oracle Linux 9:
sudo dnf config-manager --enable ol9_addons
Oracle Linux 8:
sudo dnf config-manager --enable ol8_addons
-
On each node, install the Pacemaker
pcs
command shell, software packages, and all available resource and fence agents:sudo dnf install pcs pacemaker resource-agents fence-agents-sbd
-
Configure the firewall so that the service components can communicate across the network:
sudo firewall-cmd --permanent --add-service=high-availability sudo firewall-cmd --add-service=high-availability
-
On each node, set a password for the
hacluster
user:sudo passwd hacluster
Tip
Set the same password on each node to avoid authorization issues when running
pcs
commands on different nodes within the same cluster. -
On each node, set the
pcsd
service to run and start at boot:sudo systemctl enable --now pcsd.service
-
Create an HA cluster using the nodes you have configured. See Creating an HA Cluster.
Creating an HA Cluster
With the Pacemaker and Corosync software, you can create a high availability (HA) cluster with Linux instances running on Oracle Cloud Infrastructure (OCI).
To create an HA cluster:
- Install the Pacemaker and Corosync software packages on each node you want in the cluster. See Installing Pacemaker.
-
From one of the nodes, authenticate the
pcs
cluster configuration tool for thehacluster
user of each cluster node.For example, if you want two nodes to make up the HA cluster, run the following command from one of the cluster nodes:
sudo pcs host auth node1 node2 -u hacluster
Replace node1 and node2 with the resolvable host names of the nodes that are to form the cluster.
Alternatively, if the host names are not resolvable, specify the IP address for each node as shown in the following example:
sudo pcs host auth node1 addr=192.0.2.1 node2 addr=192.0.2.2 -u hacluster
Replace 192.0.2.1 and 192.0.2.2 with the IP address of each of the respective hosts in the cluster.
-
When prompted, enter the password that you defined for the
hacluster
user when you installed and configured the Pacemaker software on each node. -
Create the HA cluster by using the
pcs cluster setup
command, and specifying the following:- Name of the cluster
- The host name and IP address of each node that you want in the cluster
For example, to create an HA cluster with two nodes:
sudo pcs cluster setup cluster-name node1 addr=192.0.2.1 node2 addr=192.0.2.2
Replace 192.0.2.1 and 192.0.2.2 with the IP address of each of the respective hosts in the cluster.
-
From one of the nodes, start the cluster on all nodes:
sudo pcs cluster start --all
-
Optionally, you can enable these services to start at boot time so that if a node reboots, it automatically rejoins the cluster. To do this, run the following command on one of the nodes:
sudo pcs cluster enable --all
Note
Some users prefer not to enable these services so that a node failure resulting in a full system reboot can be properly debugged before it rejoins the cluster.
- Configure SBD fencing for the newly created HA cluster. See Configuring Fencing.
Configuring Fencing
STONITH Block Device (SBD) fencing works with the Pacemaker software to protect data when a node in a high availability (HA) cluster becomes unresponsive. Fencing prevents the live nodes in the HA cluster from accessing data on the unresponsive node until the Pacemaker software takes that unresponsive node offline.
SBD fencing configuration is the last step in completing the setup of an HA cluster with OCI instances. For information about creating an HA cluster, see Creating an HA Cluster.
To create HA clusters with OCI instances, you must use only the SBD cluster fencing mechanism. Other cluster fencing mechanisms aren't currently supported in this environment.
Configuring SBD Fencing for an HA Cluster
To configure SBD fencing for an HA cluster:
-
From one of the cluster nodes, enable
stonith
(Shoot The Other Node In The Head), a fencing technique that's used as part of the SBD fencing strategy.sudo pcs property set stonith-enabled=true
-
From one of the nodes, stop the cluster:
sudo pcs cluster stop --all
-
On each node, install and configure the SBD daemon:
sudo dnf install sbd
-
On each node, enable the
sbd
systemd service:sudo systemctl enable sbd
Note
When enabled, thesbd
systemd service automatically starts and stops as a dependency of the Pacemaker service. This means you don't need to run thesbd
service independently, and you can't manually start or stop the service. If you try to manually start or stop it, the state of the service remains the same, and an error message is displayed, indicating that the service is a dependent service. -
On each node, edit the
/etc/sysconfig/sbd
file and set theSBD_DEVICE
parameter to identify the shared storage device. Use a persistent device path, such as a link within the/dev/disk/by-id/
directory system, to do this.For example, if the shared storage device is available on
/dev/disk/by-id/wwn-0XY3acy9d40afd88083ACR90
, ensure the/etc/sysconfig/sbd
file on each node contains the following line:SBD_DEVICE="/dev/disk/by-id/wwn-0XY3acy9d40afd88083ACR90"
For information about shared storage devices, see the links provided in the Prerequisite section.
- Continue to edit the
/etc/sysconfig/sbd
file on each node by setting the Watchdog device to/dev/null
:SBD_WATCHDOG_DEV=/dev/null
-
From one of the nodes, create the SBD messaging layout on the shared storage device, and confirm that it's in place.
For example, to set up and verify messaging on the shared storage device at
/dev/disk/by-id/wwn-0XY3acy9d40afd88083ACR90
:sudo sbd -d /dev/disk/by-id/wwn-0XY3acy9d40afd88083ACR90 create
sudo sbd -d /dev/disk/by-id/wwn-0XY3acy9d40afd88083ACR90 list
-
From one of the nodes, start the cluster and configure the
fence_sbd
fencing agent for the shared storage device.For example, to start the cluster and configure the shared storage device at
/dev/disk/by-id/wwn-0XY3acy9d40afd88083ACR90
:sudo pcs cluster start --all
sudo pcs stonith create sbd_fencing fence_sbd devices=/dev/disk/by-id/wwn-0XY3acy9d40afd88083ACR90
-
To check the
stonith
configuration has been set up correctly, run the following commands:sudo pcs stonith config sudo pcs cluster verify --full
-
To check the status of the
stonith
configuration, run the following command:sudo pcs stonith
-
To check the status of the cluster, run the following command:
sudo pcs status