6 Scaling a Kubernetes Cluster

Scaling a Kubernetes cluster involves updating the cluster by adding nodes to it or removing nodes from it. When you add nodes to a Kubernetes cluster, you're scaling up the cluster, and when you remove nodes from the cluster, you're scaling down the cluster.

Reasons for scaling up a cluster might include the need to handle a larger workload, increased network traffic, or the need to run more applications in the cluster. Reasons for scaling down a cluster might include temporarily removing a node for maintenance or troubleshooting.

Before adding a new node, you need to set up the node to meet all the necessary requirements for it to be part of the Kubernetes cluster. For information on setting up a Kubernetes node, see Installation. Depending upon the type of nodes you're adding to the cluster, and the system setup, you might also need to add the new nodes to the load balancer configured for the cluster.

An environment with a single control plane node (created for example, for testing purposes) can be scaled up to a highly available (HA) cluster with many control plane nodes, providing it was created with the --load-balancer or --virtual-ip options.

When you scale a Kubernetes cluster:

  1. A back up is taken of the cluster. In case something goes wrong during scaling up or scaling down, you can revert to the previous state so that you can restore the cluster. For more information about backing up and restoring a Kubernetes cluster, see Backing up and Restoring a Kubernetes Cluster.

  2. Any nodes that you want to add to the cluster are validated. If the nodes have any validation issues, such as firewall issues, then the update to the cluster can't proceed, and the nodes can't be added to the cluster. You're prompted for what to do to resolve the validation issues so that the nodes can be added to the cluster.

  3. The control plane nodes and worker nodes are added to or removed from the cluster.

  4. The cluster is checked to ensure all nodes are healthy. After validation of the cluster is completed, the cluster is scaled and you can access it.

Best Practices for Scaling a Kubernetes Cluster

The following list describes best practices to be followed when scaling a Kubernetes cluster in a production environment:

Scale Up and Down in Separate Steps

We recommend that you don't scale the cluster up and down in one step: scale up, and then scale down, in two separate commands.

Scaling Control Plane Nodes

To avoid split-brain scenarios, the number of control plane nodes in a cluster must always be an odd number equal to or greater than three, for example, 3, 5, or 7. Thus, control nodes must be scaled up and down two nodes at a time to maintain cluster quorum.

We recommend that clusters are provisioned with a minimum of five control plane nodes in case two nodes need to be removed during a maintenance operation.

Scaling Worker Nodes

Replace worker nodes in the cluster one node at a time to let applications running on the node to migrate to other nodes.

The cluster must always have a minimum of three worker nodes. Thus, we recommend that clusters are provisioned with a minimum of four worker nodes in case a node is removed during a maintenance operation.

Tip:

The examples in this chapter show you how to scale up and down by changing the control plane node and worker nodes at the same time by providing all the nodes to be included in the cluster using the --control-plane-nodes and --worker-nodes options. If you only want to scale control plane nodes, you only need to provide the list of control plane nodes to include in the cluster using the --control-plane-nodes option (you don't need to provide the worker node list). Similarly, if you only want to scale worker nodes, you only need to provide the list of worker nodes using the --worker-nodes option.

Scaling Up a Kubernetes Cluster

Before you scale up a Kubernetes cluster, you must set up the new nodes so they can be added to the cluster. Also, depending upon the type of nodes you're adding to the cluster, and the system setup, you might also need to add the new nodes to the load balancer configured for the cluster.

Setting up the New Kubernetes Nodes

To prepare a node:

  1. Set up the node so it can be added to a Kubernetes cluster. For information on setting up a Kubernetes node see Installation.

  2. If you're using private X.509 certificates for nodes, you need to generate and copy the certificates to the node. You don't need to do anything if you're using Vault to provide certificates for nodes. For information using X.509 certificates see Installation.

  3. Start the Platform Agent service. For information on starting the Platform Agent, see Installation.

Adding New Nodes to the Load Balancer

If you're using an external load balancer for the Kubernetes cluster (set with the --load-balancer option when you created the Kubernetes module), add any new control plane nodes to it. If you're using an Oracle Cloud Infrastructure load balancer, add any new control plane nodes to the appropriate backend set and set the port for the control plane nodes to 6443. If you're using the load balancer deployed by the Platform CLI (set with the --virtual-ip option when you created the Kubernetes module), you don't need to add the control plane nodes to it. This is done automatically when you scale the nodes into the cluster.

If you have the Istio module installed and set up with a load balancer for the Istio ingress gateway, and you're adding new worker nodes, add the new worker nodes to the Istio egress load balancer. If you're using an Oracle Cloud Infrastructure load balancer, add any new worker nodes to the appropriate backend set.

Adding New Nodes to the Kubernetes Cluster

After completing the preparatory steps in the preceding sections, use the instructions in this procedure to add nodes to a Kubernetes cluster.

To scale up a Kubernetes cluster:

  1. From a control plane node of the Kubernetes cluster, use the kubectl get nodes command to see the control plane nodes and worker nodes of the cluster.

    kubectl get nodes

    The output looks similar to:

    NAME                   STATUS   ROLE             AGE     VERSION
    control1.example.com    Ready    control-plane   26h     version  
    control2.example.com    Ready    control-plane   26h     version
    control3.example.com    Ready    control-plane   26h     version
    worker1.example.com     Ready    <none>          26h     version
    worker2.example.com     Ready    <none>          26h     version
    worker3.example.com     Ready    <none>          26h     version

    In this example, three control plane nodes are in the Kubernetes cluster:

    • control1.example.com

    • control2.example.com

    • control3.example.com

    Three worker nodes are also in the cluster:

    • worker1.example.com

    • worker2.example.com

    • worker3.example.com

  2. Use the olcnectl module update command to scale up a Kubernetes cluster.

    In this example, the Kubernetes cluster is scaled up so that it has the recommended minimum of five control plane nodes and four worker nodes. This example adds two new control plane nodes (control4.example.com and control5.example.com) and one new worker node (worker4.example.com) to the Kubernetes module named mycluster. From the operator node run:

    olcnectl module update \
    --environment-name myenvironment \  
    --name mycluster \
    --control-plane-nodes control1.example.com:8090,control2.example.com:8090,control3.example.com:8090,\
    control4.example.com:8090,control5.example.com:8090 \
    --worker-nodes worker1.example.com:8090,worker2.example.com:8090,worker3.example.com:8090,\
    worker4.example.com:8090

    You can optionally include the --generate-scripts option. This option generates scripts you can run for each node in the event of any validation failures during scaling. A script is created for each node in the module, saved to the local directory, and named hostname:8090.sh.

    You can also optionally included the --force option to suppress the prompt displayed to confirm you want to continue with scaling the cluster.

    You can optionally use the --log-level option to set the level of logging displayed in the command output. By default, error messages are displayed. For example, you can set the logging level to show all messages when you include:

    --log-level debug

    The log messages are also saved as an operation log. You can view operation logs as commands are running, or when they've completed. For more information using operation logs, see Platform Command-Line Interface.

  3. On a control plane node of the Kubernetes cluster, use the kubectl get nodes command to verify the cluster has been scaled up to include the new control plane node and worker nodes.

    kubectl get nodes

    The output looks similar to:

    NAME                   STATUS   ROLE            AGE     VERSION
    control1.example.com    Ready   control-plane   26h     version  
    control2.example.com    Ready   control-plane   26h     version
    control3.example.com    Ready   control-plane   26h     version
    control4.example.com    Ready   control-plane   2m38s   version
    control5.example.com    Ready   control-plane   2m38s   version
    worker1.example.com     Ready   <none>          26h     version
    worker2.example.com     Ready   <none>          26h     version
    worker3.example.com     Ready   <none>          26h     version
    worker4.example.com     Ready   <none>          2m38s   version

Scaling Down a Kubernetes Cluster

This procedure shows you how to remove nodes from a Kubernetes cluster.

Attention:

Be careful if you're scaling down the control plane nodes of the cluster. If you have two control plane nodes and you scale down to have only one control plane node, then you would have only a single point of failure.

To scale down a Kubernetes cluster:

  1. From a control plane node of the Kubernetes cluster, use the kubectl get nodes command to see the control plane nodes and worker nodes of the cluster.

    kubectl get nodes

    The output looks similar to:

    NAME                   STATUS   ROLE            AGE     VERSION
    control1.example.com   Ready    control-plane   26h     version  
    control2.example.com   Ready    control-plane   26h     version
    control3.example.com   Ready    control-plane   26h     version
    control4.example.com   Ready    control-plane   2m38s   version
    control5.example.com   Ready    control-plane   2m38s   version
    worker1.example.com    Ready    <none>          26h     version
    worker2.example.com    Ready    <none>          26h     version
    worker3.example.com    Ready    <none>          26h     version
    worker4.example.com    Ready    <none>          2m38s   version

    In this example, five control plane nodes are in the Kubernetes cluster:

    • control1.example.com

    • control2.example.com

    • control3.example.com

    • control4.example.com

    • control5.example.com

    Four worker nodes are also in the cluster:

    • worker1.example.com

    • worker2.example.com

    • worker3.example.com

    • worker4.example.com

  2. Use the olcnectl module update command to scale down a Kubernetes cluster.

    In this example, the Kubernetes cluster is scaled down so that it has three control plane nodes and three worker nodes. This example removes two control plane nodes (control4.example.com and control5.example.com) and one worker node (worker4.example.com) from the Kubernetes module named mycluster. From the operator node run:

    olcnectl module update \
    --environment-name myenvironment \  
    --name mycluster \
    --control-plane-nodes control1.example.com:8090,control2.example.com:8090,control3.example.com:8090 \
    --worker-nodes worker1.example.com:8090,worker2.example.com:8090,worker3.example.com:8090
  3. On a control plane node of the Kubernetes cluster, use the kubectl get nodes command to verify the cluster has been scaled down to remove the control plane nodes and worker node.

    kubectl get nodes

    The output looks similar to:

    NAME                   STATUS   ROLE            AGE     VERSION
    control1.example.com   Ready    control-plane   26h     version  
    control2.example.com   Ready    control-plane   26h     version
    control3.example.com   Ready    control-plane   26h     version
    worker1.example.com    Ready    <none>          26h     version
    worker2.example.com    Ready    <none>          26h     version
    worker3.example.com    Ready    <none>          26h     version
  4. The removed nodes can be added back into the cluster in a scale-up operation after any necessary maintenance has been completed. However, if the nodes are to be replaced by new ones, then you might need to remove the old nodes from the load balancer. For information on load balancers, see Adding New Nodes to the Load Balancer.