Note:

Use Vertical Pod Autoscaling with Oracle Cloud Native Environment

Introduction

The VerticalPodAutoscaler(VPA) is the Kubernetes component designed to optimize the allocation of CPU and memory resources available to deployed pods based on their usage. Unlike the HorizontalPodAutoscaler, the VerticalPodAutoscaler does not ship with Kubernetes by default. Instead, Kubernetes provides it as a separate project on GitHub.

The Vertical Pod Autoscaler has three components:

VPA Benefits

Kubernetes Vertical Pod Autoscaling provides the following:

VPA Limitations

Some limitations of the Kubernetes Vertical Pod Autoscaling include:

Given these benefits and limitations, Kubernetes VPA is best used to provide automation for predictable variations in demand, but the overall cluster health still needs to be monitored.

Objectives

In this tutorial, you will learn to:

Prerequisites

Deploy Oracle Cloud Native Environment

Note: If running in your own tenancy, read the linux-virt-labs GitHub project README.md and complete the prerequisites before deploying the lab environment.

  1. Open a terminal on the Luna Desktop.

  2. Clone the linux-virt-labs GitHub project.

    git clone https://github.com/oracle-devrel/linux-virt-labs.git
    
  3. Change into the working directory.

    cd linux-virt-labs/ocne2
    
  4. Install the required collections.

    ansible-galaxy collection install -r requirements.yml
    
  5. Deploy the lab environment.

    ansible-playbook create_instance.yml -e localhost_python_interpreter="/usr/bin/python3.6" -e install_ocne_rpm=true -e create_ocne_cluster=true -e "ocne_cluster_node_options='-n 1 -w 1'"
    

    The free lab environment requires the extra variable local_python_interpreter, which sets ansible_python_interpreter for plays running on localhost. This variable is needed because the environment installs the RPM package for the Oracle Cloud Infrastructure SDK for Python, located under the python3.6 modules.

    The default deployment shape uses the AMD CPU and Oracle Linux 8. To use an Intel CPU or Oracle Linux 9, add -e instance_shape="VM.Standard3.Flex" or -e os_version="9" to the deployment command.

    Important: Wait for the playbook to run successfully and reach the pause task. At this stage of the playbook, the installation of Oracle Cloud Native Environment is complete, and the instances are ready. Take note of the previous play, which prints the public and private IP addresses of the nodes it deploys and any other deployment information needed while running the lab.

Access the Kubernetes Cluster

It helps to know the number and names of nodes in your Kubernetes cluster.

  1. Open a terminal and connect via SSH to the ocne instance.

    ssh oracle@<ip_address_of_node>
    
  2. Show the two nodes and verify that they are running.

    kubectl get nodes
    

Install the Metrics Server

HPA leverages the resource usage metrics from the metrics.k8s.io Metrics API, which the Metrics Server usually provides. HPA uses these metrics to decide whether to scale the number of Pods up or down based on the rules you defined in the HPA YAML definition file. The Metrics server records CPU and memory data for the Kubernetes cluster’s nodes and pods, which HPA then uses to determine whether to scale in or out.

HPA currently supports two API versions:

  1. Check whether the Metrics API is available.

    kubectl get apiservices | grep metrics.k8s.io
    

    The lack of output confirms that the Metrics API server is unavailable by default on an Oracle Cloud Native Environment cluster.

  2. Deploy the Metrics Server.

    kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
    
  3. Patch the deployment to trust the self-signed X509 certificates used in the default install.

    kubectl patch deployment metrics-server -n kube-system --type 'json' -p '[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--kubelet-insecure-tls"}]'
    
  4. Confirm the Metrics Server Pod is running.

    kubectl get pods -w -A | grep metrics
    

    Wait for the metrics-server pod to stabilize and report its STATUS as 1/1 Running. Enter Ctrl-C to exit the watch command.

    The -w (watch) option watches the kubectl output and prints changes to the terminal as they occur. Note that, unlike the Linux watch command, it adds to, instead of refreshing the output (see GitHub issue for more detail).

  5. Confirm the Metrics API is available.

    This command is an alternative to using kubectl get apiservices.

    kubectl get --raw "/apis/metrics.k8s.io/" | jq
    

    Example Output:

    [oracle@ocne ~]$ kubectl get --raw "/apis/metrics.k8s.io/" | jq
    {
      "kind": "APIGroup",
      "apiVersion": "v1",
      "name": "metrics.k8s.io",
      "versions": [
        {
          "groupVersion": "metrics.k8s.io/v1beta1",
          "version": "v1beta1"
        }
      ],
      "preferredVersion": {
        "groupVersion": "metrics.k8s.io/v1beta1",
        "version": "v1beta1"
      }
    }
    
  6. Confirm the reporting of metrics.

    Show the metrics for the pods within the cluster.

    kubectl top pods -A
    

    Example Output:

    [oracle@ocne ~]$ kubectl top pods -A
    NAMESPACE      NAME                                           CPU(cores)   MEMORY(bytes)   
    kube-flannel   kube-flannel-ds-78vdr                          15m          15Mi            
    kube-flannel   kube-flannel-ds-jwx9h                          16m          16Mi            
    kube-system    coredns-f7d444b54-7p4j2                        2m           15Mi            
    kube-system    coredns-f7d444b54-m5pm6                        2m           15Mi            
    kube-system    etcd-ocne-control-plane-1                      16m          30Mi            
    kube-system    kube-apiserver-ocne-control-plane-1            44m          193Mi           
    kube-system    kube-controller-manager-ocne-control-plane-1   15m          56Mi            
    kube-system    kube-proxy-dlz2l                               1m           18Mi            
    kube-system    kube-proxy-s89gq                               1m           19Mi            
    kube-system    kube-scheduler-ocne-control-plane-1            3m           21Mi            
    kube-system    metrics-server-b79d5c976-vz8p4                 3m           19Mi            
    ocne-system    ocne-catalog-578c959566-88vff                  1m           5Mi             
    ocne-system    ui-84dd57ff69-gtrgf                            1m           14Mi            
    

    Next, show the metrics for the nodes.

    kubectl top nodes
    

    Example Output:

    [oracle@ocne ~]$ kubectl top nodes
    NAME                   CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
    ocne-control-plane-1   264m         13%    713Mi           20%       
    ocne-worker-1          34m          1%     373Mi           10%   
    

Install the Vertical Pod Autoscaler (VPA)

  1. Install git.

    sudo dnf install -y git
    
  2. Download VPA from GitHub.

    git clone -b cluster-autoscaler-release-1.30 https://github.com/kubernetes/autoscaler.git
    
  3. Change into the vertical-pod-autoscaler directory.

    cd autoscaler/vertical-pod-autoscaler
    
  4. Deploy the VPA.

    ./hack/vpa-up.sh
    
  5. Verify the VPA deployed.

    kubectl get pods -n kube-system
    

    Note: You may need to requery a couple of times while the VPA deploys.

    Example Output:

    [oracle@ocne vertical-pod-autoscaler]$ kubectl get pods -n kube-system
    NAME                                           READY   STATUS    RESTARTS      AGE
    coredns-f7d444b54-lwxzz                        1/1     Running   0             11m
    coredns-f7d444b54-xg9rc                        1/1     Running   0             11m
    etcd-ocne-control-plane-1                      1/1     Running   0             11m
    kube-apiserver-ocne-control-plane-1            1/1     Running   1 (11m ago)   11m
    kube-controller-manager-ocne-control-plane-1   1/1     Running   0             11m
    kube-proxy-2bsct                               1/1     Running   0             10m
    kube-proxy-lwfhr                               1/1     Running   0             11m
    kube-scheduler-ocne-control-plane-1            1/1     Running   0             11m
    metrics-server-b79d5c976-5g48b                 1/1     Running   0             9m
    vpa-admission-controller-54b69c7587-q8g7t      1/1     Running   0             7m49s
    vpa-recommender-b9c8fc874-tlzf4                1/1     Running   0             7m49s
    vpa-updater-c79c94dd9-cs876                    1/1     Running   0             7m49s
    

Deploy the Sample Application

Next, you will deploy the hamster sample deployment and VPA configuration in the Git repository. This sample creates a deployment with two pods requesting 100 millicores of CPU while trying to use just over 500 millicores of CPU. The deployment is requesting less memory than needed.

The deployed VPA configuration analyzes the deployed pods so it can monitor their behavior. There are several options available for the updateMode in VPA, which are:

This deployment example uses the updateMode: “Auto” value.

Note: This setting is usually only recommended for non-production environments because it allows the VPA to make changes to deployment based on their current usage. The usual setting in production environments is updateMode: Off, which enables the administrator to manually apply VPA recommendations by updating the deployment manifest with the most recent recommendations.

  1. Deploy the Sample Application.

    kubectl apply -f examples/hamster.yaml
    

    Example Output:

    [oracle@ocne vertical-pod-autoscaler]$ kubectl apply -f examples/hamster.yaml 
    verticalpodautoscaler.autoscaling.k8s.io/hamster-vpa created
    
  2. Verify the Sample Application pods deployed successfully.

    kubectl get pods -l app=hamster
    

    Example Output:

    [oracle@ocne vertical-pod-autoscaler]$ kubectl get pods -l app=hamster
    NAME                      READY   STATUS    RESTARTS   AGE
    hamster-b7b46759d-5wvj7   1/1     Running   0          18s
    hamster-b7b46759d-nzvvw   1/1     Running   0          18s
    
  3. View the current CPU and memory values requested by the deployed pods.

    kubectl describe pod <deployed-name>
    

    Note: Use one of the pod names returned in the last step instead of the ** placeholder shown above.

    Example Output:

    [oracle@ocne vertical-pod-autoscaler]$ kubectl describe pod hamster-b7b46759d-nzvvw
    Name:             hamster-b7b46759d-nzvvw
    Namespace:        default
    Priority:         0
    Service Account:  default
    Node:             ocne-worker-1/192.168.122.14
    Start Time:       Fri, 20 Dec 2024 14:26:01 +0000
    Labels:           app=hamster
                      pod-template-hash=b7b46759d
    Annotations:      vpaObservedContainers: hamster
                      vpaUpdates: Pod resources updated by hamster-vpa: container 0: 
    Status:           Running
    IP:               10.244.1.12
    IPs:
      IP:           10.244.1.12
    Controlled By:  ReplicaSet/hamster-b7b46759d
    Containers:
      hamster:
        Container ID:  cri-o://1cfeff4cf2de04f8f176300781d9b81f76df8a011e0149e350bf096fcba476ad
        Image:         registry.k8s.io/ubuntu-slim:0.1
        Image ID:      registry.k8s.io/ubuntu-slim@sha256:b6f8c3885f5880a4f1a7cf717c07242eb4858fdd5a84b5ffe35b1cf680ea17b1
        Port:          <none>
        Host Port:     <none>
        Command:
          /bin/sh
        Args:
          -c
          while true; do timeout 0.5s yes >/dev/null; sleep 0.5s; done
        State:          Running
          Started:      Fri, 20 Dec 2024 14:26:04 +0000
        Ready:          True
        Restart Count:  0
        Requests:
          cpu:        100m
          memory:     50Mi
        Environment:  <none>
        Mounts:
          /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-qndx2 (ro)
    Conditions:
      Type                        Statu
      PodReadyToStartContainers   True 
      Initialized                 True 
      Ready                       True 
      ContainersReady             True 
      PodScheduled                True 
    Volumes:
      kube-api-access-qndx2:
        Type:                    Projected (a volume that contains injected data from multiple sources)
        TokenExpirationSeconds:  3607
        ConfigMapName:           kube-root-ca.crt
        ConfigMapOptional:       <nil>
        DownwardAPI:             true
    QoS Class:                   Burstable
    Node-Selectors:              <none>
    Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
    Events:
      Type    Reason     Age   From               Message
      ----    ------     ----  ----               -------
      Normal  Pulling    61s   kubelet            Pulling image "registry.k8s.io/ubuntu-slim:0.1"
      Normal  Scheduled  61s   default-scheduler  Successfully assigned default/hamster-b7b46759d-nzvvw to ocne-worker-1
      Normal  Pulled     59s   kubelet            Successfully pulled image "registry.k8s.io/ubuntu-slim:0.1" in 1.991s (1.991s including waiting). Image size: 57409822 bytes.
      Normal  Created    59s   kubelet            Created container hamster
      Normal  Started    59s   kubelet            Started container hamster
    

Watch the VPA Scale the Application

The VPA analyzes the pods and determines the optimum settings it recommends for the monitored deployment.

  1. Monitor the pods in the sample application.

    watch kubectl get pods -l app=hamster
    

    Enter Ctrl-C to exit the watch command after the VPA creates a new pod, which may take a few minutes. Note the newly created pod’s name.

  2. Review the CPU and memory reservations assigned to the newly created pod.

    kubectl describe pod <new_pod_name>
    

    Replace the ** placeholder with the newly created pod's name.

    Example Output:

    [oracle@ocne vertical-pod-autoscaler]$ kubectl describe pod hamster-b7b46759d-gqghl
    Name:             hamster-b7b46759d-gqghl
    Namespace:        default
    Priority:         0
    Service Account:  default
    Node:             ocne-worker-1/192.168.122.14
    Start Time:       Fri, 20 Dec 2024 14:28:31 +0000
    Labels:           app=hamster
                      pod-template-hash=b7b46759d
    Annotations:      vpaObservedContainers: hamster
                      vpaUpdates: Pod resources updated by hamster-vpa: container 0: cpu request, memory request
    Status:           Running
    IP:               10.244.1.15
    IPs:
      IP:           10.244.1.15
    Controlled By:  ReplicaSet/hamster-b7b46759d
    Containers:
      hamster:
        Container ID:  cri-o://bc7dd0d39e0a86a03ea62b57431aa42495a3babef61ec6e97a05d0512ce829a0
        Image:         registry.k8s.io/ubuntu-slim:0.1
        Image ID:      registry.k8s.io/ubuntu-slim@sha256:b6f8c3885f5880a4f1a7cf717c07242eb4858fdd5a84b5ffe35b1cf680ea17b1
        Port:          <none>
        Host Port:     <none>
        Command:
          /bin/sh
        Args:
          -c
          while true; do timeout 0.5s yes >/dev/null; sleep 0.5s; done
        State:          Running
          Started:      Fri, 20 Dec 2024 14:28:32 +0000
        Ready:          True
        Restart Count:  0
        Requests:
          cpu:        587m
          memory:     262144k
        Environment:  <none>
        Mounts:
          /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-5d4nk (ro)
    Conditions:
      Type                        Status
      PodReadyToStartContainers   True 
      Initialized                 True 
      Ready                       True 
      ContainersReady             True 
      PodScheduled                True 
    Volumes:
      kube-api-access-5d4nk:
        Type:                    Projected (a volume that contains injected data from multiple sources)
        TokenExpirationSeconds:  3607
        ConfigMapName:           kube-root-ca.crt
        ConfigMapOptional:       <nil>
        DownwardAPI:             true
    QoS Class:                   Burstable
    Node-Selectors:              <none>
    Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
    Events:
      Type    Reason     Age   From               Message
      ----    ------     ----  ----               -------
      Normal  Pulled     82s   kubelet            Container image "registry.k8s.io/ubuntu-slim:0.1" already present on machine
      Normal  Created    82s   kubelet            Created container hamster
      Normal  Started    82s   kubelet            Started container hamster
      Normal  Scheduled  82s   default-scheduler  Successfully assigned default/hamster-b7b46759d-gqghl to ocne-worker-1
    

    Notice that the output’s Requests: section contains the updated CPU and memory reservations of the newly created pod.

    Example output:

    Requests:
      cpu:        587m
      memory:     262144k
    

    This output confirms that the VPA corrected the previously under-resourced pod and created a new pod with revised values.

    Note: The values you see for the CPU and memory reservations may differ.

View the VPA Recommendation

You can view the detailed recommendations from the VPA’s ‘Recommender’ process.

  1. View the VPA’s recommendations.

    kubectl describe vpa hamster-vpa
    

    Example output:

    [oracle@ocne vertical-pod-autoscaler]$ kubectl describe vpa
    Name:         hamster-vpa
    Namespace:    default
    Labels:       <none>
    Annotations:  <none>
    API Version:  autoscaling.k8s.io/v1
    Kind:         VerticalPodAutoscaler
    ...
    ...
      Target Ref:
        API Version:  apps/v1
        Kind:         Deployment
        Name:         hamster
      Update Policy:
        Update Mode:  Auto
    Status:
      Conditions:
        Last Transition Time:  2024-12-20T14:26:33Z
        Status:                True
        Type:                  RecommendationProvided
      Recommendation:
        Container Recommendations:
          Container Name:  hamster
          Lower Bound:
            Cpu:     402m
            Memory:  262144k
          Target:
            Cpu:     587m
            Memory:  262144k
          Uncapped Target:
            Cpu:     587m
            Memory:  262144k
          Upper Bound:
            Cpu:     1
            Memory:  500Mi
    Events:          <none>
    

    Note: The recommendations you see may be different.

    Inspect the following fields:

    • Update Mode: Auto - Automatically instructs the VPA to recreate as needed based on the recommendation.
    • Lower Bound: - Identifies the minimum estimated values to use for the deployment.
    • Upper Bound - Identifies the maximum estimated values to use for the deployment.
    • Target: - The current values.

    The VPA bases the above estimations on the min and max allowed values you set in the deployment YAML file.

    • Uncapped Target - The estimated values recommended if no minimum (min) or maximum (max) allowed values have been set.

Clean Up

You can remove the VPA using these steps.

  1. Remove the sample application.

    kubectl delete -f examples/hamster.yaml
    

    Example output:

    [oracle@ocne vertical-pod-autoscaler]$ kubectl delete -f examples/hamster.yaml 
    verticalpodautoscaler.autoscaling.k8s.io "hamster-vpa" deleted
    deployment.apps "hamster" deleted
    
  2. Delete the Vertical Pod Autoscaler deployment.

    ./hack/vpa-down.sh
    

    Example output:

    [oracle@ocne vertical-pod-autoscaler]$ ./hack/vpa-down.sh 
    customresourcedefinition.apiextensions.k8s.io "verticalpodautoscalercheckpoints.autoscaling.k8s.io" deleted
    customresourcedefinition.apiextensions.k8s.io "verticalpodautoscalers.autoscaling.k8s.io" deleted
    clusterrole.rbac.authorization.k8s.io "system:metrics-reader" deleted
    clusterrole.rbac.authorization.k8s.io "system:vpa-actor" deleted
    clusterrole.rbac.authorization.k8s.io "system:vpa-status-actor" deleted
    clusterrole.rbac.authorization.k8s.io "system:vpa-checkpoint-actor" deleted
    clusterrole.rbac.authorization.k8s.io "system:evictioner" deleted
    clusterrolebinding.rbac.authorization.k8s.io "system:metrics-reader" deleted
    clusterrolebinding.rbac.authorization.k8s.io "system:vpa-actor" deleted
    clusterrolebinding.rbac.authorization.k8s.io "system:vpa-status-actor" deleted
    clusterrolebinding.rbac.authorization.k8s.io "system:vpa-checkpoint-actor" deleted
    clusterrole.rbac.authorization.k8s.io "system:vpa-target-reader" deleted
    clusterrolebinding.rbac.authorization.k8s.io "system:vpa-target-reader-binding" deleted
    clusterrolebinding.rbac.authorization.k8s.io "system:vpa-evictioner-binding" deleted
    serviceaccount "vpa-admission-controller" deleted
    serviceaccount "vpa-recommender" deleted
    serviceaccount "vpa-updater" deleted
    clusterrole.rbac.authorization.k8s.io "system:vpa-admission-controller" deleted
    clusterrolebinding.rbac.authorization.k8s.io "system:vpa-admission-controller" deleted
    clusterrole.rbac.authorization.k8s.io "system:vpa-status-reader" deleted
    clusterrolebinding.rbac.authorization.k8s.io "system:vpa-status-reader-binding" deleted
    deployment.apps "vpa-updater" deleted
    deployment.apps "vpa-recommender" deleted
    Deleting VPA Admission Controller certs.
    secret "vpa-tls-certs" deleted
    Unregistering VPA admission controller webhook
    Warning: deleting cluster-scoped resources, not scoped to the provided namespace
    mutatingwebhookconfiguration.admissionregistration.k8s.io "vpa-webhook-config" deleted
    deployment.apps "vpa-admission-controller" deleted
    service "vpa-webhook" deleted
    resource mapping not found for name: "verticalpodautoscalers.autoscaling.k8s.io" namespace: "" from "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    ensure CRDs are installed first
    resource mapping not found for name: "verticalpodautoscalercheckpoints.autoscaling.k8s.io" namespace: "" from "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    ensure CRDs are installed first
    

Next Steps

The ability to configure your deployments using vertical pod scaling to respond to variations in demand allows you to configure the Kubernetes cluster to meet any changes in demand. Continue expanding your knowledge in Kubernetes and Oracle Cloud Native Environment by looking at our other tutorials posted to the Oracle Linux Training Station.

More Learning Resources

Explore other labs on docs.oracle.com/learn or access more free learning content on the Oracle Learning YouTube channel. Additionally, visit education.oracle.com/learning-explorer to become an Oracle Learning Explorer.

For product documentation, visit Oracle Help Center.