Note:

This tutorial is available in an Oracle-provided free lab environment.
It uses example values for Oracle Cloud Infrastructure credentials, tenancy, and compartments. When completing your lab, substitute these values with ones specific to your cloud environment.

Use Vertical Pod Autoscaling with Oracle Cloud Native Environment

Introduction

The VerticalPodAutoscaler(VPA) is the Kubernetes component designed to optimize the allocation of CPU and memory resources available to deployed pods based on their usage. Unlike the HorizontalPodAutoscaler, the VerticalPodAutoscaler does not ship with Kubernetes by default. Instead, Kubernetes provides it as a separate project on GitHub.

The Vertical Pod Autoscaler has three components:

The VPA Recommender: This monitors current and past resource usage and provides recommendations for the CPU and memory request values the container should use. Its recommendations are periodically updated to provide continued adaptation as the workload requirements vary over time.
The VPA Updater: Watches the deployment for pods using incorrect values based on their CPU and memory usage over time. It recreates pods by implementing the values suggested by the VPA Recommender only if updateMode: Auto is defined. The updated endeavors adapt to workload changes as smoothly as possible to minimize downtime.
The VPA Admission Controller: Modifies the values used for CPU and memory based on the VPA recommendations suggested at pod creation. This approach ensures Kubernetes is not creating pods using resource limits set to high or low, even if these differ from the values defined in the deployment descriptor. Thus providing the most optimum resource values when deploying new pods.

VPA Benefits

Kubernetes Vertical Pod Autoscaling provides the following:

Reduces costs by allocating resources efficiently, thereby reducing excess costs related to unused CPU and memory resources for the cluster.
Optimizes cluster resources by ensuring all pods deployed on the cluster have sufficient resources.
Removes the need for administrators to benchmark deployments manually.

VPA Limitations

Some limitations of the Kubernetes Vertical Pod Autoscaling include:

Limited amount of data storage that has no persistence, which means any history is lost when a deployment restarts.
Not compatible with Horizontal Pod Autoscaler. While similar to VPA, Kubernetes also provides a mechanism called Horizontal Pod Autoscaling (HPA), which scales the number of pods deployed based on the CPU utilization metrics. Upstream documentation recommends that administrators not use HPA and VPA on the same resource metric.
VPA is unaware of resources available to the cluster, so it may recommend more resources than are actually present. You can avoid this by tuning the VPA’s LimitRange definition to the maximum present on your cluster.
No awareness of other potential cluster bottlenecks. For example, network traffic and disk I/O, which may hinder its usefulness for data-intensive deployments.

Given these benefits and limitations, Kubernetes VPA is best used to provide automation for predictable variations in demand, but the overall cluster health still needs to be monitored.

Objectives

In this tutorial, you will learn to:

Install Kubernetes Metric Server
Use Vertical Pod Autoscaling to enable deployments to respond to a changing workload

Prerequisites

Installation of Oracle Cloud Native Environment
- a single control node and one worker node

Deploy Oracle Cloud Native Environment

Note: If running in your own tenancy, read the linux-virt-labs GitHub project README.md and complete the prerequisites before deploying the lab environment.

Open a terminal on the Luna Desktop.

Clone the linux-virt-labs GitHub project.

git clone https://github.com/oracle-devrel/linux-virt-labs.git

Change into the working directory.
```
cd linux-virt-labs/ocne2
```

Install the required collections.

ansible-galaxy collection install -r requirements.yml

Deploy the lab environment.
```
ansible-playbook create_instance.yml -e localhost_python_interpreter="/usr/bin/python3.6" -e install_ocne_rpm=true -e create_ocne_cluster=true -e "ocne_cluster_node_options='-n 1 -w 1'"
```
The free lab environment requires the extra variable local_python_interpreter, which sets ansible_python_interpreter for plays running on localhost. This variable is needed because the environment installs the RPM package for the Oracle Cloud Infrastructure SDK for Python, located under the python3.6 modules.

The default deployment shape uses the AMD CPU and Oracle Linux 8. To use an Intel CPU or Oracle Linux 9, add -e instance_shape="VM.Standard3.Flex" or -e os_version="9" to the deployment command.

Important: Wait for the playbook to run successfully and reach the pause task. At this stage of the playbook, the installation of Oracle Cloud Native Environment is complete, and the instances are ready. Take note of the previous play, which prints the public and private IP addresses of the nodes it deploys and any other deployment information needed while running the lab.

Access the Kubernetes Cluster

It helps to know the number and names of nodes in your Kubernetes cluster.

Open a terminal and connect via SSH to the ocne instance.
```
ssh oracle@<ip_address_of_node>
```
Show the two nodes and verify that they are running.
```
kubectl get nodes
```

Install the Metrics Server

HPA leverages the resource usage metrics from the metrics.k8s.io Metrics API, which the Metrics Server usually provides. HPA uses these metrics to decide whether to scale the number of Pods up or down based on the rules you defined in the HPA YAML definition file. The Metrics server records CPU and memory data for the Kubernetes cluster’s nodes and pods, which HPA then uses to determine whether to scale in or out.

HPA currently supports two API versions:

Version 1 (autoscaling.V1 API) - tracks the Metrics Server’s CPU utilization data.
Version 2 (autoscaling/V2 API) - adds support for scaling based on memory usage, and custom and external metrics.

Check whether the Metrics API is available.
```
kubectl get apiservices | grep metrics.k8s.io
```
The lack of output confirms that the Metrics API server is unavailable by default on an Oracle Cloud Native Environment cluster.

Deploy the Metrics Server.

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Patch the deployment to trust the self-signed X509 certificates used in the default install.

kubectl patch deployment metrics-server -n kube-system --type 'json' -p '[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--kubelet-insecure-tls"}]'

Confirm the Metrics Server Pod is running.
```
kubectl get pods -w -A | grep metrics
```
Wait for the metrics-server pod to stabilize and report its STATUS as 1/1 Running. Enter Ctrl-C to exit the watch command.

The -w (watch) option watches the kubectl output and prints changes to the terminal as they occur. Note that, unlike the Linux watch command, it adds to, instead of refreshing the output (see GitHub issue for more detail).

Confirm the Metrics API is available.

This command is an alternative to using kubectl get apiservices.

kubectl get --raw "/apis/metrics.k8s.io/" | jq

Example Output:

[oracle@ocne ~]$ kubectl get --raw "/apis/metrics.k8s.io/" | jq
{
  "kind": "APIGroup",
  "apiVersion": "v1",
  "name": "metrics.k8s.io",
  "versions": [
    {
      "groupVersion": "metrics.k8s.io/v1beta1",
      "version": "v1beta1"
    }
  ],
  "preferredVersion": {
    "groupVersion": "metrics.k8s.io/v1beta1",
    "version": "v1beta1"
  }
}

Confirm the reporting of metrics.

Show the metrics for the pods within the cluster.

kubectl top pods -A

Example Output:

[oracle@ocne ~]$ kubectl top pods -A
NAMESPACE      NAME                                           CPU(cores)   MEMORY(bytes)   
kube-flannel   kube-flannel-ds-78vdr                          15m          15Mi            
kube-flannel   kube-flannel-ds-jwx9h                          16m          16Mi            
kube-system    coredns-f7d444b54-7p4j2                        2m           15Mi            
kube-system    coredns-f7d444b54-m5pm6                        2m           15Mi            
kube-system    etcd-ocne-control-plane-1                      16m          30Mi            
kube-system    kube-apiserver-ocne-control-plane-1            44m          193Mi           
kube-system    kube-controller-manager-ocne-control-plane-1   15m          56Mi            
kube-system    kube-proxy-dlz2l                               1m           18Mi            
kube-system    kube-proxy-s89gq                               1m           19Mi            
kube-system    kube-scheduler-ocne-control-plane-1            3m           21Mi            
kube-system    metrics-server-b79d5c976-vz8p4                 3m           19Mi            
ocne-system    ocne-catalog-578c959566-88vff                  1m           5Mi             
ocne-system    ui-84dd57ff69-gtrgf                            1m           14Mi            

Next, show the metrics for the nodes.

kubectl top nodes

Example Output:

[oracle@ocne ~]$ kubectl top nodes
NAME                   CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
ocne-control-plane-1   264m         13%    713Mi           20%       
ocne-worker-1          34m          1%     373Mi           10%   

Install the Vertical Pod Autoscaler (VPA)

Install git.
```
sudo dnf install -y git
```

Download VPA from GitHub.

git clone -b cluster-autoscaler-release-1.30 https://github.com/kubernetes/autoscaler.git

Change into the vertical-pod-autoscaler directory.
```
cd autoscaler/vertical-pod-autoscaler
```
Deploy the VPA.
```
./hack/vpa-up.sh
```

Verify the VPA deployed.

kubectl get pods -n kube-system

Note: You may need to requery a couple of times while the VPA deploys.

Example Output:

[oracle@ocne vertical-pod-autoscaler]$ kubectl get pods -n kube-system
NAME                                           READY   STATUS    RESTARTS      AGE
coredns-f7d444b54-lwxzz                        1/1     Running   0             11m
coredns-f7d444b54-xg9rc                        1/1     Running   0             11m
etcd-ocne-control-plane-1                      1/1     Running   0             11m
kube-apiserver-ocne-control-plane-1            1/1     Running   1 (11m ago)   11m
kube-controller-manager-ocne-control-plane-1   1/1     Running   0             11m
kube-proxy-2bsct                               1/1     Running   0             10m
kube-proxy-lwfhr                               1/1     Running   0             11m
kube-scheduler-ocne-control-plane-1            1/1     Running   0             11m
metrics-server-b79d5c976-5g48b                 1/1     Running   0             9m
vpa-admission-controller-54b69c7587-q8g7t      1/1     Running   0             7m49s
vpa-recommender-b9c8fc874-tlzf4                1/1     Running   0             7m49s
vpa-updater-c79c94dd9-cs876                    1/1     Running   0             7m49s

Deploy the Sample Application

Next, you will deploy the hamster sample deployment and VPA configuration in the Git repository. This sample creates a deployment with two pods requesting 100 millicores of CPU while trying to use just over 500 millicores of CPU. The deployment is requesting less memory than needed.

The deployed VPA configuration analyzes the deployed pods so it can monitor their behavior. There are several options available for the updateMode in VPA, which are:

Off - VPA will only produce recommendations. It will not apply any changes automatically.
Initial - VPA assigns the values set in the deployment descriptor but never changes them later.
Recreate - When deploying, VPA assigns the values set in the deployment descriptor and then updates existing pods by evicting and recreating them automatically.
Auto - VPA automatically recreates the pods based on the current recommendation from the VPA.

This deployment example uses the updateMode: “Auto” value.

Note: This setting is usually only recommended for non-production environments because it allows the VPA to make changes to deployment based on their current usage. The usual setting in production environments is updateMode: Off, which enables the administrator to manually apply VPA recommendations by updating the deployment manifest with the most recent recommendations.

Deploy the Sample Application.

kubectl apply -f examples/hamster.yaml

Example Output:

[oracle@ocne vertical-pod-autoscaler]$ kubectl apply -f examples/hamster.yaml 
verticalpodautoscaler.autoscaling.k8s.io/hamster-vpa created

Verify the Sample Application pods deployed successfully.

kubectl get pods -l app=hamster

Example Output:

[oracle@ocne vertical-pod-autoscaler]$ kubectl get pods -l app=hamster
NAME                      READY   STATUS    RESTARTS   AGE
hamster-b7b46759d-5wvj7   1/1     Running   0          18s
hamster-b7b46759d-nzvvw   1/1     Running   0          18s

View the current CPU and memory values requested by the deployed pods.

kubectl describe pod <deployed-name>

Note: Use one of the pod names returned in the last step instead of the ** placeholder shown above.

Example Output:

[oracle@ocne vertical-pod-autoscaler]$ kubectl describe pod hamster-b7b46759d-nzvvw
Name:             hamster-b7b46759d-nzvvw
Namespace:        default
Priority:         0
Service Account:  default
Node:             ocne-worker-1/192.168.122.14
Start Time:       Fri, 20 Dec 2024 14:26:01 +0000
Labels:           app=hamster
                  pod-template-hash=b7b46759d
Annotations:      vpaObservedContainers: hamster
                  vpaUpdates: Pod resources updated by hamster-vpa: container 0: 
Status:           Running
IP:               10.244.1.12
IPs:
  IP:           10.244.1.12
Controlled By:  ReplicaSet/hamster-b7b46759d
Containers:
  hamster:
    Container ID:  cri-o://1cfeff4cf2de04f8f176300781d9b81f76df8a011e0149e350bf096fcba476ad
    Image:         registry.k8s.io/ubuntu-slim:0.1
    Image ID:      registry.k8s.io/ubuntu-slim@sha256:b6f8c3885f5880a4f1a7cf717c07242eb4858fdd5a84b5ffe35b1cf680ea17b1
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
    Args:
      -c
      while true; do timeout 0.5s yes >/dev/null; sleep 0.5s; done
    State:          Running
      Started:      Fri, 20 Dec 2024 14:26:04 +0000
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:        100m
      memory:     50Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-qndx2 (ro)
Conditions:
  Type                        Statu
  PodReadyToStartContainers   True 
  Initialized                 True 
  Ready                       True 
  ContainersReady             True 
  PodScheduled                True 
Volumes:
  kube-api-access-qndx2:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Pulling    61s   kubelet            Pulling image "registry.k8s.io/ubuntu-slim:0.1"
  Normal  Scheduled  61s   default-scheduler  Successfully assigned default/hamster-b7b46759d-nzvvw to ocne-worker-1
  Normal  Pulled     59s   kubelet            Successfully pulled image "registry.k8s.io/ubuntu-slim:0.1" in 1.991s (1.991s including waiting). Image size: 57409822 bytes.
  Normal  Created    59s   kubelet            Created container hamster
  Normal  Started    59s   kubelet            Started container hamster

Watch the VPA Scale the Application

The VPA analyzes the pods and determines the optimum settings it recommends for the monitored deployment.

Monitor the pods in the sample application.
```
watch kubectl get pods -l app=hamster
```
Enter Ctrl-C to exit the watch command after the VPA creates a new pod, which may take a few minutes. Note the newly created pod’s name.

Review the CPU and memory reservations assigned to the newly created pod.

kubectl describe pod <new_pod_name>

Replace the ** placeholder with the newly created pod's name.

Example Output:

[oracle@ocne vertical-pod-autoscaler]$ kubectl describe pod hamster-b7b46759d-gqghl
Name:             hamster-b7b46759d-gqghl
Namespace:        default
Priority:         0
Service Account:  default
Node:             ocne-worker-1/192.168.122.14
Start Time:       Fri, 20 Dec 2024 14:28:31 +0000
Labels:           app=hamster
                  pod-template-hash=b7b46759d
Annotations:      vpaObservedContainers: hamster
                  vpaUpdates: Pod resources updated by hamster-vpa: container 0: cpu request, memory request
Status:           Running
IP:               10.244.1.15
IPs:
  IP:           10.244.1.15
Controlled By:  ReplicaSet/hamster-b7b46759d
Containers:
  hamster:
    Container ID:  cri-o://bc7dd0d39e0a86a03ea62b57431aa42495a3babef61ec6e97a05d0512ce829a0
    Image:         registry.k8s.io/ubuntu-slim:0.1
    Image ID:      registry.k8s.io/ubuntu-slim@sha256:b6f8c3885f5880a4f1a7cf717c07242eb4858fdd5a84b5ffe35b1cf680ea17b1
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
    Args:
      -c
      while true; do timeout 0.5s yes >/dev/null; sleep 0.5s; done
    State:          Running
      Started:      Fri, 20 Dec 2024 14:28:32 +0000
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:        587m
      memory:     262144k
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-5d4nk (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True 
  Initialized                 True 
  Ready                       True 
  ContainersReady             True 
  PodScheduled                True 
Volumes:
  kube-api-access-5d4nk:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Pulled     82s   kubelet            Container image "registry.k8s.io/ubuntu-slim:0.1" already present on machine
  Normal  Created    82s   kubelet            Created container hamster
  Normal  Started    82s   kubelet            Started container hamster
  Normal  Scheduled  82s   default-scheduler  Successfully assigned default/hamster-b7b46759d-gqghl to ocne-worker-1

Notice that the output’s Requests: section contains the updated CPU and memory reservations of the newly created pod.

Example output:

Requests:
  cpu:        587m
  memory:     262144k

This output confirms that the VPA corrected the previously under-resourced pod and created a new pod with revised values.

Note: The values you see for the CPU and memory reservations may differ.

View the VPA Recommendation

You can view the detailed recommendations from the VPA’s ‘Recommender’ process.

View the VPA’s recommendations.

kubectl describe vpa hamster-vpa

Example output:

[oracle@ocne vertical-pod-autoscaler]$ kubectl describe vpa
Name:         hamster-vpa
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  autoscaling.k8s.io/v1
Kind:         VerticalPodAutoscaler
...
...
  Target Ref:
    API Version:  apps/v1
    Kind:         Deployment
    Name:         hamster
  Update Policy:
    Update Mode:  Auto
Status:
  Conditions:
    Last Transition Time:  2024-12-20T14:26:33Z
    Status:                True
    Type:                  RecommendationProvided
  Recommendation:
    Container Recommendations:
      Container Name:  hamster
      Lower Bound:
        Cpu:     402m
        Memory:  262144k
      Target:
        Cpu:     587m
        Memory:  262144k
      Uncapped Target:
        Cpu:     587m
        Memory:  262144k
      Upper Bound:
        Cpu:     1
        Memory:  500Mi
Events:          <none>

Note: The recommendations you see may be different.

Inspect the following fields:

Update Mode: Auto - Automatically instructs the VPA to recreate as needed based on the recommendation.
Lower Bound: - Identifies the minimum estimated values to use for the deployment.
Upper Bound - Identifies the maximum estimated values to use for the deployment.
Target: - The current values.

The VPA bases the above estimations on the min and max allowed values you set in the deployment YAML file.

Uncapped Target - The estimated values recommended if no minimum (min) or maximum (max) allowed values have been set.

Clean Up

You can remove the VPA using these steps.

Remove the sample application.

kubectl delete -f examples/hamster.yaml

Example output:

[oracle@ocne vertical-pod-autoscaler]$ kubectl delete -f examples/hamster.yaml 
verticalpodautoscaler.autoscaling.k8s.io "hamster-vpa" deleted
deployment.apps "hamster" deleted

Delete the Vertical Pod Autoscaler deployment.

./hack/vpa-down.sh

Example output:

[oracle@ocne vertical-pod-autoscaler]$ ./hack/vpa-down.sh 
customresourcedefinition.apiextensions.k8s.io "verticalpodautoscalercheckpoints.autoscaling.k8s.io" deleted
customresourcedefinition.apiextensions.k8s.io "verticalpodautoscalers.autoscaling.k8s.io" deleted
clusterrole.rbac.authorization.k8s.io "system:metrics-reader" deleted
clusterrole.rbac.authorization.k8s.io "system:vpa-actor" deleted
clusterrole.rbac.authorization.k8s.io "system:vpa-status-actor" deleted
clusterrole.rbac.authorization.k8s.io "system:vpa-checkpoint-actor" deleted
clusterrole.rbac.authorization.k8s.io "system:evictioner" deleted
clusterrolebinding.rbac.authorization.k8s.io "system:metrics-reader" deleted
clusterrolebinding.rbac.authorization.k8s.io "system:vpa-actor" deleted
clusterrolebinding.rbac.authorization.k8s.io "system:vpa-status-actor" deleted
clusterrolebinding.rbac.authorization.k8s.io "system:vpa-checkpoint-actor" deleted
clusterrole.rbac.authorization.k8s.io "system:vpa-target-reader" deleted
clusterrolebinding.rbac.authorization.k8s.io "system:vpa-target-reader-binding" deleted
clusterrolebinding.rbac.authorization.k8s.io "system:vpa-evictioner-binding" deleted
serviceaccount "vpa-admission-controller" deleted
serviceaccount "vpa-recommender" deleted
serviceaccount "vpa-updater" deleted
clusterrole.rbac.authorization.k8s.io "system:vpa-admission-controller" deleted
clusterrolebinding.rbac.authorization.k8s.io "system:vpa-admission-controller" deleted
clusterrole.rbac.authorization.k8s.io "system:vpa-status-reader" deleted
clusterrolebinding.rbac.authorization.k8s.io "system:vpa-status-reader-binding" deleted
deployment.apps "vpa-updater" deleted
deployment.apps "vpa-recommender" deleted
Deleting VPA Admission Controller certs.
secret "vpa-tls-certs" deleted
Unregistering VPA admission controller webhook
Warning: deleting cluster-scoped resources, not scoped to the provided namespace
mutatingwebhookconfiguration.admissionregistration.k8s.io "vpa-webhook-config" deleted
deployment.apps "vpa-admission-controller" deleted
service "vpa-webhook" deleted
resource mapping not found for name: "verticalpodautoscalers.autoscaling.k8s.io" namespace: "" from "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
ensure CRDs are installed first
resource mapping not found for name: "verticalpodautoscalercheckpoints.autoscaling.k8s.io" namespace: "" from "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
ensure CRDs are installed first

Next Steps

The ability to configure your deployments using vertical pod scaling to respond to variations in demand allows you to configure the Kubernetes cluster to meet any changes in demand. Continue expanding your knowledge in Kubernetes and Oracle Cloud Native Environment by looking at our other tutorials posted to the Oracle Linux Training Station.

More Learning Resources

Explore other labs on docs.oracle.com/learn or access more free learning content on the Oracle Learning YouTube channel. Additionally, visit education.oracle.com/learning-explorer to become an Oracle Learning Explorer.

For product documentation, visit Oracle Help Center.

Title and Copyright Information

Use Vertical Pod Autoscaling with Oracle Cloud Native Environment

G25249-01

January 2025

Oracle and/or its affiliates.

Use Vertical Pod Autoscaling with Oracle Cloud Native Environment

Introduction

VPA Benefits

VPA Limitations

Objectives

Prerequisites

Deploy Oracle Cloud Native Environment

Access the Kubernetes Cluster

Install the Metrics Server

Install the Vertical Pod Autoscaler (VPA)

Deploy the Sample Application

Watch the VPA Scale the Application

View the VPA Recommendation

Clean Up

Next Steps

Related Links

More Learning Resources