Note:
- This tutorial is available in an Oracle-provided free lab environment.
- It uses example values for Oracle Cloud Infrastructure credentials, tenancy, and compartments. When completing your lab, substitute these values with ones specific to your cloud environment.
Use Vertical Pod Autoscaling with Oracle Cloud Native Environment
Introduction
The VerticalPodAutoscaler(VPA) is the Kubernetes component designed to optimize the allocation of CPU and memory resources available to deployed pods based on their usage. Unlike the HorizontalPodAutoscaler, the VerticalPodAutoscaler does not ship with Kubernetes by default. Instead, Kubernetes provides it as a separate project on GitHub.
The Vertical Pod Autoscaler has three components:
- The VPA Recommender: This monitors current and past resource usage and provides recommendations for the CPU and memory request values the container should use. Its recommendations are periodically updated to provide continued adaptation as the workload requirements vary over time.
- The VPA Updater: Watches the deployment for pods using incorrect values based on their CPU and memory usage over time. It recreates pods by implementing the values suggested by the VPA Recommender only if
updateMode: Auto
is defined. The updated endeavors adapt to workload changes as smoothly as possible to minimize downtime. - The VPA Admission Controller: Modifies the values used for CPU and memory based on the VPA recommendations suggested at pod creation. This approach ensures Kubernetes is not creating pods using resource limits set to high or low, even if these differ from the values defined in the deployment descriptor. Thus providing the most optimum resource values when deploying new pods.
VPA Benefits
Kubernetes Vertical Pod Autoscaling provides the following:
- Reduces costs by allocating resources efficiently, thereby reducing excess costs related to unused CPU and memory resources for the cluster.
- Optimizes cluster resources by ensuring all pods deployed on the cluster have sufficient resources.
- Removes the need for administrators to benchmark deployments manually.
VPA Limitations
Some limitations of the Kubernetes Vertical Pod Autoscaling include:
- Limited amount of data storage that has no persistence, which means any history is lost when a deployment restarts.
- Not compatible with Horizontal Pod Autoscaler. While similar to VPA, Kubernetes also provides a mechanism called Horizontal Pod Autoscaling (HPA), which scales the number of pods deployed based on the CPU utilization metrics. Upstream documentation recommends that administrators not use HPA and VPA on the same resource metric.
- VPA is unaware of resources available to the cluster, so it may recommend more resources than are actually present. You can avoid this by tuning the VPA’s LimitRange definition to the maximum present on your cluster.
- No awareness of other potential cluster bottlenecks. For example, network traffic and disk I/O, which may hinder its usefulness for data-intensive deployments.
Given these benefits and limitations, Kubernetes VPA is best used to provide automation for predictable variations in demand, but the overall cluster health still needs to be monitored.
Objectives
In this tutorial, you will learn to:
- Install Kubernetes Metric Server
- Use Vertical Pod Autoscaling to enable deployments to respond to a changing workload
Prerequisites
- Installation of Oracle Cloud Native Environment
- a single control node and one worker node
Deploy Oracle Cloud Native Environment
Note: If running in your own tenancy, read the linux-virt-labs
GitHub project README.md and complete the prerequisites before deploying the lab environment.
-
Open a terminal on the Luna Desktop.
-
Clone the
linux-virt-labs
GitHub project.git clone https://github.com/oracle-devrel/linux-virt-labs.git
-
Change into the working directory.
cd linux-virt-labs/ocne2
-
Install the required collections.
ansible-galaxy collection install -r requirements.yml
-
Deploy the lab environment.
ansible-playbook create_instance.yml -e localhost_python_interpreter="/usr/bin/python3.6" -e install_ocne_rpm=true -e create_ocne_cluster=true -e "ocne_cluster_node_options='-n 1 -w 1'"
The free lab environment requires the extra variable
local_python_interpreter
, which setsansible_python_interpreter
for plays running on localhost. This variable is needed because the environment installs the RPM package for the Oracle Cloud Infrastructure SDK for Python, located under the python3.6 modules.The default deployment shape uses the AMD CPU and Oracle Linux 8. To use an Intel CPU or Oracle Linux 9, add
-e instance_shape="VM.Standard3.Flex"
or-e os_version="9"
to the deployment command.Important: Wait for the playbook to run successfully and reach the pause task. At this stage of the playbook, the installation of Oracle Cloud Native Environment is complete, and the instances are ready. Take note of the previous play, which prints the public and private IP addresses of the nodes it deploys and any other deployment information needed while running the lab.
Access the Kubernetes Cluster
It helps to know the number and names of nodes in your Kubernetes cluster.
-
Open a terminal and connect via SSH to the ocne instance.
ssh oracle@<ip_address_of_node>
-
Show the two nodes and verify that they are running.
kubectl get nodes
Install the Metrics Server
HPA leverages the resource usage metrics from the metrics.k8s.io
Metrics API, which the Metrics Server usually provides. HPA uses these metrics to decide whether to scale the number of Pods up or down based on the rules you defined in the HPA YAML definition file. The Metrics server records CPU and memory data for the Kubernetes cluster’s nodes and pods, which HPA then uses to determine whether to scale in or out.
HPA currently supports two API versions:
- Version 1 (autoscaling.V1 API) - tracks the Metrics Server’s CPU utilization data.
- Version 2 (autoscaling/V2 API) - adds support for scaling based on memory usage, and custom and external metrics.
-
Check whether the Metrics API is available.
kubectl get apiservices | grep metrics.k8s.io
The lack of output confirms that the Metrics API server is unavailable by default on an Oracle Cloud Native Environment cluster.
-
Deploy the Metrics Server.
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
-
Patch the deployment to trust the self-signed X509 certificates used in the default install.
kubectl patch deployment metrics-server -n kube-system --type 'json' -p '[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--kubelet-insecure-tls"}]'
-
Confirm the Metrics Server Pod is running.
kubectl get pods -w -A | grep metrics
Wait for the metrics-server pod to stabilize and report its STATUS as 1/1 Running. Enter
Ctrl-C
to exit the watch command.The -w (watch) option watches the kubectl output and prints changes to the terminal as they occur. Note that, unlike the Linux watch command, it adds to, instead of refreshing the output (see GitHub issue for more detail).
-
Confirm the Metrics API is available.
This command is an alternative to using
kubectl get apiservices
.kubectl get --raw "/apis/metrics.k8s.io/" | jq
Example Output:
[oracle@ocne ~]$ kubectl get --raw "/apis/metrics.k8s.io/" | jq { "kind": "APIGroup", "apiVersion": "v1", "name": "metrics.k8s.io", "versions": [ { "groupVersion": "metrics.k8s.io/v1beta1", "version": "v1beta1" } ], "preferredVersion": { "groupVersion": "metrics.k8s.io/v1beta1", "version": "v1beta1" } }
-
Confirm the reporting of metrics.
Show the metrics for the pods within the cluster.
kubectl top pods -A
Example Output:
[oracle@ocne ~]$ kubectl top pods -A NAMESPACE NAME CPU(cores) MEMORY(bytes) kube-flannel kube-flannel-ds-78vdr 15m 15Mi kube-flannel kube-flannel-ds-jwx9h 16m 16Mi kube-system coredns-f7d444b54-7p4j2 2m 15Mi kube-system coredns-f7d444b54-m5pm6 2m 15Mi kube-system etcd-ocne-control-plane-1 16m 30Mi kube-system kube-apiserver-ocne-control-plane-1 44m 193Mi kube-system kube-controller-manager-ocne-control-plane-1 15m 56Mi kube-system kube-proxy-dlz2l 1m 18Mi kube-system kube-proxy-s89gq 1m 19Mi kube-system kube-scheduler-ocne-control-plane-1 3m 21Mi kube-system metrics-server-b79d5c976-vz8p4 3m 19Mi ocne-system ocne-catalog-578c959566-88vff 1m 5Mi ocne-system ui-84dd57ff69-gtrgf 1m 14Mi
Next, show the metrics for the nodes.
kubectl top nodes
Example Output:
[oracle@ocne ~]$ kubectl top nodes NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% ocne-control-plane-1 264m 13% 713Mi 20% ocne-worker-1 34m 1% 373Mi 10%
Install the Vertical Pod Autoscaler (VPA)
-
Install git.
sudo dnf install -y git
-
Download VPA from GitHub.
git clone -b cluster-autoscaler-release-1.30 https://github.com/kubernetes/autoscaler.git
-
Change into the
vertical-pod-autoscaler
directory.cd autoscaler/vertical-pod-autoscaler
-
Deploy the VPA.
./hack/vpa-up.sh
-
Verify the VPA deployed.
kubectl get pods -n kube-system
Note: You may need to requery a couple of times while the VPA deploys.
Example Output:
[oracle@ocne vertical-pod-autoscaler]$ kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-f7d444b54-lwxzz 1/1 Running 0 11m coredns-f7d444b54-xg9rc 1/1 Running 0 11m etcd-ocne-control-plane-1 1/1 Running 0 11m kube-apiserver-ocne-control-plane-1 1/1 Running 1 (11m ago) 11m kube-controller-manager-ocne-control-plane-1 1/1 Running 0 11m kube-proxy-2bsct 1/1 Running 0 10m kube-proxy-lwfhr 1/1 Running 0 11m kube-scheduler-ocne-control-plane-1 1/1 Running 0 11m metrics-server-b79d5c976-5g48b 1/1 Running 0 9m vpa-admission-controller-54b69c7587-q8g7t 1/1 Running 0 7m49s vpa-recommender-b9c8fc874-tlzf4 1/1 Running 0 7m49s vpa-updater-c79c94dd9-cs876 1/1 Running 0 7m49s
Deploy the Sample Application
Next, you will deploy the hamster sample deployment and VPA configuration in the Git repository. This sample creates a deployment with two pods requesting 100 millicores of CPU while trying to use just over 500 millicores of CPU. The deployment is requesting less memory than needed.
The deployed VPA configuration analyzes the deployed pods so it can monitor their behavior. There are several options available for the updateMode in VPA, which are:
- Off - VPA will only produce recommendations. It will not apply any changes automatically.
- Initial - VPA assigns the values set in the deployment descriptor but never changes them later.
- Recreate - When deploying, VPA assigns the values set in the deployment descriptor and then updates existing pods by evicting and recreating them automatically.
- Auto - VPA automatically recreates the pods based on the current recommendation from the VPA.
This deployment example uses the updateMode: “Auto” value.
Note: This setting is usually only recommended for non-production environments because it allows the VPA to make changes to deployment based on their current usage. The usual setting in production environments is updateMode: Off, which enables the administrator to manually apply VPA recommendations by updating the deployment manifest with the most recent recommendations.
-
Deploy the Sample Application.
kubectl apply -f examples/hamster.yaml
Example Output:
[oracle@ocne vertical-pod-autoscaler]$ kubectl apply -f examples/hamster.yaml verticalpodautoscaler.autoscaling.k8s.io/hamster-vpa created
-
Verify the Sample Application pods deployed successfully.
kubectl get pods -l app=hamster
Example Output:
[oracle@ocne vertical-pod-autoscaler]$ kubectl get pods -l app=hamster NAME READY STATUS RESTARTS AGE hamster-b7b46759d-5wvj7 1/1 Running 0 18s hamster-b7b46759d-nzvvw 1/1 Running 0 18s
-
View the current CPU and memory values requested by the deployed pods.
kubectl describe pod <deployed-name>
Note: Use one of the pod names returned in the last step instead of the *
* placeholder shown above. Example Output:
[oracle@ocne vertical-pod-autoscaler]$ kubectl describe pod hamster-b7b46759d-nzvvw Name: hamster-b7b46759d-nzvvw Namespace: default Priority: 0 Service Account: default Node: ocne-worker-1/192.168.122.14 Start Time: Fri, 20 Dec 2024 14:26:01 +0000 Labels: app=hamster pod-template-hash=b7b46759d Annotations: vpaObservedContainers: hamster vpaUpdates: Pod resources updated by hamster-vpa: container 0: Status: Running IP: 10.244.1.12 IPs: IP: 10.244.1.12 Controlled By: ReplicaSet/hamster-b7b46759d Containers: hamster: Container ID: cri-o://1cfeff4cf2de04f8f176300781d9b81f76df8a011e0149e350bf096fcba476ad Image: registry.k8s.io/ubuntu-slim:0.1 Image ID: registry.k8s.io/ubuntu-slim@sha256:b6f8c3885f5880a4f1a7cf717c07242eb4858fdd5a84b5ffe35b1cf680ea17b1 Port: <none> Host Port: <none> Command: /bin/sh Args: -c while true; do timeout 0.5s yes >/dev/null; sleep 0.5s; done State: Running Started: Fri, 20 Dec 2024 14:26:04 +0000 Ready: True Restart Count: 0 Requests: cpu: 100m memory: 50Mi Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-qndx2 (ro) Conditions: Type Statu PodReadyToStartContainers True Initialized True Ready True ContainersReady True PodScheduled True Volumes: kube-api-access-qndx2: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true QoS Class: Burstable Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Pulling 61s kubelet Pulling image "registry.k8s.io/ubuntu-slim:0.1" Normal Scheduled 61s default-scheduler Successfully assigned default/hamster-b7b46759d-nzvvw to ocne-worker-1 Normal Pulled 59s kubelet Successfully pulled image "registry.k8s.io/ubuntu-slim:0.1" in 1.991s (1.991s including waiting). Image size: 57409822 bytes. Normal Created 59s kubelet Created container hamster Normal Started 59s kubelet Started container hamster
Watch the VPA Scale the Application
The VPA analyzes the pods and determines the optimum settings it recommends for the monitored deployment.
-
Monitor the pods in the sample application.
watch kubectl get pods -l app=hamster
Enter
Ctrl-C
to exit the watch command after the VPA creates a new pod, which may take a few minutes. Note the newly created pod’s name. -
Review the CPU and memory reservations assigned to the newly created pod.
kubectl describe pod <new_pod_name>
Replace the *
* placeholder with the newly created pod's name. Example Output:
[oracle@ocne vertical-pod-autoscaler]$ kubectl describe pod hamster-b7b46759d-gqghl Name: hamster-b7b46759d-gqghl Namespace: default Priority: 0 Service Account: default Node: ocne-worker-1/192.168.122.14 Start Time: Fri, 20 Dec 2024 14:28:31 +0000 Labels: app=hamster pod-template-hash=b7b46759d Annotations: vpaObservedContainers: hamster vpaUpdates: Pod resources updated by hamster-vpa: container 0: cpu request, memory request Status: Running IP: 10.244.1.15 IPs: IP: 10.244.1.15 Controlled By: ReplicaSet/hamster-b7b46759d Containers: hamster: Container ID: cri-o://bc7dd0d39e0a86a03ea62b57431aa42495a3babef61ec6e97a05d0512ce829a0 Image: registry.k8s.io/ubuntu-slim:0.1 Image ID: registry.k8s.io/ubuntu-slim@sha256:b6f8c3885f5880a4f1a7cf717c07242eb4858fdd5a84b5ffe35b1cf680ea17b1 Port: <none> Host Port: <none> Command: /bin/sh Args: -c while true; do timeout 0.5s yes >/dev/null; sleep 0.5s; done State: Running Started: Fri, 20 Dec 2024 14:28:32 +0000 Ready: True Restart Count: 0 Requests: cpu: 587m memory: 262144k Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-5d4nk (ro) Conditions: Type Status PodReadyToStartContainers True Initialized True Ready True ContainersReady True PodScheduled True Volumes: kube-api-access-5d4nk: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true QoS Class: Burstable Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Pulled 82s kubelet Container image "registry.k8s.io/ubuntu-slim:0.1" already present on machine Normal Created 82s kubelet Created container hamster Normal Started 82s kubelet Started container hamster Normal Scheduled 82s default-scheduler Successfully assigned default/hamster-b7b46759d-gqghl to ocne-worker-1
Notice that the output’s Requests: section contains the updated CPU and memory reservations of the newly created pod.
Example output:
Requests: cpu: 587m memory: 262144k
This output confirms that the VPA corrected the previously under-resourced pod and created a new pod with revised values.
Note: The values you see for the CPU and memory reservations may differ.
View the VPA Recommendation
You can view the detailed recommendations from the VPA’s ‘Recommender’ process.
-
View the VPA’s recommendations.
kubectl describe vpa hamster-vpa
Example output:
[oracle@ocne vertical-pod-autoscaler]$ kubectl describe vpa Name: hamster-vpa Namespace: default Labels: <none> Annotations: <none> API Version: autoscaling.k8s.io/v1 Kind: VerticalPodAutoscaler ... ... Target Ref: API Version: apps/v1 Kind: Deployment Name: hamster Update Policy: Update Mode: Auto Status: Conditions: Last Transition Time: 2024-12-20T14:26:33Z Status: True Type: RecommendationProvided Recommendation: Container Recommendations: Container Name: hamster Lower Bound: Cpu: 402m Memory: 262144k Target: Cpu: 587m Memory: 262144k Uncapped Target: Cpu: 587m Memory: 262144k Upper Bound: Cpu: 1 Memory: 500Mi Events: <none>
Note: The recommendations you see may be different.
Inspect the following fields:
- Update Mode: Auto - Automatically instructs the VPA to recreate as needed based on the recommendation.
- Lower Bound: - Identifies the minimum estimated values to use for the deployment.
- Upper Bound - Identifies the maximum estimated values to use for the deployment.
- Target: - The current values.
The VPA bases the above estimations on the min and max allowed values you set in the deployment YAML file.
- Uncapped Target - The estimated values recommended if no minimum (min) or maximum (max) allowed values have been set.
Clean Up
You can remove the VPA using these steps.
-
Remove the sample application.
kubectl delete -f examples/hamster.yaml
Example output:
[oracle@ocne vertical-pod-autoscaler]$ kubectl delete -f examples/hamster.yaml verticalpodautoscaler.autoscaling.k8s.io "hamster-vpa" deleted deployment.apps "hamster" deleted
-
Delete the Vertical Pod Autoscaler deployment.
./hack/vpa-down.sh
Example output:
[oracle@ocne vertical-pod-autoscaler]$ ./hack/vpa-down.sh customresourcedefinition.apiextensions.k8s.io "verticalpodautoscalercheckpoints.autoscaling.k8s.io" deleted customresourcedefinition.apiextensions.k8s.io "verticalpodautoscalers.autoscaling.k8s.io" deleted clusterrole.rbac.authorization.k8s.io "system:metrics-reader" deleted clusterrole.rbac.authorization.k8s.io "system:vpa-actor" deleted clusterrole.rbac.authorization.k8s.io "system:vpa-status-actor" deleted clusterrole.rbac.authorization.k8s.io "system:vpa-checkpoint-actor" deleted clusterrole.rbac.authorization.k8s.io "system:evictioner" deleted clusterrolebinding.rbac.authorization.k8s.io "system:metrics-reader" deleted clusterrolebinding.rbac.authorization.k8s.io "system:vpa-actor" deleted clusterrolebinding.rbac.authorization.k8s.io "system:vpa-status-actor" deleted clusterrolebinding.rbac.authorization.k8s.io "system:vpa-checkpoint-actor" deleted clusterrole.rbac.authorization.k8s.io "system:vpa-target-reader" deleted clusterrolebinding.rbac.authorization.k8s.io "system:vpa-target-reader-binding" deleted clusterrolebinding.rbac.authorization.k8s.io "system:vpa-evictioner-binding" deleted serviceaccount "vpa-admission-controller" deleted serviceaccount "vpa-recommender" deleted serviceaccount "vpa-updater" deleted clusterrole.rbac.authorization.k8s.io "system:vpa-admission-controller" deleted clusterrolebinding.rbac.authorization.k8s.io "system:vpa-admission-controller" deleted clusterrole.rbac.authorization.k8s.io "system:vpa-status-reader" deleted clusterrolebinding.rbac.authorization.k8s.io "system:vpa-status-reader-binding" deleted deployment.apps "vpa-updater" deleted deployment.apps "vpa-recommender" deleted Deleting VPA Admission Controller certs. secret "vpa-tls-certs" deleted Unregistering VPA admission controller webhook Warning: deleting cluster-scoped resources, not scoped to the provided namespace mutatingwebhookconfiguration.admissionregistration.k8s.io "vpa-webhook-config" deleted deployment.apps "vpa-admission-controller" deleted service "vpa-webhook" deleted resource mapping not found for name: "verticalpodautoscalers.autoscaling.k8s.io" namespace: "" from "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1" ensure CRDs are installed first resource mapping not found for name: "verticalpodautoscalercheckpoints.autoscaling.k8s.io" namespace: "" from "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1" ensure CRDs are installed first
Next Steps
The ability to configure your deployments using vertical pod scaling to respond to variations in demand allows you to configure the Kubernetes cluster to meet any changes in demand. Continue expanding your knowledge in Kubernetes and Oracle Cloud Native Environment by looking at our other tutorials posted to the Oracle Linux Training Station.
Related Links
- Oracle Cloud Native Environment Documentation
- Oracle Cloud Native Environment Track
- Oracle Linux Training Station
More Learning Resources
Explore other labs on docs.oracle.com/learn or access more free learning content on the Oracle Learning YouTube channel. Additionally, visit education.oracle.com/learning-explorer to become an Oracle Learning Explorer.
For product documentation, visit Oracle Help Center.
Use Vertical Pod Autoscaling with Oracle Cloud Native Environment
G25249-01
January 2025