Automated Scaling of Node Manager Pods Using HPA

About Automated Scaling of Node Manager Pods

Offline Mediation Controller cloud native supports the Kubernetes Horizontal Pod Autoscaler (HPA), enabling dynamic scaling of Node Manager pods to handle varying workloads. With this feature, the Node Manager pods are replicated as the application scales up, distributing the load evenly and ensuring optimal resource utilization during processing.

Offline Mediation Controller uses StatefulSets to run groups of Node Manager pods. You can configure Offline Mediation Controller to automatically scale Node Manager pods in StatefulSets at the following levels:

Global Level: Provides a default, system-wide approach to HPA. These settings apply to all Node Manager StatefulSets unless explicitly overridden.
Node Manager Set Level: Offers control for the pods in each Node Manager StatefulSet. Set-level configuration takes precedence over the global level.

Enabling Scaling Replication

When HPA scaling is enabled, new Node Manager instances are created, and the node chain is replicated to distribute the load across the scaled instances. You control replication using the scaling.replication.enabled flag, which ensures that new nodes share the load during scale-up events. Each new instance replicates the node chain and participates in load distribution. In addition to scaling up, you can also configure Node Manager pods to scale down when resource usage decreases. You control the scale-down behavior using the scaling.hpa.hpaScaleDownEnabled key.

Scaling a Node Manager up or down triggers a restart for nodes with routes linked to that Node Manager since the routes are modified.

Note:

To enable REST Services Manager authentication, you must set the ocomc.secrets.rsmOAuthToken key in your override-values.yaml file.

Configuring HPA

The Node Manager scaling and resource configurations can be managed at both global and Node Manager set levels.

Global Configuration: Provides a default, system-wide approach to HPA. When defined in the global nodeManagerConfigurations block, these settings apply to all Node Manager sets unless explicitly overridden.
Node Manager Set Configuration: Offers control for each Node Manager set.

Note:

RDM configurations, such as thread count, require appropriate tuning when HPA is enabled.

Configuring Global HPA Values

The nodeManagerConfigurations block in the override-values.yaml file defines the global settings for Node Managers. These settings can be overridden at the Node Manager set level if needed. For example, if the global log level configuration is set to WARN, you can configure the nm-voice-cc set to use the DEBUG log level instead.

To configure global HPA values:

Set the scaling.hpa.enabled key to true.
Configure the following global HPA parameters:
- scaling.replication.enabled: Set to true to specify if the node chain needs to be replicated from the root Node Manager (that is, the first pod of the StatefulSet).
- scaling.hpa.maxReplicas: Specify the maximum number of pod replicas allowed.
- scaling.hpa.metrics: Define the scaling triggers based on resource utilization.
- scaling.serviceAccount.createServiceAccount: Set to true to create a dedicated service account for scaling operations.
- scaling.serviceAccount.name: Specify the name for the service account to be used for scaling operations.
- scaling.hpa.hpaScaleDownEnabled: Set to true to enable scaling down of Node Manager pods when resource usage decreases.
For example:
```
nodeManagerConfigurations:
  scaling:
    replication:
      enabled: 
    hpa:
      enabled: true       
      maxReplicas: 3      
      metrics:
        - type: Resource
          resource:
            name: cpu
            target:
              type: Utilization
              averageUtilization: 50
        - type: Resource
          resource:
            name: memory
            target:
              type: Utilization
              averageUtilization: 70
```
This example specifies to scale resource utilization as follows:
- CPU Utilization Metric: The HPA monitors CPU usage for each Node Manager pod. If the average CPU utilization across all replicas exceeds 50%, the HPA initiates scaling by adding more instances (up to the maximum specified in maxReplicas).
- Memory Utilization Metric: The memory usage of each pod is monitored. If the average memory utilization reaches 70%, the HPA triggers scaling to ensure that enough instances are available to handle the workload.
Run the helm upgrade command to update the Offline Mediation Controller Helm release with the values you have set:
```
helm upgrade ReleaseName oc-cn-ocomc-core-helm-chart --values OverrideValuesFile -n Namespace
```
where:
- ReleaseName is the release name, which is used to track the installation instance.
- OverrideValuesFile is the path to a YAML file that overrides the default configurations in the chart's values.yaml file.
- Namespace is the namespace in which to create Offline Mediation Controller Kubernetes objects.

Configuring Node Manager Set HPA Values

Each Node Manager set can have its own specific configuration. If a configuration is not explicitly defined for a set, it inherits the values from the global nodeManagerConfigurations block.

To configure HPA values for Node Manager sets:

In your override-values.yaml file, locate the sets block.
For each Node Manager set, use the scaling.hpa.enabled key to enable scaling.
Configure the following HPA parameters:
- name: Specify the name of the Node Manager set.
- replicas: Specify the maximum number of pod replicas for this set.
- resources: Define the minimum and maximum CPU and memory resources that the Node Manager contains.
For example:
```
sets
  nm-cc:
    name: "nm-cc"         
    replicas: 1           
    resources:
      requests:
        cpu: "800m"
        memory: "5Gi"
      limits:
        cpu: "800m"
        memory: "5Gi"
    scaling:
      hpa:
        enabled: true      
    gcOptions: "${GLOBAL_OPTS}
    memoryOptions: "-Xms4g -Xmx4g"
```
Run the helm upgrade command to update the Offline Mediation Controller Helm release:
```
helm upgrade ReleaseName oc-cn-ocomc-core-helm-chart --values OverrideValuesFile -n Namespace
```

11 Automated Scaling of Node Manager Pods Using HPA

About Automated Scaling of Node Manager Pods

Enabling Scaling Replication

Configuring HPA

Configuring Global HPA Values

Configuring Node Manager Set HPA Values