Management Agent for Kubernetes (OCMA) in
Failed
State After Upgrade Failure
If the Docker image URL specified for the management agent is incorrect or
inaccessible during a Helm upgrade of the oci-kubernetes-monitoring
chart, the management agent pod remains in a failed
state.
In this state, performing a subsequent Helm upgrade with the correct
image version does not recover the pod automatically, as Kubernetes does not
automatically restart pods that remain in a Failed
state after an
image pull error.
To resolve this issue:
- Upgrade the Helm release with the correct, accessible image
URL:
helm upgrade <release-name> --values <path-to-override-values.yaml> <path-to-helm-chart>
- Delete the failed pod so Kubernetes can recreate it with the
correct image
version:
kubectl delete pod oci-onm-mgmt-agent-0 -n oci-onm
After deletion, Kubernetes will automatically recreate the pod using the corrected configuration, and the pod should start successfully.