15.3 Troubleshooting the Metrics Server
If the Kubernetes Metric Server does not reach the
READY 1/1
state, run
the following
commands:kubectl describe pod <metrics-server-pod> -n kube-system
kubectl logs <metrics-server-pod> -n kube-system
If you see errors such
as:
Readiness probe failed: HTTP probe failed with statuscode: 500
and:E0907 13:07:50.937308 1 scraper.go:140] "Failed to scrape node" err="Get \"https://X.X.X.X:10250/metrics/resource\": x509: cannot validate certificate for 100.105.18.113 because it doesn't contain any IP SANs" node="worker-node1"
then
you may need to install a valid cluster certificate for your Kubernetes cluster.For testing purposes, you can resolve this issue by:
- Delete the Kubernetes Metrics Server by running the following
command:
kubectl delete -f $WORKDIR/kubernetes/hpa/components.yaml
- Edit the
$WORKDIR/hpa/components.yaml
and locate theargs:
section. Addkubelet-insecure-tls
to the arguments. For example:spec: containers: - args: - --cert-dir=/tmp - --secure-port=4443 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --kubelet-insecure-tls - --metric-resolution=15s image: registry.k8s.io/metrics-server/metrics-server:v0.6.4 ...
- Deploy the Kubernetes Metrics Server using the
command:
kubectl apply -f components.yaml
- Run the following and make sure the
READY
status shows1/1
:
The output should look similar to the following:kubectl get pods -n kube-system | grep metric
metrics-server-d9694457-mf69d 1/1 Running 0 40s