14 Monitoring BRM REST Services Manager Cloud Native
Learn how to use external applications, such as Prometheus, Grafana, and Helidon MP, to monitor BRM REST Services Manager in a cloud native environment.
Topics in this document:
About Monitoring BRM REST Services Manager Cloud Native
You set up monitoring for BRM REST Services Manager by using the following applications:
-
Helidon MP: Use this Eclipse Microprofile application to run health checks and collect metrics. Helidon MP is configured and ready to use in the BRM REST Services Manager deployment package.
For information about using the health check and metrics endpoints, see "About REST Endpoints for Monitoring BRM REST Services Manager". For more information about Helidon MP, see "Helidon MP Introduction" in the Helidon MP documentation.
-
Prometheus: Use this open-source toolkit to scrape metric data and then store it in a time-series database. Use Prometheus Operator for BRM REST Services Manager.
See "prometheus-operator" on GitHub.
-
Grafana: Use this open-source tool to view all BRM REST Services Manager metric data stored in Prometheus on a graphical dashboard.
See "Grafana Support for Prometheus" in the Prometheus documentation for information about using Grafana and Prometheus together.
Setting Up Monitoring for BRM REST Services Manager
To set up monitoring for BRM REST Services Manager cloud native:
-
Install Prometheus Operator:
-
Ensure that BRM cloud native prerequisite software, such as the Kubernetes cluster and Helm, are running and that Git is installed on the node that runs the Helm chart.
-
Create a namespace for monitoring. For example:
kubectl create namespace monitoring
-
Set the HTTP_PROXY environment variable on all cluster nodes with the following command:
export HTTP_PROXY="proxy_host" export HTTPS_PROXY=$HTTP_PROXY
where proxy_host is the hostname or IP address of your proxy server.
-
Download the Prometheus Operator helm charts with the following commands:
helm repo add stable https://charts.helm.sh/stable helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update helm fetch prometheus-community/kube-prometheus-stack
-
Unset the HTTP_PROXY environment variable with the following command:
unset HTTP_PROXY unset HTTPS_PROXY
-
Create an override-values.yaml file for Prometheus Operator and configure optional values to:
-
Add alert rules, such as the two rules in the sample below.
-
Make Prometheus, Alert Manager, and Grafana accessible outside the cluster and host machine by changing the service type to LoadBalancer.
-
Enable Grafana to send email alerts.
The following sample override-values.yaml shows alert rules and configuration options.
additionalPrometheusRulesMap: - rule-name: BRM-RSM-rule groups: - name: brm-rsm-alert-rules rules: - alert: CPU_UsageWarning annotations: message: CPU has reached 80% utilization expr: avg without(cpu) (rate(node_cpu_seconds_total{job="node-exporter", instance="instance", mode!="idle"}[5m])) > 0.8 for: 5m labels: severity: critical - alert: Memory_UsageWarning annotations: message: Memory has reached 80% utilization expr: node_memory_MemTotal_bytes{job="node-exporter", instance="instance"} - node_memory_MemFree_bytes{job="node-exporter", instance="instance"} - node_memory_Cached_bytes{job="node-exporter",instance="instance"} - node_memory_Buffers_bytes{job="node-exporter", instance="instance"} > 22322927872 for: 5m labels: severity: critical alertmanager: service: type: LoadBalancer grafana: service: type: LoadBalancer grafana.ini: smtp: enabled: true host: email_host user: "email_address" password: "password" skip_verify: true prometheus: service: type: LoadBalancer
For details about the default Prometheus Operator values to base your override-values.yaml on, see "prometheus-operator/values.yaml" on the GitHub website.
-
-
Save and close the override-values.yaml file.
-
Install Prometheus Operator with the following command:
helm install prometheus kube-prometheus-stack --values override-values.yaml -n monitoringNamespace
where monitoringNamespace is the namespace you created for monitoring.
-
Verify the installation with the following command:
kubectl get all -n monitoringNamespace
Pods and services for the following components should be listed:
-
Alert Manager
-
Grafana
-
Prometheus Operator
-
Prometheus
-
Node Exporter
-
kube-state-metrics
-
For the list of compatible software versions, see "BRM Cloud Native Deployment Software Compatibility" in BRM Compatibility Matrix.
-
-
Configure BRM REST Services Manager ServiceMonitor, which specifies how to monitor groups of services. Prometheus Operator automatically generates the scrape configuration based on this definition.
-
Ensure that BRM REST Services Manager is running.
-
Create an rsm-sm.yaml file with the following content:
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: annotations: meta.helm.sh/release-name: releaseName meta.helm.sh/release-namespace: rsm_namespace labels: app.kubernetes.io/managed-by: Helm app.kubernetes.io/name: brm-rest-services-manager app.kubernetes.io/version: rsm_version chart: brmrestservicesmanager-15.0.0.0.0 heritage: Helm release: prometheus name: brm-rest-services-manager-monitoring namespace: rsm_namespace spec: endpoints: - path: /metrics port: api-http-prt namespaceSelector: matchNames: - rsm_namespace selector: matchLabels: app.kubernetes.io/name: brm-rest-services-manager
where:-
releaseName is the name given to the BRM REST Services Manager deployment during Helm installation
-
rsm_namespace is the namespace where BRM REST Services Manager is deployed
-
rsm_version is the version of BRM REST Services Manager, for example, 15.0.0.0.0
-
-
Save and close the file.
-
Apply the changes with the following command:
kubectl apply -f rsm-sm.yaml -n rsm_namespace
-
Verify the configuration in the Prometheus user interface. From the Status menu, select Targets and confirm that the /metrics endpoint appears.
-
-
Configure Grafana to display BRM REST Services Manager metric data. See "Creating Grafana Dashboards for BRM REST Services Manager".
-
Access the health and metrics REST endpoints. See "About REST Endpoints for Monitoring BRM REST Services Manager".
Creating Grafana Dashboards for BRM REST Services Manager
Create a dashboard in Grafana for displaying your BRM REST Services Manager metric data. Alternatively, you can use the sample dashboard JSON model that is included in the oc-cn-docker-files-15.0.x.0.0.tgz package.
To use the sample dashboard:
-
Open the oc-cn-docker-files/samples/monitoring/ocrsm-rsm-dashboard.json file in a text editor.
-
Search for instance=\" and replace the default host and port all occurrences with the host where your instance of Prometheus Operator is running and your prometheus-node-exporter port.
For example, for the node_memory_MemFree_bytes expression, replace Prometheus_Operator_host and Prometheus_Node_Exporter_Port:
{ "exemplar": true, "expr": "node_memory_MemFree_bytes{job=\"node-exporter\", instance=\"Prometheus_Operator_host:Prometheus_Node_Exporter_Port\"}", "hide": false, "interval": "", "legendFormat": "Free", "refId": "D" }
-
Save and close the file.
-
In Grafana, import the edited oc-cn-docker-files/samples/monitoring/ocrsm-rsm-dashboard.json dashboard file. See "Export and Import" in the Grafana Dashboards documentation for more information.
Modifying Prometheus and Grafana Alert Rules After Deployment
After deploying Prometheus Operator, you can add alert rules in Prometheus or make changes in the Grafana user interface.
You have the following options for editing or adding Prometheus alert rules:
-
Edit the override-values.yaml file and upgrade the Helm release.
-
If you added rules in override-values.yaml before installing Prometheus Operator, use the following command to edit the rules file:
kubectl edit prometheusrule kube-prometheus-stack-0 -n monitoringNamespace
-
If you didn't add any rules in override-values.yaml, use the following command to edit the rules file:
kubectl edit prometheusrule prometheus-kube-prometheus-alertmanager -n monitoringNamespace
You can also configure alert rules and add or remove email recipients in the Grafana user interface. See "Legacy Grafana Alerts" in the Grafana documentation for more information.
About REST Endpoints for Monitoring BRM REST Services Manager
You can use REST endpoints to monitor metrics and run a health check on BRM REST Services Manager.
Use a browser to send HTTP/HTTPS requests to the endpoints listed in Table 14-1, where hostname and port are the URL and port for your BRM REST Services Manager server.
Table 14-1 BRM REST Services Manager Monitoring Endpoints
Type | Description | Endpoint |
---|---|---|
Health | Returns details for both health/live and health/ready endpoints | https://hostname:port/health |
Liveness | Confirms that the application can run in the environment. Checks disk space, heap memory, and deadlocks. | https://hostname:port/health/live |
Readiness | Confirms that the application is ready to perform work. | https://hostname:port/health/ready |
Metrics | Returns standard Helidon MP monitoring metrics for BRM REST Services Manager. | https://hostname:port/metrics |
Sample Response for the Health Endpoint
The following example shows a response for the health endpoint, which includes both liveness and readiness details:
{
"outcome": "UP",
"status": "UP",
"checks": [
{
"name": "deadlock",
"state": "UP",
"status": "UP"
},
{
"name": "diskSpace",
"state": "UP",
"status": "UP",
"data": {
"free": "144.85 GB",
"freeBytes": 155532308480,
"percentFree": "62.71%",
"total": "231.00 GB",
"totalBytes": 248031531008
}
},
{
"name": "heapMemory",
"state": "UP",
"status": "UP",
"data": {
"free": "225.08 MB",
"freeBytes": 236014824,
"max": "3.48 GB",
"maxBytes": 3739746304,
"percentFree": "97.37%",
"total": "319.00 MB",
"totalBytes": 334495744
}
}
]
}
Sample Response for the Metrics Endpoint
The response for the metrics endpoint contains the standard Helidon application and vendor metrics. The following example shows some of the metrics in the response:
# TYPE base_classloader_loadedClasses_count gauge
# HELP base_classloader_loadedClasses_count Displays the number of classes that are currently loaded in the Java virtual machine.
base_classloader_loadedClasses_count 9095
# TYPE base_classloader_loadedClasses_total counter
# HELP base_classloader_loadedClasses_total Displays the total number of classes that have been loaded since the Java virtual machine has started execution.
base_classloader_loadedClasses_total 9097
...
# TYPE base_memory_usedHeap_bytes gauge
# HELP base_memory_usedHeap_bytes Displays the amount of used heap memory in bytes.
base_memory_usedHeap_bytes 138109824
# TYPE base_thread_count gauge
# HELP base_thread_count Displays the current number of live threads including both daemon and nondaemon threads
base_thread_count 20
...
# TYPE vendor_requests_count_total counter
# HELP vendor_requests_count_total Each request (regardless of HTTP method) will increase this counter
vendor_requests_count_total 4
# TYPE vendor_requests_meter_total counter
# HELP vendor_requests_meter_total Each request will mark the meter to see overall throughput
vendor_requests_meter_total 4
# TYPE vendor_requests_meter_rate_per_second gauge
vendor_requests_meter_rate_per_second 0.008296727017772145
For details about all of the metrics and more information about Helidon monitoring, see:
- "Helidon MP Metrics Guide" in the Helidon MP documentation
- "MicroProfile Metrics specification" on the GitHub website