DX Operational Intelligence

 View Only

 Openshift probe - how to detect pod restarts

Glenn Weavind's profile image
Glenn Weavind posted Aug 06, 2024 07:39 AM

We've suffered a number of apmservices-nass pod restarts which have caused other issues.  Support rapidly identified the CoreOS usage of Transparent Huge Pages (affects k8s and Openshift implementations) which resolved the pod crashes.

Is anyone able to tell me if I can use the UIM OpenShift probe to detect pods restarting/crashing (ideally within a given namespace) - and if so, how, please?

Nestor Falcon Gonzalez's profile image
Broadcom Employee Nestor Falcon Gonzalez

Hi Glenn, it seems that is perfectly possible:
https://techdocs.broadcom.com/us/en/ca-enterprise-software/it-operations-management/ca-unified-infrastructure-management-probes/GA/monitoring/clouds-containers-and-virtualization/openshift-monitoring/openshift-metrics.html#concept.dita_d3303fde-e786-4fd4-b0b6-e3a28fd60a82_pod

Just monitor the metric "Total Restarts" from the apmservices-nass pod.

HTH

Nestor