I was trying to monitor the application up-time using CA APM/CEM. I would like to receive email alert when one of the application server is down. Therefore i go to "Investigator" under "CPU">>"CPU Processor Count", i configured the "Comparison Operator" to "Equals To" and both "Danger" and "Caution" as zero. Meaning when there is not CPU Processor active, send an alert. Then, I bring down the application server, the graph showing "no current value" instead of zero, i think this is the main reason why i was not able to receive email alert. Kindly advise.
You cannot alert on the lack of data but the lack of data usually indicates that the agent is not reporting and is therefore disconnected. Fortunately there is a metric which holds the connection status for an agent and that has a value 1 for when the agent is connected and 3 for when it isn't.
The metric can be found at
Custom Metric Host (virtual)
- Custom Metric Process (virtual)
- Custom Metric Agent (virtual)(collector_host@port)(SuperDomain)
and then looks at the values for ConnectionStatus.
Obviously in your environment you will need to know which collector the agent should be connected to and it may not always be the same one. So you can define a metric grouping to look for the value across all collectors i.e.
(.)\|Custom Metric Process \(Virtual\)\|Custom Metric Agent \(Virtual\) .\)
If that approach is viable then you can set up an alert on that grouping to notify when the connection status value becomes 3.
for some reason the content got messed up when I replied on this comment by mail see below correct regex
agent expression(.*)\|Custom Metric Process \(Virtual\)\|Custom Metric Agent \(Virtual\) .*\)
Hi Mike, Thank you very much. Your reply is very helpful, i got it configured and also received the email alert.
The above is not a complete solution for an agent reporting to a cluster where the agent might switch collectors. Have a look at the following (and vote) idea:
Java Agent Availalbility (up/down) status metric
Best method for implementing a JVM "Offline" check although the correct'er answer is from jakbutler @ Sep 21, 2012 12:11 AM
Let us know if Mike's answer is helpful or additional assistance is needed
Hello, Mike's answer is helpful and solved my problem.Thank you!
Just to add that the Sample Management Module also has an example of Mike's alert resolution:
The problem with using ConnectionStatus is it doesn't take into account the agent moving from one collector to another.