Please see here we are getting 2 different kinds of alarms . In alarm1 if you observe all the top processes percentage sums up to
82% which is actual physical memory usage .But in alarm2 , if you observe Total cpu on CHCLPRRPDNAP01 is now 95.41% and all the individual processes percentages are Top Processes [grep[128837]-(101.00%)];[grep[126788]-(100.00%)];[grep[127023]-(97.30%)];[grep[129056]-(95.50%)];[perl[100798]-(93.00%)] which is not summing up to 95.41 % !!!
Why is this deviation ? Is alarms 2 top processes a false alert as the percentages are crossing actual cpu percentage ?? If so how to resolve this ??Please provide a proper analysis / justification .
Alarm 1
Severity : major
Host Name : CHEEFRMDNOD02
IP : 10.1.114.138
Element : Memory
Message : Physical memory usage on CHEEFRMDNOD02 is now 82%, which is above the warning threshold (80%). Top Processes [java[421184]-(61.90%)];[java[279277]-(2.30%)];[java[9380]-(1.50%)];[java[231616]-(1.10%)];[java[186058]-(0.60%)]
Time : 08/13/19 14:25:31
Probe : cdm
Alarm 2
Severity : critical
Host Name : CHCLPRRPDNAP01
IP : 10.200.121.28
Element : CPU
Message : Total cpu on CHCLPRRPDNAP01 is now 95.41%, which is above the error threshold (90%).Top Processes [grep[128837]-(101.00%)];[grep[126788]-(100.00%)];[grep[127023]-(97.30%)];[grep[129056]-(95.50%)];[perl[100798]-(93.00%)]
Time : 08/13/19 14:12:07
Probe : cdm
Regards
Amar