My apologies - I understood the question to be that you wanted to detect the problem with logmon but clear the issue with the processes probe. For something like that you need scripting because you are crossing the boundary between probes. If you are doing all the testing with logmon then, generally speaking, all you need is to have a logmon watcher that returns clear that has the same suppression id as the watcher that created the alarm. For the return code checking I believe that the clear happens automatically when you have a zero return code.
From my experience though, I'd suggest that you avoid the return code checking and instead have the script return some filterable text value - (OK or FAIL for instance) as, if nothing else, it makes debugging the whole thing easier.
Original Message:
Sent: 06-28-2019 12:10 PM
From: Christian McHugh
Subject: Clearing alarm 'x' when alarm 'y' is received
So we can have the logmon probe run a verification command every 5 minutes, and check the exit code. If the exit code is not 0 it generates an alert, but when the service is later recovered and the verification command succeeds with exit code of 0, it requires custom scripting to close out the alert?
Since this is how nagios and sensu operate, I'd expect this to be a fairly normal feature. Is it just not supported by the logmon probe when running commands, or is there a better probe to use for this?
Original Message:
Sent: 06-28-2019 11:58 AM
From: Garin Walsh
Subject: Clearing alarm 'x' when alarm 'y' is received
You need to script this (suggest using Lua) in an AO.
You need to find the alarm you want to close with something like:
alarm1=alarm.list("where","robot = '" .. robot .. "' and supp_key = '" .. supp .. "'")
You'd adjust the where criteria to match what you need - here I already knew the supp_key for the alarm I was looking for.
alarm1 then has a list of the matching alarms. Identify the one you need to close and then
action.close (a.nimid)
will close it.
Original Message:
Sent: 06-28-2019 09:20 AM
From: Stiofan Conlon
Subject: Clearing alarm 'x' when alarm 'y' is received
Hi David!
Thanks for your quick response,
So I have the process probe running also, but I was running logon to verify the service is up, but also working as expected.
is it possible to use auto-operator to close out the alert generated by logmon? say if it returns a '0' on the next run, or if the server is rebooted?
Thanks again!
Original Message:
Sent: 06-28-2019 08:59 AM
From: DAVID MICHEL
Subject: Clearing alarm 'x' when alarm 'y' is received
Oh and this page shows the compatibility between probes and OS versions.
https://docops.ca.com/ca-unified-infrastructure-management/9-0-2/en/files/490068425/537402493/6/1561451411753/Platform_Support_Availability_current.pdf
------------------------------
Support Engineer
Broadcom
Original Message:
Sent: 06-28-2019 07:12 AM
From: Stiofan Conlon
Subject: Clearing alarm 'x' when alarm 'y' is received
Hi all,
Another noob question inbound!
I am monitoring a Linux service with logmon (checking the exit code of a command).
If the service fails the alarm triggers, I would like to be able to clear this alert if the service recovers.
Could anyone point me in the right direction ?
Thanks for your time and help!