DX Infrastructure Management

Expand all | Collapse all

Alarm policies not clearing alarms

  • 1.  Alarm policies not clearing alarms

    Posted 24 days ago

    Hello!

    Since we are monitoring close to a 100 robots, we decided to switch to using MCS for CDM, in order to make life a bit easier.

    We've deployed the enhanced profiles:

    Setup cdm, CPU Monitor, Default Disk(s), Disk IO and Memory Monitor.

    We have 4 alarm policies. One for CPU, one for memory and two for disk.

    The problem is that while alarms are being generated by the policies, they are not cleared. 

    If we look at one of the disk policies, we've set up a condition with disc free % metric.

    The thresholds are set to <= 12% (warning), <= 8% (major) and <= 4% (critical). Alarms are being triggered when these thresholds are reached, but they are not cleared when free space is back to more than 12%.

    Has anyone experienced similar issues?

    Versions:

    Hub 9.20HF6, Data_engine 9.20HF2, mon_config_service 9.20hf1, nis_server 9.10.

    Regards

    Espen B Hanssen



  • 2.  RE: Alarm policies not clearing alarms

    Posted 23 days ago
    I tested this on a similar environment, but here the alarms are cleared.
    The only difference is that I changed the thresholds to test it.


  • 3.  RE: Alarm policies not clearing alarms

    Posted 23 days ago

    Hm....
    For a moment I thought the problem could be related to the fact that we had added a prefix to the alarm messages (Disk - for disk-alarms, CPU - for CPU-alarms etc) in order to make sorting easier.

    But even after reverting to the default setup, alarms are generated but not cleard.

    The weird thing is that we have a condition that triggers if processor queue length hits 80% over the baseline. These alarms are cleared automatically.

    I've exhausted all my ideas, so I guess I just have to open a support ticket to see if they have any ideas.

    Regards
    Espen




  • 4.  RE: Alarm policies not clearing alarms

    Posted 23 days ago
    You could use drnimbus to check if you see a clear? (not sure if he sees all clears)
    In my environment I changed the nas so that he shows also the clear message in the console (+ rule that closes these clears after xx minutes)


  • 5.  RE: Alarm policies not clearing alarms

    Posted 22 days ago

    Tried to use DrNimBus, and I can only see the alarms, no clears. Going to examine the plugin_metric.cfg a bit closer to see if it can give any clues.

    Have opened a support ticket to see if they have any tips.

    //Espen