DX NetOps

 View Only
  • 1.  How to - suppress alarms after a certain number of events

    Posted May 04, 2015 02:21 PM


    Some devices become very insistent when there is an environmental condition present - such as high temperature - and send traps every 10 seconds. We recently had this problem and it was flooding our Spectrum system. Even worse, it occurred during the db backup interval and Archive Manager was shutdown and stuff really piled up.

     

    What I would like would be a way in EventDisp to allow 'n' alarms to be generated, then have an alarm generated that says 'suppressing xyz alarm' and no more alarms generated until no events have been seen for 's' seconds. The latter is almost exactly a EventRateWindowRule, but I'm not sure how to tie the rules together.

     

    Example for 'n'=2 and 2 minute window ('s'=2)

    10:00:00 event -> alarm FAN BAD

    10:00:10 event -> alarm FAN BAD

    10:00:20 event -> alarm SUPPRESS FAN BAD ERRORS FOR 2 MINUTES

    10:00:30 event

    10:00:40 event

    10:00:50 event

    10:00:60 event

    10:01:00 event

    <no more events>

    10:03:00 alarm > (Clear) FAN BAD ERRORS HAVE STOPPED

     

    Thanks,

    Al




  • 2.  Re: How to - suppress alarms after a certain number of events

    Posted May 06, 2015 08:27 AM

    Almon,

     

    Do you get the threshold breached and reset traps or it is just threshold exceeded traps for temperature? If it is just threshold breach , Spectrum will not generate multiple alarms and the events are updated unless "we are generating unique alarm for this events".

     

    For event counter rule to work, you need to have both the breached and reset events happen continously until they breach some number and then generate new threshold breached alarm.

     

    ex: alarm fan bad

         alarm fan good

     

      both occur for n number of times then generate the new threshold event with an alarm.



  • 3.  Re: How to - suppress alarms after a certain number of events

    Posted May 06, 2015 09:45 AM

    Thanks for your reply.

    Unfortunately, what we were seeing was continuous fan alarms, once every 10 seconds. There was a temperature problem which wasn't resolved for multiple hours and we had multiple devices sending these alerts every 10 seconds with no resets.

     

    SEC (Simple Event Correlator) has an easy way to handle this through the use of a context variable, so one could generate an alarm if a context (too_many_fan_bad) wasn't set. Then another command that would look for so many FAN BAD events occurring within a set window of time; if found it would set the context with a particular time to live. When the problem stopped, the context would be deleted and a new message could be sent

     

     

     




  • 4.  Re: How to - suppress alarms after a certain number of events

    Broadcom Employee
    Posted Jun 26, 2017 01:02 PM

    Thanks for using the CA Community!  It looks like your question has been answered so we are closing this thread out.  If you have additional questions, please feel free to contact us again.