DX Unified Infrastructure Management

Expand all | Collapse all

How suppress robot inactive alarm over ping failure?

  • 1.  How suppress robot inactive alarm over ping failure?

    Posted Mar 26, 2019 03:48 PM

    I am receiving two alerts when the server is down, one is for server and another for robot inactive. Can someone help to correlate these alerts in UIM.

     

    1. Ping failure alert

    2. Robot inactive alert

     

    Earlier, someone had this question but I don't find answer on that thread.



  • 2.  Re: How suppress robot inactive alarm over ping failure?

    Posted Mar 27, 2019 09:11 AM

    So is what you want a lua script that when a robot inactive alarm comes in, the script will check to see if there is a pre-existing ping failure alarm, and then only if there is a ping alarm suppress the robot inactive? 

     

    Maybe a script with a sql query to the nas_alarms table to check for the ping alarm.



  • 3.  Re: How suppress robot inactive alarm over ping failure?

    Posted Apr 02, 2019 08:32 AM

    Thanks David. I will check this option.



  • 4.  Re: How suppress robot inactive alarm over ping failure?

    Posted Apr 02, 2019 02:25 PM

    Check out the nas technical document also. https://c.na53.content.force.com/servlet/fileField?id=0BE60000000PBnT 

     

    There are a bunch of alarm.whatever function calls to interact with the current set of alarms. You probably want to consider alarm.query().

     

    Also consider using triggers because the nas allows you to create logic off triggers so if you have a ping trigger and a robot inactive trigger you can create an alarm that is "robot inactive trigger fired and not pring trigger fired" to generate the alert. 

     

    It doesn't scale well unless you create a process to automatically create the triggers but it does work well in specific cases.



  • 5.  Re: How suppress robot inactive alarm over ping failure?

    Posted Apr 10, 2019 06:01 AM

    Thanks Garin,

     

    I do not have login access to  https://c.na53.content.force.com/servlet/fileField?id=0BE60000000PBnT.

    Can you provide pdf version of this doc, if possible?



  • 6.  Re: How suppress robot inactive alarm over ping failure?

    Posted Apr 10, 2019 06:13 AM

    Hi David,

     

    I have prepared a script which is working perfect while testing, but fails to work in live/prod environment.

    script is not throwing any error.

     

    Here one more thing I want to mention.

    We have multiple NAS, one is on secondary HUB & one is on Primary HUB.

    I am trying to deploy this script on primary HUB where only we have access to the Data_engine.

     

    Can you please suggest on this?



  • 7.  Re: How suppress robot inactive alarm over ping failure?

    Broadcom Employee
    Posted Apr 12, 2019 03:59 AM


  • 8.  Re: How suppress robot inactive alarm over ping failure?

    Posted Apr 12, 2019 09:00 AM

    Have the script echo each step so it is possible to see what's happening and find the break point, or break the script up into parts so each part can be tested, again with the goal of finding the break point.



  • 9.  Re: How suppress robot inactive alarm over ping failure?

    Posted Apr 25, 2019 03:02 AM

    Getting this err -

    Apr 25 12:26:23:765 [10384] nas: PREPROCESSOR ERROR: scripts/test\visible_test:2: attempt to index global 'action' (a nil value)

     

    My Sample script is -

    ------------------------------------------------

    ipadr = event.source

    if(action.ping (ipadr)) then

          action.visibility (false)

    end

    return event

    -----------------------------------------------------

    Please suggest.



  • 10.  Re: How suppress robot inactive alarm over ping failure?

    Posted Apr 25, 2019 10:30 AM

    It appears that you are trying to use the NAS preprocessing rules as opposed to an AO profile. 

     

    In the preprocessor script there are limited functions available - To quote the documentation:

    Note that only a subset of the lua methods are available to the pre-processing script. The
    following classes and methods are not available: exit, sleep, nimbus, pds, trigger, action,
    database, alarm and note. The trigger.state method through the state method is however
    available.

     

    So when you try to call action.ping, the variable action is nil because it's not defined. That's why you get the error.

     

    The documentation further states:

    The script is expected to return the event (modified or not) or nil. The nil will indicate that the event
    is to be skipped.

    So if you want the event to never become an alarm, all your preprocessing script needs to do is to return nil.

     

    And the visibility setting you were trying to manipulate isn't specifically for suppressing alerts. It's intended to provide a flag to filter on when using the alarm viewer and or as criteria for AO profiles. Also note that if you replicate between nas instances, the visibility flag will not replicate - it is perfectly valid for an alarm to be visible in one nas and not visible in another.

     

    -Garin



  • 11.  Re: How suppress robot inactive alarm over ping failure?

    Posted Apr 25, 2019 10:48 AM

    Thanks for the information Garin.

     

    Here I am trying to change some event parameters (IP address or hostname) on the basis of ping response with pre-processing rule.

    Can you suggest any other way that I can use action.ping & modify event parameters.



  • 12.  Re: How suppress robot inactive alarm over ping failure?

    Posted Apr 25, 2019 02:35 PM

    I'd suggest going back over the discussions of correlation in the forums. This topic gets hit every couple months over the past several years with some good content available.

     

    To take a couple steps backwards, the directly supported way to correlate multiple events is using triggers.

     

    It requires some setup and assumptions. The main assumption is that you are going to use visibility (or one of the custom fields) to control whether an alarm is an alarm (visible = true) or the input to a trigger (visible = false). And then use that visibility flag to prevent these trigger-alarms from polluting the real alarms. (there's an idea out there to allow for an alarm type that's natively intended to drive a trigger and never show up as a real alarm but it hasn't seen much interest - here's that use case)

     

    The second big assumption is that you have a very small number of systems you want to build such a correlation for since triggers are just fancy "pre computed filters".

     

    Then you create an AO profile (on arrival and visible=true) that runs a script that updates the visibility to false and reposts the message for the messages that are matched by the trigger.

     

    Then you create two triggers, one of the ping and one for the robot message.

     

    Then you create the nas trigger logic to create an alarm that "ors" the two triggers.

     

    If you have a large number of systems where is it prohibitive to create the triggers manually, you have to script it. That can be a little difficult to wrap one's head around because it's easy to forget that you are just processing messages as they arrive, not updating the status of alarms in a database. It is worth while to do a little reading on state table based programming otherwise you can lose yourself in all the various permutations of the message sequences.