DX Unified Infrastructure Management

 View Only
  • 1.  LUA script close alarm

    Posted Apr 08, 2020 11:09 AM
    Hello team Broadcom

    Waiting for your valuable help, I explain a little about the environment in which I find myself:

    In the snmpget probe, I have created variables in the Dynamic Index Tracking section, (when creating this variable there is no value to display or alarm) It is configured so that it can be alarmed.

    When the equipment presents any alarm that were created in the Dynamic Index Tracking it throws some value and for the previous configuration it is shown as alarm.
    When the equipment is normalized or no longer has that alarm, it remains ("stuck or stunned" it no longer collects data) in the USM alarm window, so it is required to eliminate it automatically.

    I have searched for solutions in the NAS probe with the AO with the close option, but if the alarm lasts longer than the AO time the alarm count is reset.
    If the alarm lasts less than the AO cycle (default 5min), the close works until the AO cycle is completed.

    Another option that I think may work are LUA scripts which I do not have extensive knowledge of. I found a script that cleans the alarms, but I think that it requires some type of action that cleans those that were left active (without pulsing or collecting values), since it deletes all those that are active.

    The share script:

    local alarm_list = alarm.list("sid", "1.1.3")
    for i,al in ipairs(alarm_list) do
    print("Closing alarm "..al.nimid.." with message: "..al.message)
    action.close (al.nimid)
    end



    I found it in this case:
    https://community.broadcom.com/communities/community-home/digestviewer/viewthread?MID=741152

    You can support me with this great problem.

    Any questions remain pending your comments.

    Regards.


  • 2.  RE: LUA script close alarm

    Broadcom Employee
    Posted Apr 08, 2020 12:31 PM
    Hi,

    So I guess I am not clear on what you want to happen.
    I understand you are having alarms come in from the snmpget probe.
    When the issue is resolved the probe does not send a clear alarm so these alarms have to be cleared manually.

    I do not understand how a LUA script would do anything additional for you that an AO profile will not.
    I would set up an AO profile to auto-close the alarm when overdue say 2 hours.
    If the problem is still going on it will come back and yes the count will be reset but a LUA script is just going to do the same thing.

    Maybe I am missing something here and other can help ....

    ------------------------------
    Gene Howard
    Principal Support Engineer
    Broadcom
    ------------------------------



  • 3.  RE: LUA script close alarm

    Posted Apr 08, 2020 08:44 PM
    Hi Gene

    What you say is correct, I receive alarms from the snmpget probe and when they stop collecting data it is required to be deleted automatically and not manually

    I hope there is a LUA script that can automatically remove them after 20 seconds.
    NOTE: Delete those that stopped collecting data.

    Please any questions let me know.

    Regards


  • 4.  RE: LUA script close alarm
    Best Answer

    Broadcom Employee
    Posted Apr 09, 2020 01:12 PM
    I will have to think about this and how it might be done.
    The only thing I can think of to make this work would be to get the alarm and then check the time received against the current time and if it is more than 1 minute then close it.

    Does that sound like what you want?
    The large problem I see with this is that it would have to be run on an interval, This is usually discouraged as it can impact the performance of nas if not done correctly.
    On interval causes the profile to have to scan all open alarms.
    If you have 10K open alarms as some clients do then you set an on interval every 1 minute then it scans all 10k every 1 minute and will cause major issues with alarm processing.

    ------------------------------
    Gene Howard
    Principal Support Engineer
    Broadcom
    ------------------------------



  • 5.  RE: LUA script close alarm

    Posted Apr 09, 2020 04:45 PM
    Instead of creating an AO profile that runs on interval to execute the script, you could use the nas scheduler to run it. Similar to the LUA snippet included in this thread, you would want to limit the number of alarms the script was retrieving. Using the subsystem or subsystem ID (sid) is  one way. I believe you may be able to set a custom subsystem ID in the snmpget probe. You might have to create a custom message to do so.

    ------------------------------
    [Designation]
    [City]
    ------------------------------



  • 6.  RE: LUA script close alarm

    Posted Apr 09, 2020 04:49 PM
    Question - how often is the snmpget probe polling? Is it less than 20 seconds? How would the script determine that the alarm stopped collecting data? By not recurring (suppression count not going up) and/or the difference between now and time received >=X (20 seconds?)

    ------------------------------
    [Designation]
    [City]
    ------------------------------



  • 7.  RE: LUA script close alarm

    Posted Apr 09, 2020 05:07 PM
    We have a script that does this. Here is an adaptation that would be more likely to work for you. Hopefully there are no typos.

    I wouldn't recommend closing an alarm after it has been idle for 20 seconds. That seems way too aggressive. Like Jim said, you would have to be polling the device more often than every 20 seconds, which also seems like a lot. Plus you would have to run the script way too frequently. Like Gene said, that could overload the NAS.

    We run this script in AO profiles that generally have intervals in hours or days. Some run like every 15 minutes but have specific match criteria to avoid doing that on a lot of alarms.

    -- Get idle time from script argument
    local idle_time = tonumber(SCRIPT_ARGUMENT)
    
    -- Check argument
    if idle_time == nil then
       print("ERROR: Idle time must be passed in as the script argument")
       return
    end
    
    -- Get alarm that kicked this script
    local a = alarm.get()
    
    -- Check alarm
    if a == nil then
       print("ERROR: Script must be run by AO profile")
       return
    end
    
    -- Use arrival time as received time by default
    local received = a.arrival
    
    -- Use suppression time if alarm was suppressed
    if a.supptime ~= nil then
       received = a.supptime
    end
    
    -- Reference time is idle_time seconds ago
    local ref_time = os.time() - idle_time
    
    -- Close alarm if received time is prior to reference time
    if received < ref_time then
       action.close(a.nimid)
       print("Closed alarm: "..a.nimid)​



  • 8.  RE: LUA script close alarm

    Posted Apr 10, 2020 12:15 PM
    Hello all

    Thank you for all your comments.

    Answering the questions chronologically

    Gene: It's just what I'm looking for. The operation of these alarms are very sporadic, since the mission of our service is to solve it as soon as possible.

    James: the device is configured every 20 seconds, the script By not recurring (suppression count not going up)

    Keith: Thanks for the script, I will pass it to the NAS and in the next few days I will be commenting on the results.

    Thanks for your support.

    Regards.