DX Unified Infrastructure Management

 View Only
Expand all | Collapse all

clearing alerts which not updated for last X time

  • 1.  clearing alerts which not updated for last X time

    Posted Jul 07, 2015 04:20 AM

    Hi guys,

     

    does anyone have any idea how to clear an alerts which didnt go any update (count didnt grow or time recived field hasnt updated for past X time range)?

     

     

    i have created alerts with AO profile, which make new_alarm action after few triggers met.

     

    i cant clear this alerts, becuase the original alerts that making this Triggers are haveing pre-processing rule for made them invis - so i can see only the new alert i have created.

     

    best regards!



  • 2.  Re: clearing alerts which not updated for last X time

    Posted Jul 07, 2015 07:14 AM

    Hi,

     

    I'm using a LUA script for the cleanup myself. I'm on vacation currently, so I can't get the script for you, but this is basically what I've done:

     

    1. Runs once an hour (or something similar), scheduled by nas
    2. Gets a list of all alarms and loops through them
    3. Checks severity, I only want to clean alarms with severity lower than major
      1. Each lower severity has a cleanup theshold specified, informational = 24h, warning 48h, minor 72h or something similar
      2. If last update for alarm of this severity occurred over specified theshold ago, then close it.
        1. I've also built a scheduling system for this, like "business hours". I don't remember in detail how smart it is, but the idea is that it doesn't clean alarms that haven't been active long enough during business hours, so that people have had a chance to see them. For example, on sunday I don't want to cleanup alarms created on saturday, since it's likely no one's seen them.

     

    If you're interested, I might be able to get the script for you next week.

     

    -jon



  • 3.  Re: clearing alerts which not updated for last X time

    Posted Jul 07, 2015 08:45 AM

    hi,

     

    if you can send it to me when you will come back work it will be great - because i dotn know lua script..

     

    best regards and have great vecation!



  • 4.  Re: clearing alerts which not updated for last X time

    Posted Jul 13, 2015 05:29 AM

    Hi,

     

    I'm not sure if it'll make much sense without scripting knowledge, but here it is anyway:

     

    local debug = 0
    local t_close_alarms = {}
    local t_time = {}
    local alarms = alarm.list()
    local now = timestamp.now()
    
    
    t_time["hour"] = tonumber(timestamp.format(now, "%H"))
    t_time["day"] = tonumber(timestamp.format(now, "%w"))
    
    
    for _,a in pairs(alarms) do
       local time_diff = timestamp.diff(timestamp.fromISO(a.time_origin),"hours", timestamp.now())
       local print_line = 0
    
    
       if (a.level == 3 and time_diff > 72) then
          table.insert(t_close_alarms, a.nimid)
          print_line = 1
       elseif (a.level == 2 and time_diff > 48 and t_time["day"] ~= 0 and t_time["day"] ~= 1) then
          if (t_time["day"] == 1 and t_time["hour"] <= 12) then
          else
             table.insert(t_close_alarms, a.nimid)
             print_line = 1
          end
       elseif (a.level == 1 and time_diff > 24 and t_time["day"] ~= 0 and t_time["day"] ~= 1) then
          if (t_time["day"] == 6 and t_time["hour"] >= 12) then
          else   
             table.insert(t_close_alarms, a.nimid)
             print("close info")
             print_line = 1
          end
       end
    
    
       if print_line == 1 and debug == 1 then
          printf("%s: %s - %d: %d", a.nimid, a.time_origin, timestamp.diff(timestamp.fromISO(a.time_origin), "hours", timestamp.now()),a.level)
       end
    end
    
    
    local s_close_alarms = table.concat(t_close_alarms, ",")
    
    
    printf("closing %d alarms: %s", #t_close_alarms, s_close_alarms)
    action.close(s_close_alarms)
    

     

    -jon



  • 5.  Re: clearing alerts which not updated for last X time

    Posted Jul 19, 2015 07:12 AM

    hi,

     

    thank you! i wil look into it and try to edit it to my ENV.

     

    best regards!



  • 6.  Re: clearing alerts which not updated for last X time

    Posted Jul 20, 2015 02:21 AM

    hi,

     

    i have changed the script to the next one: (trying to look for the last hour critical alerts which visible and contain NetworkTeam stirng on message and not updated on time recived field - all works great until i changed line 13 - the time_origin to time_supp - do you know what is wrong at what i have done?

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    the error i get:

     

    Error in line 13: bad argument #1 to 'fromISO' (string expected, got nil)

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     

    local debug = 0 

    local t_close_alarms = {} 

    local t_time = {} 

    local alarms = alarm.list() 

    local now = timestamp.now() 

     

     

    t_time["hour"] = tonumber(timestamp.format(now, "%H")) 

    t_time["day"] = tonumber(timestamp.format(now, "%w")) 

     

     

    for _,a in pairs(alarms) do 

       local time_diff = timestamp.diff(timestamp.fromISO(a.time_supp),"hours", timestamp.now()) 

       local print_line = 0 

     

     

       if (a.level == 5 and time_diff > 1 and string.match(a.message, "NetworkTeam") and a.visible == 1) then 

          table.insert(t_close_alarms, a.nimid) 

          print_line = 1 

       end

     

     

       if print_line == 1 and debug == 1 then 

          printf("%s: %s - %d: %d", a.nimid, a.time_origin, timestamp.diff(timestamp.fromISO(a.time_origin), "hours", timestamp.now()),a.level) 

       end 

    end 

     

     

    local s_close_alarms = table.concat(t_close_alarms, ",") 

     

     

    printf("closing %d alarms: %s", #t_close_alarms, s_close_alarms) 

    action.close(s_close_alarms) 

     

    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


    ---------------


    regards



  • 7.  Re: clearing alerts which not updated for last X time

    Posted Jul 20, 2015 05:44 PM

    I suspect this might be getting hung up on an alarm with only one occurrence. If the alarm has not repeated, it has never been suppressed, and the suppression time is empty.

     

    You probably need to check if time_supp is nil and then use time_arrival instead. If the alarm has never been suppressed, time_arrival is the correct time to show when you last received the alarm.



  • 8.  Re: clearing alerts which not updated for last X time

    Posted Jul 20, 2015 06:48 PM

    good point



  • 9.  Re: clearing alerts which not updated for last X time

    Posted Jul 20, 2015 07:02 PM

    As Keith suggested, the time_supp might be a nil. Without testing, you could remedy it with something like this:

     

    local time_diff = timestamp.diff(timestamp.fromISO(a.time_supp or a.time_arrival),"hours", timestamp.now())
    
    


  • 10.  Re: clearing alerts which not updated for last X time

    Posted Jul 21, 2015 08:54 AM

    hi,

     

    thank you for the answers! i tried this and its now working good.

    hope it will do the job i need to remove alerts which havent updated for the last hour

     

    regards!



  • 11.  Re: clearing alerts which not updated for last X time

    Posted Oct 03, 2015 12:57 AM

    Hello,

     

    I was seeking to understand how to clean an alarm created with a script to test the correlation between three ping tests for a url site.

     

    I managed to create the script to ping test to the three sites, but did not understand how to remove the alarm after one of the pings back.

     

    I think this script will serve me.

     

    Or you think take the logic and specify the closing of this alert in specific?

     

    Or is there some sort of easier to use a basic AO?

     

    Thank you.



  • 12.  Re: clearing alerts which not updated for last X time

    Posted Oct 12, 2017 11:06 AM

    Great question and a great response from John.

     

    Please mark this as answered