DX Unified Infrastructure Management

 View Only

Event correlation (sharing ideas)

  • 1.  Event correlation (sharing ideas)

    Posted Apr 19, 2016 06:17 PM

    Hi,

     

    I want to share with you a need that can be useful for some and if the logic is correct.

     

    We have a net_connect pinging multiple external IPs.

     

    But when the Internet link is with high utilization, occur several alarm "threshold ping" or "ping failure." False positive!

     

    Will not solve make adjustments to the probe, it is something that will only occur when the primary link has high utilization problems. What in fact does not occur often.

     

    We believe in creating a condition where only be generated alarm "threshold ping" or "ping failure" for the profiles, if we do not have any problem in my main internet link.

     

    First try:

    - Create two triggers that are activated with a AO to run a script:

    Trigger one:

    --level = "Cricital"

    --hostname = "WEB *"

    --message Count = "1"

    --NMS Probe Name = "net_connect"

     

    AND

     

    Trigger Two:

    --level = "Cricital"

    --hostname = "WEB-MAIN"

    --message Count = "1"

    --NMS Probe name = "interface_traffic"

    --message String = "Internet overloads ..."

     

    Action

     

    Script (Close Alarms 1)

     

    al = alarm.list ( "severity", "Critical")

    if al ~ = nil then

       for i in pairs in (l) of

          --print (a.nimid, "-", a.sid, "-", a.source, "-", a.hostname)

          for i in pairs in (l) of

             if a.suppcount <= 2 and a.prid == "net_connect" then

                --print (a.suppcount, "-", a.nimid)

                action.close (a.nimid)

             end

          end

       end

    end

     

    Result:

    --The Trigger only works for new events, so if the two trigger has happened, it does not identical high-use event again when the trigger one identifies a new alarm "threshold ping" or "ping failure."

     

     

    Second attempt:

     

    - Create a AO

     

    AO profile (script type):

    --level "Critial"

    --hostname: "WEB *"

    --message Counter: "1"

    --NMS Probe Name: "net_connect"

     

    Action

     

    Script (Close Alarms 2)

     

    al = alarm.list ( "severity", "Critical")

    if al ~ = nil then

       for i in pairs in (l) of

          if a.prid == "interface_traffic" and a.source == "LNK_NTC_MTZ_TESTE" then

             --print (a.nimid, "-", a.sid, "-", a.source, "-", a.hostname)

             for i in pairs in (l) of

                if a.suppcount <= 2 and a.prid == "net_connect" then

                   --print (a.suppcount, "-", a.nimid)

                    action.close (a.nimid)

                 end

             end

          end

       end

    end

     

    Result:

    --It Works but the alarm appears and disappears from the alarm panel!

     

    NOTE: Pre-processing rules does not work for this case. Unless I put as invisible and then made visible again with a script.

    It is the best option?

     

    Thank you.