One of our users is asking for an interesting set of logic for processing BGP alarms. We already have thousands of standing alarms in our environment. We are trying to cut that down help people focus on the real problems. One of our big pain points with alarms is around BGP backwards Transition traps. What my user would like for us to do is:
Generate one alarm when a router reports a BGP backward transition.
Increase the Occurrence count as more are generated.
Keep track of the peer router identified in the BGP backward transition alarm
As traps are received that indicate that the peers return to established, remove those from the list and decrease the occurrence count.
When the occurrence count transitions from 1 to 0 clear the alarm
Maintaining a list of peer routers is not that difficult. I was looking to use Event Procedures to maintain an additional alarm attribute where we would keep the current peer router list.
However, the Occurrence count is where I'm stuck. I know that when a second alarm of the same type comes in that is not unique, it causes the count to increase. However, I don't know how to decrease the occurrence count. Is that even possible?