CRITICAL Chassis down alarm did not clear after router back online

View Only

Back to discussions

Expand all | Collapse all

CRITICAL Chassis down alarm did not clear after router back online

Jump to Best Answer

1. CRITICAL Chassis down alarm did not clear after router back online

0 Recommend
Jon Velazquez
Posted Jan 10, 2018 02:18 PM

Reply Reply Privately
Time to time we notice a critical chassis down alarm not clearing by itself. How do we troubleshoot to root cause? BGP reestablished few minutes later. It shows STALE but it should have cleared by itself.

ITGC2W000145%/d/CA_Spectrum/vnmsh > show alarms -a mh=0x50e171
ID Date Time PCauseId MHandle MName MTypeName Severity LastOccurDate&Time Ack Stale Assignment Status
46955415 01/05/2018 02:12:41 0x10f71 0x50e171 VNDDRTR1.vfc.com Rtr_Cisco MAJOR 01/05/2018 02:12:41 No Yes 13845619 05/20/2017 15:28:40 0x1030a 0x50e171 VNDDRTR1.vfc.com Rtr_Cisco OK 01/06/2018 06:06:18 Yes Yes 46955413 01/05/2018 02:12:41 0x10f69 0x50e171 VNDDRTR1.vfc.com Rtr_Cisco CRITICAL 01/05/2018 02:12:41 No Yes ITGC2W000145%/d/CA_Spectrum/vnmsh > ./disconnect

Events Tab
Jan 5, 2018 2:14:30 AM EST   VNDDRTR1.vfc.com   "A ""cbgpFsmStateChange"" event has occurred, from Rtr_Cisco device, named VNDDRTR1.vfc.com.

The BGP cbgpFsmStateChange notification is generated
        for every BGP FSM state change. The bgpPeerRemoteAddr
        value is attached to the notification object ID.

bgpPeerLastError = 0.0
bgpPeerLastError.bgpPeerRemoteAddr = 10.255.28.29
bgpPeerState = established
cbgpPeerLastErrorTxt =
cbgpPeerPrevState = openconfirm"
Jan 5, 2018 2:14:30 AM EST   VNDDRTR1.vfc.com   A bgpEstablished trap has been received for this device. The peer router is 10.255.28.29, the current state is established, and the LastError is 0.0.
Jan 5, 2018 2:14:30 AM EST   VNDDRTR1.vfc.com_Se0/0/0   The BGP Peering session from VNDDRTR1.vfc.com to US MPLS Sprint AS1803 is established.
2. Re: CRITICAL Chassis down alarm did not clear after router back online

0 Recommend
Broadcom Employee

Jason Meader
Posted Jan 11, 2018 10:37 AM

Reply Reply Privately
The BGP notification isn't configured to clear chassis alarms...unless you were just posting that to note the device was back online. If the alarm is stale, that may be why it didn't clear. Were you able to manually clear it?
Cheers
Jay
3. Re: CRITICAL Chassis down alarm did not clear after router back online

0 Recommend
Jon Velazquez
Posted Jan 11, 2018 05:08 PM

Reply Reply Privately
I can clear it manually. I did post BGP message as an event to indicate the device has connectivity now only 2 minutes later from alarm timestamp. Don't understand why this particular alarm just did not clear on its own. Notice other minor/major/critical stale alarms. I can clear all those and start monitoring when they occur.

Since device seems to be reachable within 3 minutes of the alarm time stamp just curious on how to investigate why chassis down alarm did not clear on its own?
4. Re: CRITICAL Chassis down alarm did not clear after router back online

Best Answer

0 Recommend
Broadcom Employee

Jason Meader
Posted Jan 12, 2018 03:53 PM
| view attached

Reply Reply Privately
If the device comes back online, the chassis down alarm should clear on it’s own. The only time I’ve seen where it doesn’t is if there was a customization on the Chassis event/alarm instead of using the default (check the /custom/Events/EventDisp). If you have the default event configuration and this keeps happening, you may need to open a case so we can review the data.
Cheers
Jay
5. Re: CRITICAL Chassis down alarm did not clear after router back online

0 Recommend
Jon Velazquez
Posted Jan 15, 2018 10:04 AM

Reply Reply Privately
Ok, we don't have a EventDisp under /custom/Events. I'll keep an eye out from this point forward to note when stale alarms occur. Especially right before and after a Spectroserver shutdown/restart as I know that is one possibility when this can occur.

DX NetOps

CRITICAL Chassis down alarm did not clear after router back online

Jon VelazquezJan 10, 2018 02:18 PM

Jason MeaderJan 11, 2018 10:37 AM

Jon VelazquezJan 11, 2018 05:08 PM

Jason MeaderJan 12, 2018 03:53 PMBest Answer

Jon VelazquezJan 15, 2018 10:04 AM

1. CRITICAL Chassis down alarm did not clear after router back online

2. Re: CRITICAL Chassis down alarm did not clear after router back online

3. Re: CRITICAL Chassis down alarm did not clear after router back online

4. Re: CRITICAL Chassis down alarm did not clear after router back online Best Answer

5. Re: CRITICAL Chassis down alarm did not clear after router back online

4. Re: CRITICAL Chassis down alarm did not clear after router back online

Best Answer