DX NetOps

  • 1.  MODULE REMOVAL DETECTED alarms on blade servers

    Posted Oct 31, 2012 01:08 PM
    Our HP blade servers keep triggering bunches of "MODULE REMOVAL DETECTED" alarms every few minutes in Spectrum. They all clear at the next poll, but then a few minutes later they pop back up again. I can't figure out what might be triggering these. All the modules in a chassis seem to throw the alarms at the same time. I see a "This module has been pulled" event followed immediately by a "This module is online" event, then a couple of minutes later I'll get a "This module is present", a "This module is online", and a "This module is now functioning properly" event and the offline event clears. Can anyone suggest what it is that Spectrum is not finding when it polls the device?


  • 2.  RE: MODULE REMOVAL DETECTED alarms on blade servers

    Posted Nov 01, 2012 04:20 AM
    I think a lot of the blade events in Spectrum are based on SNMP traps instead of polling, so my first step would be to check if the bladesystem is sending traps about module removal and to check the logging on that side.


  • 3.  RE: MODULE REMOVAL DETECTED alarms on blade servers

    Posted Nov 01, 2012 06:18 AM

    MichielHelder wrote:

    I think a lot of the blade events in Spectrum are based on SNMP traps instead of polling, so my first step would be to check if the bladesystem is sending traps about module removal and to check the logging on that side.
    Agree with this. If you can do a tcpdump you will more than likely see SNMP traps being received from your HP Management system (The source might not be the blades themselves but the software/system managing them - kind of like vCentre is to VM's).


  • 4.  RE: MODULE REMOVAL DETECTED alarms on blade servers

    Posted Nov 02, 2012 05:15 AM
    Joeh,

    Can you post the event messages here for module removed issue? Paralley you can enable the trap debug on the vnm like the below for this ip to see if its a trap.

    ./update action=0x10245 mh=<VNM mh> index=0,attr=1,type=0x13,val=10.10.5.1


    kalyan


  • 5.  RE: MODULE REMOVAL DETECTED alarms on blade servers

    Posted Nov 02, 2012 05:16 AM
    One more thing is you can also use the ECE columns to see if it is a trap event or polled event. This column is hidden by default.

    kalyan


  • 6.  RE: MODULE REMOVAL DETECTED alarms on blade servers

    Posted Nov 05, 2012 02:27 PM
    The alarm is being triggered by event 0x10f6d. Interestingly, the only trap I can find associated with that cause code is an old Cabletron trap. I've at least gotten the OK to change the alarm to a "minor" severity, although I think I'd be safe enough suppressing it entirely. I've got a consultant running this cause code down to see if it's hard-coded somewhere. Thanks for all the suggestions. I'll update this post when I get some more information.


  • 7.  RE: MODULE REMOVAL DETECTED alarms on blade servers

    Posted Nov 08, 2012 11:03 AM
    More info - the event is derived by polling the BladeCenter devices and is generated in code. It's not a trap.