View Only
  • 1.  HBA has failed alarm?

    Posted Mar 15, 2007 10:26 PM

    Does anyone have a way of getting an alert when an HBA fails?

    The clients fiber switch doesn't send alerts and no snmp monitoring is configured so I'm looking for a way in ESX or VC to be notified when I lose a HBA connection from the server to the switch. The same for a network card failure.




  • 2.  RE: HBA has failed alarm?

    Posted Mar 18, 2007 06:20 AM

    There is probably something in /var/log/vmkwarning that you could monitor for. Or some kind of syslog script.

  • 3.  RE: HBA has failed alarm?

    Posted Mar 20, 2007 06:12 AM

    Or could your SAN controllers alert you?

    Message was edited by:


  • 4.  RE: HBA has failed alarm?

    Posted Mar 26, 2007 03:01 AM

    I can get alarms from the SAN controllers but I would really like to get an alert from ESX as well.

  • 5.  RE: HBA has failed alarm?

    Posted Apr 12, 2007 04:41 AM

    I was thinking about this some more and was wondering if you are running dual HBAs and dual NICs on your host server?

    You could try pulling one of the HBA cables and see what shows up in the logs. If you found a key phrase to search on, maybe DISK or VMHBA in a particular log file that would help.

    Also, I was wondering if you were looking for real time notification or if a daily notification was ok? If a daily is ok, there may be a script you can throw together that will look for new entries in the log files and alert you if there are any errors.

  • 6.  RE: HBA has failed alarm?
    Best Answer

    Posted Apr 12, 2007 06:08 AM

    If you have a failure of a path (or to a secondary HBA) you may see some, or all of the following:

    "Manual switchover to path vmhbax:y:z begins"

    "Changing active path to vmhbax.y.z"

    "Manual switchover to vmhbax.y.z completed successfully"


    "Delaying failover to path vmhbax.y.z"

    "Manual switchover to path vmhbax.y.z begins."

    "Manual switchover to vmhbax.y.z completed successfully."

    -or, our worst nightmare-

    "Manual switchover to path vmhbax.y.z begins."

    "Did not switchover to vmhbax.y.z"

    (where x.y.z are the, of course[/i])

    Further, if ESX 3.x detects a hardware failure of the HBA, it may also log the following text:

    "Unrecoverable hardware error : Adapter being marked offline"

    Where to look? If it has already been written to the logs, you can check for it here:


    Or, for real time access to logging that hasn't been written to disk yet:


    What to look for? The key word here is 'switchover'

  • 7.  RE: HBA has failed alarm?

    Posted Apr 20, 2007 08:44 PM

    Thanks everyone it's all very helpful. We are using IBM x366 and yes I am looking for a real time notification to let me know an hba has failed.

    Grasshopper, your input was very helpful, thanks.

    I can look through logs myself but I really don't want to have to do that, I'm looking into what damage will be done loading a director agent onto ESX. The last time I loaded it, it was using java which I really didn't like.

    I don't get anything on the hba's from snmp or in the normal director agent, but I'm going to load in a RSA card and see if that does anything for me.


  • 8.  RE: HBA has failed alarm?

    Posted Apr 22, 2007 04:53 AM

    Unfortunately I don't think Director or an RSA card will alert you on failed HBA's.

    I agree with you on loading director, not a good idea. My opinion is that the director software causes more trouble than good. The RSA's are really nice. It seems that both Director and RSA's are limited in their ability to report on HBA problems.

    How's are you at shell scripts? :smileyhappy:

  • 9.  RE: HBA has failed alarm?

    Posted Apr 26, 2007 10:00 PM

    I'm starting to get the impression that the only way is going to be with a shell script. We've loaded the RSA card but so far it doesn't look too promising.

    Thanks for everyone's input, if lightning strikes in the middle of the night and you get some genious idea please let me know.

    If I script something that works :smileyhappy: I'll be more than happy to post it.


  • 10.  RE: HBA has failed alarm?

    Posted Apr 12, 2007 08:33 AM

    What hardware are you using?

    We've got HP DL585's. We have installed the HP insight agents on each of our hosts, and then set them up to send SNMP traps to HP Openview Operations when things like HBA or NIC connections fail.

    I'm sure there are similar agents for Dell and IBM hardware.