I’ve been working at this IT department for over 10 years now and this year our company decided it would be rebuilding our remote offices server infrastructure on HP DL380 G7 servers running ESXI 5.x. These servers would be spread around the continental US. The bandwidth requirements depends on the data load are equipped with fractional T3 anywhere from 5-45meg. Our rollout so far has gone well but there have been some alerting and visibility issues with the hardware. Some of these problems can be attributed to knowledge transfer issues. The main build engineer left the company as the build process was engineered and SNMP was never enabled on these servers. Since operations expected SNMP hardware alerts from enterprise ID 232 (HP/Compaq Prolaint) when one of the components failed, there was massive confusion on why this alerting didn’t work because nobody at our company totally understood what was intended to get this new version of ESXI working to report hardware faults. Lucky for us we have a couple of seasoned support engineers here that assumed that the legacy SNMP infrastructure was a supported direction. Once SNMP was configured, the traps began to flow and we got over that hump. Lesson learned.
Another issue has come up with the sphere hardware tab. As hardware faults report into vCenter, that alert status doesn’t seem to clear on its own. I’ve had stale alerts as old as 30 days. Some I’ve been able to clear with the update link and others I had to being the server down and reboot it.