VMware vSphere

 View Only
  • 1.  Esxi 6.7 U3 installed on a HP Proliant DL380 Gen 10

    Posted Jun 20, 2024 03:42 AM

    Strange issue here standalone esxi 6.7 host. HP ilo shows sever is healthy and no hardware issues. Host only has 2 vm's running and they have e1000 drivers. The host has gone down for 2 days in a row around the same time. No one has touched any networking etc. The host is up but has no connectivity. I can't ping it, I can log in over ilo and see the esxi host is on and has the correct static IP. If it's rebooted it starts to work again w/o any issues. Although it was not done I assume restarting the mgmnt services would fix it also. What is going on? I downloaded the support logs after the reboot, any suggestions on what logs to look in for this. I have no idea what is causing this.



  • 2.  RE: Esxi 6.7 U3 installed on a HP Proliant DL380 Gen 10

    Posted Jun 28, 2024 11:45 AM

    Do you have the latest NIC driver and HPE firmware for the host compatible with 6.7? Also, match the firmware version of the NIC to the driver version on the HCL. 




  • 3.  RE: Esxi 6.7 U3 installed on a HP Proliant DL380 Gen 10

    Posted Jun 28, 2024 12:43 PM

    I discovered in this process that the raid controller would crap out and loose connectivity. Esxi is running on the HDD and not off a USB. That's when I discovered the controller error in the vmw logs and then updated the firmware followed by the HP service pack. HP Ilo shows the server is healthy, but I still see some raid errors, the most important one 1441 volume offline has stopped since the service pack update.




  • 4.  RE: Esxi 6.7 U3 installed on a HP Proliant DL380 Gen 10

    Posted Jun 28, 2024 01:24 PM

    So, if the volume or disk seemed to be suddenly removed from the OS, The host may still seem to partially function, but have issues. We had a hardware issue with one of our fabric switches years ago (we were boot from SAN). The symptom started showing up as randomly dropping connectivity to VM's, not being able to connect to vCenter or once connected, being dropped. Pings to the host did not seem to work so we got Cisco involved (we had them for both the switches and UCS). Turns out, a bug in the fabric switch was dropping traffic to the storage, but since the port showed up, there was no failover. We had to hard boot all host to force a failover to the other switch and get the environment stable.