VMware vSphere

 View Only
Expand all | Collapse all

Unexpeted, repeated shutdowns of EXSi server

  • 1.  Unexpeted, repeated shutdowns of EXSi server

    Posted Dec 21, 2017 05:05 PM

    From about two weeks my Esxi server on HP Proliant DL 380 G5 shutdown unexpetedly  every two days.

    After that I must manually restart it.

    If I see on the vSphere dashboard the only alert message is this:

    Any idea on how to fix it?

    Thanks in advance



  • 2.  RE: Unexpeted, repeated shutdowns of EXSi server

    Posted Dec 21, 2017 05:08 PM

    That usually means the cache battery on the storage controller is dead and you must replace it. As to whether that is responsible for the ESXi host shutdowns, don't know, but if you are using those internal drives in a RAID configuration and especially using write-back caching, you should plan to replace it ASAP.



  • 3.  RE: Unexpeted, repeated shutdowns of EXSi server

    Posted Dec 26, 2017 03:18 PM

    Hi, daphnissov , thanks for your reply.

    I'll try asap



  • 4.  RE: Unexpeted, repeated shutdowns of EXSi server

    Posted Dec 21, 2017 05:58 PM

    Hello m4biz,

    I advise logging via ssh and parsing the ESXi host logs, below the files responsible for each function and their respective locations.

    VMware Knowledge Base

    Then look at the HP Server logs through HP System Insight Manager..

    https://www.hpe.com/us/en/product-catalog/detail/pip.489496.html



  • 5.  RE: Unexpeted, repeated shutdowns of EXSi server

    Posted Dec 21, 2017 06:23 PM

    For the unexpected shutdown, login to the server's iLO, and  check the System Management Logs.

    André



  • 6.  RE: Unexpeted, repeated shutdowns of EXSi server

    Posted Dec 21, 2017 08:00 PM

    Checks if there are any events in the iLO log, as the controller cache may be experiencing problems. It can be hardware with problems or lack of firmware application.

    Drivers & software for hp proliant dl380 g5 server:

    https://support.hpe.com/hpsc/swd/public/detail?sp4ts.oid=1121413&swItemId=MTX_a1fd0e6f7fdc42549704ee1582&swEnvOid=4184

    https://support.hpe.com/hpsc/swd/public/detail?sp4ts.oid=1121413&swItemId=MTX_575981f040124616bcad8db1e7&swEnvOid=4184

    Installing async drivers in ESXi 5.x and 6.x using esxcli and async driver VIB file (2137854):

    https://support.hpe.com/hpesc/public/home/driverHome?sp4ts.oid=1121413&swLangOid=2&swEnvOid=4166

    Hugs,



  • 7.  RE: Unexpeted, repeated shutdowns of EXSi server

    Posted Jan 16, 2018 06:45 PM

    No, a failed Smart array battery won't bug your box, but any one or combination of: a failing raid controller, especially the onboard ones, a bad motherboard, bad power supplies, bad power AC/DC power regulators, will.

    Yes, are are seeing a sensor for the battery, but that is just coincidence for all the other stuff that can go bad over time for a decade old box.

    All the ProLiant sensors just tell you if something is present or not-present, and doesn't account for "works some of the time".

    You can try a new battery swap, but the cost of the battery will be the same as replacing the entire box, your call.

    You can also try to drop in a better smart array card, but the downtime to replace the entire box is just as long.



  • 8.  RE: Unexpeted, repeated shutdowns of EXSi server

    Posted Feb 14, 2018 02:10 PM

    Hi Dave.

    Sorry for delay in my reply.

    Is there anyway to disable this check and stop the continue shutdown without replace the battery pack?

    My disks  not are in any RAID configuration.



  • 9.  RE: Unexpeted, repeated shutdowns of EXSi server

    Posted Feb 14, 2018 02:30 PM

    I guess you can try to physically detach battery pack from raid controller.



  • 10.  RE: Unexpeted, repeated shutdowns of EXSi server

    Posted Feb 14, 2018 06:53 PM

    Hi Finikiez ,

    thanks forr your reply.

    What happen if I do this?

    The server works too?



  • 11.  RE: Unexpeted, repeated shutdowns of EXSi server

    Posted Feb 15, 2018 08:03 AM

    Cache battery is necessary to avoid data corruption in case of unexpected power loss when write cache is enabled.

    So try to disable write cache in controller's BIOS or from ACU cli first.

    If this doesn't help, try to detach it physically.

    When you disable write cache expect write performance degradation.



  • 12.  RE: Unexpeted, repeated shutdowns of EXSi server

    Posted Feb 15, 2018 09:20 AM

    The check alone would not be causing the shutdown so there's no need to remove it. Your best bet is to check through the iLO to see whats actually happening as it sounds like it could probably be a hardware fault as you're using a G5 server...Apart from that, check through the ESXi host logs to see if there are any software errors



  • 13.  RE: Unexpeted, repeated shutdowns of EXSi server

    Posted Feb 15, 2018 04:00 PM

    Hi imacfj , thanks for your reply.

    I've just configured iLo 2 and I've founded this:

    At this point I think that I must replace the battery pack:

    What do you think about?



  • 14.  RE: Unexpeted, repeated shutdowns of EXSi server

    Posted Feb 16, 2018 08:17 AM

    If you can replace the battery obviously you need to do this.

    If you can't - disable write cache on the contoller.



  • 15.  RE: Unexpeted, repeated shutdowns of EXSi server
    Best Answer

    Posted Apr 12, 2018 02:44 PM

    Hi!

    I've solved the issue.

    I've replaced the battery pack without solve the problem.

    The real problem was related to my APC Smart UPS .

    After I've contacted APC support I've simply re-started the UPS by means a very simple procedure that APC' support has mailed to me and all now works fine from about three weeks.

    I hope my feedback will be useful to other with similar issue.