VMware vSphere

 View Only
  • 1.  Data loss of two virtual machines on two different ESXi hosts

    Posted Apr 09, 2022 02:40 PM

    Hi everyone,

    Rather bizarre one I'm struggling to figure out. We have suffered data loss from a handful of VMs running on two different ESXi hosts. This dataloss was initially observed after one of the Linux VMs I was sshd into dropped connection along with this being reflected as unavailable in our monitoring software. The VM appear to had rebooted and this was reflected after I had sshd back to it a short time later where the uptime reflected the reboot. I then decided to shutdown the VM and noticed via vSphere the VM was still running, but was not available by ssh. Odd. I used the vSphere to web console to connect onto the VM where the network interface was down, so I brought it up. While on the VM I ran uptime where the same VM that had just been rebooted and shutdown showed an uptime of 1 day (there had been a power outage the day before). Okay now this is super weird. Upon further looking at the health of the VM there was data loss (this VM runs jenkins and pipelines/builds were missing), almost as far back as two months. What's odd is that /var/log/messages of the VM appear to be populated over the time period data was lost, yet other files are lost. Regarding the powerloss, due to where in the world these ESXi hosts are, they are subject to frequent powercuts. Probably every other month.

    To add some further information to the background of our ESXi hosts. A little over a month ago we started using ghettoVCB. When restoring from a backup made a week ago, this backup also showed the same dataloss described above. So ghettoVCB backups lost data as well. Well when restoring a backup of a different VM the data contained within appeared up to date as to when the backup was made.

    The dataloss of both the VM described and it's backup appears to be before ghettoVCB backups were implemented. Does anything described here sound familiar or indicate anything? I've checked logs in a few places both at the VM level and ESXi host and nothing stands out. Unfortuately I'm unable to determine if the powerlosses, ghettoVCB backups or something else has caused this. Any suggestions would be much appreciated.

    My questions are:
    Is there any way to check the health of virtual machines and there disks to see if data loss of other VMs can be prevented?
    Has anyone experienced something weird like this?
    Is data lost recoverable? For the time being we have restored from file based backups rather than ghettoVCBs image based backups.
    What is the behaviour of ESXi data storage during a power loss?
    Could ghettoVCB have caused this?

    Any questions feel free to ask and I'll provide as much information as possible.

    Thanks for your time in reading this,
    Kind regards



  • 2.  RE: Data loss of two virtual machines on two different ESXi hosts

    Posted Apr 09, 2022 03:56 PM

    Honestly this sounds like you dont have an idea of how snapshots and virtual disks  actually work.
    Connect to both hosts with WinSCP - if you dont know what WinSCP is - do your homework now.
    Then create screenshots / or filelists of both directories and attach the latest 3 vmware.log files to your next reply.

    Ulli



  • 3.  RE: Data loss of two virtual machines on two different ESXi hosts

    Posted Apr 09, 2022 04:57 PM

    If you need to run ESXi in an area where you have to expect regular powerfailures you have 2 options:

    THINK BIG: run vSphere as large as VMware suggests and replicate your datastores to a second continent or better second planet..
    Then you can live with the dataloss because you can  easily switch to healthy older copies.
    or

    AVOID THIN PROVISIONING
    In a small environment with a small budget you must avoid the extra risk of thin provisioning.
    Use only eager zeroed thick provisioned vmdks and write down the exact mapping of each flat.vmdk.
    Then you can restore the flat.vmdk if VMFS fails and have a dataloss prevention like with native NTFS.

    Ulli



  • 4.  RE: Data loss of two virtual machines on two different ESXi hosts

    Posted Apr 09, 2022 06:45 PM

    Hello.
    No software or hardware is power outage-proof.

    A power outage is equivalent to us taking the wheels off a moving car. Anything can happen.

    The first requirement of the hardware is a normal and continuous power supply.

    Data is the most important thing, so investing in a UPS for hardware will save us a lot of headaches.

    If your VMs are few (about 10) I recommend using the Veeam Backup & Replication Edition Community (Free).