vSAN1

 View Only
  • 1.  Objects recovery/data extraction from vSAN.

    Posted Mar 04, 2025 09:39 AM

    We have 2 Nodes vSAN cluster with 1 vSAN Witness running 6.7. We lost 2 disks on one of the hosts which lead to multiple VMs became inaccessible.

    We tried to get support from VMWare, however as we don't have extended support for version 6.7, we did not get any help. The technical support engineer at VMWare said it is possible to use disk on working ESXi host to bring back the VM, however he did not share the approach.

    We have 25 objects in inaccessible state.

    I am not an expert with vSAN, hence asking for help. Is there anyone who can help here?

    [root@ESX-01:~] esxcli vsan debug object list --all > /tmp/objout1308
    [root@ESX-01:~] less /tmp/objout1308| grep -i inacc
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost data availability.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)
       Health: inaccessible - Lost quorum.(APD)

    Objects have following 3 health status:

    1.  Health: inaccessible - Lost quorum.(APD)

    2.  Health: reduced-availability-with-no-rebuild

    3.  Health: inaccessible - Lost data availability.(APD)



  • 2.  RE: Objects recovery/data extraction from vSAN.

    Broadcom Employee
    Posted Mar 04, 2025 09:47 AM

    Are you sure those VMs were replicated across both hosts? As this indicates you lost disks on both hosts or the witness may also be gone... 




  • 3.  RE: Objects recovery/data extraction from vSAN.

    Posted Mar 05, 2025 04:49 PM
    Edited by TheBobkin Mar 05, 2025 04:51 PM

    Hi @kplmdn

    Paul Twomey here - the support engineer that assisted with this issue.

    "We tried to get support from VMWare, however as we don't have extended support for version 6.7, we did not get any help."

    In fairness, I wouldn't call that an accurate assessment of what transpired - I provided you with a detailed analysis and explanation of what occurred here and discussed various avenues of resolution.

    Your Witness has not been participating in this cluster for at least one month (and very likely longer - that is as far as vmkernel.log goes back on the Witness here) and the data-nodes have not been updated since 2019(!), this is a security risk to say the least and not how clusters should be managed.

    If the Witness is not in the cluster, then your data is functionally FTT=0 (other than in later builds, as we implemented 2 feature to prevent that being the case) and thus having disk failures results in inaccessible objects, this is expected behaviour and the recourse is to restore lost VMs/vmdks from backup.

    "The technical support engineer at VMWare said it is possible to use disk on working ESXi host to bring back the VM, however he did not share the approach."

    I did indicate how that may be possible - it would require using a tool internal to VMware by Broadcom and to be permitted and able to use it would require the nodes be updated to a version that supports modern versions of that tool and allowance for me to work on this environment (either by having extended support or updating to an ESXi version we still generally support).

    While *possibly* toeing the line of providing support where we are not supposed to, I today dug up an old vSAN 6.7 U3b Witness OVA and provided this to you - if you would like to deploy a functional Witness using this, configure the cluster as Stretched using it, repair all (non-inaccessible objects) back to a redundant state, which will then allow you to update both data-nodes and the Witness to ESXi/vSAN 7.x/8.x without VM downtime being needed, I will happily assist you with attempt to recover those objects as discussed previously.