vSAN1

 View Only
Expand all | Collapse all

2-node stretched cluster - unable to add witness host

  • 1.  2-node stretched cluster - unable to add witness host

    Posted Dec 27, 2017 05:31 PM

    I had a 2-node vSAN cluster configured with the witness, everything was fine.

    Then, the witness host got deleted by mistake... so I rebuilt a new one and am trying to reconfigure the stretched cluster but I'm getting an error when trying to add the new witness host back. The error just simply says "Failed to add witness host to a stretched cluster"

    I followed some guides regarded configuration requirements, to make sure everything was in place (distributed switch/disk settings, etc.). When selecting the witness host in the wizard, it says the compatibility checks succeeded. But when I click Finish, it fails immediately.

    I also tried running update manager to make sure all hosts are fully updated, and I updated the vcenter appliance as well.

    I'm just not sure which logs to check.



  • 2.  RE: 2-node stretched cluster - unable to add witness host

    Broadcom Employee
    Posted Dec 27, 2017 06:10 PM

    Here is how I would approach this:

    Pre-reqs:

    1- vCenter at same or higher version as Hosts

    Make sure vSAN is not holding on to old NodeUuuid for your witness. If vSAN ckuster is still configured then check unicastagents

    - From a host CLI run "esxcli vsan cluster unicastagent list"

    if you see one marked as witness, remove it using "esxcli vsan cluster unicastagent remove -a <addr> -p <port> -u <uuid>"

    Re-enable SC

         a- If disabling fails then you may need to turn it off manually

    GET state: vsish -e get /vmkModules/vsanutil/stretchedClusterMode

    If 1 then it is enabled. If 0 it is disabled.

    If enabled (1), turn off by setting to 0

    SET state: vsish -e set /vmkModules/vsanutil/stretchedClusterMode 0

         b- see my blog about this here https://greatwhitetec.com/2017/01/13/tip-cannot-complete-file-creation-operation-failed-to-place-witnesses/

    After disabling SC/vSAN, start the creation of a new cluster.

    Hopefully this will clear any old entries and allow you to create the 2-node SC.



  • 3.  RE: 2-node stretched cluster - unable to add witness host

    Posted Dec 27, 2017 06:19 PM

    Thanks much for the quick reply.

    When disabling vSAN/creating a new cluster, will this cause any data to be lost? I would have just scrapped everything and started over, but I didn't want to have to recopy multiple terabytes back over :smileygrin:



  • 4.  RE: 2-node stretched cluster - unable to add witness host

    Broadcom Employee
    Posted Dec 27, 2017 06:23 PM

    For some reason I was assuming this was a lab environment. do you see any entries for a witness host in the unicastagent list?



  • 5.  RE: 2-node stretched cluster - unable to add witness host

    Posted Dec 27, 2017 06:35 PM

    It is a lab environment. It's not a problem if I lose the data, just an inconvenience. I appreciate your help.



  • 6.  RE: 2-node stretched cluster - unable to add witness host

    Broadcom Employee
    Posted Dec 27, 2017 06:40 PM

    Since the old Witness is not active, my guess is that such host is still in the config somewhere, since the witness wasn't replaced/removed properly.

    In vSAN 6.6 you can replace the witness via UI, in 6.1-6.5 you need to re-enable Stretched Cluster (not vSAN) to add the new host. Since the addition is failing, we need to find if vSAN thinks there is a witness host already.

    What is the output of "esxcli vsan cluster unicastagent list"?

    What procedure are you using to add the new witness?

    Storage and Availability Technical Documents



  • 7.  RE: 2-node stretched cluster - unable to add witness host

    Posted Dec 27, 2017 06:50 PM

    There was an entry on both hosts for a witness, UUID was all zeroes. I removed it from both hosts using your command, but the add witness host operation still failed afterwards. There must still be traces of that old witness host somewhere in the configs.

    The procedure I'm using is via the GUI - select the cluster > configure > fault domains & stretched cluster > configure > follow the wizard



  • 8.  RE: 2-node stretched cluster - unable to add witness host

    Broadcom Employee
    Posted Dec 27, 2017 06:56 PM

    Did you disable stretched cluster? See this Storage and Availability Technical Documents

    At this point you can disable SC, then re-enable and add witness.



  • 9.  RE: 2-node stretched cluster - unable to add witness host

    Posted Dec 27, 2017 07:03 PM

    It is disabled currently.



  • 10.  RE: 2-node stretched cluster - unable to add witness host

    Broadcom Employee
    Posted Dec 27, 2017 07:13 PM

    If you run this command from a host CLI  vsish -e get /vmkModules/vsanutil/stretchedClusterMode

    Do you get 1 or 0?



  • 11.  RE: 2-node stretched cluster - unable to add witness host

    Posted Dec 27, 2017 07:15 PM

    When I run that command I get 0



  • 12.  RE: 2-node stretched cluster - unable to add witness host

    Broadcom Employee
    Posted Dec 27, 2017 07:32 PM

    Look at the vSAN health check for any alerts. Also make sure the network side of the witness is set up properly (subnets, MTU, vmkernel, etc.)

    Also check license is applied and StretchedCluster is part of the default value

    vsish -e get /config/VSAN/strOpts/LicensedFeatures

    Other:

    - vSAN witness cannot be in the vSAN cluster

    - Also since the old witness host was deleted and a new one was created, it is important to check the version number of the new witness and make sure that:     

         - it matches the other 2 hosts

         - vCenter is at a higher or same level build (important)

         - if you post the build 3 for witness, ESXi, and vCenter it would help



  • 13.  RE: 2-node stretched cluster - unable to add witness host

    Posted Jan 03, 2018 06:08 PM

    Only metadata is stored on the vSAN Witness Appliance.

    In a 2 Node vSAN configuration, one copy of data (a replica) is on one node, and one copy of data (another replica) is on the other node.

    Disabling the Stretched Cluster configuration (required for 2 Node) will definitely cause your vSAN objects to be out of compliance, but only because they are missing the Witness component.

    Reenabling the Stretched Cluster configuration and pointing to the new vSAN Witness Appliance will return the configuration to a supported state. The witness components will be recreated on the new vSAN Witness Appliance.



  • 14.  RE: 2-node stretched cluster - unable to add witness host

    Posted Dec 27, 2017 08:30 PM

    Hello jswilmoth,

    From dealing with improperly decommissioned Witness Appliances in the past I have noted that the issue can be on the vC side (usually DB/inventory references) or on the vSAN-cluster/host side.

    If it is vC side that has the issue then it *should* be possible to just add a new Witness Appliance back manually from the CLI:

    Configure the networking and disks on the Witness and then via SSH populate the unicastagent address list on all nodes:

    # esxcli vsan cluster unicastagent add -a <IP addr of vSAN-vmk of other nodes> -i <vmk#>

    (Note: it is normal for Witness to have all 0's as UUID after addition to hosts lists)

    And then try add it back to the cluster with Witness flag (-t):

    # esxcli vsan cluster join -t -p <Preferred FD name> -U 1 -u <Sub-cluster UUID>

    If the above doesn't work please attach the clomd log from the hosts and the virgo log from the vCenter.

    Bob