Hello,
I tested the vSAN shutdown in lab many times with issues simulation and I can say, that the "shutdown button" in VCSA working well when no issues happen and all hosts will boot sucessfully without any issues. In case of damaged disk group or not booting host, it is hard to bring the cluster up. It is possible, but I would not like to do it in production environment. Especially when there is no official documentation. I opened vmware case to clarify this functionality in case of some troubles and no info was provided. Simply, in case of some issue, please contact support. Horrible way.
I rather prefer the way of manual shutdown described in
Manually Shut Down and Restart the vSAN Cluster
Broadcom |
remove preview |
|
Manually Shut Down and Restart the vSAN Cluster |
You can manually shut down the entire vSAN cluster to perform maintenance or troubleshooting. |
View this on Broadcom > |
|
|
Simply because if some host will not boot again or there would be some damaged disk group, you still can bring the vsan cluster up without corrupted host (if vSAN storage policy is still fullfilled). Last host can be started later and is added to cluster by two simple steps:
1. "esxcfg-advcfg -s 0 /VSAN/IgnoreClusterMemberListUpdates"
2. enable vSAN kernel on the host
The difference between this manual shutdown, which simply disable vSAN kernel port on all the hosts at the same time, and "is easy" to fix it. The automatic vSAN shutdown do something similar, but in some background way and start the cluster is "officially impossible" without vmware support as I already mentioned. There are manual steps which can fix the problem, but in production environment it is huge unnecessary risk.
Also would be recommended to move VCSA and one DNS server to non vSAN storage to easier start after the break.
Maybe the best option is shutdown VM workload and terminate power sources in entire rack, what terminate all hosts at the same time like during unexpected power break. :D
After unexpected power failure I have never saw vSAN issue in case, that some host will not boot and storage policy is fullfilled.
In all cases, good luck and hopefully no HW issue will be recognized after movement.
Original Message:
Sent: Apr 16, 2025 02:31 AM
From: Neil Colyn
Subject: Full vSAN Cluster Power-Off for Site Migration
Hi Cristian,
We recently went through the same exercise and also followed Broadcom's documentation. The one thing you need to make sure of is when you shutdown down the vSAN cluster, make a note of which vSAN cluster host is the "orchestration host".
When you start up the vSAN cluster - start up the "orchestration host" first! We didn't do that, and it caused issues.
Good luck!
Original Message:
Sent: Apr 15, 2025 03:11 PM
From: Cristian Cortese
Subject: Full vSAN Cluster Power-Off for Site Migration
Hello Community,
We're planning a full migration of all hosts in one of our vSAN clusters from one physical site to another. This will require a complete shutdown of the vSAN cluster for approximately 12 hours. The plan is to power it back on after the relocation.
We've reviewed Broadcom's documentation, but we'd really appreciate hearing from anyone who has gone through a similar process. Any tips, lessons learned, or potential pitfalls to watch out for would be very helpful.
Thanks in advance!