yes, retreat mode did not work for my scenario. the kb i found was just updated / written a few days ago and i confirmed i am getting the errors in the logs, so perhaps this is something new VMW has discovered.
Original Message:
Sent: Sep 16, 2024 04:50 AM
From: StephenMoll
Subject: Vcenter not creating vcls vm's after host psod'd
You said "i've followed various blogs about changing the value to false so vcsa will delete them, but they were never deleted, i can't force power it on either."
I'd guess that this meant you attempted to put the cluster into "Retreat Mode". I'm surprised that didn't work if true, and somewhat concerning, because this technique is the documented way of clearing and resetting DRS vCLS functionality.
Original Message:
Sent: Sep 15, 2024 09:55 PM
From: nickcasa
Subject: Vcenter not creating vcls vm's after host psod'd
Ok, dig some more digging. I found this article and I have the issue as described in the wcpsvc.log, i ran the checksts python and my certs are fine, no errors, so thats good. vcsa ui also reports all certs are valid with many years till expiration
https://knowledge.broadcom.com/external/article?legacyId=80588
I'm nervous to run the steps in the article as it does something with the certs which could mess up citrix / veeam perhaps. obviously, i would snap it first or perhaps i was thinking of just restoring vcsa from a recent backup to a new path with veeam. this way, i can just power off the old and power on the new. thoughts?
Original Message:
Sent: Sep 15, 2024 09:17 PM
From: nickcasa
Subject: Vcenter not creating vcls vm's after host psod'd
I have a small 2 node cluster with shared storage (starwind). One of the nodes had a PSOD due a raid controller cache issue (r640), i power cycled the node, all is fine now, however the node that psod' will not start the vcls vm. i've followed various blogs about changing the value to false so vcsa will delete them, but they were never deleted, i can't force power it on either. I can vmotion between the hosts, honestly no issues at all other than the vcls wont power on. ive bounced the vcsa too, no help there. any other ideas I can try?
Edit I should also add under cluster quickstart it thinks the psod'd node is in maintenace mode, which it is clearly not. clicking re-validate does nothing, so i think this is contributing to my issue too. it thinks the node is in maintenance mode and hence will not start the vcls vm
Edit2: I changed the datastore setting for vcls and it deleted the vcls vm's from vcenter and the datastore, so that was good, but it will not re-create them. I'm not sure if drs will work without the vcls vm's
Edit3: I've put the cluster into retreat mode for an hour and back to system managed, still no vcls vm's, restarted vcsa, all services are running such as eam, all certs are perfectly fine for at least another year.
Someone suggested creating a new cluster and moving the hosts to it, however i am unsure how that will affect veeam and citrix vdi, so i'd prefer not to do that, thank you