vSphere Availability

The placement of vCLS appliances needs to be more 'location aware'

    Posted Oct 12, 2022 11:01 PM

    With the simple placement engine in HA seemingly nobbled in vSphere 7 with the Ha advanced options to enforce affinity rules not working anymore, this places more emphasis on the vCLS appliances being available to take up the VM placement activities when VMs are being recovered, following host failures.

    We find that when starting up a system from cold, there is a high probability that hosts are started in some of sort sequence based on their physical layout.

    So for example... If your datacentre has several cabinets of hosts, and let's say these hosts are compute sleds in a set of chassis in each cabinet. Now suppose that there are a couple of clusters that for redundancy are also evenly spread across those cabinets.

    The system is cold and someone powers the system up start at one end and powering on hosts from top to bottom in each cabinet.

    The vCLS appliances are going to start up on the first three hosts available in each cluster, and therefore in our hypothetical datacentre, all the vCLS are likely to be in the first cabinet and within the first couple of compute chassis.

    This means that a location specific failure such as a failure in the power feed to the first cabinet, for example, would wipe out all vCLS appliances for the clusters in an instant. HA would have no option other than to start recovery using the simple placement engine, and all the affinity rules are 'out the window'.

    There are few options to manage the placement of vCLS appliances. There is an implied anti-affinity rule between them, so they should at least be on separate hosts. There is also the vCLS VM anti-affinity policy feature, will allows users to keep them away from critical work loads. What I feel is missing though, is a way to perhaps label hosts with tags that could relate to their physical location and define some sort of vCLS placement policy that would ensure vCLS appliances are not gathered together in too close a physical proximity to one and other.


    Does HA recover at least one vCLS fast enough to ensure the correct placement of recovered VMs in accordance with affinity rules?