VMware vSphere

 View Only

 Problem with understanding VM-host rule conflicts

mortihh's profile image
mortihh posted Jan 15, 2025 12:05 PM

Hi community,

I have a problem with understanding how VM-Host rule conflicts behave. 
VM-VM is clear so far, the vCenter reports a problem directly when creating rules in conflict, deactivates the new rule and displays the conflict on the VM/Host Rules overview. So far so good :)

But especially for VM-Host rules the Broadcom Techdoc states:

So it is possible for you to create a rule that conflicts with the other rules you are using. When two VM-Host affinity rules conflict, the older one takes precedence and the newer rule is deactivated. DRS only tries to satisfy activated rules and deactivated rules are ignored.

Using VM-Host Affinity Rules

Broadcom remove preview
Using VM-Host Affinity Rules
You use a VM-Host affinity rule to specify an affinity relationship between a group of virtual machines and a group of hosts. When using VM-Host affinity rules, you should be aware of when they could be most useful, how conflicts between rules are resolved, and the importance of caution when setting required affinity rules.
View this on Broadcom >


Now I have built the following very simple test scenario:
3 hosts, 1 VM, 2 rules

- Host1 (0 VMs)
- Host2 (0 VMs)
- Host3 (VM1)

I have created two host groups, one for Host1 and one for Host2, plus one VM group for VM1.

Then I created two VM-Host rules:
Rule 1: VM1 must run on Host1
Rule 2: VM1 must run on Host2
The VMs are all located on Host3.

As expected and described, creating this works without any problems, knowingly it creates a conflict.
However, according to the description in the techdocs, I would now have expected DRS to deactivate the newer rule, as it obviously creates a conflict, or at least to enforce the rule that was created first (Rule1) and migrate VM1 to Host1. Instead, however, DRS no longer shows any recommendation at all and only a DRS Fault. VM1 simply stays on Host3.

The fault is shown two times and both the same, I would also think it shows two faults, one for migration to Host1 and another one for Host2, but both show onlyHost1:

Could not fix hard VM/Host affinity rule violation

Fault:

Virtual Machine VM1 on Host1 would violate a virtual machine - host affinity rule

Prevented Recommendation:

Migrate VM1 from Host3 to any host


Why is the behavior different than described in the Techdocs? Or is this to be expected?

I am using the following new versions:

vCenter 8.0 Update 3 Build  24322831 

ESXi 8.0 Update 3 Build 2402251

Andrea Consalvi's profile image
Andrea Consalvi

Hi,

I've carefully read your test scenario, and the behavior you're observing doesn’t seem to fully align with the Broadcom documentation. In theory, when two VM-Host Affinity rules conflict, the older rule should take precedence while the newer one gets deactivated, and DRS should try to satisfy the active rule.

To better understand what's happening, it would be useful to verify whether Rule 1 was actually created before Rule 2. If there's any doubt, you could try deleting both rules and recreating them in the correct order, starting with the one that should take precedence. Another aspect to check is how DRS behaves when set to "Partially Automated" mode. Does it provide any migration recommendations? Also, if you manually refresh the DRS recommendations, does it offer any suggested actions?

Since DRS decisions are based on internal logic, checking the logs might provide more insights. If possible, reviewing the vmware-drmdump.log file could help in understanding how DRS is processing these affinity rules.

If you can test these aspects and share the results, we can analyze the situation further together!

Duncan Epping's profile image
Broadcom Employee Duncan Epping

Note, DRS changed significantly over the past couple of versions, it wouldn't surprise me if this behaviour also changed along the way. I will try to find internal docs, or reproduce it and see what happens and report back.

mortihh's profile image
mortihh

Hi Duncan, I received a reply from the internal team about our bug ticket yesterday (Internal defect ID - 3476858) with the following information:

The way to interpret multiple VM-Host must affinity rules (a.k.a VM-Host hard affinity rules) is that they are an "intersection" of the rules.

Rule1 - VM-Group (vm-1) must run on Host-Group-1 (h-1)
Rule2 - VM-Group (vm-1) must run on Host-Group-1 (h-2)

When DRS evaluates where to place vm-1, it starts with the set of compatible hosts for Rule1 (h-1) and intersects it with the set of compatible hosts (h-2) for Rule2. 
Since the compatible hosts for both rules are disjoint sets, the intersection is null. 

If the VM was already powered-on on a third host (that is not part of either of these rules) before the creation of the rules, then DRS just surfaces a fault that it can't enforce either of these rules.

If however, there is an attempt to power-on the VM after these rules are created, the power-on would fail since DRS can't find a host that can satisfy both of these "must rules".

The first two sentences of the next paragraph are correct in the doc - 
"When you create a VM-Host affinity rule, its ability to function in relation to other rules is not checked. So it is possible for you to create a rule that conflicts with the other rules you are using."

But then I agree that this next line in the doc is confusing -
  
"When two VM-Host affinity rules conflict, the older one takes precedence and the newer rule is deactivated. DRS only tries to satisfy activated rules and deactivated rules are ignored."

This is not true for VM-Host affinity rules and is only applicable for VM-VM rules. 
This needs to be updated. 
I will create a doc PR and assign it to the doc writer team.