I do realize that now. As this is running atop a vSAN stretched cluster, we've opted for the three site deployment with one at each of the primary, secondary, and witness sites.
We'll run a load balancer for providing a single VIP for NSX Manager.
Original Message:
Sent: Jun 06, 2025 12:08 AM
From: Dylan Cohen
Subject: NSX Manager configuration
That sounds pretty risky.
Doing stretched L2 across sites in the physical network is what you really need to do, and even then, it gets pretty messy at the best of times. If you have an account team in Broadcom, I would recommend reaching out to them to discuss your specific use case. You don't want to implement something that works now, but fails when you have an issue and need it.
Original Message:
Sent: May 29, 2025 01:23 PM
From: Charlie Silverman
Subject: NSX Manager configuration
Ok, so I understand where our disconnect happened and we're working to resolve that.
One concern about having them on a VLAN segment is that, if the VLAN in question has its gateway at the primary site and that site fails, we still have NSX down as none of the NSX managers could communicate with the rest of the network.
The workaround that I could see would be to make it an VLAN segment but also make an overlay segment bridged to the VLAN and place the gateway in NSX. Assuming that NSX itself hasn't failed, if the primary site fails, everything would still work as the network's gateway would just exist in NSX regardless.
Any thoughts on this approach?
Original Message:
Sent: May 25, 2025 10:54 PM
From: Dylan Cohen
Subject: NSX Manager configuration
NSX Managers need to be on non-overlay segments/subnets. If they were on overlay segments, they would be running on the very segments/subnets they are attempting to manage. In this situation, you would have a chicken-egg issue.
The only real option is to extend the network in the physical network such as using VXLAN.
I would recommend reading this https://techdocs.broadcom.com/us/en/vmware-cis/vcf/vcf-5-2-and-earlier/4-5/administering/stretching-clusters-admin/stretch-clusters-requirements-admin.html
Original Message:
Sent: May 22, 2025 02:00 PM
From: Charlie Silverman
Subject: NSX Manager configuration
I'm a little baffled by the recommended configuration for the NSX manager cluster in a stretched cluster environment. The recommendation is for a 3-node management cluster with 3 manager appliances in the primary site and 1 appliance in the secondary site.
All of that works great when both sites are up but, if the primary site fails, the single appliance cannot provide NSX services and there are problems. The guides say that you can add a temporary 4th appliance in that scenario, but that makes the whole system far less automatic for failover than would be desired.
Is there a reason that intentionally running a 4 node NSX management cluster with two nodes at each site would NOT be a supportable and functional solution?
It also does not appear that the management appliances can function properly in an overlay network which is unfortunate as that would seem to resolve the issue. If an NSX management appliance is on an overlay network and then the VM is moved to another host, the appliance simply stops responding to the management network until it is rebooted and sometimes doesn't come back at all.
This leads to another issue which is that it is desired for the management appliances to all be on the same layer-2 network, otherwise there's no point in creating a cluster IP. How would this be handled in a scenario where, outside of an overlay network, there is no good way to extend a layer-2 network between the two sites?