VMware NSX

  • 1.  NSX Manager configuration

    Posted 26 days ago

    I'm a little baffled by the recommended configuration for the NSX manager cluster in a stretched cluster environment. The recommendation is for a 3-node management cluster with 3 manager appliances in the primary site and 1 appliance in the secondary site.

    All of that works great when both sites are up but, if the primary site fails, the single appliance cannot provide NSX services and there are problems. The guides say that you can add a temporary 4th appliance in that scenario, but that makes the whole system far less automatic for failover than would be desired.

    Is there a reason that intentionally running a 4 node NSX management cluster with two nodes at each site would NOT be a supportable and functional solution?

    It also does not appear that the management appliances can function properly in an overlay network which is unfortunate as that would seem to resolve the issue. If an NSX management appliance is on an overlay network and then the VM is moved to another host, the appliance simply stops responding to the management network until it is rebooted and sometimes doesn't come back at all.

    This leads to another issue which is that it is desired for the management appliances to all be on the same layer-2 network, otherwise there's no point in creating a cluster IP. How would this be handled in a scenario where, outside of an overlay network, there is no good way to extend a layer-2 network between the two sites?



  • 2.  RE: NSX Manager configuration

    Broadcom Employee
    Posted 26 days ago

    Hello Charlie,

    From what I understood about your query, it is always recommended for the NSXT manager clusters to be an odd numbered at the minimum a 3 node cluster in customer production environments. This is to take care of the split brain scenario in a cluster. So for the above scenario you have described, its never recommended to have only 2 nodes in a cluster. So its good to have a 3 node cluster at the secondary site of the stretched cluster. Please excuse me if I have not got your query correct.

    Regards

    Sriram




  • 3.  RE: NSX Manager configuration

    Broadcom Employee
    Posted 23 days ago

    NSX Managers need to be on non-overlay segments/subnets. If they were on overlay segments, they would be running on the very segments/subnets they are attempting to manage. In this situation, you would have a chicken-egg issue. 

    The only real option is to extend the network in the physical network such as using VXLAN. 

    I would recommend reading this https://techdocs.broadcom.com/us/en/vmware-cis/vcf/vcf-5-2-and-earlier/4-5/administering/stretching-clusters-admin/stretch-clusters-requirements-admin.html




  • 4.  RE: NSX Manager configuration

    Posted 19 days ago

    Ok, so I understand where our disconnect happened and we're working to resolve that.

    One concern about having them on a VLAN segment is that, if the VLAN in question has its gateway at the primary site and that site fails, we still have NSX down as none of the NSX managers could communicate with the rest of the network.

    The workaround that I could see would be to make it an VLAN segment but also make an overlay segment bridged to the VLAN and place the gateway in NSX.   Assuming that NSX itself hasn't failed, if the primary site fails, everything would still work as the network's gateway would just exist in NSX regardless.

    Any thoughts on this approach?




  • 5.  RE: NSX Manager configuration

    Broadcom Employee
    Posted 12 days ago

    That sounds pretty risky.

    Doing stretched L2 across sites in the physical network is what you really need to do, and even then, it gets pretty messy at the best of times. If you have an account team in Broadcom, I would recommend reaching out to them to discuss your specific use case. You don't want to implement something that works now, but fails when you have an issue and need it. 




  • 6.  RE: NSX Manager configuration

    Posted 12 days ago

    I do realize that now.   As this is running atop a vSAN stretched cluster, we've opted for the three site deployment with one at each of the primary, secondary, and witness sites.

    We'll run a load balancer for providing a single VIP for NSX Manager.