VMware NSX

  • 1.  NSX-T network stops working.

    Posted Aug 09, 2023 07:56 AM

    Hi All,

    I am new to NSX world and just looking for any kind of suggestions for the issue.

    We are currently trying to migrate our NSX cluster to new nodes (replacing the old hosts). There are two NSX-T edge VM and three NSX-T managers configured. For the migration we are creating a new cluster and adding the new nodes to the cluster. OLD nodes are in Cluster A while new nodes are added in Cluster B. After the nodes are added to the cluster B, the new nodes were attached to the same VDS that was being used in the old nodes. The new host has been successfully configured from the VDS, MTU, and underlay switches perspective.  

    The nodes were then host prepared and added to NSX-T. Everything until here is working optimal. The NSX-T segments are now available in the new nodes as well and operating successfully (Tested on a newly created VM). 

    Once verifying everthing is working fine we then started vMotion of NSX-T manager to the new cluster, no problem, all nodes were sucessfully migrated. Then we started vMotion of our NSX-T edge nodes, that when the problems start, once we migrate NSX-T Edge VMs the NSX-T segments stops being reachable from outside network. But if I go into a certain VM under the segment everything is reachable.

    Everything works fine if we leave the NSX-T edge nodes on the Cluster A and migrate all VMs connected to NSX-T Segment to Cluster B. The issue arises only when we migrate NSX-T Edge VMs.

    At first we thought that there might be issue in the underlay network, VLAN and MTU set on the physical switches that is being used for NSX-T Edge connectivity to physical network. But everything is fine on the physical side, all required VLANS are passed with MTU being set to 9000.

    On the NSX-T side there is no any error after we migrate the Edge VM. All tunnels are up, all connectivity is working, BGP peer with upper bound router is also working. No any errors are displayed on NSX-T manager UI.

    NOTE: Once EDGE VMs were migrated we did face alarm being raised due to "Display Name" and "Compute Id" being mismatch. It was resolved once we redployed the NSX-T Edge VMs

    We did thought that issue might be with the mac adress not being updated on the switch side, we also tried to flush mac address on switch as well as arp on routers.

    Once we vMotion the EDGE VM to old cluster everything starts working.

    NSX-T Version:  4.1.0.2.0.21761691

    We are currently out of ideas on what might be causing the issue. I will be very grateful for any suggestions.



  • 2.  RE: NSX-T network stops working.
    Best Answer

    Posted Aug 16, 2023 01:36 PM

    Looks like there might be problem with N-S connectivity of Tier0 of Cluster B, you can try following

    Please check in firewall ICMP traffic is allowed or temporarily disable all firewall rules

    Ping from VM to default gateway of segment in cluster B ideally this should work, if not
    it is recommended to validate required vlans are allowed in switch UCS port channel etc.

    Next ping from NSX Edge tier0 to HA VIP and switch IP
    Based on results you can further narrow down issue