Hi all,
hitting a really puzzling issue: configured the latest NSX-T on Cisco UCS, created T0, overlay TZ, created a segment and added two VMs. VMs are able to ping the gateway on T0, can ping each other if on the same host, but cannot ping each other if on different hosts. Upon closer inspection, it appears that no tunnels are formed between the ESXi nodes.
I'm able to ping between TEPs with large MTU, so no networking issues as far as I can see, but the tunnels are not formed... BFD shows tunnels are down (please see the output on the bottom).
Not seeing any related error messages in /var/log/vmkernel or /var/log/nsx-syslog.log on the hosts.
Anything else I can check? Would be happy to provide any other output. Please help!!!
Tested VXLAN connectivity and it looks good:
[root@NSX02:~] ping ++netstack=vxlan 10.12.0.151 -s 1600 -d
PING 10.12.0.151 (10.12.0.151): 1600 data bytes
1608 bytes from 10.12.0.151: icmp_seq=0 ttl=64 time=0.271 ms
Checked the logical switches on the hosts and they look good (the switch I'm using is called "test"):
NSX-Manager> get logical-switch
VNI UUID Name Type
71688 5cce3073-c5c9-4cf6-9cad-8db50dd06b68 OV-WEB DEFAULT
71689 8208b2cd-7d0c-407e-aacf-ee9297ef5cf2 OV-DB DEFAULT
71691 fedd3ec3-d3e4-4d02-ac4f-cd94bde02fdf transit-bp-2a5f80db-676d-41f4-b305-1e8591266f94 TRANSIT
71692 c9b96c71-ebff-4572-88a9-7639d2923743 transit-bp-8871e348-42da-447f-9193-70781b09730f TRANSIT
71690 50db354a-bf9c-483f-9637-c397e78d05b7 transit-rl-8871e348-42da-447f-9193-70781b09730f TRANSIT
71681 97655bd6-dd20-4746-8138-656a0c06e9b0 test DEFAULT
71687 6fa865f8-4bb6-439a-a428-a94e27e02090 OV-APP DEFAULT
[root@NSX02:~] nsxcli -c get logical-switch 71681 vtep-table
Logical Switch VTEP Table
-----------------------------------------------------------------------------------------------
Host Kernel Entry
===============================================================================================
Label VTEP IP Segment ID Is MTEP VTEP MAC BFD count
124941 10.12.0.151 10.12.0.128 False 00:50:56:67:31:cb 0
LCP Remote Entry
===============================================================================================
Label VTEP IP Segment ID VTEP MAC DEVICE NAME
124941 10.12.0.151 10.12.0.128 00:50:56:67:31:cb None
LCP Local Entry
===============================================================================================
Label VTEP IP Segment ID VTEP MAC DEVICE NAME
124942 10.12.0.152 10.12.0.128 00:50:56:63:b0:56 None
[root@NSX03:~] nsxcli -c get logical-switch 71681 vtep-table
Logical Switch VTEP Table
-----------------------------------------------------------------------------------------------
Host Kernel Entry
===============================================================================================
Label VTEP IP Segment ID Is MTEP VTEP MAC BFD count
124942 10.12.0.152 10.12.0.128 False 00:50:56:63:b0:56 0
LCP Remote Entry
===============================================================================================
Label VTEP IP Segment ID VTEP MAC DEVICE NAME
124942 10.12.0.152 10.12.0.128 00:50:56:63:b0:56 None
LCP Local Entry
===============================================================================================
Label VTEP IP Segment ID VTEP MAC DEVICE NAME
124941 10.12.0.151 10.12.0.128 00:50:56:67:31:cb None
Checked BFD sessions, tunnels down, no diagnostic....
[root@NSX03:/var/log] net-vdl2 -M bfd -s nvds
BFD count: 3
===========================
Local IP: 10.12.0.151, Remote IP: 10.12.0.153, Local State: down, Remote State: down, Local Diag: No Diagnostic, Remote Diag: No Diagnostic, minRx: 1000, isDisabled: 0, l2SpanCount: 1, l3SpanCount: 1
Roundtrip Latency: NOT READY
VNI List: 71687
Routing Domain List: 8871e348-42da-447f-9193-70781b09730f
Local IP: 10.12.0.151, Remote IP: 10.12.0.200, Local State: down, Remote State: down, Local Diag: No Diagnostic, Remote Diag: No Diagnostic, minRx: 1000, isDisabled: 0, l2SpanCount: 3, l3SpanCount: 2
Roundtrip Latency: NOT READY
VNI List: 71690 71691 71692
Routing Domain List: 2a5f80db-676d-41f4-b305-1e8591266f94 8871e348-42da-447f-9193-70781b09730f
Local IP: 10.12.0.151, Remote IP: 10.12.0.152, Local State: down, Remote State: down, Local Diag: No Diagnostic, Remote Diag: No Diagnostic, minRx: 1000, isDisabled: 0, l2SpanCount: 2, l3SpanCount: 2
Roundtrip Latency: NOT READY
VNI List: 71681 71688
Routing Domain List: 2a5f80db-676d-41f4-b305-1e8591266f94 8871e348-42da-447f-9193-70781b09730f