This is a lab setup working on some different designs. Have 2 Cisco switches connected with BGP peering. Have 2 hypervisors with 2 uplinks for NSX, each uplink to a different switch. I have an Edge gateway on each hypervisor. I am using the Edge workflow design where you pin each uplink to a switch VLAN, which is not spanned to the other switch. Everything works and my T0 shows all peers are good. This is a Federated environment, so I am stretching a T0 and T1, but this is not a Federation issue. There are some other Cisco switches for the other "site" as well.
If take one of the Edge VM down, I only lose several pings to my VMs. My bgp hold and keepalive and 4 and 1.
However if I just unplug an uplink from a switch, I only lose a couple of pings. However when I plug the uplink back in the switch I lost pings for about 30 seconds, even to the second site, so my RTEPs are hosed too. I cannot understand why. On each Cisco switch I have set the BGP advertisement to 5 seconds as well.
This will probably not be enough information, but I am just learning. Hoping someone can give me some place to try to look on why the behavior when plugging an uplink back in a switch. Thank you.