We have two sites, our main site, which Ill refer to as SiteA, and a remote site, which Ill refer to as SiteB.
SiteA has a VCenter appliance running with the IP 10.20.4.20. SiteA and SiteB are connected via a MetroE connection. The route from SiteA to SiteB is reachable via IP 10.20.4.250. In addition to this there is also a VPN connection on their firewall for redundancy, which we manually control at this point.
SiteB's VSphere server IP is 10.30.4.10. SiteB’s to SiteA’s MetroE connection routes via IP 10.30.4.250 .
During normal operation SiteB 10.30.4.10 gets to 10.20.4.20 via the MetroE connection on 10.30.4.250 . When we need to schedule maintenance on our MetroE connection we failover to VPN. The VPN route is reachable via SiteB default gateway which is on the firewall, 10.30.4.1. The VSphere server at SiteB is also configured with a default gateway of 10.30.4.1 . My problems begin with VSphere when we failover to VPN. Vsphere is hardcoding a manual route of 10.20.4.20 255.255.255.255 10.30.4.250 vmk1 Manual which can be seen running the command esxcli network ip route ipv4 list.
The MetroE route might still be reachable but is not the way we want the traffic to take. All other machines when we failover to VPN take the VPN route.
I reached out to VmWare who says the route is always added to their route table and does not disappear until it is unreachable for an HR. As no other devices do this, I am wondering what is going on with VSphere. How can we get VSphere to behave like all other devices that always go to their default gateway to get the route? I can simulate this issue with other sites as well.