VMware vSphere

 View Only

 Networking question - vxlan and XVMotion

Jump to  Best Answer
p0shkar's profile image
p0shkar posted Jul 24, 2024 07:23 AM

So our networking team would like to run LACP and EVPN-VXLAN for an active-active setup.

Question 1: Does anyone know if it is supported to run vMotion (or in this case enhanced vMotion = XVMotion = without shared storage) with VXLAN?

We have setup a proof of concept lab with 1 HPE (Intel) host each in 2 different locations. The hosts are only for testing purposes and are cleanly installed with ESXi 8.0.2. They only got their local disk for storage, so for testing the vMotion we only have the option of testing vMotion without shared storage.

The physical switches are configured with VXLAN but not LACP yet. We don't have NSX or any vtep on the ESXi side or vSwitches, all VXLAN is handled by the physical switches. I can vmkping over the vMotion vmkernel between the hosts, but xvMotion fails.

Interestingly enough the hosts on each side gives different error messages.

One side says "Failed waiting for data. Error 195887179. Connection reset by peer." and checking further logs we can see "2475: Could not find MemXferFS region for /vmfs/volumes/..."

The other side says "A fatal internal error occurred. See the virtual machine's log for more details." and in logs we can see "FSS: 7418: Failed to open file 'hpilo-d0ccb4'; Requested flags 0x5, world: 2099807 [amsd], (Existing flags 0x5, world: 2099684 [sut]): Busy"

Question 2: Any idea what is missing, if answer to question 1 is that it is supported?

p0shkar's profile image
p0shkar  Best Answer

This was solved. If anyone else has similar issues, this was due to BGP MTU misconfiguration which resulted in updates coming and going in VXLAN.

JungHo Shin's profile image
JungHo Shin

Before proceeding with this PoC lab, is it correct to assume that there are no issues if we conduct it without using VXLAN?
I would also like to know about this. Please share the results once they are available

p0shkar's profile image
p0shkar

Yes, it is working when not using vxlan, so that should exclude storage issues and esxi issues.