VMware vSphere

View Only

Network speed on external traffic

Marcus Erfurth posted Oct 01, 2024 03:08 PM

Hi folks,

I am having a strange issue with my network bandwidth.

This is my setup.

2-node ESX 8 U3 cluster.
Each node has multiple physical network cards going to different subnets (2x 10 Gbit network 1, 2x 10 Gbit network 2, 2x 100Gbit network 3)
All setup in a dvs.
All guest VMs are Windows 2022 using vmxnet 3 interfaces to each of my 3 physical networks.

Now I am performing network bandwitdh tests using iperf3.

Those are my results:

Test 1: iperf directly on the ESXi CLI from ESXi 1 to ESXi 2 => 10 Gbits speed as expected

Test 2: iperf from VM 1 to VM 2 that resides on same host => 10 Gbits speed as expected

Test 3: iperf from VM 1 to VM 2 that do NOT reside on same host => only 2,5 Gbits instead of expected 10 Gbits

Test 4: iperf from VM 1 to ESXi 2 (again VM is not on this ESXi) => only 2,5 Gbits instead of expected 10 Gbits

=> So if VM traffic stays within one ESXi , i got the whole bandwidth of 10 Gbits, but as soon as it leaves the ESXi, I only got 2,5 Gbits. It is like a hidden boundary that I am not allowd to go over 2,5 Gbits.

Several more tests I did and it shows:

=> Does not matter which VMs i tried, allways the same

=> Does not matter if i do iperf to another client in the network (apart from my vmware cluster) , again only 2,5 Gbits

=> Changed to normal vswitch, again only 2,5 Gbits

=> Even passed through the physical 100 Gbits network card to my VM, again only 2,5 Gbits.

=> increased the virtual speed of my vmxnet3 card in the vmx file, again only 2,5 Gbits

=> enabled network reservation in VM settings, again only 2,5 Gbits

What am i doing wrong? Is there some hidden setting which throttles my bandwidth as soon as network traffic leaves the VMs and the ESXi ?

I did not change any setting in my cluster apart from enabling HA and DRS, no network reservation enabled etc.

Thx,

Marcus

NickDaGeekUK posted Oct 02, 2024 05:39 AM

Hi Marcus,

you are not the first person to have reported this issue and it started long before 8.x

see here and here

but basically it is known but not discussed.

Re: Is the vmkernel network limited to 40% for backup traffic?

Quote

Post by Gostev » Nov 22, 2021 12:41 pm

Sorry but we're bound by NDA on our communications with their development and product management.

Try asking on their forums about this issue, may be my counterpart there would be willing to share their plans.
This is a known limitation and trust me, they are well aware of it... they obviously do their own tests too!

Re: Is the vmkernel network limited to 40% for backup traffic?

Quote

Post by MAA » Nov 22, 2021 1:13 pm

I have already asked but did not receive an answer
https://communities.vmware.com/t5/ESXi- ... -p/2879040 my limit is 20% for vmkernel network

One thought I have is to ask about your external network physical connnection and if the 2 x 100GB Network 3 is your ESXi management NIC? Even if you have different IP subnets for your ESXi management traffic and VM traffic and they are on different DVS in the host by default you only have one TCP/IP stack and the default route is on your Management NIC.

The two ESXi hosts for data transfer, how are they connected to each other (Layer 2 or Layer 3 Switches externally) and are you using LAG on your external switches and if so what type static or dynamic. I ask because I had issues that I beleive have been related to the way my external Layer 3 switches were routing traffic (due to the default TCP/IP stack in ESXi and Inter VLAN Routing on our switches) Note: the type of hashing algorithm set on your LAG inside the vSwitches must match the external LAG on your physical switches if you are using LAG. I believe Cisco static LAG requires IP Route Hash on the vSwitch LAG.

What I did was to set up a custom TCP/IP stack and apply it to a VMKernel on the VM switch so that the default route was the external vlan interface static IP was the default gateway. It seems to have resolved a very long delay in Traceroute resolution of second hop between hosts. Prior to that I am fairly certain the external Layer 3 Switches were routing the VM traffic via the Management default gateway and the switches were loading up doing inter vlan routing they didn't need to do.

mviel posted Oct 04, 2024 05:08 AM

We have seen similar results to you , maybe you test the speed between 2 VMs from an OS Layer in "Non-Windows" .

Windows seems to be the Limitation here, even in Version 2022 , i have seen some "Google" Solution but cant give you advice . If your Enviroment is RDMA ready , you might have a solution ( needs addition konfig though ) .

Otherwise I would look on the Windows Site for solutions on fast Networkcard and Performance Tuning.

https://learn.microsoft.com/en-us/windows-server/networking/technologies/network-subsystem/net-sub-performance-tuning-nics

NickDaGeekUK posted Oct 04, 2024 05:16 AM

Hi Marcus,

would tend to agree with mviel that windows tuning on the vmxnet3 is a good idea to look at VMXNET3 RX Ring Buffer Exhaustion and Packet Loss – vswitchzero

windows - VMXNET3 receive buffer sizing and memory usage - Server Fault

Marcus Erfurth posted Oct 04, 2024 02:26 PM

Hey guys,

thx for your responses.

If it was Windows who is the bottleneck then why is my Test 2 (VM1 to VM2 on same host) successfully with 10 Gbit bandwidth as expected? That wouldn't make sense to me, but maybe i oversee something.

Also i played around a bit more with iperf and when I use iperf with multiple streams (e.g. -P 5) then I get the 10 Gbits bandwidth).
To be honest I don't know what I make out of this. Is it somehow connected to my CPU and vCPU? Maybe one vCPU can only handle 2.5 Gbits network?

Also its all Layer-2 networking, not routing in between. All port groups are on the same vds, but again i also tried standard switches and that makes no difference.

Marcus

NickDaGeekUK posted Oct 05, 2024 10:34 AM

Hi Marcus,

Thanks for the updated information.

With regard to your statement "it is all Layer 2 networking" I understand that for your vSwitches and networking internally on your hosts. However you did not say if that includes your external physical switch (or switches if more than one) and if are you using VLANS on your external switches. I ask because even at layer 2 it is possible to set up VLAN and it will effectively isolate your vSwitch traffic externally. I suspect you are not using external VLANs but I could be wrong.

OK the 10GB internally (between VMs on the same vSwitch and Portgroup) is to be expected, it is using an internal virtual link which defaults to 10GB on VMXNET adapters.

The results you get with iperf when using multple streams is also very intersting. It makes me think there is an efficency or latency issue in your network.

The results you are getting indicate that you are experiencing the limit only when exiting the vSwitch to the physical NIC on to your external switch. The fact that you can overide that limitation by using multiple streams is important.

Can you explain what configuration you are using for your physical NIC connected to your vSwitches. Are you using Teaming or Fail Over in the ESXi and if teaming in the vSwitch physical NICs what load balancing are you using Route Based on IP Hash, Route Based on Source MAC, Route Based on Originating Port, or Explicit Failover order? if you are using Teaming internally and a LAG externally is the external LAG static or dynamic and have you got Stacked External Switches? All of these factors are related to each other. For Example if you use teaming internally in ESXi and an external LAG then best practice is Route Based on IP Hash for the Load Balancing to a Static External LAG and if using Stacked Switches the stacking link must support LAG.

The multiple streams working at full bandwidth in iperf suggests a load balancing or routing hash issue relative to the external swtich configuration.

Also I find there is no harm in tweaking the Ring 1 and Small Buffer settings in VMXNET adapters in Windows, it does improve performance considerably but as you suggest it does have a small impact on memory and vCPU to do so but I have found the benefits outweigh the impact. The link I sent you to vswitchzero has more information and is worth reading.

Marcus Erfurth posted Oct 07, 2024 04:53 AM

Hi Nick,

there is VLAN in place. My physical network cards are connected to an Access port (no VLAN tag) on the external switches.

No LACP (LAG) in place, only Load Balancing "Route based on originating virtual port".

Also i dont believe it is my external network setup which causes that because iperf from ESX 1 to ESX 2 via same network is working with 10Gbit straight without the need to use multiple iperf streams.

Anyways I will try to setup a linux based VM to see outcome there...

Marcus Erfurth posted Oct 07, 2024 02:52 PM

So, i quickly installed a CentOs Linux and ran iperf. and what shall i say i get 10Gbits out of the box.

So it is really Windows here with the problems.

But I still dont unterstand, because when doing iper between 2 Windows VMs on same host, it gives me 10Gbits, only if they reside on 2 different hosts, its only 2.5 Gbits.

So confusing...