Hi,
We have a strange problem with storage latency.
Here is roughly our system:
Host:
- ESXi 7.0.3. 21424296
- Intel Ethernet Controller 10 Gigabit X540-AT2
- ixgben 1.15.1.0
Storage:
Infortrend DS 1024REB2
9x SSD, Raid 6
Connected with iSCSI, one LUN, Jumbo frames not activated.
The Storage itself is connceted to a switch with a 20Gbit Port Channel (2x10 Gbit).
The Host was planned to be connected to the switch with 20Gbit Port Channel, but Essentials license does not allow port aggregation (surprise!), so 2x 10 Gbit is used instead on the host side.
After initial tests we migrated one VM to the new storage and noticed the first time latency.
I tried to copy files from a share to the VM and noticed severe lag.
I tried to copy files directly via the host (WinSCP) to the new storage volume and again severe lag.
I then added a new drive to the VM on the new storage and copied some 56GB of data there.
Then I copied the 56GB of data on a separat drive on the same VM an something really strange (for me) happend:
The copy action was running fine, around 800MB/s and 6000 IOPS combiend on the storage.
But then _after_ the copy action, severe lag occured, lasting many minutes. During this period the VM was nearly not usable.
I repeated that test a few times, same result.
See picture:
I checked with esxtop, and after the copy action, the DAVG values spiked too.
Looks like an "echo" of the previous operation.
What is going on?
And how do I fix it?
Thanks for your help.