Hi experts,
I have a problem with latencys where I need some help.
My setup:
I have a blade chassis with several blade servers from Fujitsu. The chassis contains two 10GbE switches with multiple external ports.
For the ESXi 5 hosts this means, each server have 2 10GbE nics.
The blade switches have each one 10Gbit uplink to our LAN (each to a different Cisco switch, stack configuration).
As shared storage we have a EMC CX4 120 (latest flare code). Each storage processor is connected via iSCSI to each of the blade switches. Each storage processor has two ports. Means that each SP is connected with port A to blade switch 1 and with port B to blade switch 2. So overall it´s an full redundant setup and everything is working fine so far.
I think I can describe my problem best with our backup szenario (using Veeam B&R 6).
Veeam is running as a VM and has some dedicated Raid Groups (SATA Storage) as backup repository on the EMC system.
Now when I run a job, which backups a VM located on "LUN-A" (FC 15k Storage) to "LUN-C" (SATA Storage - Backup repository) I see high write latency on LUN-C. Of course thats normal because the slow SATA storage is the bottleneck in this setup, so it´s write performance is used nearly 100% all the time (which is not a problem for us because all the jobs are still fast enough for our environment).
No what I don´t really get is why I also get high write latencys on all my other (around 20) LUNs connected to the ESXi hosts. The latency is not as high as the backup LUN with ~100ms avg, but also increase to ~30-40ms avg (with much higher peaks). This LUNs are physical totally seperated (different raidgroup, different spindels).
By fact I am just getting a wirte performance of 40MB/s on my slow SATA Raid6, I don´t understand why the whole environment is affected. Any ideas where I can start troubleshooting?
I already checked the load of the storage processors, which are far away from being overloaded.
Same szenario I can see when I clone a VM for example. Here I also see high latency on all my luns and not just on the LUNs which are involved for the clone process (source & target).
Thanks for every tip!
Regards,
Peter