vSphere Storage Appliance

 View Only
  • 1.  vSphere 4 host nfs mount very slow performance

    Posted Aug 09, 2010 11:06 PM


    In order to resolve reported slowness issue, we've been doing some tests on the environment and found that if we are copying a 75MB file locally (ie. on vmdk) or remotely (scp) on a linux VM host, it took 2 sec. However, if we are copying locally or remotely to a nfs mounted volume (from Netapp) on the same linux VM host, it took 45 sec. If we are copying from another physical host to the same nfs mounted volume on the same host, it took around 3 sec. We've done strace and sniffer on the host side, and found that the transfer is very fast initiallY (2sec), but when it's come to 99%, it will wait for the close/acknowledge for the remaining time.

    Anyone have experienced this before, or know what's going on? Any further advice or solution that we can try?


  • 2.  RE: vSphere 4 host nfs mount very slow performance

    Posted Aug 10, 2010 10:08 AM

    Check that the NFS export is set to "async", not "sync", if that setting is available on an NetAPP box.

    Do you use the same vSwitch and physical NIC for the virtual machine (Linux) where the data comes in and for the NFS connection?


    VCP 3 & 4


    =Would you like to have this posting as a ringtone on your cell phone?=

    =Send "Posting" to 911 for only $999999,99!=

  • 3.  RE: vSphere 4 host nfs mount very slow performance

    Posted Aug 15, 2010 03:37 AM

    NFS has always been slow on ESX, usually, unbearably so. 20/22 MBs is usually the best you can expect (even after enabling 'async' on the host NFS server in some cases). I believe the fastest I've got it to work is around 35 MBs. Right now, it's back to 22 MBs, even with async transfers.

    Whenever I run into NFS issues on Linux, I set the 'timeo' to 10 (timeo=10). That will usually bring the speed up to 75/90 MBs, from a typical 45 MBs on NFS4. I've played around with low and high timeo settings and 10 seems to work best. Unfortunately, I believe timeo is a client-only setting. So I don't think it can be used with ESX.

    More info on timeo (second paragraph):

    Btw, I've also experienced high initial transfers for the first few seconds, then a general slowdown due to retransmissions. That's what timeo=10 tends to fix.

    UPDATE: I just ran some test and I'm doing 50.1 MBs (dd throughput is in deci, base 2 = 47.8 MBs). CIFS is 67.7 MB (base 2 = 64.6).