ESXi

 View Only
  • 1.  VMware ESXi network dropout

    Posted Nov 17, 2013 01:57 PM

    I am running a low latency demanding application on a VM in VSphere 5.1. The TCP/IP transfer rate is 124 MBit. I have an intermittent problem related to the network dropping and then slowly increasing to full speed again.

    I have attached a diagram of data gathered from the performance of the ESXi. The network speed drops from 124 MBbit to about 100Mbit.

    I have followd this guide: http://www.vmware.com/files/pdf/techpaper/VMW-Tuning-Latency-Sensitive-Workloads.pdf

    Changing the network adapter from e1000 to VMXNET3 improved the performance. With e1000 I had this problem always, and now it only shows up 1 out of 10 times.

    I have disable the virtual interupt coalescing and also LRO.



  • 2.  RE: VMware ESXi network dropout

    Posted Nov 17, 2013 02:53 PM

    Are the VMware tools itself up to date? Any error on the physical switch side? This also might be a valid use case for VM DirectPath I/O whereas you can map a PCI device to a VM but you lose some vMotion (and other advanced) functionallity.



  • 3.  RE: VMware ESXi network dropout

    Posted Nov 17, 2013 04:27 PM

    Hi,

    Yes, I believe it is the latest version of VMware tools. I was running initially with a straight TP cable connected directly to the equipment.
    At the moment I am having a small not manageable switch in between the equipment and the server.

    The ESXi is installed as a standalone server and is not part of a data center, so VMmotion is nothing that is needed in this case.

    I noticed also that

    Here is another example. The marker is place where the red circle is.

    This diagram shows the data receive rate.

    This diagram shows the receive packet drops. It is interesting that the data rate drops at the same time as the packet drops occur.

    Here is the disk latency diagram.

    I do have all the data collected from the net, mem, disk, cpu and system parts of the ESXi performance.

    The CPU peaks at about 360MHZ per core.

    In my opinion, it seems like it is the network that is the problem, but I could be wrong.



  • 4.  RE: VMware ESXi network dropout

    Posted Nov 19, 2013 03:35 PM

    I was unaware of the VM DirectPath I/O possibility. If the increasing of the rx ring buffer on the VM does not help this will be my next action.



  • 5.  RE: VMware ESXi network dropout

    Posted Nov 19, 2013 03:39 PM

    Could be a live saver in your case whereas low latency and high throughput is a requirement. You map an entire vmnic to a VM, so make sure your host has at least 2 NICs to keep managing the host itself.



  • 6.  RE: VMware ESXi network dropout

    Posted Nov 18, 2013 12:55 AM

    There are many ESXi/ESX host components that can contribute to network performance.

    Validate that each troubleshooting step below is true for your environment. The steps provide instructions or a link to a document, for validating the step and taking corrective action as necessary. The steps are ordered in the most appropriate sequence to isolate the issue and identify the proper resolution. Please do not skip a step.

    1. Verify that the latest version of VMware Tools is installed in the virtual machines. For more information, see Verifying a VMware Tools build version (1003947) and Overview of VMware Tools (340).

    2. VMware recommends using multiple NICs on the associated virtual switch to increase the overall network capacity for portgroups that contain many virtual machines or several virtual machines that are very active on the network. For more information, see NIC teaming in ESXi and ESX (1004088).

    3. Verify the speed and duplex settings of the installed network adapters. For more information, see Configuring the speed and duplex of an ESX / ESXi host network adapter (1004089).

    4. Verify that the portgroup and virtual switch are not configured for promiscuous mode. For more information, see Configuring promiscuous mode on a virtual switch or portgroup (1004099).

    5. Verify the integrity of the physical network adapters. For more information, see Verifying the integrity of the physical network adapter (1003686).

    6. Verify that your host is not overloaded. Networking relies on available processor resources. If the CPUs on the host are being used at capacity, network performance suffers.

    7. Verify that you have chosen the appropriate network driver for your virtual machine based on your needs. For more information, seeChoosing a network adapter for your virtual machine (1001805).

    If your problem still exists after trying the steps in this article:

    1. Gather the VMware Support Script Data. For more information, see Collecting diagnostic information for the vSphere Client or VMware Infrastructure Client (1003687).
    2. File a support request with VMware Technical Support and note this KB Article ID (1004087) in the problem description. For more information, see How to Submit a Support Request.

    Best regards



  • 7.  RE: VMware ESXi network dropout

    Posted Nov 18, 2013 08:39 PM

    Hi,

    I have performed all the steps 1-4. There has been an improvement at least in the diagram. The transfer rate looks much more stable.

    In addition to NIC teaming I am also running dual vNICs. I read about that in this article: http://www.confio.com/vm-resources/vmware-tips/vmware-host-dropped-packets/

    In the diagram below you can see a normal fully working transmission to the left and a faulty one to the right.

    My next step is to try and increase the ring buffer in Linux. "ethtool -G rx 4096"

    I will also try to use the VisualESXtop software to see if I can capture some DRPRX counts.



  • 8.  RE: VMware ESXi network dropout

    Posted Nov 19, 2013 12:11 AM

    I'm glad i could help , VisualEsxtop can help u alot as well .

    Good luck and let me know if you need any more help .

    Best regards

    Yours, Oscar



  • 9.  RE: VMware ESXi network dropout

    Posted Nov 19, 2013 06:03 AM

    The problem with those data drops is that the server does manager to empty the the ring buffer of the equipment which leads to loss of data that is overwritten.

    The ring buffer only lasts for 2 seconds in the equipment.