VMware vSphere

 View Only
Expand all | Collapse all

IOMeter Performance and PS6010 10Gb

  • 1.  IOMeter Performance and PS6010 10Gb

    Posted Feb 21, 2011 03:14 AM

    Hi,

    We have a PS6010vx RAID 50 16 x 600 SAS 15K connected to 2 x Dell 8024f 10Gb switches. 4 x Dell 710 ESXi 4.1 running Dual Port broadcom 57711 Nics. Over the last month or so we experienced a bit of pain due to the drivers with the NIC's. This has been resolved and the system is stable. But I am not getting the IO Meter number I would expect. I am a bit reluctant to use MEM until I am convinced there are no issues with the system or the network.

    Just confirming, Jumbo Frames enabled all the way through including LAG between switches, all iSCSI optimisation diabled, flow control on, Portfast enabled, Storm control disabled. (Using Software Intiator due to broadcom issue running Jumbo and TOE)

    Currently the ESXi Servers are running RR, using iometer 2006 we are getting approx 190MB for 100% Seq Read at 32 K Blocks and around 300 for 50%Read /Write. We are achieving IOs of around 9000 and latency is around the 4-5ms

    We have had various people look at the system but no real answers at the moment.

    I have compared it to other SAN's I have installed at clients sites. The IOPs and the latency is fine but the throughput is where I would expect more.

    EG - IBM 3512 split using 6HDD and 5HDD raid 5. 8 Gb Fibre I was getting around 600MB for the same IOmeter test. The 50%Read/Write Seq wasn't as dramatic increase around 400Mb compared to 300Mb. I have similar results with EVA 4100 and 4400.

    I know for the EVA's RR implementation Best Practice was to change the RR IOPS
    esxcli nmp psp setconfig --device --config "policy=iops;iops=1". There may be a similar setting for 6010, I have seen unofficially the IOPs settings change to 9 but nothing offical.

    It would be great to see if anyone else can post their IOMeter results using PS6010vx or similar 10Gb environment with and without MEM.

    Regards,

    Joe



  • 2.  RE: IOMeter Performance and PS6010 10Gb

    Posted Feb 21, 2011 08:37 AM

    Please post some info on the methodology, for example guest OS used, VMDK config (thin/thick/size/scsi type), EqualLogic provisioning type, IOMeter configs for random and sequential tests (outstanding IOs, IO size, test size).

    To get some comfort on the network config, have a look in SanHQ as there is a TCP retransmit graph in there.

    Just covering all bases here..., but the LAG is also running 10GbE?



  • 3.  RE: IOMeter Performance and PS6010 10Gb

    Posted Feb 21, 2011 08:40 AM

    I would actually think the mem is gonig to help with performance as it offloads some of the overhead to the SAN and the NIC's vs the ESX donig the work

    I just implemented a PS4000xv this weekend with 16x300gb 15k SAS drives and on a 1Gb link i am pushing 100MB/Sec with a single nic doing the work (as I am storage vmotioning a Server to the SAN).  These boxes are also NOT using the MEM yet (as I need to get VM's to the SAN and then upgrade them to ESX 4.1 and then put on the mem..

    I ran this command to change my IOPS to level 9 (although I think 3 is recommended) from what I've read.

    esxcli nmp roundrobin setconfig -d naa.6090a09820f874e7f3d5c4000000e0bd --iops 9 --type iops


    naa.6090a09820f8f4eaf3d5f4000000003b
        Device Display Name: EQLOGIC iSCSI Disk (naa.6090a09820f8f4eaf3d5f4000000003b)
        Storage Array Type: VMW_SATP_EQL
        Storage Array Type Device Config:
        Path Selection Policy: VMW_PSP_RR
        Path Selection Policy Device Config: {policy=iops,iops=9,bytes=10485760,useANO=0;lastPathIndex=0: NumIOsPending=32,numBytesPending=2097152}
        Working Paths: vmhba35:C0:T2:L0

    Again I haven't pushed it too hard yet, but i know sending another storage vmotion to a different lun jumped me to about 1400 (from 1200 with a single SVMotion happening).  I am only using about 40% of the storage network at the moment but 80% of that traffic is all a single nic getting absolutely hammered with this storage vmotion (which is taking F'n forever at 350GB).



  • 4.  RE: IOMeter Performance and PS6010 10Gb

    Posted Feb 21, 2011 11:48 AM
    Switches 8024f
    • The Uplink between the switches is 10Gb and MTU set to 9216 as well as all the iSCSI connections
    • No Errors appear on any ports
    • Running Latest firmware
    • No Storm Control
    • Flow Control On
    • PortFast enabled on all iSCSI ports
    VM1
    • Windows 2008 R2
    • Single Vdisk 40GB in size
    • Thick provisioned running version 7
    • 1 vCPU
    • 4GB Ram
    • 1 x vNIC E1000
    VM2
    • Windows 2003
    • Single Vdisk 20GB in size
    • Thick provisioned running version 7
    • 1 vCPU
    • 2GB Ram
    • 1 x vNIC
    Physical Servers x 4
    • Dell 710
    • 2 x 57711 Broadcom (Dual Port, 1 port from each connecting to the iSCSI fabric) - Running the latest drivers VMware website 1.60...
    • Using Software iSCSI
    • 2 x QuadCore 5540
    • 60 Gb RAM
    • ESXi 4.1 with all the latest updates
    • Running RR
    • set MTU to 9000
    San
    • Equallogic PS6010vx
    • Dual Controller - separated the connectiones between switches
    • All ports connected at 10GbFull
    • Five Luns created
    • 16x 600GB 15K RAID 50 (2 Hot spare.)
    • Firmware 5.03 (latest version)
    • Confirmed all connections are visiable on SAN.
    • Currently using SAN HQ to observe the speed and connections (All traffic appears balanced between NICs.)
    • We used SAN HQ to determine the issues we were having with the drivers and causing the iSCSI disconnections.
    IOMeter 2006 and 2008
    downloaded the templated used on this forum
    Test 100%Read at 32K 100%Seq for 5 minutes
    • Read IOPS -6090,
    • Write IOPS - 0
    • Total IOPS 6090,
    • Write MB - 0,
    • Read MB 190,
    • Total MB 190,
    Test 50%Read/Write 32K 100% for 5 minutes
    • Read IOPS - 5056,
    • Write IOPS - 5059
    • Total IOPS 10115,
    • Write MB - 158,
    • Read MB 158,
    • Total MB 316,
    Tests were taken when no other VM's were running
    Ran test on differnet servers to elminate the server itelf
    I am looking to setup 2 x 10Gb Uplink to see if this makes a difference. I am leaning towards either the boradcom drivers or the card themselves


  • 5.  RE: IOMeter Performance and PS6010 10Gb

    Posted Feb 21, 2011 08:44 PM

    I just reviewed my environment after some sleep and realized I was only showing a single path to the  storage (which is about 100MB/sec)

    What I had pretty much missed..was well everything I think...

    #1 - needed to add a second VMKernel port with jumbo frames

    #2 - ensure that I only had one vmnic bound to a single vmkernel port

    #3 - ensure I put the opposite adapter in unused state for each vmkernel port

    #4 - bind the nics to the vmkernel port.

    It was easy to get up and running but I missed the multipathing 1:1 mapping...attached is the latest TR that was sent to me an hour ago from Dell.



  • 6.  RE: IOMeter Performance and PS6010 10Gb

    Posted Feb 22, 2011 11:30 AM

    Thanks Rumple,

    We have a slightly different network setup to what the article has indicated, we had used a another Dell article for the PS 6010 as indicated below.

    iSCSI 1 - 1 x vSwitch (switch01), 1 x VMKernel Port(iSCSI01) and 1 x PNic

    iSCSI 2 - 1 x vSwitch (switch02), 1 x VMKernel Port(iSCSI02) and 1 x PNic

    I will make the changes to one of the ESXi servers and verify the performance.

    Thanks for you help so far

    Joe



  • 7.  RE: IOMeter Performance and PS6010 10Gb

    Posted Feb 23, 2011 01:58 PM

    Just an update, I have redone iSCSI network on one of the servers according to Dell documentation.i.e., moved both vmkernel ports onto one vSwitch. The results were still the same. After some further testing, I decided to just use one pNIC and confirm the speed. The results almost double compared to what I was getting for Dual iSCSI in Round Robin, still on the slow side though. I suspect the problem I am facing will revolve around the way the physical switches are working. I will have to check the uplinks again and confirm no issues are occurring.

    I have confirmed I am not getting any drop packets when pinging from within ESXi to the SAN at the highest packet size.

    EG ping -d x.x.x.x -s 8972.

    Joe



  • 8.  RE: IOMeter Performance and PS6010 10Gb

    Posted Feb 23, 2011 02:43 PM

    Make sure the ports are not in a channel-group but just configured as individual interfaces.



  • 9.  RE: IOMeter Performance and PS6010 10Gb

    Posted Feb 23, 2011 03:16 PM

    All iSCSI ports are configured exactly as shown below

    interface ethernet 1/xg1

    description 'Server01-iscsi'
    spanning-tree cost 2000
    spanning-tree portfast
    spanning-tree mst 0 external-cost 2000
    mtu 9216
    switchport access vlan 99
    exit
    We have a trunk between switches (In the middle of creating a 2x10Gb LACP Trunk just for iSCSI Vlan waiting on cables)
    interface ethernet 1/xg24
    description 'TRUNK_between_Switch1_Switch2'
    mtu 9216
    switchport mode trunk
    switchport trunk allowed vlan add 1,99,101-103,110-118
    exit


  • 10.  RE: IOMeter Performance and PS6010 10Gb

    Posted Feb 23, 2011 03:33 PM

    I think they recommend spanning-tree turned off for the endpoint devices but leaving spanning tree portfast on…

    Usually spanning tree is only physically needed on uplinks to other switches...but it shouldn’t really matter I don’t think…



  • 11.  RE: IOMeter Performance and PS6010 10Gb

    Posted Feb 24, 2011 07:16 PM

    i am jumping on this configuratoin because we have the same configuration and same readings.

    I would be very interested in seeing how this gets resolved.

    See attached screenshot for my results



  • 12.  RE: IOMeter Performance and PS6010 10Gb

    Posted Feb 24, 2011 11:07 PM

    I am currently working with Dell and Equallogic on this.

    Does anyone have a standard configuration for 2 x 8024f switches for iSCSI environment that I can compare it with. I need to make sure that I haven't missed anything.

    Regards,

    Joe



  • 13.  RE: IOMeter Performance and PS6010 10Gb

    Posted Feb 25, 2011 08:33 PM

    Here is what I did:

    1. Updated the drivers on the Broadcom Netextreme II 57711 10GB nic cards from the VMware website

    2. Changed the multipathing to RR from Fixed

    3. Checked that the vmkernel was setup as recommended (1 nic per vmkernel)

    Still the slowness was persisting. I tested deploying the same image on the Dell storage and an EMC SAN. the results were 58 minute son the Equallogic as compared to the 15 minutes on the EMC.

    I then went ahead and:

    4. installed the Dell MEM following the instructions and verified that the installation was successful, the MEM was enabled and the default multipathing was set to DELL

    5. Retested the template deployment

    VOILA... same template now takes 15 minutes to complete on the Equallogic.

    I ran a IOmeter test with 32K 100% read and 32K 50% ready with 2 workers and 5 threads per worker and got a much higer result when compred to the one I posted earlier.

    I might be looking at the worng numbers BUT the MBps is way higher here. Please do let me know if I am reading it wrong.

    Thanks

    A



  • 14.  RE: IOMeter Performance and PS6010 10Gb

    Posted Mar 16, 2011 12:43 PM

    Hi,


    Just an update with the current situation, due to our current production environment we have purchased another SAN PS6010, 2 more 8024F switches and two more servers, Dell R610. Though this time I have ordered the Intel 520 10Gb cards instead of the Broadcom. The plan is to dedicate these switches for iSCSI network and move the production switch for the Data LAN only. (Currently sharing the iSCSI and Data)

    So we setup the lab environment and configured the switches per some dell/equalogic support guidelines, we noticed that the speed issue still existed. After some troubleshooting we noticed that there was an mtu setting left out of the trunk. It was applied to the network ports but not the channel group. Immediately once we added this we notice a massive speed difference of up to 690MB/s up from the current of 200 to 300 MB running IOMeter 100% Read test.


    The following is the switch configuration for the portchannel and interface configuration

    interface ethernet 1/xg1

    description 'san01-controller0-iscsi'

    spanning-tree cost 2000

    spanning-tree portfast

    spanning-tree mst 0 external-cost 2000

    mtu 9216

    switchport access vlan 99

    exit

    !

    interface ethernet 1/xg2

    description 'san01-controller1-iscsi'

    spanning-tree cost 2000

    spanning-tree portfast

    spanning-tree mst 0 external-cost 2000

    mtu 9216

    switchport access vlan 99

    exit

    !

    interface ethernet 1/xg19

    channel-group 1 mode auto

    spanning-tree portfast

    mtu 9216

    switchport mode trunk

    switchport trunk allowed vlan add 99

    exit

    !

    interface ethernet 1/xg20

    channel-group 1 mode auto

    spanning-tree portfast

    mtu 9216

    switchport mode trunk

    switchport trunk allowed vlan add 99

    exit

    !

    interface port-channel 1

    switchport mode trunk

    switchport trunk allowed vlan add 99

    mtu 9216

    exit

    From here we ran through various scenarios and tested the failover and logged the information through IOMeter readings to confirm no issues existed with the configuration for the following.

    1. Controller Failure, - ran the restart command.

    2. Switch Failure - removed the power from one switch at a time

    3. iSCSI NIC Failure - Disabled the NIC port on the switch.

    All of which passed without a single dropped packet to the VM and consistently running between 660MB/s - 670MB/s. The IOMeter slowed down during the testing, levelled off and then picked again.

    Next test was to see if MEM was going to provide any better results than what we were currently seeing, after installing the plugin and running through the exactly same IOMeter tests the results indicated the following.It was slower than not running MEM altogether, I found this bizarre and ran through it several time with the same results.

    Moving forward, the next step was to test the R710 servers we had put into production, the real difference between the lab servers we had setup and the production servers were the Broadcom NICs. We removed one of the servers from the cluster and moved into the lab environment. Immediately we saw a difference in performance compared to the Intel cards. We had noticed the IOMeter was running at approximately 220MB/s -250MBs We plugged the server directly into the SAN, this helped eliminate any switch configuration issues, we received the same results of around 250MBs. The VMWare iSCSI environment was setup exactly the same way, my colleague noticed VMware had released a new Broadcom driver specifically for the 10Gb cards. (Note we had tried all previous drivers from VMware and Broadcom and none were successful, we had spoken to tech support from equallgoic and dell and had escalated it, all with no luck.) We applied the latest one 1.62.... released 7 days ago and presto 662 MB/s. We have yet to go through all the scenarios of testing but it appears to be running slightly slower than Intel cards but faster than MEM.


    I have attached a baseline reading running the IO Meter, for Intel, with and without MEM and Broadcom without MEM.


    From this experience I will lean towards Intel over Broadcom more than ever,Broacom has caused a lot of headaches for my company and clients.

    Joe



  • 15.  RE: IOMeter Performance and PS6010 10Gb

    Posted Mar 28, 2013 06:14 AM

    "The IOPs and the latency is fine but the throughput is where I would expect more."

    I suspect that you are not quite seeing what IOMeter is telling you.

    If you got 9000 IOPS on a 32K size transfer test then the throughput is:

    32,768 x 9,000 = 294,912,000

    = 300MB/s !!!!  (It can't be anything else unless one of the numbers above changes)

    Your SAN can't push the throughput any higher because the disks are maxed at that point. 9000 IOPS is a respectable perf of what you are running this on hardware wise.

    If you want to see higher throughput, then increase the transfer size to 1mb and run the same test again and you will get a much lower IOPS and you push much higher throughput.

    Perhaps I'm misreading your problem but from what I did read, you don't have a problem, its working fine

    Paul