VMware vSphere

 View Only
Expand all | Collapse all

Aggregate NIC & Trunking of HP Procurve Switch

  • 1.  Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 19, 2010 08:04 PM

    Hi all.

    I have configured a couple NICs (1GB) in ESXi 4 Host>configuration>networking.

    This is my NFS host network.

    vmnic5 & vmnic4. Both nics appear to see different networks? I have configured those 2 ports to TRUNK (HP Procurve) ie. FEChannel under Cisco. I can't actually tell if it's working correctly due to each nic showing a different "observed network". The switch is set to Route Based IP HASH.



  • 2.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 23, 2010 07:50 PM

    Any ideas or perhaps I didn't give enough information?

    VMHOST <===> PROCURVE SWITCH <===> NFS TARGET

    Both the HOST & TARGET are set to 9000 MTU

    The Switch is set to allow JUMBO frames

    The ports for all (4) 1-GB connections are in its own VLAN.



  • 3.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 23, 2010 09:49 PM

    What load balancing policy do you have set for the NIC's on the ESX Server (VIrtual Port ID or IP hash)?

    -J



  • 4.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 24, 2010 01:28 PM

    Via the vSphere client, vSwitch1 & VMotion & IP Storage Port are set to IP Hash.



  • 5.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 24, 2010 03:23 PM

    Please beware that ESXi 4.0 and older cannot really load balance this case. The best you can hope for with 4.0 is to have different volumes go out different NICs by setting the policy to "port and IP", and hope you are lucky the volumes end up on different NICs.

    I am unsure if ESXi 4.1 can do real NIC teaming where packets go out whichever physical NIC is least busy and the switch combines the traffic for higher bandwidth (up to the speed of the switch "backplane"), which is what would really benefit an iSCSI SAN.



  • 6.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 24, 2010 06:16 PM

    Are you saying, it can't be done period? I see many + blogs stating otherwise.



  • 7.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 24, 2010 06:33 PM

    I wouldn't put it that strongly, but from what I have read it seems that at least for ESXi 4.0 NIC teaming seems to be limited to various algorithm for spreading many-to-many traffic over multiple NICs, which is not that useful for VMKernel to iSCSI box traffic.

    I guess there is no problem in ESX (no i), as that could use the NIC teaming in the full Linux kernel, and have heard some rumors about 4.1.

    And of cause I could be mistaken, but look closely at the blog posts you find to check if they really talk about ESXi (not ESX) and optimizing traffic to a single iSCSI LUN.

    I certainly was disappointed when I had to give up using the two NICs on my storage network as a 2Gb connection, but then I found that my choice in SAN hardware also had that limitation, so I did not complete my own research.



  • 8.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 24, 2010 06:41 PM

    Understood. We're implementing NFS, but I assume the same holds true w/ iscsi. I guess there's too much FUD out there, and no definitive answer.



  • 9.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 25, 2010 10:45 AM

    Well congratulations. From what I have heard, NFS spreads its load over more ports than iSCSI, so "port based" load balancing might work better with NFS than with iSCSI. But since your VMWare kernel has only one relevant IP and your NAS box has one relevant IP, it is important to set the balancing to "port based" not "IP based", as the latter would force all the traffic onto a single link no matter how many UDP/TCP ports it is spread over.

    P.S.

    I guess you meant "hype" not "FUD".



  • 10.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 24, 2010 10:12 PM

    I can't actually tell if it's working correctly due to each nic showing a different "observed network". The switch is set to Route Based IP HASH.

    You should probably not see different networks if it is configured correctly, as the aggregated links should behave as one logical link.

    The Switch is set to allow JUMBO frames

    The ports for all (4) 1-GB connections are in its own VLAN.

    Which model of Procurve switch are you using?

    Have you set the four ports on the Physical Switch to different VLANs? Or the trunk link?

    Is it the physical or virtual switch configured for Jumbo frames?

    Understood. We're implementing NFS, but I assume the same holds true w/ iscsi. I guess there's too much FUD out there, and no definitive answer.

    It is correct that you most likely will be using only one of the physicals links in this setup. It has nothing to do with Vmware, it is a network standard (802.3ad / 8021.AX) that defines this. It means that all communication between two different endpoints will go over one specific link. If just having two here it will always travel the same way. If you had more hosts on this network the traffic would spread more.



  • 11.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 25, 2010 10:50 AM

    Hmm, I don't have a list of the 802 series standard handy, but I definitely seem to recall that one of the standards specify bandwidth aggregation on switch to switch links by just plugging in multiple patch cables and setting both switches to recognize the situation by means of their STP topology detection. Problem is that at least in ESXi 4.0u1, the VMware vSwitch does not implement this feature.



  • 12.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 25, 2010 12:00 PM

    From what I have heard, NFS spreads its load over more ports than iSCSI, so "port based" load balancing might work better with NFS than with iSCSI. But since your VMWare kernel has only one relevant IP and your NAS box has one relevant IP, it is important to set the balancing to "port based" not "IP based", as the latter would force all the traffic onto a single link no matter how many UDP/TCP ports it is spread over.

    I do not really know if NFS uses more TCP ports than iSCSI does. I would guess that it also is a typical client-server application with a dynamic client tcp port and a well known server port (tcp/2049), but I do not know at all.

    However, that does not really matter. I am afraid I belive it is not correct what you write above, as "port based" is not a port as in a TCP/UDP port. It is just a virtual port on the virtual switch. Choosing port based load balancing will only make all traffic to one VM / vmk always go over the same NIC. That is, really no load balancing at all if only looking at one VM / vmk, however the load will spread somewhat even across all VMs.

    It will be of no help for a iSCSI och NFS vmkernel interface unfortunaly.

    Some physical switches with support for etherchannels (Cisco) has the possibility to select between MAC, IP or TCP based hashes for distributing frames over an etherchannel, but however that is not available in Vmware virtual switches.

    Hmm, I don't have a list of the 802 series standard handy, but I definitely seem to recall that one of the standards specify bandwidth aggregation on switch to switch links by just plugging in multiple patch cables and setting both switches to recognize the situation by means of their STP topology detection.

    I do not think that is correct. If using STP (or RSTP) there could be said to be native support for multiple switch to switch links, BUT that would only mean that Spanning Tree will shut down all but one of the links and only open it again if there is a failure on the first.

    What I guess you are thinking of is the 802.3ad / 802.1AX standards, which describes link aggregation, which could be static (which Vmware support) or dynamic, through LACP, but still needs to be configured on the switch. If you define that a couple of ports on both switches should belong together in aggregation group (called "portchannel" on Cisco and "trunk" on HP Procurve) then Spanning Tree will allow this, since all links belong together in one logical link.

    Note that this will also most likely be that traffic between two hosts always go over on of the links and the bandwidth is not greater than the bandwidth of one single port. It will depend of the capabilities of the switch however, if it could distribute on TCP/UDP port numbers you can go over this limit, but also must the application (e.g. NFS) really use multiple ports at the same time.

    If you use the "IP based" load balancing on the Vmware virtual switch than you must define a portchannel/trunk on the physical switch.



  • 13.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 25, 2010 04:18 PM

    ricnob:

    You should probably not see different networks if it is configured

    correctly, as the aggregated links should behave as one logical link.

    Correct. which why I'm a little confused. Those (2) NIC's are trunked on the ProCurve PORT 13 & 14 per screenshot. I read on another post here about TRUNKING and not using LACP trunks on a Procurve. I'm looking for a theory though, just fact it does or doesn't work...



  • 14.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 25, 2010 06:38 PM

    LACP is not supported on the ESX side, so you must use static trunking (as HP calls it) on the Procurve side.

    Can you post a screenshot on the trunk setup?



  • 15.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 25, 2010 06:45 PM

    Here's two screenshots: I have verified twice, the NIC's I have in the switch are indeed 13&14, but vmware shows a "discovered network" to be different ? You'll notice it's simply a trunk, no LACP enabled. All (3) trunks T1,T2,T3 are in the same VLAN (150). I have (2) NFS servers and 1 VMware Host.



  • 16.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 25, 2010 07:44 PM

    It seems correct all I can see. Which of the two ip networks 192.168.9.* or 192.168.150.* are you expecting to see on this VLAN?

    Can you post a screenshot of the VLAN config of the trunk links?

    Which firmware version do you have on the 1824?



  • 17.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 25, 2010 08:19 PM

    The VLAN shown is VLAN 150 (192.168.150.100 = VMware Host, 192.168.150.200 = SAN02, 192.168.150.205 = SAN03)



  • 18.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 25, 2010 08:26 PM

    I do really remember the config of 1800 switches through the GUI, but on all Procurve switches with command line interface you always define the VLANs to the Trunk link and not to the ports. As soon as the ports has entered a "trunk group" they should have no config, only the trunk link.

    Can you see if there is anything a little bit down, how does it look for the Trunk3?



  • 19.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 25, 2010 08:41 PM

    Yes, once I set the trunk, the port configurations are much disabled..

    Check out the (2) port 13 & 14 stats...included below also...



  • 20.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 25, 2010 08:22 PM

    Both the HOST & TARGET are set to 9000 MTU

    Also, by the way, you have enabled Jumbo frames on the vmkernel interface and the virtual switches through the Service Console commands?



  • 21.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 25, 2010 08:34 PM

    Yes, MTU was set to 9000 via cli on esxi host...

    ~ # esxcfg-nics -l

    Name PCI Driver Link Speed Duplex MAC Address MTU Description

    vmnic0 01:00.00 bnx2 Up 1000Mbps Full 00:26:b9:52:a6:f1 1500 Broadcom Corporation Broadcom NetXtreme II BCM5709 1000Base-T

    vmnic1 01:00.01 bnx2 Up 1000Mbps Full 00:26:b9:52:a6:f3 1500 Broadcom Corporation Broadcom NetXtreme II BCM5709 1000Base-T

    vmnic2 02:00.00 bnx2 Up 1000Mbps Full 00:26:b9:52:a6:f5 1500 Broadcom Corporation Broadcom NetXtreme II BCM5709 1000Base-T

    vmnic3 02:00.01 bnx2 Up 1000Mbps Full 00:26:b9:52:a6:f7 1500 Broadcom Corporation Broadcom NetXtreme II BCM5709 1000Base-T

    vmnic4 05:00.00 igb Up 1000Mbps Full 00:1b:21:48:77:3c 9000 Intel Corporation 82576 Gigabit Network Connection

    vmnic5 05:00.01 igb Up 1000Mbps Full 00:1b:21:48:77:3d 9000 Intel Corporation 82576 Gigabit Network Connection

    ~ #



  • 22.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 25, 2010 08:37 PM

    Yes, MTU was set to 9000 via cli on esxi host...

    And of course on the vSwitch also?

    One other things, does VLAN1 has the ip range 192.168.9.* on your network?

    And do you know the firmware version of the switch?



  • 23.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 25, 2010 08:45 PM

    I do appreicate all the help :smileyhappy: I do notice something below, the VLAN for the "NFS-Network" is in vlan0?

    ~ # esxcfg-vswitch --list

    Switch Name Num Ports Used Ports Configured Ports MTU Uplinks

    vSwitch0 64 5 64 1500 vmnic0

    PortGroup Name VLAN ID Used Ports Uplinks

    Management 0 2 vmnic0

    Management Network 0 1 vmnic0

    Switch Name Num Ports Used Ports Configured Ports MTU Uplinks

    vSwitch1 64 4 64 9000 vmnic4,vmnic5

    PortGroup Name VLAN ID Used Ports Uplinks

    NFS-NETWORK 0 1 vmnic4,vmnic5



  • 24.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Aug 26, 2010 08:16 AM

    I do appreicate all the help :smileyhappy: I do notice something below, the VLAN for the "NFS-Network" is in vlan0?

    Switch Name Num Ports Used Ports Configured Ports MTU Uplinks

    vSwitch1 64 4 64 9000 vmnic4,vmnic5

    PortGroup Name VLAN ID Used Ports Uplinks

    NFS-NETWORK 0 1 vmnic4,vmnic5

    It seems like there is no VLAN tagging being done from this portgroup. That will depend on how you want to do it of course, but I prefer to let the virtual switch tag the traffic on its way out. I would make sure that the portgroup is in VLAN 150 on your vSwitch1.

    As for the physical switch setup, I would like that you check some configuration. The Procurve 1800 has a bit of a strange web gui in my opinion..

    First, could you please check SYSTEM - information and the software version, just want to know that it is not a outdated firmware.

    Then verify the trunk link, T3 and its membership on VLANs. It is quite easy to have a port/trunk being member on more VLANs than wanted.

    Go to VLAN setup, go in to all vlans through "modify" and make sure that the trunk T3 is not member of any other vlan than 150.

    Also really make sure that it actually is a member of VLAN 150, that is - scroll down past all ports until you see the trunk name.

    In my images above I have port 19 and 20 bundled into Trunk1 and notice that the Trunk is a member of VLAN1 too, which might not be expected.



  • 25.  RE: Aggregate NIC & Trunking of HP Procurve Switch

    Posted Sep 01, 2010 06:24 PM

    ricnob, sorry for the delay, I had to fly out of town for business.

    My switch, (1800-24G) is UP to date, the fimrware was one revision off, but I have since upgraded to latest available. I will set the vswitch1 to vlan 150 after hours. I''l need to shutdown all the VM's running incase of an issue.