VMware vSphere

 View Only
Expand all | Collapse all

Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

  • 1.  Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Nov 19, 2013 04:56 PM

    I have installed  ESXi5.5 in a server with Emulex OneConnect 10Gb NICs.

    I have installed the last driver for this nic - elxnet-10.0.575.9-1OEM.550.0.0.1331820.x86_64.vib.

    After some network activity of virtual machines, the interfaces go down, even the switch ports are up.

    vmnic4  0000:05:00.00 elxnet      Down 0Mbps     Half   00:00:c9:e4:13:16 9000   Emulex Corporation OneConnect 10Gb NIC

    vmnic5  0000:05:00.01 elxnet      Down 0Mbps     Half   00:00:c9:e4:13:18 9000   Emulex Corporation OneConnect 10Gb NIC

    Here is the logs

    2013-11-19T15:49:12.395Z cpu2:33376)WARNING: elxnet: elxnet_detectDumpUe:238: 0000:005:00.0: UE Detected!!

    2013-11-19T15:49:12.396Z cpu2:33376)elxnet: elxnet_detectDumpUe:249: 0000:005:00.0: Forcing Link Down as Unrecoverable Error detected in chip/fw.

    2013-11-19T15:49:12.396Z cpu2:33376)WARNING: elxnet: elxnet_detectDumpUe:257: 0000:005:00.0: UE lo: MPU bit set

    2013-11-19T15:49:12.892Z cpu5:33377)WARNING: elxnet: elxnet_detectDumpUe:238: 0000:005:00.1: UE Detected!!

    2013-11-19T15:49:12.892Z cpu5:33377)elxnet: elxnet_detectDumpUe:249: 0000:005:00.1: Forcing Link Down as Unrecoverable Error detected in chip/fw.

    2013-11-19T15:49:12.892Z cpu5:33377)WARNING: elxnet: elxnet_detectDumpUe:257: 0000:005:00.1: UE lo: MPU bit set

    Anyone have a similiar trouble?



  • 2.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Dec 17, 2013 04:37 AM

    I have the same problem with NIC Emulex here in my HS23 servers. Updated the Emulex firmware and native Emulex NIC driver but the problems continued.

    I had the idea to update the legacy NIC driver from Emulex, deactivated the native driver and activated the driver legacy, did the test and the problem stopped.

    Run the commands to disable the native driver and enable the legacy driver:

    esxcli system module set --enabled=false --module=elxnet

    esxcli system module set --enabled=true --module=be2net

    reboot the host.



  • 3.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 07, 2014 10:32 PM

    I have the same problem on a number of HP BL460c G7. I have not tried this on the BL460c Gen8s, which use the same driver/firmware. My scenario right now is:

    8 BL460c Gen8 running ESXi 5.1 with dvs Cisco Nexus 1000V. Trying to add 6 BL460c G7 with ESXi 5.5 to this dvs will lead to unrecoverable error in the elxnet driver. Adding the G7s to a dvs which is not Cisco based does not lead to the unrecoverable error.

    I have engaged Emulex and they are currently trying to reproduce it.



  • 4.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Aug 20, 2014 06:33 AM

    Hey marcomcpap - I had a similar issue and was able to fix it with your esxcli commands. It worked like a charm and after a reboot all my 16 NICs started showing up.. Thanks for the tip mate..



  • 5.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 08, 2014 05:04 AM

    I faced the same issue on my ESX4.0 host. After updated the NIC firmware and latest driver version. Issue got fixed for me.



  • 6.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 08, 2014 05:21 AM

    I am already running the latest version firmware (as posted by HP) and the latest driver (as posted on VMware.com). Tried actually 3 different versions of the driver.



  • 7.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 08, 2014 04:31 PM

    This is a known bug:

     

    Cisco bugID CSCuj81943

     

    X- the issue is that N1Kv is not able to set the NIC speed correctly for the interface with [ethtool –S ] due to the new driver on the Emulex card;

    X- As a workaround you can either try ESXi 5.1 or try to use the older be2net driver on the Emulex card;

    X- the fix for N1Kv will come out in next release, ETA is not determined.



  • 8.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 08, 2014 05:12 PM

    Thanks for the information



  • 9.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 09, 2014 09:27 PM

    FWIW, I had the same problem with this NIC in an IBM HS23 blade.  The three blades in this chassis I was installing shipped with esxi 5.1 pre-installed.  The customer ordered the blades with the advanced software upgrade that enables vnics/IO virtualization (forget what they call it exactly).  I went into the UEFI on the blades and enabled multichannel switch independent mode, NIC personality.  I rebooted the blades and they came back up showing 6x 10GE + 2x 1GE NICs all connected and functioning correctly.  Unfortunately for me the customer wanted to run vmware 5.5 on their new blade center.  I did a clean install of 5.5 on the HS23s with the IBM custom vmware ISO.  Despite all the firmware and driver updates I could find, plus support calls to vmware and IBM, we were not able to get the vnics to work with vmware 5.5.  I finally gave up and disabled multichannel on the nics (set them back to physical mode).  This particular blade center had Cisco Nexus 4001i switches.  The weird thing was that even though on the network adapters tab in vmware, the links showed down as with the OP, on the network tab, I was still able to see CDP stats from a vswitch uplink port.  I confirmed this wasn't cached by disabling the internal ports on the 4Ks and verifying the CDP information went away.

    BTW, for fun I used the IBM vmware ISO to downgrade one of the blades back to 5.1 and still the 10GE links would not come up.  My guess is IBM has some secret driver that they ship on a pre-installed 5.1 USB key that isn't in their custom ISO for 5.5 or 5.1. 

    Going to follow this thread as I would like to see how this turns out.

    Seth



  • 10.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 22, 2014 11:12 PM

    FYI, we have this problem on both version of Cisco Nexus 1000v 4.2(1)SV2(2.1) and 4.2(1)SV2(2.1a). Even the latest elxnet driver (posted Dec 24 2014 on vmware.com) is not fixing this problem. Cisco Nexus is triggering it, but the actual bug is in the driver or firmware of the Emulex chip.




  • 11.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 23, 2014 01:11 AM

    I setup another blade center today that came preloaded with the vmware 5.1 image from IBM again.  I decided to take a gamble and try vnics again.  The nics are showing connected but I'm having some other network problems that I think are caused by this NIC. 

    Here are the drivers showing:

    Name    PCI Device     Driver     Link Speed  Duplex  MAC Address         MTU Description        

    ------ -------------  ---------  ---- -----  ------  -----------------  ---- --------------------------------------------

    vmnic0 0000:016:00.0  be2net     Up 10000  Full    34:40:b5:c8:f0:e8  1500 Emulex Corporation OneConnect 10Gb NIC (be3)



  • 12.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jun 20, 2014 01:08 AM

    This issue has been addressed by N1k in the release 4.2(1)SV2(2.2) that was released in Jan 2014. It is documented in the bug report CSCuj81943. We addressed it by using a different API  to retrieve the NIC speed in ESX 5.5.





  • 13.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Aug 18, 2014 03:03 PM

    We have the same problem with an Emulex 554FLB 10Gb without a Cisco Nexus 1000v.

    Switching side is showing everything up and connected. VSphere client shows a physical disconnect on both adapters.

    Setting the speed for both NICs through de CLI gives a connected state, but after rebooting the esxi host de VSphere client shows both NICs disconnected again. Strange thing is: when de NICs show a disconnected state the host is pingable on its management IP.

    Emulex firmware v4.9.416.0

    ESX 5.5.0.

    Message was edited by: Nevets01 VMWare Driver Info:          Bus Info:          Driver: elxnet          Firmware Version: 4.9.416.0          Version: 10.2.298.5 After some testing i got both NICs connected (through CLI) at 10000Gb, but one NIC is showing observed IP ranges and the Other NIC isn't. When i disable the NIC WITH observed IP Ranges network connectivity is lost. I am 100% sure that it isn't the Cisco switching side config/port channel. vSwitch config is also OK (Route based on IP hash/Link status only/Notify/Failback



  • 14.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Sep 03, 2014 01:04 PM

    same problem here!

    using HP ProLiant BL460c G7 servers with embedded NC553i (10Gb 2-port FlexFabric Converged Network Adapter) as esxi hosts


    migrated to vsphere v5.5 u1 some days ago. yesterday one of our esxi hosts got disconnected from the cluster. fond the following in the logs:


    2014-09-01T13:17:51.499Z cpu13:32852)Uplink: 6530: enabled port 0x2000002 with mac e4:11:5b:e0:1d:d8

    2014-09-01T13:17:52.499Z cpu13:32852)NetPort: 1632: disabled port 0x2000002

    2014-09-01T13:17:56.169Z cpu14:33444)WARNING: elxnet: elxnet_detectDumpUe:274: 0000:002:00.1: UE Detected!!

    2014-09-01T13:17:56.172Z cpu14:33444)elxnet: elxnet_detectDumpUe:285: 0000:002:00.1: Forcing Link Down as Unrecoverable Error detected in chip/fw.

    2014-09-01T13:17:56.172Z cpu14:33444)WARNING: elxnet: elxnet_detectDumpUe:302: 0000:002:00.1: UE lo: MPU bit set

    2014-09-01T13:17:56.172Z cpu14:33444)WARNING: elxnet: elxnet_detectDumpUe:312: 0000:002:00.1: UE hi: PMEM bit set

    2014-09-01T13:17:56.499Z cpu13:32852)NetPort: 1632: disabled port 0x2000002

    2014-09-01T13:17:56.499Z cpu13:32852)Uplink: 6530: enabled port 0x2000002 with mac e4:11:5b:e0:1d:d8

    2014-09-01T13:17:56.532Z cpu2:33443)WARNING: elxnet: elxnet_detectDumpUe:274: 0000:002:00.0: UE Detected!!

    2014-09-01T13:17:56.532Z cpu2:33443)elxnet: elxnet_detectDumpUe:285: 0000:002:00.0: Forcing Link Down as Unrecoverable Error detected in chip/fw.

    2014-09-01T13:17:56.532Z cpu2:33443)WARNING: elxnet: elxnet_detectDumpUe:302: 0000:002:00.0: UE lo: MPU bit set

    2014-09-01T13:17:56.532Z cpu2:33443)WARNING: elxnet: elxnet_detectDumpUe:312: 0000:002:00.0: UE hi: PMEM bit set

    today the same issue on another esxi host.

    opened a case with HP and also with vmware! no solution so far. as far as i was told, this was already observed at other customers. vmware support came up with the recommendation to use legacy drivers as described a few posts above:

    "Run the commands to disable the native driver and enable the legacy driver:

    esxcli system module set --enabled=false --module=elxnet

    esxcli system module set --enabled=true --module=be2net

    reboot the host."

    will wait for a response from HP before i switch to legacy drivers on all esxi hosts...

    has anybody other recommendations than switching to legacy drivers? will the switch to legacy drivers using the above commands be persistent across reboots?



  • 15.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Sep 05, 2014 11:59 AM

    I got an customer who experience the same problem as fgw described. The hardware is HP G7 with the Emulex NC553i (10Gb 2-port FlexFabric Converged Network Adapter). Please let us know if you have an update on this issue.



  • 16.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Sep 06, 2014 02:51 AM

    Following commands worked for me.. Not on 1 but on 7 hosts so I'm sure this can work for you as well. Try these

    esxcli system module set --enabled=false --module=elxnet

    esxcli system module set --enabled=true --module=be2net

    VMSavvy :smileyhappy:



  • 17.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Sep 10, 2014 02:15 PM

    Did not want to start a new thread since we are having very similar problems :smileyhappy:


    Running HP C7000 enclosure - Blade460c G8 with HP Flexfabric 10Gb 2-port 554FLB nics.

    If I put a OS on the blade the nic works fine, so it's a Vmware issue.

    Tried using ESXi 5.5 Update 1 and 2 using the HP custom images.

    Also tried ESXi 5.5 U2 driver rollup and non rollup, always the some problem, no network.

    Link is up, see attachment, tried changing to legacy driver, that didn't work.

    Any ideas ?



  • 18.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Sep 10, 2014 02:53 PM

    Make sure your switching side switch port speeds match with the selected speed on the NIC in VMWare.

    Make sure that your switching side ports aren't SUSPENDED.

    I downloaded the latest drivers, configured the network adapter on Auto negotiate, Cisco switching side ports on Auto Negotiate and made sure the ports aren't suspended and the channel configured correctly.

    After rebooting the server everything worked fine.

    Remark: with the latest drivers it's only possible to configure the speed on 10000Mb full duplex or Auto Negotiate (use the vSphere Client GUI)



  • 19.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Sep 10, 2014 05:18 PM

    Thanks, that made me think and talk to our network guy.

    This is how the port was configured and no network access:


    interface GigabitEthernet0/12

    switchport access vlan 250

    speed 1000

    spanning-tree portfast


    Then we changed it too this and now it works:


    interface GigabitEthernet0/12

    switchport trunk encapsulation dot1q

    switchport trunk allowed vlan 250

    switchport mode trunk

    Does the port need to be trunked to work for Vmware ?



  • 20.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Aug 20, 2015 04:13 AM

    Hi,

    We have got the similar problem in exsi 5.5. We have upgraded the firmware and network adapter driver. No luck.

    Finally call the hardware vendor and replace the network card. It's work.

    Might be helpful

    Thanks.



  • 21.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Nov 02, 2014 02:35 AM

    I got exactly same problem on BL460c G7 and ESXi 5.5 U1. (NC553i)

    Firmware and drivers are up to date. NIC driver version is elxnet 10.2.298.5. NIC firmware version is 10.2.340.19

    If your VC version is 4.01 or  later  you may see NIC SPEED is 10Gbps for all NICs. You have to change it by following article:

    http://h20565.www2.hp.com/portal/site/hpsc/template.PAGE/public/psi/mostViewedDisplay?javax.portlet.begCacheTok=com.vignette.cachetoken&javax.portlet.endCacheTok=com.vignette.cachetoken&javax.portlet.prp_efb5c0793523e51970c8fa22b053ce01=wsrp-navigationalState%3DdocId%253Dmmr_kc-0108859-15%257CdocLocale%253Den_US&javax.portlet.tpst=efb5c0793523e51970c8fa22b053ce01&sp4ts.oid=5288507&ac.admitted=1414895439125.876444892.492883150

    I have changed VC network setting to allocating correct speed.

    esxcli network nic list command give correct NIC SPEED.

    But issue still there.This is more like a Emulex problem.



  • 22.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Dec 01, 2014 04:36 AM

    Anybody experience same problem? Did you get it fixed?



  • 23.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Dec 02, 2014 10:14 AM

    In our case upgrading the Emulex card firmware fixed this issue.



  • 24.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Dec 15, 2014 01:13 PM

    We had the same issue in our environment a few times as well. On a few hosts, the firmware and driver upgrade helped to alleviate this issue - a "network stress test" consisting of copying a larger VMDK to another ESXi host from the SSH shell revealed whether the issue was remedied or not (usually the UE happened within 15 minutes). But on one server it was indeed a hardware error and since we had the NIC replaced, these stopped appearing.



  • 25.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Dec 04, 2014 02:32 PM

    HP support sent us this link as a workaround regarding issues with our BL460c G7 running driver 10.2.298.5 and firmware 10.2.340.19 on the OneConnect 10Gb Emulex NC553i;

    VMware KB: Emulex OneConnect network cards missing with elxnet driver 10.0.725.2 and later in ESXi 5.5



  • 26.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Dec 14, 2014 11:23 AM

    So in this case is there any update for this problem ?



  • 27.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Dec 15, 2014 12:39 AM

    For my case. HP updated me that Emulex found something in OneCapture logs. They are working on solution now.

    It's been 3 months since first outage.



  • 28.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Dec 15, 2014 12:52 AM

    Wilber, thanks for the update.

    So let us know here what will be the patch or the update to be applied.



  • 29.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 15, 2015 09:34 AM

    HP sent me a debug driver to collect additional logs if the issue happen again.

    But somhow the debug driver and problem driver both cannot trigger the issue again....I'm still trying to re-produce the issue.



  • 30.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 16, 2015 02:22 PM

    Sorry to dig up an old thread, but did you ever receive a resolution to this issue?

    I recently pushed out the recommended firmware and drivers listed on the HCL to a number of blades in my estate and I've encountered the same issue. I've had the same fault across a number of blades so I fail to see this as a hardware fault, and the very same blades remain stable if we use the unsupported be2net driver instead.

    Thanks,

    Martyn



  • 31.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 17, 2015 09:10 AM

    Let us know the result here wilber822



  • 32.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 16, 2015 04:23 PM

    I have a slightly different problem with the Emulex 10Gb NIC.  I'm running HP BL460C G7s in the C7000 Chassis with the Emulex 10GB Enet cards.  My esxi 5.5 hosts stay connected fine but I have virtual machines on several different vlans and the virtual machines appear to lose the ability to talk at random times.  Sometimes I have to disconnect their nic and reconnect it to fix the problem, other times I can only get them to talk by migrating to a different host.  I was using the be2net but HP had me change it to the elxnet and update driver to 10.2.298.5 and firmware to 10.2.340.19.  This did not resolve my problems.  VMware hasn't been able to help either.  Anyone else experiencing this or has and knows a solution?



  • 33.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 16, 2015 04:31 PM

    I have that issue too, although ironically I don't see the issue at my other sites using identical hardware, firmware and software builds.



  • 34.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Nov 27, 2017 04:00 PM

    We had a similar issue in our condition a couple of times also. On a couple of hosts, the firmware and driver overhaul reduced this issue - A "network stress test" consisting of copying a larger VMDK to another ESXi host from the SSH shell revealed whether the issue was remedied or not (usually the UE happened within 15 minutes). But on one server it was for sure an equipment mistake and since we had the NIC supplanted, these quit showing up. May IBM drivers help to get the solution easily because IBM have strong relation with this problem.



  • 35.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 14, 2015 05:03 PM

    I am using elxnet and update driver 10.2.298.5 and firmware 10.2.340.19.  I seem to be getting disconnects packets dropped under high load such as vmotions.  Sometimes vmotions will fail or take extremely long time to complete.   We are running 5.5 update 2.   I opened a case with HP and VMware for help on this I will post any response I get



  • 36.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 15, 2015 09:03 AM

    Ryanotown22, yes me too, I'm having a problem with my VM network issue where TCP retransmissions and TCP resets is quite high intermittently.

    I'm running ESXi 5.1 Update 1 on my HP Blades BL 465c G7 & 8 please let us know how did you go with your case logged to VMware as I'm keen to know what could be the culprit of this issue.



  • 37.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 17, 2015 12:21 AM

    Guys, Emulex confirmed that's driver issue. They give me a beta driver contains possible fix, but I'm not able to reproduce the issue any more, even by original driver. So I can't say it's fixed...



  • 38.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 18, 2015 03:11 AM

    Hey wilber,

    please let us know how it goes,

    im having a similar issue,

    of the 2 copper internal interfaces of HS23, one connects correctly, the other one never connects, even if it gets link and negotiates speed, it never gets a DHCP response, nor having a static IP has connection.

    checked all switches, even replaced switch modules with base config, also updated all firmwares, blades, chassis, switches, san switches, etc. with no avail.

    running on ESXi 5.5u2 fully patched.



  • 39.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 22, 2015 02:56 AM

    Hi LordChares,

    I don't think your issue is similar like mine. :-)



  • 40.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 19, 2015 02:15 PM

    I've tried it on our current test build of 5.5 u1 and I've tried it on 5.5 u2 with the suggested drivers and firmware that HP gave me.  No luck.  I've even tried running the be2net drivers and I haven't had any luck.  If I can't get this fixed soon it will be a deal breaker and I'll have to reload our entire environment with 5.1...  I'm already running a small 5.1 environment for our other network and interesting enough I don't have this problem and I'm using the exact same equipment...go figure.

    Hoping you guys can help me figure it out soon...........  I've had tickets opened up since August



  • 41.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 22, 2015 02:58 AM

    Hi jhwagner,

    I'm considering roll back to ESXi 5.1 U1 since we also have lot of  other  problem on network/storage after upgraded to  ESXi 5.5 U1. That's definitely unstable version.



  • 42.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 20, 2015 05:48 PM

    Hey Guys,

    any updates?

    has anybody had the same problem as me? one interface works, the other one, gets link, gets negotiated, but doesnt get IP nor it can communicate?

    Cheers



  • 43.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 21, 2015 05:10 PM

    I'm really struggling with this at the moment, there appears to be so much conflict in the supported/recommended firmware and driver combinations.

    HP are saying the following:

    Elxnet Driver version: 10.2.298.5

    Firmware version: 10.2.340.19

    According to http://vibsdepot.hp.com/hpq/recipes/HP-VMware-Recipe.pdf

    VMware are saying, use the following:

    Elxnet Driver version: 10.2.298.5

    Firmware version: 10.2.298.21

    According to the HCL: VMware Compatibility Guide: I/O Device Search


    On the other hand, they're also saying use this:

    Elxnet Driver version: 10.2.298.5

    Firmware version: 10.2.340.10

    According to the HP Flex-10 / Flex-Fabric Doc: http://partnerweb.vmware.com/programs/hcl/ESX_Flex_config.pdf

    Emulex are claiming that the following are recommended:

    Firmware version: 10.2.323.39 or 10.2.340.19

    According to VMware Recommended Software Matrix

    Emulex have refused to help me as my device is not a true Emulex product and have referred me to HP. HP are struggling to understand the problem and keep pointing to me to firmware I already have and VMware want me to try the firmware recommended on their HCL, however I can't appear to find the firmware in order to try.

    Does anyone have either firmware 10.2.298.21 or 10.2.340.10?

    Cheers,

    Martyn



  • 44.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 22, 2015 03:03 AM

    Hi MartynThomas,

    I used the one below, got problem.

    Elxnet Driver version: 10.2.298.5

    Firmware version: 10.2.340.19

    HP told me that's  driver issue, not firmware issue after they worked with Emulex.

    Now I'm testing driver version  10.2.261.6251-1OEM.550.0.0.1331820 which  HP provided  me with DEBUG options and possible fix. They told me I need to collect vm-support if the issue present again as the logs will contains additional info.

    Unfortunately, I cannot re-produce  the issue again.

    Then I re-installed  driver 10.2.298.5 but same cannot re-produce the  issue. I simulated high network utilization on VM network and vmk port but no lucky. :-(



  • 45.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 22, 2015 04:27 PM

    I reverted back to the 10.2.298.5 ELXNet driver on a single blade and ran it for an hour in production, it failed about 30 mins after with the same UE detected fault. Before bringing it into production I installed the OneConnect vCenter plug-in and OCE CIM provider to enable me to pull the dumps from the NICs.

    I've supplied the dumps to HP and VMware to see if they can see anything strange!

    Considering how common the Emulex OC11xx / HP NC553 NICs are really, i'm really shocked this has been dragging on so long without a proper resolve.



  • 46.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 22, 2015 04:38 PM

    Hi MartynThomas,

    May I know your NIC model and error message? I want to see is it exactly same like mine.



  • 47.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 22, 2015 05:44 PM

    Hi Wilber,

    I'm using HP BL460c and BL490c G7s with the HP NC553i onboard and HP NC553m Mezzanine cards.

    Each device is reported by esxcfg-info (snipped) as the following:


    VMNIC0

    |----Vendor Id.......................................0x19a2

    |----Device Id.......................................0x0710

    |----Sub-Vendor Id...................................0x103c

    |----Sub-Device Id...................................0x3315

    |----Vendor Name.....................................Emulex Corporation

    |----Device Name.....................................HP NC553i Dual Port FlexFabric 10Gb Converged Network Adapter

    |----Device Class....................................512

    |----Device Class Name...............................Ethernet controller

    |----VmKernel Device Name............................vmnic0

    VMNIC1

    |----Vendor Id.......................................0x19a2

    |----Device Id.......................................0x0710

    |----Sub-Vendor Id...................................0x103c

    |----Sub-Device Id...................................0x3315

    |----Vendor Name.....................................Emulex Corporation

    |----Device Name.....................................HP NC553i Dual Port FlexFabric 10Gb Converged Network Adapter

    |----Device Class....................................512

    |----Device Class Name...............................Ethernet controller

    |----VmKernel Device Name............................vmnic1

    Reported in the HCL as:

    Model:

    (HP NC553i) Emulex OneConnect OCe11102 10GbE NIC CNA for HP ProLiant Intel G7 BladeSystems

    Device Type:

    Network

    DID:

    0710

    Brand Name:

    HP

    SVID:

    103c

    Number of Ports:

    2

    SSID:

    3315

    VID:

    19a2


    VMNIC2

    |----Vendor Id.......................................0x19a2

    |----Device Id.......................................0x0710

    |----Sub-Vendor Id...................................0x103c

    |----Sub-Device Id...................................0x3341

    |----Vendor Name.....................................Emulex Corporation

    |----Device Name.....................................HP NC552m Dual Port Flex-10 10Gbe BL-c Adapter

    |----Device Class....................................512

    |----Device Class Name...............................Ethernet controller

    |----VmKernel Device Name............................vmnic2

    VMNIC3

    |----Vendor Id.......................................0x19a2

    |----Device Id.......................................0x0710

    |----Sub-Vendor Id...................................0x103c

    |----Sub-Device Id...................................0x3341

    |----Vendor Name.....................................Emulex Corporation

    |----Device Name.....................................HP NC552m Dual Port Flex-10 10Gbe BL-c Adapter

    |----Device Class....................................512

    |----Device Class Name...............................Ethernet controller

    |----VmKernel Device Name............................vmnic3

    Model:

    HP NC552m

    Device Type:

    Network

    DID:

    0710

    Brand Name:

    HP

    SVID:

    103c

    Number of Ports:

    2

    SSID:

    3341

    VID:

    19a2

    My cards are running firmware: 10.2.340.19 and the driver: 10.2.298.5.

    Error logged in the VMKernel log is as below, which in turn causes host isolation:

    2015-01-15T13:36:07.669Z cpu15:33448)WARNING: elxnet: elxnet_detectDumpUe:357: 0000:002:00.0: UE Detected!!

    2015-01-15T13:36:07.669Z cpu15:33448)elxnet: elxnet_detectDumpUe:368: 0000:002:00.0: Forcing Link Down as Unrecoverable Error detected in chip/fw.

    2015-01-15T13:36:07.669Z cpu15:33448)WARNING: elxnet: elxnet_detectDumpUe:385: 0000:002:00.0: UE lo: MPU bit set

    2015-01-15T13:36:07.669Z cpu15:33448)WARNING: elxnet: elxnet_detectDumpUe:395: 0000:002:00.0: UE hi: PMEM bit set

    2015-01-15T13:36:07.669Z cpu15:33448)WARNING: elxnet: elxnet_detectDumpUe:395: 0000:002:00.0: UE hi: NETCUnknown bit set

    2015-01-15T13:36:07.932Z cpu18:33450)WARNING: elxnet: elxnet_detectDumpUe:357: 0000:002:00.1: UE Detected!!

    2015-01-15T13:36:07.932Z cpu18:33450)elxnet: elxnet_detectDumpUe:368: 0000:002:00.1: Forcing Link Down as Unrecoverable Error detected in chip/fw.

    2015-01-15T13:36:07.932Z cpu18:33450)WARNING: elxnet: elxnet_detectDumpUe:385: 0000:002:00.1: UE lo: MPU bit set

    2015-01-15T13:36:07.932Z cpu18:33450)WARNING: elxnet: elxnet_detectDumpUe:395: 0000:002:00.1: UE hi: PMEM bit set

    2015-01-15T13:36:07.932Z cpu18:33450)WARNING: elxnet: elxnet_detectDumpUe:395: 0000:002:00.1: UE hi: NETCUnknown bit set

    2015-01-15T13:36:12.072Z cpu0:32852)WARNING: elxnet: elxnet_asyncWorldWait:3592: 0000:002:00.0: GetStats Checkpoint 1 (12 sec) No resp for MCC cmd opcode: 0x4, subsystem:0x3, timeout:0, req_len:4080

    2015-01-15T13:36:22.309Z cpu13:33454)WARNING: elxnet: elxnet_detectDumpUe:357: 0000:006:00.1: UE Detected!!

    2015-01-15T13:36:22.310Z cpu13:33454)elxnet: elxnet_detectDumpUe:368: 0000:006:00.1: Forcing Link Down as Unrecoverable Error detected in chip/fw.

    2015-01-15T13:36:22.310Z cpu13:33454)WARNING: elxnet: elxnet_detectDumpUe:385: 0000:006:00.1: UE lo: MPU bit set

    2015-01-15T13:36:22.310Z cpu13:33454)WARNING: elxnet: elxnet_detectDumpUe:395: 0000:006:00.1: UE hi: NETCUnknown bit set

    2015-01-15T13:36:23.221Z cpu0:33452)WARNING: elxnet: elxnet_detectDumpUe:357: 0000:006:00.0: UE Detected!!

    2015-01-15T13:36:23.221Z cpu0:33452)elxnet: elxnet_detectDumpUe:368: 0000:006:00.0: Forcing Link Down as Unrecoverable Error detected in chip/fw.

    2015-01-15T13:36:23.221Z cpu0:33452)WARNING: elxnet: elxnet_detectDumpUe:385: 0000:006:00.0: UE lo: MPU bit set

    2015-01-15T13:36:23.221Z cpu0:33452)WARNING: elxnet: elxnet_detectDumpUe:395: 0000:006:00.0: UE hi: PMEM bit set

    2015-01-15T13:36:23.221Z cpu0:33452)WARNING: elxnet: elxnet_detectDumpUe:395: 0000:006:00.0: UE hi: NETCUnknown bit set

    2015-01-15T13:36:24.074Z cpu0:32852)WARNING: elxnet: elxnet_asyncWorldWait:3592: 0000:002:00.0: GetStats Checkpoint 2 (24 sec) No resp for MCC cmd opcode: 0x4, subsystem:0x3, timeout:0, req_len:4080

    2015-01-15T13:36:36.076Z cpu0:32852)WARNING: elxnet: elxnet_asyncWorldWait:3592: 0000:002:00.0: GetStats Checkpoint 3 (36 sec) No resp for MCC cmd opcode: 0x4, subsystem:0x3, timeout:0, req_len:4080

    2015-01-15T13:36:36.076Z cpu0:32852)WARNING: elxnet: elxnet_asyncWorldWait:3611: 0000:002:00.0: GetStats MCC cmd timed out. opcode: 0x4, subsystem:0x3, timeout:0, req_len:4080

    2015-01-15T13:36:36.076Z cpu0:32852)WARNING: elxnet: elxnet_generateUE:55: 0000:002:00.0: Injecting fatal error for post-mortem dump

    Cheers,

    Martyn



  • 48.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 23, 2015 02:37 AM

    We are same NIC model and 90% similar error logs.

    Emulex  asked me install OneCapture to dump some logs when the issue re-produced, then HP gave me the DEBUG driver after 1  month.

    Looks like you also have a case opened with HP and VMware, is it?

    Do you willing share the case number with me  so I can ask  HP and VMware BCS team check if we can help each other?



  • 49.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 23, 2015 09:21 AM

    I can't send PMs yet as I don't have enough points but I'm more than happy to share my SR numbers.

    Cheers,

    Martyn



  • 50.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 28, 2015 04:45 PM

    Hi MartynThomas,

    Could you check is Jambo Frame enabled on the virtual machine on the problem host?

    I cannot re-produce the issue by beta driver, but I see some error:

    2015-01-28T06:35:00.490Z cpu10:4273238)WARNING: elxnet: elxnet_dumpPkt:4892: P0 :: vmnic2-q0 Failure reason: "9k without TSO"

    2015-01-28T06:35:00.490Z cpu10:4273238)WARNING: elxnet: elxnet_dumpPkt:4895: P0 ::  pkt_len:11241, must_tso:0x0, tso_mss:0, num_frags: 4



  • 51.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 28, 2015 04:58 PM

    I don't have jumbo frames enabled within any VMs, nor do I have jumbo frames enabled on any of my 4500x switches, VMNICs or dvswitches.



  • 52.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 03, 2015 03:43 PM

    HP have suggested I try driver version 10.2.445.0 from the Emulex site, needless to say shortly after installation I encountered the usual host isolation issue, albeit a very slightly different error:

    2015-02-03T15:06:30.634Z cpu14:33448)WARNING: elxnet: elxnet_detectDumpUe:357: 0000:002:00.0: UE Detected!!

    2015-02-03T15:06:30.634Z cpu14:33448)elxnet: elxnet_detectDumpUe:368: 0000:002:00.0: Forcing Link Down as Unrecoverable Error detected in chip/fw.

    2015-02-03T15:06:30.634Z cpu14:33448)WARNING: elxnet: elxnet_detectDumpUe:385: 0000:002:00.0: UE lo: MPU bit set

    2015-02-03T15:06:30.634Z cpu14:33448)WARNING: elxnet: elxnet_detectDumpUe:395: 0000:002:00.0: UE hi: PMEM bit set

    2015-02-03T15:06:30.634Z cpu14:33448)WARNING: elxnet: elxnet_detectDumpUe:395: 0000:002:00.0: UE hi: NETCUnknown bit set

    2015-02-03T15:06:30.651Z cpu1:33450)WARNING: elxnet: elxnet_detectDumpUe:357: 0000:002:00.1: UE Detected!!

    2015-02-03T15:06:30.652Z cpu1:33450)elxnet: elxnet_detectDumpUe:368: 0000:002:00.1: Forcing Link Down as Unrecoverable Error detected in chip/fw.

    2015-02-03T15:06:30.652Z cpu2:34113)World: 14296: VC opID hostd-6611 maps to vmkernel opID 8e89d881

    2015-02-03T15:06:30.652Z cpu1:33450)WARNING: elxnet: elxnet_detectDumpUe:385: 0000:002:00.1: UE lo: MPU bit set

    2015-02-03T15:06:30.652Z cpu1:33450)WARNING: elxnet: elxnet_detectDumpUe:395: 0000:002:00.1: UE hi: PMEM bit set

    2015-02-03T15:06:30.652Z cpu1:33450)WARNING: elxnet: elxnet_detectDumpUe:395: 0000:002:00.1: UE hi: NETCUnknown bit set

    2015-02-03T15:06:32.078Z cpu11:32852)WARNING: elxnet: elxnet_asyncWorldWait:3586: 0000:002:00.0: GetDieTemperature Checkpoint 1 (12 sec) No resp for MCC cmd opcode: 0x79, subsystem:0x1, timeout:0, req_len:8

    2015-02-03T15:06:40.427Z cpu5:39026)NetSched: 626: 0x2000004: received a force quiesce for port 0x200000c, dropped 46 pkts

    2015-02-03T15:06:44.081Z cpu11:32852)WARNING: elxnet: elxnet_asyncWorldWait:3586: 0000:002:00.0: GetDieTemperature Checkpoint 2 (24 sec) No resp for MCC cmd opcode: 0x79, subsystem:0x1, timeout:0, req_len:8

    2015-02-03T15:06:46.464Z cpu5:39026)NetSched: 626: 0x2000004: received a force quiesce for port 0x200000c, dropped 3 pkts

    2015-02-03T15:06:56.083Z cpu11:32852)WARNING: elxnet: elxnet_asyncWorldWait:3586: 0000:002:00.0: GetDieTemperature Checkpoint 3 (36 sec) No resp for MCC cmd opcode: 0x79, subsystem:0x1, timeout:0, req_len:8

    2015-02-03T15:06:56.083Z cpu11:32852)WARNING: elxnet: elxnet_asyncWorldWait:3605: 0000:002:00.0: GetDieTemperature MCC cmd timed out. opcode: 0x79, subsystem:0x1, timeout:0, req_len:8

    2015-02-03T15:06:56.083Z cpu11:32852)WARNING: elxnet: elxnet_generateUE:55: 0000:002:00.0: Injecting fatal error for post-mortem dump

    2015-02-03T15:06:56.286Z cpu11:32852)WARNING: elxnet: elxnet_txComplClean:3799: 0000:002:00.0: 2018 pending tx-completions

    2015-02-03T15:06:56.286Z cpu11:32852)WARNING: elxnet: elxnet_rxQueuesDestroy:2289: elxnet_cmdRxqDestroy failed for 0000:002:00.0

    2015-02-03T15:06:56.287Z cpu11:32852)WARNING: elxnet: elxnet_rxCQClean:2230: 0000:002:00.0 rxcq-3: did not receive flush compl

    2015-02-03T15:06:56.287Z cpu11:32852)WARNING: elxnet: elxnet_rxQueuesDestroy:2289: elxnet_cmdRxqDestroy failed for 0000:002:00.0

    2015-02-03T15:06:56.288Z cpu11:32852)WARNING: elxnet: elxnet_rxCQClean:2230: 0000:002:00.0 rxcq-2: did not receive flush compl

    2015-02-03T15:06:56.288Z cpu11:32852)WARNING: elxnet: elxnet_rxQueuesDestroy:2289: elxnet_cmdRxqDestroy failed for 0000:002:00.0

    2015-02-03T15:06:56.289Z cpu11:32852)WARNING: elxnet: elxnet_rxCQClean:2230: 0000:002:00.0 rxcq-1: did not receive flush compl

    2015-02-03T15:06:56.289Z cpu11:32852)WARNING: elxnet: elxnet_rxQueuesDestroy:2289: elxnet_cmdRxqDestroy failed for 0000:002:00.0

    2015-02-03T15:06:56.290Z cpu11:32852)WARNING: elxnet: elxnet_rxCQClean:2230: 0000:002:00.0 rxcq-0: did not receive flush compl

    2015-02-03T15:06:56.291Z cpu11:32852)elxnet: elxnet_quiesceIO:2122: Unarming EQ

    2015-02-03T15:06:56.291Z cpu11:32852)elxnet: elxnet_quiesceIO:2122: Unarming EQ

    2015-02-03T15:06:56.291Z cpu11:32852)elxnet: elxnet_quiesceIO:2122: Unarming EQ

    2015-02-03T15:06:56.291Z cpu11:32852)elxnet: elxnet_quiesceIO:2122: Unarming EQ

    2015-02-03T15:06:56.299Z cpu11:32852)WARNING: elxnet: elxnet_wrbFromMbox:2206: 0000:002:00.0: Error in Card Detected! Cannot allocate WRB from Mail box

    2015-02-03T15:06:56.299Z cpu11:32852)WARNING: elxnet: elxnet_wrbFromMbox:2206: 0000:002:00.0: Error in Card Detected! Cannot allocate WRB from Mail box

    2015-02-03T15:06:56.299Z cpu11:32852)WARNING: elxnet: elxnet_wrbFromMbox:2206: 0000:002:00.0: Error in Card Detected! Cannot allocate WRB from Mail box

    2015-02-03T15:06:56.299Z cpu11:32852)WARNING: elxnet: elxnet_uplinkReset:2332: 0000:002:00.0: f/w init failed

    2015-02-03T15:06:56.299Z cpu11:32852)lacp: LACPDisableDVPort:4275: LACP is not enabled on portset DvsPortset-0

    2015-02-03T15:06:56.299Z cpu11:32852)NetPort: 1632: disabled port 0x2000004

    2015-02-03T15:06:56.300Z cpu11:32852)NetPort: 2903: resuming traffic on DV port 2022

    2015-02-03T15:06:56.300Z cpu11:32852)Uplink: 6530: enabled port 0x2000004 with mac b4:99:ba:fb:f4:d0

    2015-02-03T15:06:56.504Z cpu11:32852)WARNING: elxnet: elxnet_txComplClean:3799: 0000:006:00.0: 2018 pending tx-completions

    2015-02-03T15:07:02.094Z cpu22:33452)WARNING: elxnet: elxnet_detectDumpUe:357: 0000:006:00.0: UE Detected!!

    2015-02-03T15:07:02.094Z cpu11:32852)WARNING: elxnet: elxnet_rxQueuesDestroy:2289: elxnet_cmdRxqDestroy failed for 0000:006:00.0

    2015-02-03T15:07:02.094Z cpu22:33452)elxnet: elxnet_detectDumpUe:368: 0000:006:00.0: Forcing Link Down as Unrecoverable Error detected in chip/fw.

    2015-02-03T15:07:02.094Z cpu22:33452)WARNING: elxnet: elxnet_detectDumpUe:385: 0000:006:00.0: UE lo: MPU bit set

    2015-02-03T15:07:02.094Z cpu22:33452)WARNING: elxnet: elxnet_detectDumpUe:395: 0000:006:00.0: UE hi: PMEM bit set

    2015-02-03T15:07:02.094Z cpu22:33452)WARNING: elxnet: elxnet_detectDumpUe:395: 0000:006:00.0: UE hi: NETCUnknown bit set

    2015-02-03T15:07:02.095Z cpu11:32852)WARNING: elxnet: elxnet_rxCQClean:2230: 0000:006:00.0 rxcq-3: did not receive flush compl

    2015-02-03T15:07:02.095Z cpu11:32852)WARNING: elxnet: elxnet_rxQueuesDestroy:2289: elxnet_cmdRxqDestroy failed for 0000:006:00.0

    2015-02-03T15:07:02.096Z cpu11:32852)WARNING: elxnet: elxnet_rxCQClean:2230: 0000:006:00.0 rxcq-2: did not receive flush compl

    2015-02-03T15:07:02.096Z cpu11:32852)WARNING: elxnet: elxnet_rxQueuesDestroy:2289: elxnet_cmdRxqDestroy failed for 0000:006:00.0

    2015-02-03T15:07:02.097Z cpu11:32852)WARNING: elxnet: elxnet_rxCQClean:2230: 0000:006:00.0 rxcq-1: did not receive flush compl

    2015-02-03T15:07:02.097Z cpu11:32852)WARNING: elxnet: elxnet_rxQueuesDestroy:2289: elxnet_cmdRxqDestroy failed for 0000:006:00.0

    2015-02-03T15:07:02.098Z cpu11:32852)WARNING: elxnet: elxnet_rxCQClean:2230: 0000:006:00.0 rxcq-0: did not receive flush compl

    2015-02-03T15:07:02.098Z cpu11:32852)elxnet: elxnet_quiesceIO:2122: Unarming EQ

    2015-02-03T15:07:02.098Z cpu11:32852)elxnet: elxnet_quiesceIO:2122: Unarming EQ

    2015-02-03T15:07:02.098Z cpu11:32852)elxnet: elxnet_quiesceIO:2122: Unarming EQ

    2015-02-03T15:07:02.098Z cpu11:32852)elxnet: elxnet_quiesceIO:2122: Unarming EQ

    2015-02-03T15:07:02.106Z cpu11:32852)WARNING: elxnet: elxnet_wrbFromMbox:2206: 0000:006:00.0: Error in Card Detected! Cannot allocate WRB from Mail box

    2015-02-03T15:07:02.106Z cpu11:32852)WARNING: elxnet: elxnet_wrbFromMbox:2206: 0000:006:00.0: Error in Card Detected! Cannot allocate WRB from Mail box

    2015-02-03T15:07:02.106Z cpu11:32852)WARNING: elxnet: elxnet_wrbFromMbox:2206: 0000:006:00.0: Error in Card Detected! Cannot allocate WRB from Mail box

    2015-02-03T15:07:02.106Z cpu11:32852)WARNING: elxnet: elxnet_uplinkReset:2332: 0000:006:00.0: f/w init failed

    2015-02-03T15:07:02.188Z cpu2:33454)WARNING: elxnet: elxnet_detectDumpUe:357: 0000:006:00.1: UE Detected!!

    2015-02-03T15:07:02.188Z cpu2:33454)elxnet: elxnet_detectDumpUe:368: 0000:006:00.1: Forcing Link Down as Unrecoverable Error detected in chip/fw.

    2015-02-03T15:07:02.188Z cpu2:33454)WARNING: elxnet: elxnet_detectDumpUe:385: 0000:006:00.1: UE lo: MPU bit set

    2015-02-03T15:07:02.188Z cpu2:33454)WARNING: elxnet: elxnet_detectDumpUe:395: 0000:006:00.1: UE hi: PMEM bit set

    2015-02-03T15:07:02.188Z cpu2:33454)WARNING: elxnet: elxnet_detectDumpUe:395: 0000:006:00.1: UE hi: NETCUnknown bit set

    2015-02-03T15:07:02.188Z cpu6:39030)World: 14296: VC opID hostd-7e68 maps to vmkernel opID 65326fa

    2015-02-03T15:07:14.108Z cpu11:32852)WARNING: elxnet: elxnet_asyncWorldWait:3586: 0000:006:00.1: GetDieTemperature Checkpoint 1 (12 sec) No resp for MCC cmd opcode: 0x79, subsystem:0x1, timeout:0, req_len:8

    2015-02-03T15:07:20.269Z cpu15:35856)World: 14296: VC opID hostd-c4a9 maps to vmkernel opID 1750f6e4

    2015-02-03T15:07:24.655Z cpu14:39945)World: 14296: VC opID hostd-afb3 maps to vmkernel opID 17159c6b

    2015-02-03T15:07:26.109Z cpu2:32852)WARNING: elxnet: elxnet_asyncWorldWait:3586: 0000:006:00.1: GetDieTemperature Checkpoint 2 (24 sec) No resp for MCC cmd opcode: 0x79, subsystem:0x1, timeout:0, req_len:8

    2015-02-03T15:07:38.110Z cpu2:32852)WARNING: elxnet: elxnet_asyncWorldWait:3586: 0000:006:00.1: GetDieTemperature Checkpoint 3 (36 sec) No resp for MCC cmd opcode: 0x79, subsystem:0x1, timeout:0, req_len:8

    2015-02-03T15:07:38.110Z cpu2:32852)WARNING: elxnet: elxnet_asyncWorldWait:3605: 0000:006:00.1: GetDieTemperature MCC cmd timed out. opcode: 0x79, subsystem:0x1, timeout:0, req_len:8

    2015-02-03T15:07:38.110Z cpu2:32852)WARNING: elxnet: elxnet_generateUE:55: 0000:006:00.1: Injecting fatal error for post-mortem dump

    Wilber, any chance you could share the debug driver you have? I can crash my host *almost* on demand so it would be interesting to see what it reveals.

    Cheers,

    Martyn



  • 53.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 03, 2015 03:53 PM

    I sent you a PM



  • 54.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 03, 2015 03:59 PM

    Thanks, I've dropped you a mail with my contact details :smileyhappy:

    Cheers,

    Martyn



  • 55.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 15, 2015 09:10 AM

    wilber822 have you tried to use the driver be2net-10.2.293.0-1869542.zip ?

    I was seeing similar symptoms about random intermittent ESXi host disconnecting and I also see other errors related to NIC that has caused the VMs to disconnect and caused this host to be in a stuck mode.

    This is what I found from the logs.

    VMware ESXi 5.1.0 Update 1

    *** vmkernel.log ***

    2015-02-12T12:06:16.604Z cpu12:33621979)vmnic6: UE happened ...   <---  Unrecoverable Error on the NIC, this has caused the NICs to crash

    2015-02-12T12:06:16.604Z cpu12:33621979)ue_status_low: 0x20

    2015-02-12T12:06:16.604Z cpu12:33621979)ue_status_hi: 0x0

    2015-02-12T12:06:16.604Z cpu12:33621979)ue_status_low_mask: 0x4000140

    2015-02-12T12:06:16.604Z cpu12:33621979)ue_status_hi_mask: 0x0



  • 56.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 17, 2015 11:50 AM

    Hi AlbertWT

    My understanding  is whenever you see a Unrecoverable Error on NIC, it similar to my issue. You should also observe high count of InPauseFrame on virtual connect module if you are using HP blade system.

    UE error is not fixed yet by Emulex for particular NIC module.



  • 57.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 17, 2015 04:10 PM

    Could any of you guys who have the same UE issue with the NC55x cards let me know what version of OA and VC firmware you're running?

    Cheers,

    Martyn



  • 58.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 03, 2015 05:10 PM

    Just pushed the latest Emulex firmware (10.2.470.14) to one of my test hosts and within about 15 mins it's failed again. This isn't looking promising!



  • 59.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 04, 2015 03:09 AM

    Hi Martyn,

    I shoot you an email.



  • 60.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 04, 2015 03:06 PM

    Just to keep everyone else in the loop, I've installed and tested the development driver (10.2.261.6251) and the host has remained stable so far.

    However I am seeing the following logged in the VMKernel.log, the same as Wilber822:

    TSO (TCP Segmentation Offload) is enabled by default in ESXi if a supported NIC is used, this can also be confirmed by running the following:

    esxcli system settings advanced list -o /Net/UseHwTSO




    If the above returns 1, it's enabled. 0 = disabled.

    I'm tempted to turn off TSO to see if the errors are still logged.

    Cheers,

    Martyn



  • 61.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 04, 2015 03:43 PM

    Disabling TSO makes no difference, errors are still logged in the VMKernel log relating to '9k without TSO'.

    Cheers,

    Martyn



  • 62.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 06, 2015 01:42 PM

    Hey guys and gals

    I'm somewhat hesitant to chime in here as I've not seen precisely the issue you have.

    We do have the 554FLB in our Gen8 Blades.

    When this environment was first commissioned we experienced an issue where we could not see all 8 available paths in oneview or via the OA..

    Without giving a very lengthy explanation the solution was quite obscure yet simple.

    The 554FLB was installed with firmware version of something like 4.9.006 (if I recall)

    When we saw this odd behaviour we update the firmware to 4.10.xxx which did not resolve the issue.

    After about 2 weeks on this a HP engineer who was also working on it found that if he downgraded the firmware to a much earlier version, rebooted then upgraded to version 4.9.416 the issue is resolved..

    He was correct.

    I'm sorry to say I can not confirm the firmware version the 554FLB was supplied with or the version we downgraded to but I can confirm we are still running firmware version 4.9.416 on the 554FLB and our 552SFP's.

    These have remained stable with this version.

    OH another tidbit. HP made reference to the HP VMware FW and Software Recipe. This is apparently a list of proven firmware / software versions the HP engineers follow when on site performing customer installs etc.. It does appear to have some credibility in that people who do the job have formulated this list.. (rather than someone who sits on the phone and is not sure what a blade actually looks like)

    I've found this url  http://vibsdepot.hp.com/hpq/recipes/HP-VMware-Recipe.pdf



  • 63.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 06, 2015 03:54 PM

    Hi Mark,

    Thanks for taking the time to respond.

    My problem is that I am using, and have tried all of the 'supported' and 'un-supported' combinations from Emulex, HP and VMware, but I'm still without a stable platform.

    Have a look at post number 36 on this thread :smileyhappy:

    Cheers,

    Martyn



  • 64.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 06, 2015 04:28 PM

    I wonder why the HP VMware FW and Software Recipe hasn't been updated with the 4.9.416.4 firmware in place of 4.9.416.2.

    http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c04326096



  • 65.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 09, 2015 10:31 AM

    Just to keep all updated, my test host has been rock solid on version 10.2.261.6251 of the Elxnet driver. I rolled the driver back to 10.2.298.5 this morning and again, within 15 minutes the host failed.

    This is definitely a badly written firmware/driver issue!



  • 66.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 09, 2015 11:59 PM

    Hi Martyn,

    You are correct. HP has confirmed that's a problem in driver. I'm glad to know the driver fixed the problem.

    I have feedback to HP. Hopefully they will release a new driver soon.

    My environment hard to re-produce the issue even revert back to original driver, but I have confirmed with VMware and HP that your case is similar with mine.

    Thanks a lot for your help.



  • 67.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 13, 2015 06:47 PM

    We've been having issues with our Emulex NC553i (OCe11102) based NICs and 5.5.  Since our datastores are mounted via NFS, we've been getting NFS APDs lasting long enough to cause VMs to go offline.  We were having this issue with HP's published recipe of firmware & driver combo for 5.5 U1 and U2.  Esxcli network nic stats show tons of receive packet drops on the vmnics.  What's interesting is that in the same host with both Broadcom and Emulex based NICs, only the Emulex vmnics recorded any packet drops.

    Anyway, here are a few links of interest to anyone suffering from similar issues.

    Packet drops and connectivity issues when using Emulex elxnet Driver version 10.2.298.5 or earlier on OCe10102 and OCe11102 adapters or OEM equivalents

    http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2091192

    HP revised their advisory as well - Advisory: (Revision) HP NC550x and NC551x Network Adapters - Only 4 of 8 Flex Ports May Be Functional When Using the Flex NIC Option in Virtual Connect

    http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c04321064

    Our symptoms are different than what's published in the advisory (but matches the VMware KB perfectly).  HP informed that we should downgrade both firmware and driver, and switch from native to legacy mode (due to the downrev driver).  HP just published the February recipe, but it still had the same U2 driver & firmware combo for Emulex.  Hopefully they are working on certifying the new Emulex drivers that's supposed to fix these issues.

    We are on driver 4.9.288.0 and firmware 4.9.416.0 now.  The packet drops have stopped, but we are still monitoring for issues.  Hopefully they do not return...



  • 68.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 15, 2015 09:13 AM

    PeteSu let us know how did you go with the new ESXi network driver and the Firmware version that is stable :-)



  • 69.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 17, 2015 10:41 AM

    AlbertWT, so far, driver 4.9.288.0 and firmware 4.9.416.0 @ legacy mode on our NC553i seems to be stable.  No receive packet losses compared to the native mode 10.0.725.2 and 10.2.298.5 drivers.

    Emulex has 10.2.445.0 posted on their website, and others in this thread have reported some success with development driver 10.2.261.625.

    I haven't been able to get HP to provide an ETA on when they'll have the new Emulex drivers certified.  Considering that we've had issues with both the 10.0.x and 10.2.x drivers in native mode, if legacy mode and 4.9.x provides more stability in our environment, then we'll stick with that for now.

    Of course, your mileage may vary, so you should test in a development host before applying to the rest of your environment.  If you haven't already done so, I would also recommend opening a support case with HP.  With more people reporting issues with Emulex based VCs, maybe they'll hurry up and certify the new drivers (and hopefully test thoroughly this time).

    There are 2 HP advisories for the 4.9.416 firmware, so you should check to see if they apply for your VC model.

    http://h20566.www2.hp.com/hpsc/doc/public/display?sp4ts.oid=4145106&docId=emr_na-c04218016&docLocale=en_US

    http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c04326096



  • 70.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 17, 2015 11:47 AM

    Hi PeteSu,

    Thanks for sharing your case.

    The debug driver we used is for the problem - "Emulex NIC lost network connectivity on ESXi 5.5". We ran the driver in our environment stable.

    I have to give negative comments to HP support since I have asked them when they will have a GA release for my issue about 1 weeks ago, there was no any responding till now.

    Hopefully they will give you a ETA soon.

    In  other hand, looks like lot of people have problem on Emulex 10.x drivers on ESXi 5.5, I think it significantly impacts production.



  • 71.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 17, 2015 10:23 PM

    Pete,

    Thanks for the reply. The HP advisory page that you sent over to me http://h20566.www2.hp.com/hpsc/doc/public/display?sp4ts.oid=4145106&docId=emr_na-c04218016&docLocale=en_US suggest that there is a problem with the Emulex be2net firmware version 4.9.416.2

    but according to the latest February 2015 HP-VMware Recipe book page 17:

    it shows that the stable version is 4.9.416.2

    Source: Recommended Firmware and Driver for HP http://vibsdepot.hp.com/hpq/recipes/ (Feb2015VMwareRecipeSPP201409_16.4.pdf)



  • 72.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 16, 2015 06:16 PM

    Yes, I am another customer that is having problems with this same exact setup.  During the evening we get paged out that vCenter sees hosts as disconnected but in reality packets are getting dropped.  I do a ping test from the vcenter to any host and the transmission loss is avg about 14%. 

    I have vCenter ESXi 5.5 U2d running with 5.5 hosts on HP 5.5 2302651 builds.

    The drivers/firmware we had running were:

    10.2.340.19

    driver: 10.2.298.5

    We downgraded the driver to 10.0.783.13 but still experienced same issue.  I am trying to find the firmware: 4.9.416.01 but can't find the download anymore and HPs site is down for that 554FLB card.



  • 73.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 17, 2015 10:05 PM

    Hi Jess,

    Try this site; ftp://ftp.hp.com/pub/softlib2/software1/pubsw-generic/p520687518/v101011 and let us know how you go with the firmware downgrade performance.



  • 74.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 18, 2015 07:39 PM

    Thanks everyone for chiming in.

    So far we have been running error free for 2 days now running the following on the Emulex 554FLB.

    elxnet driver: 10.0.783.13

    elxnet device firmware: 4.9.416.0

    ...obviously not running legacy mode.



  • 75.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 19, 2015 03:00 AM

    Thanks jessem for the detailed reply. However my HP BL 465c G7 Emulex model is NC 551i not 554 FLB as in my HP BL 465c G8.

    Somehow my HP support engineer working on my case suggest me to perform the following versions based on the HP-VMware recipe PDF February 2015:

    firmware version 4.9.416.2

    driver version 10.2.293.0

    Here's the version that is currently running on the HP BL 465c G8 using 554FLB:

    ~ # ethtool -i vmnic0

    driver: be2net

    version: 10.2.293.0

    firmware-version: 10.2.340.19

    bus-info: 0000:04:00.0

    Cheers,



  • 76.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 19, 2015 04:22 AM

    are you running 5.5 with that driver on your g8?



  • 77.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 19, 2015 06:29 AM

    Hi Jesse,

    Not yet, my ESXi are all on 5.1 Update 2



  • 78.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 19, 2015 04:26 AM

    All,

    unfortunately this package didnt work for us. So now I guess we will downgrade the driver one more level.



  • 79.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 19, 2015 06:29 AM

    let us know how it goes mate after you've downgrade the driver and firmware sets.



  • 80.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 18, 2015 07:26 PM

    I have been moved onto level 2 support.   Waiting for an update from them.  Hosts keep having pauses/freezing,  network RX drops, vmotions stick, also All Path Down to our NFS storage. I am on the latest firmware/drivers provided by HP in the ESXI HP 5.5 update 2 disk and HP SPP disk.  These are all elxnet.

    HP DL360p G8
    Emulex HP NC552SFP Dual Port 10GBE
    firmware 10.2.340.19
    Driver 10.2.298.5

    Emulex HP FlexFabrix 10GB 2 Port 554FLR-SFP+
    firmware 10.2.340.19
    Driver 10.2.298.5


    BL 460c G8
    Emulex HP Flexfabric 554FLB 10GB 2 Port

    Firmware 10.2.340.19
    Driver 10.2.298.5

    BL 460c G7
    Emulex HP NC553i Dualt Port FlexFabric 10GB Converged Network Adapter
    Firmware 10.2.340.19

    Driver 10.2.298.5



  • 81.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 18, 2015 08:32 PM


    I was just told to downgrade to this.. and HP support said they had another person downgrade to this and had no issues.   Also they said in the next few weeks emulex is releasing a new firmware that will fix this and this will be included in the March recipe book for HP.   I am going to try this now and check stability

    Firmware 4.9.416.0

    Firmware:   http://h20564.www2.hp.com/hpsc/swd/public/detail?sp4ts.oid=5215387&swItemId=co_131997_1&swEnvOid=54

    Driver 10.0.725.2

    Driver: https://my.vmware.com/web/vmware/details?downloadGroup=DT-ESXI55-EMULEX-ELXNET-1007252&productId=353



  • 82.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Feb 24, 2015 05:34 PM

    I downgraded but recently had an odd issue with linux vm's that had NFS mounted internally had them disconnected.  Not sure if it was related and there seemed to be nothing in the logs about it



  • 83.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Mar 03, 2015 10:22 PM

    I am experiencing very similar issues running 5.1 update 3.  I have an Emulex OCm10102-n-x running on 8 hosts.  Only on one host the adapters keep disconnecting and going offline.  I have tried all combinations of new and old firmware.  Anyone have any other ideas?



  • 84.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Mar 05, 2015 03:08 PM

    It's been about 2 weeks no with no issues running the following....

    ESXI 5.5 HP BLADE DRIVERS/FIRMWARE

    EMULEX 554FLB CARD

    ELXNET (DRIVER FOR 5.5 - NEEDED TO BE DOWNGRADED)

    DRIVER:   10.0.725.2

    FIRMWARE: 4.9.416.0



  • 85.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Mar 27, 2015 08:30 AM

    We have exact the same setup. But 1 host become not responinding last night. What version of OA and VC do you use? We have 4.30 Jul 08 2014



  • 86.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Apr 08, 2015 10:30 AM

    Hi Guys,

    I have a case open with HP since the end of October about the following issue: After an upgrade of platform (Blade BIOS / iLO / HP Virtual connect / ESXi to HP customized 5.1 Update 2) we had a issue of half of our uplinks dying after anything from 5 minutes to 1 hours (after booting) after many hours backwards and forwards with HP (recreate server profiles / different iLO versions, ROM, Emulex Firmware / be2net drivers etc) we narrowed issue down to Gen8 blades with Emulex CNAs (554 FLB)

    The issue did not effect the mezzanine cards (also 554s) be2net driver version  10.2.293.0 Firmware 10.2.340.19 Everything was as per HP recipe for VMware (HP_Service_Pack_for_ProLiant_2014.09.0_792934_001_spp & VMware-ESXi-5.1.0-Update2-2000251-HP-5.68.30-Sep2014.iso)

    The issue was that on the FLB cards the advanced mode was enabled in the Emulex BIOS. As soon as we disabled this all issues disappeared.

    I wonder if anyone with these network issues has had a look at the settings in the Emulex BIOS.

    To disable Advanced Mode Support through PXE Select:

    1. After the BIOS initializes and you have selected your controller, the Controller Configuration screen appears. Select Advanced Mode Support from the drop-down menu. The Controller Configuration Advanced Mode Support dialog box appears.

    2. From the drop-down menu, select Disabled and press .

    3. Select Save and press .

    4. After enabling Advanced Mode Support, the Port Selection screen appears. Select the port you want to configure and press . Continue to configure your controller.  - at this point you can exit out of the BIOS and reboot.

    I'd be interested to know if this solves issues for anyone else. HP have been working with Emulex on this issue, we have tried 2 test drivers for them, unable to produce the issue in their labs (so far) they have sourced the exact same batch of blade that we have and tested... when it happened to us we had 9 of the same blades in our chassis (so 18 in total - 1 per DC)

    The issue happened on all 18 blades but was solved by disabling the advanced mode. Issue never happened for the remaining Gen7 and older version of Gen8s in the chassis - on all of these Advanced mode was disabled by default on the FLBs and Mezz cards.

    regards

    Ciarán

    VC 4.30 Oa 4.30



  • 87.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Apr 09, 2015 01:21 PM

    Hi Ciarán,

    We saw something very similar with driver version and 10.2.298.5 (elxnet) and firmware 10.2.340.19 but on the rackmount FlexLOM - the 554FLR-SFP+. We were using elxnet rather than be2net as in ESXi 5.5 Emulex went over to the native mode driver but the symptoms sound the same. The host came online and appeared to be passing traffic until a number of VMs were on it, when vmnic0 stopped passing traffic (but the link status stayed up) followed shortly afterwards by vmnic1. Strangely, in the case we had open with VMware (where they were working with Emulex), we were told to turn advanced mode on, which sounds like the opposite of the advice you had. At the moment we're staying on an older set of drivers and firmware until we have a definitive answer.

    I see Emulex have just put out a new version on http://www.emulex.com/downloads/emulex/drivers/vmware/vsphere-55/drivers/ (I appreciate you're using ESXi 5.1 but it may still be of interest) but there's no indication that the issue has been addressed in the release notes.

    http://www-dl.emulex.com/support/elx/rt10.4.0/ga/Docs/final/vmware/vmware_relnotes_elx.pdf

    The only thing I did notice was under ESXi 5.5 known issues point 15:

    On OCe11100-series adapters if you update the driver and firmware, ESXi 5.5 hosts may report large numbers of packet loss and errors in the vmkernel logs. Throughput is not effected, but errors may fill management software logs.

    Workaround

    None.

    However I'm not sure that it matches since out throughput was affected, to point of taking the hosts out of service. I'd be very interested to hear how your case goes.

    Regards,

    Jason



  • 88.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Apr 10, 2015 04:14 PM

    Hi Jason,

    After today's discussion with HP I decided not to pursue the point of finding out why the advanced mode was enabled on the FLB for a particular set of Gen8 blades (although they said they will share a document that explains different scenarios for when advanced mode is disabled or enabled).

    They will keep looking with Emulex but we are not going to proceed with further attempts to reproduce the issue. In our case it was definitely the advanced mode that (off) that solved the issue.

    The latest SPP for Proliant servers will be release next week (planned) There is also a new version of the customized HP ESXi (30/03/2015) http://www8.hp.com/us/en/products/servers/solutions.html?compURI=1499005#tab=TAB4 which apparently will have fixes for driver bugs  ( related to the ELXNET drivers for 5.5) The firmware is still version 10.2.340.19 but HP mentioned this will also be updated.

    We will start testing as soon as everything is available, and with a bit of luck plan for a migration to vSphere 6 in July. fingers crossed there are no issues with the emulex cards.

    I will post if we get any further updates.

    Regards

    Ciarán



  • 89.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Apr 21, 2015 10:53 AM

    For those still with issues, after working with HP over the last few months, I now have what appears to be a stable platform.

    I'm using the following combination which were released a few weeks ago:

    Driver: VMW-ESX-5.5.0-elxnet-10.4.255.13-2555693.zip

    https://my.vmware.com/web/vmware/details?downloadGroup=DT-ESXI55-EMULEX-ELXNET-10425513&productId=353


    Firmware: 10.2.477.23

    http://h20564.www2.hp.com/hpsc/swd/public/detail?sp4ts.oid=4324631&swItemId=MTX_00b06590d26c4222a0a96da87b&swEnvOid=4166#tab5


    Everything appears to be stable at the moment. I've introduced the same loads which previously caused the issues I mentioned previously.


    Thanks,


    Martyn



  • 90.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Apr 24, 2015 03:19 PM

    Martyn,

    Has 10.4.255.13 been certified via HP.  I can't seem to find that on HP's site?



  • 91.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Apr 24, 2015 03:22 PM

    I'm not sure to be honest, those 2 links were provided to me directly by HP.

    Cheers,

    Martyn



  • 92.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Apr 24, 2015 03:36 PM

    Ok, well I'll take it as they are since HP support provided those links.



  • 93.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Apr 24, 2015 06:02 PM

    @MartynThomas --  Thanks for your post.

    We recently experienced a similar issue.  In our case our BL460G8 blades abruptly disconnected both ethernet and storage networks serviced by the Emulex OneConnect chipset (NC554FLB). 


    Our environment regained stability after applying the following:

    esxcli software vib install -d /tmp/VMW-ESX-5.5.0-elxnet-10.4.255.13-offline_bundle-2555693.zip  ## Emulex OneConnect Network Driver v10.4.255.13

    esxcli software vib install -d /tmp/VMW-ESX-5.5.0-lpfc-10.2.455.0-offline_bundle-2254453.zip  ## Emulex OneConnect FC Driver v10.2.455.0

    esxcli software vib install -d /tmp/hp-esxi5.5uX-bundle-2.2-17.zip ## HP ESXi5.5 bundle

    (reboot)

    Then

    ./CP025747.scexe    ## Emulex OneConnect Firmware v10.2.477.10

    (reboot)

    NOTE:   The HP prescribed ESX5.5 recipie specifies the elxnet v10.2.445.0 driver.   In our experience this has not resolved the issue.   We are stable with elxnet 10.4.255.13.

    We're in dialog with L3 (VMW, HDS, HP) to triage further.

    Hope this helps someone.

    Adam Kupsta.



  • 94.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Aug 07, 2015 04:51 AM

    Hey guy and gals

    would anyone have a current link to the firmware version 10.2.477.23..

    HP seem to have lost most of their WEB environment in their transition..



  • 95.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Apr 26, 2015 01:09 PM

    That helped me pretty much, thanks for the thread.

    Br,

    Anthony J

    AraqueFotos



  • 96.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Apr 27, 2015 12:37 PM

    Anthony / Anyone else experiencing this issue,

    I'm trying to correlate root cause.   Are you able to share a few details about your storage config? 

    - FC/iSCSI?

    - Both network/storage attachment serviced by same Emulex interface?

    - Array subscribed vs. allocation levels (%)

    - Any array tiering policies enabled?

    Thanks,

    Adam.



  • 97.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Apr 29, 2015 06:16 AM

    We are having an issue related to this, I susptect.

    Our setup is:

    - IBM PureFlex x240 nodes.

    - FC storage

    - FW: 10.4.255.25

    - Driver: elxnet: 10.4.255.13 lpfc: 10.4.245.0

    Both network and storage goes through the same hardware adapters.

    We do over provision datastores. (But it would take quite some time to get any specific number here)

    We are using array tiering on some datastores.

    After upgrading to 5.5.u2 network starts to reports that it drops. (Have only upgraded one host.)



  • 98.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Apr 30, 2015 01:41 PM

    Dear Collegues,

    up and down of 10Gb network port experienced by our customer was solved using STP (shielded) Cat.7 cables instead of UTP (unshielded) Cat.6e.

    The issue was caused by cross-talking (see wikipedia here).

    We have solved issue regarding up/down of ports on the switches, but we continue to have issues with vSphere 5.1 and 5.5.

    We are unable to have iscsi speed more than 460MByte/sec instead of a blade host with Windows 2012 R2 with the same configuration that reach 1.6GByte/sec.

    Giovanni Coa



  • 99.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted May 08, 2015 12:08 PM

    I can't agree, because we are using 10Gb twinax, no UTP, and we have a lot of problems with emulex in esxi5.5



  • 100.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted May 08, 2015 12:05 PM

    The latest reliable combination in our environment (with vmware vDS) seems to be (for 554FLB):

    # esxcli network nic get -n vmnic0

       Driver Info:

             Bus Info: 0000:04:00:0

             Driver: elxnet

             Firmware Version: 10.2.477.10

             Version: 10.2.445.0

    I have no courage to test it with nexus1000v in production environment.

    If i try to install vib VMW-ESX-5.5.0-elxnet-10.4.255.13-offline_bundle-2555693.zip after reboot the hypervisor ends in PSOD! Strange....

    With Nexus1000v is emulex driver absolutely unusable, and when i switch driver from emulex to be2net it leads to corrupted packets.

    So in one of our clusters we changed hypervisors with NIC 554FLB to 534FLB (Broadcom), they are without any issues.

    We are using nic for ethernet communication, no FC, no iSCSi. The driver must support vxlans.

    The suggestion is - don't use emulex, use broadcom :smileyhappy:

    But in this time i have no idea how to upgrade from 5.1 to 5.5...



  • 101.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jul 18, 2015 07:41 PM

    I work for a large company that purchased ~2,500 Gen 8 blades with 554FLB CNAs last October (manufactured all at once) for a worldwide deployment we are just now completing. We are using a single ESX 5.5 autodeploy image which is complaint with the HP recipe - 10.2.340.19 Emulex firmware & 10.2.298.5 driver. All hardware is the same everywhere, same total stack of firmware across all hardware in all locations everywhere, same gold image, etc.

    With all of the original hardware no issues. Thousands of identical stateless ESX 5.5 OSs happily PXE booting 100% of the time. With a deployment this large we did however have some DOA hardware and had to replace about a dozen or so 554FLBs across the world. Guess what? With just about all of the replacement 554FLBs we started seeing very strange issues 95% consistent with the litany of problems described throughout this thread. I found this thread by accident after googling the error messages found in the ESX debug screen. The discussion here seems to have revolved around finding the perfect combination of drivers, settings, etc (everyone is assuming their hardware is good). My simple theory is that a bad batch of 554FLBs is floating around or perhaps the firmware on these or similar adapters isn't getting applied properly at the factory nor can a user re-apply the firmware with success. In my case so far I have had to ship known working / good 554FLBs from my lab to the field to replace the replacements. 100% of the time this has worked so far.


    Now I need your help. If you are still having these issues in your shop or had them and found a root cause please send me a PM with any information you can share like HP case #s. I am working with HP to find the root cause.


    Thanks



  • 102.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Aug 05, 2015 03:23 AM

    Hi domenic10

    We are similar case. I worked with MartynThomas‌, HP and VMware to figured out which combination was stable.

    I would suggest you don't wast time on that, they (HP and Emluex) has tried about 1 year but no final RCA.

    We are using Cisco and HP both, feel Cisco is very stable, but HP not. I intend to replace HP by DELL.

    Sorry, my post may not related to this post. Just FYI.



  • 103.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Apr 30, 2015 03:49 PM

    Our customer experienced that cross-talking was the issue.

    Cross-talking cause issue when multiple UTP (Unshielded) cables at 10Gbits cause interference each others.

    Using Cat.7 STP (Shielded) solved the issue of up/down.

    Many other issues are now caused by EMULEX elxnet drivers on ESXi (vSphere) 5.5 U2.

    We can't go more than 460MByte/sec instead of 1.6GByte/sec of Windows 2012 R2 in same configuration.

    Giovanni Coa



  • 104.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted May 12, 2015 09:28 PM

    Having a real similar issue - been using these BL460Gen8 blades w/554 Emulex LOMs for a year fine now and decided to update to the latest SPP (2015.04.0).

    Now I have the 2 hosts that got upgraded drop off 1hr-3hrs after reboots - (first time had an outage with VMs on them.)

    I found this thread, and disabled Advanced Mode in the Emulex BIOS and that seems to have fixed the issue, but I'm not willing to put the hosts back into production yet.

    Strangely, I spot checked the other blades and they have Advanced set to 'Enabled' by default and have worked fine with the older firmware and drivers.

    Does anybody know if there's a problem with disabling Advanced in the Emulex BIOS?  Is there some functionality I will be missing?

    We are still on 5.1U3 and looking to move to 5.5U3 soon.

    This combo has worked for a year, but is very old:

    be2net driver:          4.9.288.0

    be2net firmware:     10.2.340.19

    This combo which the latest 2015.04.0 SPP upgraded to broke everything:

    be2net driver:          10.2.477.10

    be2net firmware:     10.2.453.0

    Anybody else have any new findings on this awful problem?



  • 105.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted May 19, 2015 10:33 AM

    Hi,

    There is no problem in disabling the advanced mode in the Emulex BIOS. We worked extensively with HP to try and find the root of this problem.

    Eventually after months they published the following customer advisory with the workaround: http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c04608235

    They claim that the lastest version of the driver resolves this issue: 10.2.477.20 (this is not available with the latest SPP and you will need to download it separately: ftp://ftp.hp.com/pub/softlib2/software1/sc-linux-fw-sys/p930408510/v105241 or https://my.vmware.com/web/vmware/details?downloadGroup=DT-ESXI51-EMULEX-BE2NET-10247720&productId=285

    For some background on the cause of this issue (why some cards have advanced mode enabled by default and others don not: (feedback from HP) The Emulex firmware that relates to Advance Mode / SR-IOV being enabled or disabled: With 4.2.x.x FW and previous, customers had the ability to manually set the state of SR-IOV in the NIC BIOS. In 4.6.x.x firmware, this ability was removed and the customer could no longer toggle the state of SR-IOV manually.  This created major issues due to known compatibility issues with SR-IOV and certain OS’s.  The ability to disable SR-IOV needed to be given back to the end user in the NIC BIOS to resolve that.  This was accomplished by tying the SR-IOV state to the Advanced Mode Support State with FW 4.9.x.x and higher. So depending on the firmware version that was initially installed on the nic will affect whether Advanced Mode and SR-IOV are enabled or not.

    Regards

    Ciarán



  • 106.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted May 19, 2015 03:19 PM

    So, after opening a case with HP on this, they advised against disabling Advanced mode in the Emulex BIOS.  They explained they're aware that that may fix this issue, but mentioned there could be a performance impact - that they would only recommend it in very particular cases.

    They also supplied me with the 10.2.477.20 VMware driver which I have applied to 3 hosts running 10.2.477.10 554FLB firmware.

    I have yet to put them back into production (I'll wait a week) but they have not dropped off where before they'd disappear within 1-3 hours, consistently.

    So, it appears this latest 10.2.477.20 VMware Emulex driver has fixed the issue while still allowing Advanced mode to stay enabled.




  • 107.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Aug 05, 2015 06:14 AM

    Emulex NICs are nothing but trouble - after fruitless calls between HP & Cisco we moved to Broadcomm for or new ESXi deployments and are happy ever since.



  • 108.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Aug 05, 2015 07:01 AM

    Hi all,

    by my experience, the working combination of Emulex driver and firmware, which was tested in 5.5 is:

    # esxcli network nic get -n vmnic0

       Driver Info:

             Driver: elxnet

             Firmware Version: 10.5.65.21

             Version: 10.5.65.4

    this combination is described in http://vibsdepot.hp.com/hpq/recipes/HP-VMware-Recipe.pdf

    and looks like good combination.

    I didn't do all tests yet, but combination described above looks stable, vxlan are OK and so on.



  • 109.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Aug 28, 2015 02:20 AM

    In case anyone is still following this lengthy thread, VMware recently updated the following article listing elxnet 10.5.65.4 as the solution:

    Packet drops and connectivity issues when using Emulex elxnet Driver version 10.2.298.5 or earlier on OCe10102 and OCe11102 adapters or OEM equivalents

    http://kb.vmware.com/kb/2091192

    The interesting thing is that, previously, the article said this was fixed with elxnet 10.2.445.0.



  • 110.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Aug 29, 2015 05:26 AM

    Hi,

    We have got the similar problem in exsi 5.5. We have upgraded the firmware and network adapter driver. No luck.

    Finally call the hardware vendor and replace the network card. It's work.

    Might be helpful

    Thanks.



  • 111.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Sep 06, 2015 02:04 AM

    Hi Guys,

    Just let you guys know.

    HP told me the issue was fixed by driver elxnet 10.4.255.13 but unfortunately it's not.

    I slowly deployed driver elxnet 10.4.255.13 and firmware 10.2.477.10 on our environment. We didn't observed any issue till today.

    It happened again. Same problem. I have deployed on 50 - 80 hosts. There are 1000+ VMs on it, now it happens again!!!! I'm tried.

    I have to say again, don't use HP.



  • 112.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Oct 07, 2015 04:58 AM

    Same issue upgrading from v5.1 to v5.5U3 on BL685cG7 blades.

    Fix worked as described here.



  • 113.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Oct 15, 2015 04:19 PM

    Which fix exactly do you mean?



  • 114.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Oct 21, 2015 09:50 AM

    Please:    which fix do you mean ??



  • 115.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Oct 21, 2015 10:20 AM

    Hi

    If you are running ESXi 5.5 / 6 use the following combination:

    Driver: elxnet

    Firmware Version: 10.5.65.21

    Version: 10.5.65.4

    If you are using 5.1 use:

    driver: be2net

    version: 10.5.65.4

    firmware-version: 10.5.65.21

    you cna check current with following command: ethtool -i vmnic0

    They will work fine.

    Ciarán



  • 116.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jun 27, 2016 09:48 AM

    having the same issue here with 5.5 update 3a

    server 5.5 update 3d

    This message has repeated 21504 times: 0000:081:00.1: Error in Card Detected! Cannot allocate WRBs hw_error:1|fw_timeout:02016-06-27T07:54:20.533Z cpu63:33571)BC: 3423: Pool 0: Blocking due to no free buffers. nDirty = 271 nWaiters = 1


    NC553i


    Driver Info:

             Driver: elxnet

             Firmware Version: 10.7.110.31

             Version: 10.7.110.13

    according the hp recipe book these are the latest and greatest drivers/firmware for the NC553i

    cant find a solution to this...

    we have the exact same config in our other DC and this error does not exist there.

    concerned if this is a major issue or not :/



  • 117.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Sep 28, 2016 12:01 AM

    dingo‌ and djciaro‌ are you guys doing the below steps assuming your server is running HP Hardware:

    1. Update the firmware using the last HP SPP 2016.04 (864794_001_spp-2016.04.0-SPP2016040.2016_0317.20.iso) ?

    2. Upgrade or install the ESXi using the HPE latest ESXi (VMware-ESXi-5.5.0-Update3-3568722-HPE-550.9.5.0.33-Apr2016.iso) ?



  • 118.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Apr 20, 2017 11:37 PM

    I had a similar problem and it was the "Personality" in the HPE BIOS.  It defaulted the NICs to iSCSI and FCoE.  I had to change them to NIC and all the links came up.  HPE Support was worthless and no help at all. 



  • 119.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Sep 27, 2016 08:23 PM

    I ran into a similar issue. running:

    ESXi 5.5U3b

    Emulex OCe11102-NM

    updated the FW to 11.1.38.57 and VIC to 11.1.145.0 and then the NIC dropped off the network but was showing as Enabled and UP but no observed IP ranges. It could vmkping itself but not other hosts vmk vmotion ports.

    Banged my head on it for hours going over every setting AND Rebooting several times, uninstalling drivers etc. finally, wild shot in the dark, changed the NIC speed from 10000 to 1000 and then back and everything came back up all good.



  • 120.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Dec 13, 2017 11:20 AM
    we have the same problem on an IBM HS Blade
    with the 7875 and 8038 models. esxi 5.5u2
    EMULEX CORPORATION ONE CONNECT 10Gb NIC (be3)
    Driver version: 4.6.100.0v
    Firmware Version: 11.2.1193.34
    the network cards should be 6 but only 4 show up



  • 121.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Jan 29, 2018 01:52 PM

    Hello,

    Did someone test with 6.0 or 6.5 build with IBM HS23 blade? Any progress? We still have network glitch with ESXI build 5.5



  • 122.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Aug 16, 2018 07:39 AM

    Anyone got a solution for this?

    We get the same errors on Fujitsu RX2540M4 with Emulex OneConnect OCe14102 NIC, current firmware and driver running ESX6.5U2b.

    So everything is up-to-date and on the HCL but still some times after some network activity the "UE Detected!!" pops up in the logs and both ports of the card go down.

    Driver          : elxnet

    Version         : 11.2.1149.0

    FirmwareVersion : 11.2.1194.30

    OPROM Version 11.2.1194.30

    2018-08-15T22:12:59.156Z  vmkwarning: cpu42:66241)WARNING: elxnet: elxnet_detectDumpUe:343: [vmnic2] UE Detected!!

    2018-08-15T22:12:59Z  vmkernel: ue_status_lo=0x20  ue_status_hi=0x0[0m

    2018-08-15T22:12:59.156Z  vmkwarning: cpu42:66241)WARNING: elxnet: elxnet_detectDumpUe:351: [vmnic2] UE lo: MPU bit set

    2018-08-15T22:12:59Z  dcbd: [info]     Ignoring vmnic2 link state change, no port found

    2018-08-15T22:12:59.156Z  vobd:  [netCorrelator] 825442035413us: [vob.net.vmnic.linkstate.down] vmnic vmnic2 linkstate down

    2018-08-15T22:12:59.385Z  vmkwarning: cpu5:66260)WARNING: elxnet: elxnet_detectDumpUe:343: [vmnic3] UE Detected!!

    2018-08-15T22:12:59Z  vmkernel: ue_status_lo=0x20  ue_status_hi=0x0[0m

    2018-08-15T22:12:59.385Z  vmkwarning: cpu5:66260)WARNING: elxnet: elxnet_detectDumpUe:351: [vmnic3] UE lo: MPU bit set

    2018-08-15T22:12:59Z  dcbd: [info]     Ignoring vmnic3 link state change, no port found

    2018-08-15T22:12:59.385Z  vobd:  [netCorrelator] 825442264656us: [vob.net.vmnic.linkstate.down] vmnic vmnic3 linkstate down

    2018-08-15T22:13:00.000Z  vobd:  [netCorrelator] 825436878471us: [esx.problem.net.vmnic.linkstate.down] Physical NIC vmnic2 linkstate is down

    2018-08-15T22:13:00.000Z  vobd:  [netCorrelator] 825436878617us: [esx.problem.net.vmnic.linkstate.down] Physical NIC vmnic3 linkstate is down



  • 123.  RE: Vsphere 5.5 and Emulex OneConnect 10Gb NIC trouble

    Posted Aug 16, 2018 08:08 AM

    If the nic card is compatible according HCL

    If the nic drivers are up to date

    Compatible with ESXi version post this if you are still facing network issues , it would also be helpful to get hardware vendor involved and get there update on same .

    Most of the time the issue is with firmware, once thats updated issue will be resolved.

    Please consider marking this answer as "correct" or "helpful" if you think your questions have been answered.

    regards

    gayathri