ESXi-Arm Fling

 View Only
  • 1.  ESXi 8.0 U3b build-24364478 i225-V + Orange Pi 5 Plus - unable to receive dhcp IP, no connection w/static

    Posted Nov 04, 2024 11:30 AM

    Disclaimer: I realize that the Orange Pi 5 Plus isn't officially supported, but I have gotten it working with the 7.0 fling and USB NICs. I have had success with getting the OS to boot and create a datastore on an NVMe drive by disabling MSI interrupts.

    Troubleshooting performed:

    Have tried known good network cards (validated in x86 servers)

    Have tried suspect card in working machines (x86)

    Have attempted runtime disable of MSI interrupts (disableMSI=TRUE boot option) as well as configured (esxcli system settings kernel set -s "disableMSI" -v "TRUE")

    • Bolding commands used for clarity.

    The issue:

    In preparation for the USB NIC fling not being integrated into the latest 8.0 arm build, I purchased an Intel i225-V (link). I also have an M.2 to PCI-e x4 adapter to test other supported network cards, a Broadcom 5719 using ntg3 driver, and an Intel PRO 1000 using the ne1000 driver. I upgraded my ESXi 7.0 build using the offline depot to 8.0 U3b build 24364478.

    After upgrading, I noticed that the NIC is recognized:

    [root@localhost:~] esxcli network nic list
    Name    PCI Device    Driver    Admin Status  Link Status  Speed  Duplex  MAC Address         MTU  Description
    ------  ------------  --------  ------------  -----------  -----  ------  -----------------  ----  -----------
    vmnic0  0000:01:00.0  cndi_igc  Up            Up            1000  Full    88:c9:b3:b5:09:fe  1500  Intel Corporation Ethernet Controller I225-V
    vusb0   Pseudo        cdce      Up            Up             100  Full    00:24:9b:23:c7:50  1500  Realtek USB 101001000 LAN

    The driver loaded and the MAC address matches the device, yet I can't pull a DHCP address when I put that NIC as primary. I tried putting a static IP down for the management network, but it was not pingable. Using a Realtek 8153 USB NIC works (DHCP or static IP) using the same network cable, so I know this isn't a downstream L2/L3 networking issue.

    Driver version:

    [root@localhost:~] esxcli software vib list | grep cndi
    cndi-igc                       1.2.10.0-1vmw.803.0.40.24364478        VMW     VMwareCertified   2024-11-02    host

    I pulled the following info from dmesg (if I need to grab other info, please let me know):

    [root@localhost:~] dmesg | grep cndi_igc
    <6>Loading cndi_igc.v00
    <7>cndi_igc.v00 (MD5: 1a4267c372420ed0bdc710b352f0fbd2): transferred 44KiB (45764 bytes)
    <7>cndi_igc.v00 (MD5: f3251a4090a98db64d5ff120117c28e6): extracted 137KiB (140800 bytes)
    2024-11-03T04:06:28.472Z cpu0:262144)Loading cndi_igc.v00...
    2024-11-03T04:06:28.474Z cpu0:262144)VisorFSTar: 2610: cndi_igc.v00 for 0x22600 bytes
    2024-11-03T04:06:46.263Z cpu2:262534)Loading module cndi_igc ...
    2024-11-03T04:06:46.266Z cpu2:262534)Elf: 2130: module cndi_igc has license VMware
    2024-11-03T04:06:46.270Z cpu2:262534)vmkcndi: vmkcndi_AddDriver:135: added a driver: cndi_igc@46 (0x41ffcc600268/0x41ffcc600340)
    2024-11-03T04:06:46.270Z cpu2:262534)Device: 193: Registered driver 'cndi_igc' from 46
    2024-11-03T04:06:46.270Z cpu2:262534)cndi_igc: cndi_RegisterDriver:107: registered a CNDI driver 0x41ffcc600340 for 0x43056bc05c30
    2024-11-03T04:06:46.270Z cpu2:262534)Mod: 4809: Initialization of cndi_igc succeeded with module ID 46.
    2024-11-03T04:06:46.270Z cpu2:262534)cndi_igc loaded successfully.
    2024-11-03T04:06:46.286Z cpu2:262534)cndi_igc: igc_IdentifyPF:256: (0000:01:00.0) identifying device 8086:15f3:8086:0000
    2024-11-03T04:06:46.286Z cpu2:262534)Device: 362: cndi_igc:driver->ops.attachDevice :0 ms
    2024-11-03T04:06:46.286Z cpu2:262534)Device: 368: Found driver cndi_igc for device 0x476343056bc051c6
    2024-11-03T04:06:46.367Z cpu4:262534)cndi_igc: igc_InitPF:349: (0000:01:00.0) read MAC address 88:c9:b3:b5:09:fe
    2024-11-03T04:06:46.381Z cpu4:262534)Device: 637: cndi_igc:driver->ops.startDevice:96 ms
    2024-11-03T04:06:46.382Z cpu4:262534)Device: 459: cndi_igc:driver->ops.scanDevice:1 ms
    2024-11-03T04:07:50.902Z cpu3:262221)cndi_igc: igc_ReconfPF:472: (vmnic0) WoL configured to 0x20 (WUFC = 0x2)
    2024-11-03T04:08:03.568Z cpu2:262221)cndi_igc: igc_ReconfPF:472: (vmnic0) WoL configured to 0x20 (WUFC = 0x2)
    2024-11-03T04:08:03.964Z cpu0:262221)cndi_igc: igc_ReconfPF:472: (vmnic0) WoL configured to 0x20 (WUFC = 0x2)
    2024-11-03T04:08:04.327Z cpu0:262221)cndi_igc: igc_ReconfPF:472: (vmnic0) WoL configured to 0x20 (WUFC = 0x2)
    2024-11-03T04:08:07.080Z cpu2:262221)cndi_igc: igc_ReconfPF:472: (vmnic0) WoL configured to 0x20 (WUFC = 0x2)
    2024-11-03T04:08:42.197Z cpu4:262221)cndi_igc: igc_ReconfPF:472: (vmnic0) WoL configured to 0x20 (WUFC = 0x2)
    2024-11-03T04:08:43.152Z cpu4:262221)cndi_igc: igc_ReconfPF:472: (vmnic0) WoL configured to 0x20 (WUFC = 0x2)
    2024-11-03T04:15:53.249Z cpu4:262221)cndi_igc: igc_ReconfPF:472: (vmnic0) WoL configured to 0x20 (WUFC = 0x2)
    2024-11-03T04:16:34.045Z cpu3:262221)cndi_igc: igc_ReconfPF:472: (vmnic0) WoL configured to 0x20 (WUFC = 0x2)
    2024-11-03T04:16:36.812Z cpu5:262221)cndi_igc: igc_ReconfPF:472: (vmnic0) WoL configured to 0x20 (WUFC = 0x2)
    2024-11-03T04:16:37.407Z cpu4:262221)cndi_igc: igc_ReconfPF:472: (vmnic0) WoL configured to 0x20 (WUFC = 0x2)
    2024-11-03T04:16:51.615Z cpu3:262221)cndi_igc: igc_ReconfPF:472: (vmnic0) WoL configured to 0x20 (WUFC = 0x2)
    2024-11-03T04:16:53.263Z cpu3:262221)cndi_igc: igc_ReconfPF:472: (vmnic0) WoL configured to 0x20 (WUFC = 0x2)

    [root@localhost:~] dmesg | grep 0000:01:00.0
    2024-11-03T04:06:11.052Z cpu0:262144)PCIE: 740: 0000:01:00.0: PCIe v2 PCI Express Endpoint
    2024-11-03T04:06:11.052Z cpu0:262144)PCI: 1057: 0000:01:00.0: probing 8086:15f3 8086:0000 0200 0003
    2024-11-03T04:06:11.052Z cpu0:262144)PCI: 1311: 0000:01:00.0: registering 8086:15f3 8086:0000
    2024-11-03T04:06:11.052Z cpu0:262144)PCI: 1989: 0000:01:00.0: Enabling device, Command register mask: 0x2
    2024-11-03T04:06:11.052Z cpu0:262144)Device: 1551: Registered device: 0x43056bc04fb0 p0000:01:00.0 808615f380860000020000 (parent=0x6f3d43056bc01daa)
    2024-11-03T04:06:46.167Z cpu5:262532)PCI: 1127: 0000:01:00.0 named 'vmnic0' (was '')
    2024-11-03T04:06:46.285Z cpu2:262534)VMK_PCI: 751: 0000:01:00.0: pciBar 0 bus_addr 0xf0000000 size 0x100000
    2024-11-03T04:06:46.286Z cpu2:262534)cndi_igc: igc_IdentifyPF:256: (0000:01:00.0) identifying device 8086:15f3:8086:0000
    2024-11-03T04:06:46.286Z cpu2:262534)DMA: 442: DMA Engine '0000:01:00.0-dma' range unlimited. (non-coherent)
    2024-11-03T04:06:46.286Z cpu2:262534)DMA: 755: DMA Engine '0000:01:00.0-dma' created using mapper 'DMABounce'.
    2024-11-03T04:06:46.367Z cpu4:262534)cndi_igc: igc_InitPF:349: (0000:01:00.0) read MAC address 88:c9:b3:b5:09:fe
    2024-11-03T04:06:46.382Z cpu4:262534)Device: 1551: Registered device: 0x43056bc01220 pci#p0000:01:00.0#0 com.vmware.uplink (parent=0x476343056bc051c6)
    2024-11-03T04:06:56.797Z cpu2:262221)vmkcndi: vmkcndi_Associate:730: PF (0000:01:00.0) now has a name vmnic0
    2024-11-03T04:06:56.799Z cpu2:262221)VMK_PCI: 599: 0000:01:00.0: allocated 3 MSIX interrupts
    2024-11-03T04:20:53.101Z cpu3:264179)WARNING: PCIVPD: 91: 0000:01:00.0: VPD capability is not found
    2024-11-03T04:20:55.482Z cpu4:264208)WARNING: PCIVPD: 91: 0000:01:00.0: VPD capability is not found

    I noticed at this point that there were 3 MSIX interrupts, but the Orange Pi 5 Plus doesn't support MSI. I realized at this point that MSI was still enabled, so I attempted to disable MSI interrupts and reboot. This time, the network adapter didn't show up, only the vusb0.

    [root@localhost:~] dmesg | grep cndi
    <6>Loading cndi_igc.v00
    <7>cndi_igc.v00 (MD5: 1a4267c372420ed0bdc710b352f0fbd2): transferred 44KiB (45764 bytes)
    <7>cndi_igc.v00 (MD5: f3251a4090a98db64d5ff120117c28e6): extracted 137KiB (140800 bytes)
    2024-11-03T04:47:28.196Z cpu0:262144)Loading cndi_igc.v00...
    2024-11-03T04:47:28.198Z cpu0:262144)VisorFSTar: 2610: cndi_igc.v00 for 0x22600 bytes
    2024-11-03T04:47:46.918Z cpu3:262534)Loading module cndi_igc ...
    2024-11-03T04:47:46.924Z cpu3:262534)Elf: 2130: module cndi_igc has license VMware
    2024-11-03T04:47:46.931Z cpu3:262534)vmkcndi: vmkcndi_init_module:451: CNDI: version 1.2.10.0
    2024-11-03T04:47:46.931Z cpu3:262534)vmkcndi: vmkcndi_AddDriver:135: added a driver: cndi_igc@46 (0x41ffcb400268/0x41ffcb400340)
    2024-11-03T04:47:46.931Z cpu3:262534)Device: 193: Registered driver 'cndi_igc' from 46
    2024-11-03T04:47:46.931Z cpu3:262534)cndi_igc: cndi_RegisterDriver:107: registered a CNDI driver 0x41ffcb400340 for 0x43056bc05b80
    2024-11-03T04:47:46.931Z cpu3:262534)Mod: 4809: Initialization of cndi_igc succeeded with module ID 46.
    2024-11-03T04:47:46.931Z cpu3:262534)cndi_igc loaded successfully.
    2024-11-03T04:47:46.945Z cpu3:262534)cndi_igc: igc_IdentifyPF:256: (0000:01:00.0) identifying device 8086:15f3:8086:0000
    2024-11-03T04:47:46.945Z cpu3:262534)Device: 362: cndi_igc:driver->ops.attachDevice :0 ms
    2024-11-03T04:47:46.945Z cpu3:262534)Device: 368: Found driver cndi_igc for device 0x640243056bc051d4
    2024-11-03T04:47:47.078Z cpu1:262534)cndi_igc: igc_InitPF:349: (0000:01:00.0) read MAC address 88:c9:b3:b5:09:fe
    2024-11-03T04:47:47.093Z cpu1:262534)Device: 637: cndi_igc:driver->ops.startDevice:147 ms
    2024-11-03T04:47:47.093Z cpu1:262534)Device: 459: cndi_igc:driver->ops.scanDevice:0 ms
    2024-11-03T04:48:00.291Z cpu2:262221)vmkcndi: vmkcndi_Associate:730: PF (0000:01:00.0) now has a name vmnic0
    2024-11-03T04:48:00.293Z cpu2:262221)vmkcndi: vmkcndi_ConfigPF:428: vmnic0: failed to allocate needed interrupt vector

    The last message seems to give some clue, and the same message is found in the /var/run/log/vmkernel.log file as an "Info" message, not a warning or error. 

    I've attempted to use a (very old) Intel PRO 1000 dual port card using ne1000, but received a message along the lines of "false tx hang detected on vmnic0". I've also attempted using a slightly newer Broadcom 5719 card using ntg3, but received a similar message.

    I have an Intel i210 on the way which should arrive tomorrow - this would use the igbn driver and I've seen some folks post on Twitter their success in using this card with Ampere Altra. I'm hoping that this will give me some connectivity, else I'll need to downgrade back to 7.0 to use USB NICs.

    I also pulled a support bundle prior to disabling MSI, it is 41.6MB and can share it if needed. What other steps can I take to try to make this work?



  • 2.  RE: ESXi 8.0 U3b build-24364478 i225-V + Orange Pi 5 Plus - unable to receive dhcp IP, no connection w/static

    Posted Nov 05, 2024 10:04 AM

    I've done more testing:

    With the Broadcom 5719, the ntg3 driver had a module parameter for legacy interrupts. I tried enabling that, but now if that module loads the system purple screens (screenshot in post). I'm not convinced that legacy interrupts would be required, I only know that it worked for NVMe use on an older version of the edk2 uefi.

    The i210 has the same behavior using igbn as the i225-V did. Neither of these cards have module parameters for legacy interrupts.

    All of these cards work on this device using Armbian Linux 6.10.6. Might test Windows on it soon to see if the same problems arise with the edk2-rk3588 EFI implementation, which is on version 11.2. Otherwise, I'll probably just hold out for a version of the fling that incorporates the USB NIC fling, or go back to the 7.0 version.




  • 3.  RE: ESXi 8.0 U3b build-24364478 i225-V + Orange Pi 5 Plus - unable to receive dhcp IP, no connection w/static

    Broadcom Employee
    Posted Nov 05, 2024 08:37 PM

    It is curious that ESXi believes it can use MSI/MSI-X if they are not supported. Fixing this would allow skipping using the disableMSI options.

    The PSOD screenshot above does not show up for me :-/ If you had NVMe working, and it wasn't too early in the boot, you may find a zdump file in /var/core (and it should be included in the support bundle if taken after and will give us more info than the PSOD screen only).

    Getting the support bundle would help see more logs, but we can start with a /var/log/boot.gz (vmkernel boot log), and the output of `irqinfo` (I would suspect that the PCIe devices don't get any interrupts if they are configured as MSI).

    Note: the USB fling integration may come sooner rather than later.




  • 4.  RE: ESXi 8.0 U3b build-24364478 i225-V + Orange Pi 5 Plus - unable to receive dhcp IP, no connection w/static

    Posted Nov 07, 2024 08:56 AM
    Let me try again. It's set up on USB boot as the NVMe slot is being used for the NIC. I'll see what other logs I can dig up if there are any.



  • 5.  RE: ESXi 8.0 U3b build-24364478 i225-V + Orange Pi 5 Plus - unable to receive dhcp IP, no connection w/static

    Broadcom Employee
    Posted 30 days ago

    okay, it looks like a core is stuck in interrupt processing, likely the PCI legacy one. We never had a Broadcom 5719 to test with, and I am seeing that the driver is not "ready for Arm" (mostly missing memory barriers).




  • 6.  RE: ESXi 8.0 U3b build-24364478 i225-V + Orange Pi 5 Plus - unable to receive dhcp IP, no connection w/static

    Posted 27 days ago

    Understood. I think I'll wait for the USB fling to get integrated :)