If I look at what pci_hyperv_intf is doing
* This driver acts as a paravirtual front-end for PCI Express root buses.
* When a PCI Express function (either an entire device or an SR-IOV
* Virtual Function) is being passed through to the VM, this driver exposes
* a new bus to the guest VM. This is modeled as a root PCI bus because
* no bridges are being exposed to the VM. In fact, with a "Generation 2"
* VM within Hyper-V, there may seem to be no PCI bus at all in the VM
* until a device has been exposed using this driver.
May be it plays a role also in VMware environment. Indeed if I ****
the module from loading and then I add Mellanox SRIOV-VF adapter, the
adapter is not detected and mlx5_core is not loading.
Is there someone in this list that knows exactly what happens with Mellanox
SRIOV-VF if we don't have this module?
How to compile Photon3 with the pci_hyperv_intf module?
If I use rpm2cpio
https://packages.vmware.com/photon/4.0/photon_srpms_3.0_x86_64/linux-4.19.315-1.ph3.src.rpm| cpio -idm. and then I compile and check, there is no pci_hyperv_intf.c
available, but now I remember in TKG < 2.1.1 I could see that controller
driver and all was working.
Br,
Marco
On Mon, Dec 9, 2024 at 10:09 AM Marco Stura <
marco.stura@broadcom.com>
wrote:
> OK, let me give the full background.
>
> We are using TKG 2.1.1 with BYOD template built by VMware Telco (ex SDE
> now). That template has got mlx5_core and rdma drivers but not loaded at
> boot. I made them loaded at boot and then added SRIOV-VF backed adapters.
> Then I noticed the mlx5_core was not binding automatically, hence no links
> created on those adapters.
>
> I then created a script to bind the Mellanox SRIOV-VF backed adapters if
> any.
>
> #!/bin/bash
>
> # Check for Mellanox devices and fetch their PCI addresses
> echo "Scanning for Mellanox network cards..."
> MLX_CARDS=$(lspci | grep Mellanox | awk '{print $1}')
>
> # Check if any cards were found
> if [ -z "$MLX_CARDS" ]; then
> echo "No Mellanox network cards detected!"
> exit 1
> fi
>
> # Count the number of cards detected
> CARD_COUNT=$(echo "$MLX_CARDS" | wc -l)
> echo "Found $CARD_COUNT Mellanox network card(s):"
> echo "$MLX_CARDS"
>
> # Iterate over each PCI address and bind the mlx5_core driver
> for PCI_ADDR in $MLX_CARDS; do
> echo "Processing Mellanox card at PCI address $PCI_ADDR..."
>
> # Apply driver override
> echo "mlx5_core" > /sys/bus/pci/devices/0000:$PCI_ADDR/driver_override
> if [ $? -ne 0 ]; then
> echo "Failed to apply driver override for $PCI_ADDR. Skipping..."
> continue
> fi
>
> # Bind the mlx5_core driver
> echo 0000:$PCI_ADDR > /sys/bus/pci/drivers/mlx5_core/bind
> if [ $? -ne 0 ]; then
> echo "Failed to bind mlx5_core driver for $PCI_ADDR. Skipping..."
> continue
> fi
>
> echo "Successfully bound mlx5_core driver to $PCI_ADDR."
> done
>
> echo "Driver binding process completed."
>
>
> The script is used by a systemd service that I also created to do it at
> boot. All fine, I can see the mlx5_core binds to the adapters and all looks
> perfect from a PCI and ip link perspective. However, packets are not
> received at the OS IP stack and not sent out from the adapter once the IP
> stack sends e.g. a ping or whatever packet. For example, when I add an IP
> address and ping, tcpdump show the ARP going out from the IP stack but it
> is not sent to the physical network by the adapter.
>
> Then I thought about replacing the drivers by installing Mellanox OFED
> 24.x. Which I did, but I hit the same issue. All looks fine but no packets
> are going through.
>
> root [ /usr/local/bin ]# lsmod | grep mlx
> mlx5_ib 397312 0
> ib_uverbs 143360 2 rdma_ucm,mlx5_ib
> ib_core 303104 8
> rdma_cm,ib_ipoib,iw_cm,ib_umad,rdma_ucm,ib_uverbs,mlx5_ib,ib_cm
> mlx5_core 1552384 1 mlx5_ib
> mlxfw 24576 1 mlx5_core
> mlxdevm 172032 1 mlx5_core
> auxiliary 16384 2 mlx5_ib,mlx5_core
> mlx_compat 40960 13
> rdma_cm,ib_ipoib,mlxdevm,mlxfw,iw_cm,auxiliary,ib_umad,ib_core,rdma_ucm,ib_uverbs,mlx5_ib,ib_cm,mlx5_core
> hwmon 16384 1 mlx5_core
>
> root [ /usr/local/bin ]# lspci -kvv
> 03:00.0 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen
> Virtual Function
> DeviceName: pciPassthru30
> Subsystem: Mellanox Technologies Device 0058
> Physical Slot: 64
> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR- FastB2B- DisINTx+
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <tabort-></tabort-><mabort->SERR- <perr-></perr->
> Latency: 0, Cache Line Size: 32 bytes
> NUMA node: 0
> Region 0: Memory at fcd00000 (64-bit, prefetchable) [size=1M]
> Capabilities: [60] Express (v2) Endpoint, MSI 00
> DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s
> <64ns, L1 <1us
> ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset+
> SlotPowerLimit 0.000W
> DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
> RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
> FLReset-
> MaxPayload 128 bytes, MaxReadReq 128 bytes
> DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr-
> TransPend-
> LnkCap: Port #0, Speed 5GT/s, Width x32, ASPM L0s, Exit
> Latency L0s <64ns
> ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
> LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> LnkSta: Speed 5GT/s (ok), Width x32 (ok)
> TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
> DevCap2: Completion Timeout: Not Supported, TimeoutDis-,
> NROPrPrP-, LTR-
> 10BitTagComp-, 10BitTagReq-, OBFF Not Supported,
> ExtFmt-, EETLPPrefix-
> EmergencyPowerReduction Not Supported,
> EmergencyPowerReductionInit-
> FRS-, TPHComp-, ExtTPHComp-
> AtomicOpsCap: 32bit- 64bit- 128bitCAS-
> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-,
> LTR-, OBFF Disabled
> AtomicOpsCtl: ReqEn-
> LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance-
> SpeedDis-
> Transmit Margin: Normal Operating Range,
> EnterModifiedCompliance- ComplianceSOS-
> Compliance De-emphasis: -6dB
> LnkSta2: Current De-emphasis Level: -6dB,
> EqualizationComplete-, EqualizationPhase1-
> EqualizationPhase2-, EqualizationPhase3-,
> LinkEqualizationRequest-
> Capabilities: [9c] MSI-X: Enable+ Count=12 Masked-
> Vector table: BAR=0 offset=00002000
> PBA: BAR=0 offset=00003000
> Capabilities: [100 v1] Vendor Specific Information: ID=0000 Rev=0
> Len=00c
> Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
> ARICap: MFVC- ACS-, Next Function: 0
> ARICtl: MFVC- ACS-, Function Group: 0
> Kernel driver in use: mlx5_core
>
> 03:01.0 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen
> Virtual Function
> DeviceName: pciPassthru31
> Subsystem: Mellanox Technologies Device 0058
> Physical Slot: 65
> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR- FastB2B- DisINTx+
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <tabort-></tabort-><mabort->SERR- <perr-></perr->
> Latency: 0, Cache Line Size: 32 bytes
> NUMA node: 0
> Region 0: Memory at fcc00000 (64-bit, prefetchable) [size=1M]
> Capabilities: [60] Express (v2) Endpoint, MSI 00
> DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s
> <64ns, L1 <1us
> ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset+
> SlotPowerLimit 0.000W
> DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
> RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
> FLReset-
> MaxPayload 128 bytes, MaxReadReq 128 bytes
> DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr-
> TransPend-
> LnkCap: Port #0, Speed 5GT/s, Width x32, ASPM L0s, Exit
> Latency L0s <64ns
> ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
> LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> LnkSta: Speed 5GT/s (ok), Width x32 (ok)
> TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
> DevCap2: Completion Timeout: Not Supported, TimeoutDis-,
> NROPrPrP-, LTR-
> 10BitTagComp-, 10BitTagReq-, OBFF Not Supported,
> ExtFmt-, EETLPPrefix-
> EmergencyPowerReduction Not Supported,
> EmergencyPowerReductionInit-
> FRS-, TPHComp-, ExtTPHComp-
> AtomicOpsCap: 32bit- 64bit- 128bitCAS-
> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-,
> LTR-, OBFF Disabled
> AtomicOpsCtl: ReqEn-
> LnkSta2: Current De-emphasis Level: -6dB,
> EqualizationComplete-, EqualizationPhase1-
> EqualizationPhase2-, EqualizationPhase3-,
> LinkEqualizationRequest-
> Capabilities: [9c] MSI-X: Enable+ Count=12 Masked-
> Vector table: BAR=0 offset=00002000
> PBA: BAR=0 offset=00003000
> Capabilities: [100 v1] Vendor Specific Information: ID=0000 Rev=0
> Len=00c
> Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
> ARICap: MFVC- ACS-, Next Function: 0
> ARICtl: MFVC- ACS-, Function Group: 0
> Kernel driver in use: mlx5_core
>
> root [ /usr/local/bin ]# rdma link show
> 0/1: mlx5_0/1: state ACTIVE physical_state LINK_UP netdev Net3
> 1/1: mlx5_1/1: state ACTIVE physical_state LINK_UP netdev Net4
>
> root [ /usr/local/bin ]# ibv_devinfo
> hca_id: mlx5_0
> transport: InfiniBand (0)
> fw_ver: 22.38.1002
> node_guid: 0050:56ff:fea4:841f
> sys_image_guid: a088:c203:003b:cd0a
> vendor_id: 0x02c9
> vendor_part_id: 4126
> hw_ver: 0x0
> board_id: DEL0000000027
> phys_port_cnt: 1
> port: 1
> state: PORT_ACTIVE (4)
> max_mtu: 4096 (5)
> active_mtu: 1024 (3)
> sm_lid: 0
> port_lid: 0
> port_lmc: 0x00
> link_layer: Ethernet
>
> hca_id: mlx5_1
> transport: InfiniBand (0)
> fw_ver: 22.38.1002
> node_guid: 0050:56ff:fea4:3c15
> sys_image_guid: a088:c203:003b:cd0a
> vendor_id: 0x02c9
> vendor_part_id: 4126
> hw_ver: 0x0
> board_id: DEL0000000027
> phys_port_cnt: 1
> port: 1
> state: PORT_ACTIVE (4)
> max_mtu: 4096 (5)
> active_mtu: 1024 (3)
> sm_lid: 0
> port_lid: 0
> port_lmc: 0x00
> link_layer: Ethernet
>
> If I use TKG 2.5 (Photon5) all works fine without doing anything. The only
> difference I can notice is that in Photon5 there is a PCI controller module
> that is not present in TKG 2.1.1 Photon3, pci_hyperv_intf that has to do
> with HYPERV but it may help with something???
>
> root [ /home/capv ]# lsmod | grep mlx
> mlx5_ib 368640 0
> ib_uverbs 151552 1 mlx5_ib
> ib_core 385024 2 ib_uverbs,mlx5_ib
> mlx5_core 1433600 1 mlx5_ib
> pci_hyperv_intf 16384 1 mlx5_core
>
> Do you understand my issue now?
>
> Br,
> Marco
>
>
--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.
Original Message:
Sent: 12/8/2024 7:58:00 AM
From: Daniel Casota
Subject: RE: MELLANOX OFED Install
What sort of issues? Not sure if we talk about the same. The commmunity version contains the source files. Would you mind to share the steps before the issue(s) and the output logfile?
Original Message:
Sent: Dec 08, 2024 07:44 AM
From: Marco Stura
Subject: MELLANOX OFED Install
I use ConnetctX-6. I managed to install the OFED by tweaking the install.pl
in the Community version.
The OFED, rpm version, is compiled and installed in the running OS. No ISO.
Anyway, Ihit other issues now :-)
--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.
Original Message:
Sent: 12/8/2024 3:56:00 AM
From: Daniel Casota
Subject: RE: MELLANOX OFED Install
TKG 2.1 and TKG 2.4 for vSphere comes with Photon OS 3, indeed.
Accordingly to Release Notes - NVIDIA Docs OFED latest version 24.10-1.1.4.0, December 2024, does not support ConnectX-3 anymore. Which adapter do you use?
Did a quick test, and Photon OS 3 (4.19.324-1) on Azure, deployed as Standard F4s v2 with accelerated networking enabled, gets a ConnectX4 adapter --> Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] [15b3:1016] (rev 80).
If I understood the install process correctly, first, an ISO is created for the kernel version used, see Installing the Driver - NVIDIA Docs
Afaik the contrary of "officially support" means mission control your own custom Photon OS image.
Original Message:
Sent: Dec 07, 2024 07:28 AM
From: Marco Stura
Subject: MELLANOX OFED Install
Unfortunately we have Telco customers running SRIOV on Mellanox and Photon3
(TKG 2.1 and previous version) in production. For example Singtel with
Ericsson UPF.
On Sat, Dec 7, 2024 at 2:36 PM Alexey Makhalov via Broadcom <
Mail@broadcom.com> wrote:
> We do not officially support it. We had a discussion a year ago with
> NVidia to support Photon OS 4.0. But it was de prioritized. -posted to the
> "Photon OS" community
> [image: Broadcom] <https: community.broadcom.com>
> Photon OS
> <https: community.broadcom.com communities community-home digestviewer?communitykey=a70674e4-ccb6-46a3-ae94-7ecf16c06e24>
> Post New Message <broadcom-photonos@connectedcommunity.org>
> Re: MELLANOX OFED Install
> <https: community.broadcom.com discussion mellanox-ofed-install#bm0a6da5f4-c337-45be-8b1e-2fe8f67eb9e1>
> Reply to Group
> <broadcom_photonos_0a6da5f4-c337-45be-8b1e-2fe8f67eb9e1@connectedcommunity.org?subject=re:+mellanox+ofed+install> Reply
> to Sender
> <https: community.broadcom.com communities all-discussions postreply?messagekey=0a6da5f4-c337-45be-8b1e-2fe8f67eb9e1&ListKey=39fd2ab1-ab47-42d9-b812-018ed42c5557&SenderKey=2f39633a-1d2b-4441-9bee-018b2ea13496>
> [image: Alexey Makhalov]
> <https: community.broadcom.com network members profile?userkey=2f39633a-1d2b-4441-9bee-018b2ea13496>
> Dec 7, 2024 5:34 AM
> Alexey Makhalov
> <https: community.broadcom.com network members profile?userkey=2f39633a-1d2b-4441-9bee-018b2ea13496>
>
> We do not officially support it. We had a discussion a year ago with
> NVidia to support Photon OS 4.0. But it was de prioritized.
> *Reply to Group Online
> <https: community.broadcom.com communities all-discussions postreply?messagekey=0a6da5f4-c337-45be-8b1e-2fe8f67eb9e1&ListKey=39fd2ab1-ab47-42d9-b812-018ed42c5557>*
> *Reply to Group via Email
> <broadcom_photonos_0a6da5f4-c337-45be-8b1e-2fe8f67eb9e1@connectedcommunity.org?subject=re:+mellanox+ofed+install>*
> *View Thread
> <https: community.broadcom.com discussion mellanox-ofed-install#bm0a6da5f4-c337-45be-8b1e-2fe8f67eb9e1>*
> *Recommend
> <https: community.broadcom.com:443 discussion mellanox-ofed-install?messagekey=0a6da5f4-c337-45be-8b1e-2fe8f67eb9e1&cmd=rate&cmdarg=add#bm0a6da5f4-c337-45be-8b1e-2fe8f67eb9e1>*
> *Forward
> <https: community.broadcom.com communities all-discussions forwardmessages?messagekey=0a6da5f4-c337-45be-8b1e-2fe8f67eb9e1&ListKey=39fd2ab1-ab47-42d9-b812-018ed42c5557>*
> *Flag as Inappropriate
> <https: community.broadcom.com discussion mellanox-ofed-install?markappropriate=0a6da5f4-c337-45be-8b1e-2fe8f67eb9e1#bm0a6da5f4-c337-45be-8b1e-2fe8f67eb9e1>*
>
> -------------------------------------------
> Original Message:
> Sent: Dec 05, 2024 05:43 AM
> From: Marco Stura
> Subject: MELLANOX OFED Install
>
> Hi guys,
>
> I need to install MELLANOX OFED in photon3, but I cannot find a suitable
> software package from NVIDIA download. I tried out with the community
> version but didn't work.
>
> Could you please point me to where I can download the OFED to be installed
> in photon OS?
>
> Thank you
>
> Marco
>
>
>
>
> You are receiving this notification because you followed the 'MELLANOX
> OFED Install' message thread. If you do not wish to follow this, please
> click here
> <https: community.broadcom.com higherlogic common unfollow.aspx?userkey=09f7b9fa-7ac0-4e97-bb88-018b2e955c72&sKey=KeyRemoved&ItemKey=dd6c990a-9ec0-4646-b1d4-01939668aa80>.
>
>
> Update your email preferences
> <https: community.broadcom.com go.aspx?c=Preferences§ion=email> to
> choose the types of email you receive
>
> Unsubscribe from all participation emails
> <https: community.broadcom.com higherlogic egroups unsubscribe.aspx?userkey=09f7b9fa-7ac0-4e97-bb88-018b2e955c72&sKey=KeyRemoved&mClass=Social>
>
--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.
Original Message:
Sent: 12/7/2024 5:34:00 AM
From: Alexey Makhalov
Subject: RE: MELLANOX OFED Install
We do not officially support it. We had a discussion a year ago with NVidia to support Photon OS 4.0. But it was de prioritized.
Original Message:
Sent: Dec 05, 2024 05:43 AM
From: Marco Stura
Subject: MELLANOX OFED Install
Hi guys,
I need to install MELLANOX OFED in photon3, but I cannot find a suitable software package from NVIDIA download. I tried out with the community version but didn't work.
Could you please point me to where I can download the OFED to be installed in photon OS?
Thank you
Marco
</https:></https:></https:></https:></https:></https:></https:></broadcom_photonos_0a6da5f4-c337-45be-8b1e-2fe8f67eb9e1@connectedcommunity.org?subject=re:+mellanox+ofed+install></https:></https:></https:></https:></broadcom_photonos_0a6da5f4-c337-45be-8b1e-2fe8f67eb9e1@connectedcommunity.org?subject=re:+mellanox+ofed+install></https:></broadcom-photonos@connectedcommunity.org></https:></https:></mabort-></mabort-></marco.stura@broadcom.com>