ESXi

 View Only
Expand all | Collapse all

NVMEof Datastore Issues

  • 1.  NVMEof Datastore Issues

    Posted Apr 24, 2020 01:00 AM

    Hello, We are testing NVMEof with esxi 7. I am having issues getting the device to be recognized. I am using mellanox connectx-4 cards. I am attempting to access a nvme device as a test. I am able to discover the controller in the vmware interface.

    The namespace tab also shows the correct disk size and name. 750gb in this case

    on the paths tab the following shows up:

    Runtime Name: vmhba67:C0:T1:L0

    Target: Blank

    Lun: 0

    Status: Dead

    Below is the test config from the linux server. Anyone have any suggestions for next steps for troubleshooting? /dev/nvme0n1 is a freshly erased nvme drive.

    modprobe nvmet

    modprobe nvmet-rdma

    sudo /bin/mount -t configfs none /sys/kernel/config/

    sudo mkdir /sys/kernel/config/nvmet/subsystems/PSC

    cd /sys/kernel/config/nvmet/subsystems/PSC

    echo 1 | sudo tee -a attr_allow_any_host > /dev/null

    sudo mkdir namespaces/1

    cd namespaces/1/

    echo -n /dev/nvme0n1> device_path

    echo 1 | sudo tee -a enable > /dev/null

    sudo mkdir /sys/kernel/config/nvmet/ports/1

    cd /sys/kernel/config/nvmet/ports/1

    echo 10.10.11.1 | sudo tee -a addr_traddr > /dev/null

    echo rdma | sudo tee -a addr_trtype > /dev/null

    echo 4420 | sudo tee -a addr_trsvcid > /dev/null

    echo ipv4 | sudo tee -a addr_adrfam > /dev/null

    sudo ln -s /sys/kernel/config/nvmet/subsystems/PSC/ /sys/kernel/config/nvmet/ports/1/subsystems/PSC

    sudo mkdir /sys/kernel/config/nvmet/ports/2

    cd /sys/kernel/config/nvmet/ports/2

    echo 10.10.12.1 | sudo tee -a addr_traddr > /dev/null

    echo rdma | sudo tee -a addr_trtype > /dev/null

    echo 4420 | sudo tee -a addr_trsvcid > /dev/null

    echo ipv4 | sudo tee -a addr_adrfam > /dev/null

    sudo ln -s /sys/kernel/config/nvmet/subsystems/PSC/ /sys/kernel/config/nvmet/ports/2/subsystems/PSC



  • 2.  RE: NVMEof Datastore Issues

    Posted Apr 24, 2020 03:54 AM

    just a follow up I was able to add it to another linux system without issue. Is there somewhere on the esxi host i can check logs? Is there possibly something wrong with my subnqn? Most vendor appliances have long winded names. The linux server accepted PSC but perhaps vmware cant.



  • 3.  RE: NVMEof Datastore Issues

    Posted Apr 24, 2020 04:17 AM

    Found this in the logs, it seems HPP doesnt support the device for some reason (its a mellanox connectx-4 adapter back to a linux target). Perhaps they dont support the linux target, or perhaps i simply need to do a better job of naming my linux target nqn.

    2020-04-24T04:01:52.789Z cpu9:2099749 opID=6441911)WARNING: HPP: HppClaimPath:3719: Failed to claim path 'vmhba67:C0:T2:L0': Not supported

    2020-04-24T04:01:52.789Z cpu9:2099749 opID=6441911)HPP: HppUnclaimPath:3765: Unclaiming path vmhba67:C0:T2:L0

    2020-04-24T04:01:52.789Z cpu9:2099749 opID=6441911)ScsiPath: 8397: Plugin 'HPP' rejected path 'vmhba67:C0:T2:L0'

    2020-04-24T04:01:52.789Z cpu9:2099749 opID=6441911)ScsiClaimrule: 1568: Plugin HPP specified by claimrule 65534 was not able to claim path vmhba67:C0:T2:L0: Not supported

    2020-04-24T04:01:52.789Z cpu9:2099749 opID=6441911)WARNING: ScsiPath: 8327: NMP cannot claim a path to NVMeOF device vmhba67:C0:T2:L0

    2020-04-24T04:01:52.789Z cpu9:2099749 opID=6441911)ScsiClaimrule: 1568: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba67:C0:T2:L0: Not supported

    2020-04-24T04:01:52.789Z cpu9:2099749 opID=6441911)ScsiClaimrule: 1872: Error claiming path vmhba67:C0:T2:L0. Not supported.

    2020-04-24T04:01:52.809Z cpu9:2099749 opID=6441911)WARNING: HPP: HppClaimPath:3719: Failed to claim path 'vmhba67:C0:T2:L0': Not supported

    2020-04-24T04:01:52.809Z cpu9:2099749 opID=6441911)HPP: HppUnclaimPath:3765: Unclaiming path vmhba67:C0:T2:L0



  • 4.  RE: NVMEof Datastore Issues

    Posted May 15, 2021 02:52 AM

    hello, we also have this issue. Do you resolve this issue?

    our storage target map to esxi with fc-nvme, we can find the nvme controller and namespace, but can't find storage device. 

    1. find nvme controller and namespace.

    _______________

    [root@localhost:~] esxcli nvme fabrics discover -a vmhba68 -W 0x56c92bf803002760 -w 0x56c92bf8033b2760
    Transport Type Address Family Subsystem Type Controller ID Admin Queue Max Size Transport Address Transport Service ID Subsystem NQN Connected
    -------------- -------------- -------------- ------------- -------------------- ------------------------------------------- -------------------- ----------------------------------- ---------
    FC Fibre Channel NVM 65535 32 nn-0x56c92bf803002760:pn-0x56c92bf8033b2760 none nqn.2004-12.com.inspur:mcs.28827034 true
    [root@localhost:~] [root@localhost:~] esxcli nvme fabrics discover -a vmhba68 -W 0x56c92bf803002760 -w 0x56c92bf8033b2760 [root@localhost:~] esxcli nvme controller list
    Name Controller Number Adapter Transport Type Is Online
    ----------------------------------------------------------------------------- ----------------- ------- -------------- ---------
    nqn.2004-12.com.inspur:mcs.28827034#vmhba68#56c92bf803002760:56c92bf8033b2760 467 vmhba68 FC true
    [root@localhost:~] [root@localhost:~] esxcli nvme controller list list list list list list list list list list listn lista listm liste lists listp lista listc liste list list
    Name Controller Number Namespace ID Block Size Capacity in MB
    ------------------------------------ ----------------- ------------ ---------- --------------
    eui.d000000000000001005076000a209c06 467 2 512 10240

    _______________

    2. "esxcli storage core path list" command show the path is dead," esxcli storage core device list" can.t find storage device

    _______________

    fc.200000109bc18a3f:100000109bc18a3f-fc.56c92bf803002760:56c92bf8033b2760-
    UID: fc.200000109bc18a3f:100000109bc18a3f-fc.56c92bf803002760:56c92bf8033b2760-
    Runtime Name: vmhba68:C0:T3:L1
    Device: No associated device
    Device Display Name: No associated device
    Adapter: vmhba68
    Channel: 0
    Target: 3
    LUN: 1
    Plugin: (unclaimed)
    State: dead
    Transport: fc
    Adapter Identifier: fc.200000109bc18a3f:100000109bc18a3f
    Target Identifier: fc.56c92bf803002760:56c92bf8033b2760
    Adapter Transport Details: Unavailable or path is unclaimed
    Target Transport Details: Unavailable or path is unclaimed
    Maximum IO Size: 2097152

    _______________

    3. some err log

    Warring: HPP: HppClaimPath:3719: Failed to claim path ‘vmhba68:C0:T0:L1’: Not supported

    _______________

    see attachment

    _______________



  • 5.  RE: NVMEof Datastore Issues

    Posted May 17, 2021 03:32 PM

    I have the same issue, with Emulex LPe32000 PCi Fibre channel adapter and ESX 7.0.1

    I think issue is due to bad ClaimRule.

    I've tried add a new one but it's not working.

    I've feeling ESX is not prepared for NVME storages in default

     



  • 6.  RE: NVMEof Datastore Issues

    Posted May 18, 2021 12:19 AM

    we also use LPe32000 PCi FC adapter and ESXi 7.0.0. we have this issue。

    We use IBM storage  to test, it can work fine. 



  • 7.  RE: NVMEof Datastore Issues

    Posted Oct 18, 2021 05:42 PM

    Issue was solved by Storage vendor. Released a new firmware, downgrade 4k to 512 volume block size supported by VMware.

    Instead of vSphere 7U2  support 4k device, external storage like NetApp EF600 with 4k volume is not visible in vSphere, only 512 block size volume.

    Controllers, Namespaces, Paths all are OK, but 4k device/volume is not shown in vSphere. Not supported?

    Peter



  • 8.  RE: NVMEof Datastore Issues

    Posted Jan 21, 2022 02:47 AM

    I configured a nvme block device using 'nvmetcli' with block size: 512 bytes on a Centos VM and then did an nvme connect[NVMe/TCP] to that target. ESXi is able to see the volume and it's also listed in 'esxcli nvme namespace list'. But path to the target is shown DEAD. Any pointers as to why the path is DEAD?

    esxcli nvme namespace list

    Name                                   Controller Number  Namespace ID  Block Size  Capacity in MB

    -------------------------------------  -----------------  ------------  ----------  --------------

    eui.343337304d1007610025384500000001                 256             1         512          915715

    eui.343337304d1015200025384500000001                 257             1         512          915715

    uuid.b8bbea9b8b34471b97b13222a954e43e                328             1         512           20480 <<<<

     

    esxcli nvme controller list

    Name                                                                                    Controller Number  Adapter  Transport Type  Is Online

    --------------------------------------------------------------------------------------  -----------------  -------  --------------  ---------

    nqn.2014-08.org.nvmexpress_144d_SAMSUNG_MZQLB960HAJR-00007______________S437NE0M100761                256  vmhba2   PCIe                 true

    nqn.2014-08.org.nvmexpress_144d_SAMSUNG_MZQLB960HAJR-00007______________S437NE0M101520                257  vmhba3   PCIe                 true

    testnqn#vmhba65#15.33.8.5:4420                                                                        328  vmhba65  TCP                  true

     

    esxcli storage core path list -p vmhba65:C0:T0:L0

    tcp.vmnic5:3c:fd:fe:c3:93:5d-tcp.unknown-

       UID: tcp.vmnic5:3c:fd:fe:c3:93:5d-tcp.unknown-

       Runtime Name: vmhba65:C0:T0:L0

       Device: No associated device

       Device Display Name: No associated device

       Adapter: vmhba65

       Channel: 0

       Target: 0

       LUN: 0

       Plugin: (unclaimed)

       State: dead <<<<<<<<<<<<<<<<<<<<<<

       Transport: tcp

       Adapter Identifier: tcp.vmnic5:3c:fd:fe:c3:93:5d

       Target Identifier: tcp.unknown

       Adapter Transport Details: Unavailable or path is unclaimed

       Target Transport Details: Unavailable or path is unclaimed

       Maximum IO Size: 1048576

     VMkernel log:

    ===========

    2022-01-21T02:44:37.120Z cpu22:1048893)HPP: HppCreateDevice:3071: Created logical device 'uuid.b8bbea9b8b34471b97b13222a954e43e'.                             

    2022-01-21T02:44:37.120Z cpu22:1048893)WARNING: HPP: HppClaimPath:3956: Failed to claim path 'vmhba65:C0:T0:L0': Not supported                                

    2022-01-21T02:44:37.120Z cpu22:1048893)HPP: HppUnclaimPath:4002: Unclaiming path vmhba65:C0:T0:L0                                                             

    2022-01-21T02:44:37.120Z cpu22:1048893)ScsiPath: 8597: Plugin 'HPP' rejected path 'vmhba65:C0:T0:L0'                                                          

    2022-01-21T02:44:37.120Z cpu22:1048893)ScsiClaimrule: 2039: Plugin HPP specified by claimrule 65534 was not able to claim path vmhba65:C0:T0:L0: Not supported

    2022-01-21T02:44:37.121Z cpu22:1048893)WARNING: ScsiPath: 8496: NMP cannot claim a path to NVMeOF device vmhba65:C0:T0:L0                                     

    2022-01-21T02:44:37.121Z cpu22:1048893)ScsiClaimrule: 2039: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba65:C0:T0:L0: Not supported

    2022-01-21T02:44:37.121Z cpu22:1048893)ScsiClaimrule: 2518: Error claiming path vmhba65:C0:T0:L0. Not supported.



  • 9.  RE: NVMEof Datastore Issues

    Posted Jan 21, 2022 06:02 AM

    I configured a nvme block device on Centos using 'nvmetcli' and did an nvme connect from ESXi 7. Though nvme connect was successful and the namespace was listed in 'esxcli nvme namespace list', the path is the namespace was reported DEAD. Any pointers as to why the path was reported DEAD?

    vmkernel log:

    ==========

    022-01-21T02:44:37.120Z cpu22:1048893)HPP: HppCreateDevice:3071: Created logical device 'uuid.b8bbea9b8b34471b97b13222a954e43e'.                             

    2022-01-21T02:44:37.120Z cpu22:1048893)WARNING: HPP: HppClaimPath:3956: Failed to claim path 'vmhba65:C0:T0:L0': Not supported                               

    2022-01-21T02:44:37.120Z cpu22:1048893)HPP: HppUnclaimPath:4002: Unclaiming path vmhba65:C0:T0:L0                                                  

    2022-01-21T02:44:37.120Z cpu22:1048893)ScsiPath: 8597: Plugin 'HPP' rejected path 'vmhba65:C0:T0:L0'                                                      

    2022-01-21T02:44:37.120Z cpu22:1048893)ScsiClaimrule: 2039: Plugin HPP specified by claimrule 65534 was not able to claim path vmhba65:C0:T0:L0: Not supported

    2022-01-21T02:44:37.121Z cpu22:1048893)WARNING: ScsiPath: 8496: NMP cannot claim a path to NVMeOF device vmhba65:C0:T0:L0       

    2022-01-21T02:44:37.121Z cpu22:1048893)ScsiClaimrule: 2039: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba65:C0:T0:L0: Not supported

    2022-01-21T02:44:37.121Z cpu22:1048893)ScsiClaimrule: 2518: Error claiming path vmhba65:C0:T0:L0. Not supported.

     

    esxcli storage core path list -p vmhba65:C0:T0:L0

    tcp.vmnic5:3c:fd:fe:c3:93:5d-tcp.unknown-

       UID: tcp.vmnic5:3c:fd:fe:c3:93:5d-tcp.unknown-

       Runtime Name: vmhba65:C0:T0:L0

       Device: No associated device

       Device Display Name: No associated device

       Adapter: vmhba65

       Channel: 0

       Target: 0

       LUN: 0

       Plugin: (unclaimed)

       State: dead

       Transport: tcp

       Adapter Identifier: tcp.vmnic5:3c:fd:fe:c3:93:5d

       Target Identifier: tcp.unknown

       Adapter Transport Details: Unavailable or path is unclaimed

       Target Transport Details: Unavailable or path is unclaimed

       Maximum IO Size: 1048576

     

    esxcli nvme controller list

    Name                                                                                    Controller Number  Adapter  Transport Type  Is Online

    --------------------------------------------------------------------------------------  -----------------  -------  --------------  ---------

    nqn.2014-08.org.nvmexpress_144d_SAMSUNG_MZQLB960HAJR-00007______________S437NE0M100761                256  vmhba2   PCIe                 true

    nqn.2014-08.org.nvmexpress_144d_SAMSUNG_MZQLB960HAJR-00007______________S437NE0M101520                257  vmhba3   PCIe                 true

    testnqn#vmhba65#15.33.8.5:4420                                                                        328  vmhba65  TCP                  true   

     

    esxcli nvme namespace list

    Name                                   Controller Number  Namespace ID  Block Size  Capacity in MB

    -------------------------------------  -----------------  ------------  ----------  --------------

    eui.343337304d1007610025384500000001                 256             1         512          915715

    eui.343337304d1015200025384500000001                 257             1         512          915715

    uuid.b8bbea9b8b34471b97b13222a954e43e                328             1         512           20480                    



  • 10.  RE: NVMEof Datastore Issues

    Posted Mar 23, 2022 06:37 AM

    I hit the issue with linux nvmet + esxi host.

    Does someone know how to dig out how HPP rejects it as failed to claim path?

    I would like to know which doc describes how to debug with HPP.

     

    If you know that, please leave some information about that for me.

    Thanks~



  • 11.  RE: NVMEof Datastore Issues

    Posted Aug 04, 2022 12:02 PM

    See below link. VMware expects some functionality which is not in Linux Kernel 5 and so it cant work.

    We are also trying to get that working and i found this link:

    https://koutoupis.com/2022/04/22/vmware-lightbits-labs-and-nvme-over-tcp/