VMware vSphere

 View Only
Expand all | Collapse all

LUN is invisible from ESXi hosts in cluster

  • 1.  LUN is invisible from ESXi hosts in cluster

    Posted Mar 19, 2013 10:24 PM

    Hi,

    We have recently presented a couple of 1TB Netapp shared LUNs to ESXi5.0 hosts. Now one of the LUN is invisible from ESXi hosts in cluster, another LUN is fine which is from same storage igroup. As per my request, storage team have represented the LUN to ESXi hosts but no luck, they have confirmed that no issues at storage end..

    We have identified that all paths for this LUN are in dead state, they are not coming to active state even after rescan HBAs (rescanned HBAs individually from ESXi as well) also rebooted the ESXi hosts but no luck.. When I tried to active the paths by command from ESXi, getting error like path or HBA busy..

    We have only 10 LUNs in ESX farm out of 9 are in use. I think ESXi 5.0 have limit to 256 LUNs, 1024 paths and LUN size can be 64TB, so no issues in that aspect..

    Please provide your inputs..

    Regards,

    Surendra



  • 2.  RE: LUN is invisible from ESXi hosts in cluster

    Broadcom Employee
    Posted Mar 20, 2013 12:37 AM

    Hi,

        Can you perform a rescan and post the kernel logs?



  • 3.  RE: LUN is invisible from ESXi hosts in cluster

    Posted Mar 20, 2013 12:44 PM

    Hi,

    PFB the kernal logs, effected lun is 9.

    2013-03-20T09:23:10.424Z cpu6:9035)WARNING: VMW_SATP_ALUA: satp_alua_getTargetPortInfo:79:Could not find relative target port ID for path "vmhba6:C0:T3:L9" - Not found (19588710
    2013-03-20T09:23:10.424Z cpu6:9035)WARNING: NMP: nmp_SatpClaimPath:2093:SATP "VMW_SATP_ALUA" could not add  path "vmhba6:C0:T3:L9" for device "Unregistered Device". Error Not fo
    2013-03-20T09:23:10.424Z cpu6:9035)WARNING: ScsiPath: 4550: Plugin 'NMP' had an error (Not found) while claiming path 'vmhba6:C0:T3:L9'.Skipping the path.
    2013-03-20T09:23:10.424Z cpu6:9035)ScsiClaimrule: 1329: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba6:C0:T3:L9. Busy
    2013-03-20T09:23:10.424Z cpu6:9035)ScsiClaimrule: 1554: Error claiming path vmhba6:C0:T3:L9. Busy.
    2013-03-20T09:23:10.425Z cpu6:9035)WARNING: VMW_SATP_ALUA: satp_alua_determineStatus:559:VMW_SATP_ALUA:Unknown Check condition 0/2 0x5 0x24 0x0.
    2013-03-20T09:23:10.425Z cpu6:9035)WARNING: VMW_SATP_ALUA: satp_alua_issueCommandOnPath:676:Path "vmhba5:C0:T0:L9" determined to be in unexpected NOT READY state when probed.
    2013-03-20T09:23:10.425Z cpu6:9035)WARNING: VMW_SATP_ALUA: satp_alua_determineStatus:559:VMW_SATP_ALUA:Unknown Check condition 0/2 0x5 0x24 0x0.
    2013-03-20T09:23:10.425Z cpu6:9035)WARNING: VMW_SATP_ALUA: satp_alua_issueCommandOnPath:676:Path "vmhba6:C0:T0:L9" determined to be in unexpected NOT READY state when probed.
    2013-03-20T09:23:10.426Z cpu6:9035)WARNING: VMW_SATP_ALUA: satp_alua_determineStatus:559:VMW_SATP_ALUA:Unknown Check condition 0/2 0x5 0x24 0x0.
    2013-03-20T09:23:10.426Z cpu6:9035)WARNING: VMW_SATP_ALUA: satp_alua_issueCommandOnPath:676:Path "vmhba5:C0:T2:L9" determined to be in unexpected NOT READY state when probed.
    2013-03-20T09:23:10.427Z cpu6:9035)WARNING: VMW_SATP_ALUA: satp_alua_determineStatus:559:VMW_SATP_ALUA:Unknown Check condition 0/2 0x5 0x24 0x0.
    2013-03-20T09:23:10.427Z cpu6:9035)WARNING: VMW_SATP_ALUA: satp_alua_issueCommandOnPath:676:Path "vmhba6:C0:T1:L9" determined to be in unexpected NOT READY state when probed.
    2013-03-20T09:23:10.427Z cpu6:9035)WARNING: vmw_psp_rr: psp_rrSelectPathToActivate:972:Could not select path for device "Unregistered Device".
    2013-03-20T09:23:10.427Z cpu6:9035)WARNING: NMP: nmpPathClaimEnd:1195:Device, seen through path vmhba6:C0:T1:L9 is not registered (no active paths)
    2013-03-20T09:23:10.504Z cpu6:9035)Vol3: 647: Couldn't read volume header from control: Invalid handle
    2013-03-20T09:23:10.504Z cpu6:9035)FSS: 4333: No FS driver claimed device 'control': Not supported
    2013-03-20T09:23:10.636Z cpu6:9035)VC: 1449: Device rescan time 37 msec (total number of devices 15)
    2013-03-20T09:23:10.636Z cpu6:9035)VC: 1452: Filesystem probe time 167 msec (devices probed 14 of 15)
    2013-03-20T09:23:16.179Z cpu5:11141)Vol3: 647: Couldn't read volume header from control: Invalid handle
    2013-03-20T09:23:16.179Z cpu5:11141)FSS: 4333: No FS driver claimed device 'control': Not supported
    2013-03-20T09:23:16.328Z cpu5:11141)VC: 1449: Device rescan time 65 msec (total number of devices 15)
    2013-03-20T09:23:16.328Z cpu5:11141)VC: 1452: Filesystem probe time 289 msec (devices probed 14 of 15)



  • 4.  RE: LUN is invisible from ESXi hosts in cluster

    Posted Mar 20, 2013 01:46 PM

    yeah -- most likely that LUN is not in online state from NetApp Array - ask your storage team to double check -- lun show -m should show it



  • 5.  RE: LUN is invisible from ESXi hosts in cluster

    Posted Mar 20, 2013 03:57 PM

    We already checked on this, LUN is online and r/w. Storage team have represented it for us as well, but no luck..



  • 6.  RE: LUN is invisible from ESXi hosts in cluster

    Posted Mar 20, 2013 05:56 AM

    I still wont rule out an issue from the NetApp array -- perhaps the LUN was presented read-only? there is enough free space on the volume/qtree to which the LUN belongs to?

    if these questions have been answered ..then perhaps you can offline the naa id using esxcfg-scsidevs -o and then do a rescan

    helps?

    ~Sai Garimella



  • 7.  RE: LUN is invisible from ESXi hosts in cluster

    Posted Mar 20, 2013 06:05 PM

    Hi Sai,

    Thanks for the promt response...

    LUN is online and r/w.. there is sufficient free space on volume..

    when tried to offline the naa id using esxcfg-scsidevs -o getting error as unknow device naa..



  • 8.  RE: LUN is invisible from ESXi hosts in cluster

    Posted Mar 20, 2013 06:27 PM

    24/0 in the error is telling us that there might be a reservation on the LUN -- on the NetApp array , can you run 'lun persistent_resv show /vol/LUN'  if any reservation is present try to clear it

    HTH

    ~Sai Garimella



  • 9.  RE: LUN is invisible from ESXi hosts in cluster

    Posted Mar 20, 2013 10:55 PM

    Hi Sai,

    You want us to check the LUN reservation at storage end or from ESXi.. If you say storage end, storage team is clearly told that no issues at their end(also they have represented it), even there is an issue we are not in a situation to ask them again without showing the proof..

    If it is ESXi end, no reservation errors are observed in kernel.log file and I think pending reservations may lead lun locking issues which can resolve by reset. But to run this cmd we are not able to see the LUN from ESXi as its not detected, hence its resulting unknown device.. Please correct me if iam wrong..

    And one more thing is that after the lun representation by storage team, we have rebooted the esxi hosts one by one by putting into maintenance mode, then LUN was visible for sometime on 4 servers out of 7, then again path are went to dead state hence LUN become invisible....



  • 10.  RE: LUN is invisible from ESXi hosts in cluster

    Broadcom Employee
    Posted Mar 21, 2013 12:37 AM

    Hello Surendra,

                          Sorry for the late response.If the paths are showing dead and if you are sure Lun is still online in Storage ,i would strongly suggest you to check the Fabric/Ethernet switch.As you know Esx/Esxi host will never make a dead paths,it will still rotue the I/O as long as the last availalbe path is visible.

    Active - The path is working and is the current path being used for transferring data.

    Disabled - The path has been disabled and no data can be transferred.

    Standby - The path is working but is not currently used for data transfer.

    Dead - The software cannot connect to the disk through this path

    Is your Storage Active-Active or Active-Passive? Please provide the model number of Storage and path policy that you have selected for the lun?

    Most likely Zoning or Lun presentation issue!!!



  • 11.  RE: LUN is invisible from ESXi hosts in cluster

    Posted Mar 22, 2013 10:14 AM

    Hi Sreec,

    No issues in connectivity from fabric switch to ESXI hosts, have already verified.. Its active-active storage and round robin path selection.. please se ethe attched log files for more details..



  • 12.  RE: LUN is invisible from ESXi hosts in cluster

    Posted Mar 21, 2013 04:10 AM

    If you have a chance to use that LUN via RDM in a Linux VM or a ESXi-stateless VM have a look at the VMFS volume and check if it still has a valid MBR/GPT table.

    Last week I had two day at a customer with similar symptoms.
    The first sector of the volume was wiped blank - after a repair of that sector the mount problems were gone



  • 13.  RE: LUN is invisible from ESXi hosts in cluster

    Posted Mar 21, 2013 01:25 PM

    I was refering reservations on the SAN array --

    is/was there data on the LUN? is the LUN coming from the same controllers as the other LUNs?

    is this LUN by chance shared to any other hosts in a different cluster ?



  • 14.  RE: LUN is invisible from ESXi hosts in cluster

    Posted Mar 21, 2013 07:48 PM

    HI Sai,

    PFB the inline

    is/was there data on the LUN? is the LUN coming from the same controllers as the other LUNs?

    There was a data, but after this issue happen LUN was detected to one server after rescan then we have moved the VM which is on that LUN.. Now there is no data hence we were dare to represented it.. LUN is coming from same controllers and igroup..

    is this LUN by chance shared to any other hosts in a different cluster ?

    NO, it is shared to only this perticular cluster(7 ESXI hosts)..