VMware vSphere

 View Only
Expand all | Collapse all

Lost access to datastore

  • 1.  Lost access to datastore

    Posted Jun 02, 2021 04:21 PM

    Hi All

    Hope someone out there can help.

    I have 3 ESXi hosts running 6.7U3 (2 Production & 1 development) connected via scsi to a Lenovo DE4000H storage array. On the storage array we have 3 volumes seen by all 3 ESXi servers. Resently rebooted the development esxi host, when it came back online it could no longer see the datastores via the scsi connection.

    I have checked the scsi adapter on the web interface to the esxi and it states it is on line. I have verified the iqn numbers between the esxi server and the storage array and all seems to be in order. I can even ping the IP address give to the san port on the san array but no matter what I try I am unable to see the datastores any more. This is only happening on the 1 server that was rebooted. The other 2 can see the volumes fine and we are able to browse the datestore if required.

    In the log /var/log/vobd.log there are some errors 5488265684us: [esx.problem.storage.connectivity.lost] Lost connectivity to storage device naa.6d039ea00014999e0000017b5xxxxxxx. Path vmhba64:C1:T0:L3 is down. Affected datastores: "dev_disk".

    I have followed a lot of suggestions found on the forums about removing the links and then adding them back again, then rescanning, but nothing seems to help. I have shtdown ESXi host, replugged scsi cables and powered up server again but running command esxcli-scsidevs -m only shows the local disk so no VM are able to be started as all the images are on the external scsi disks

    Any suggestion would be great as I am banging my head against a wall to get this working again.



  • 2.  RE: Lost access to datastore

    Posted Jun 02, 2021 04:44 PM

    Hello.
    The physical servers are Lenovo ?
    The connection between the servers and the Storage DE4000H is direct or through an Ethernet Switch or 2 Ethernet switches, the switches are Lenovo?
    Did you update the Firmware of the servers and the Storage as part of the installation?
    Were the physical servers and storage purchased together as part of a solution?

     



  • 3.  RE: Lost access to datastore

    Posted Jun 03, 2021 07:18 AM

    Hi Enrique

    The physical servers are Lenovo ?
    Yes all 3 servers are Lenovo, model SR635 Type 7Y99
    The connection between the servers and the Storage DE4000H is direct or through an Ethernet Switch or 2 Ethernet switches, the switches are Lenovo?
    The Connection type is direct, so no switches in between.
    Did you update the Firmware of the servers and the Storage as part of the installation?
    No changes were made to the environment. We simply did a shutdown of the server and restarted it.
    Were the physical servers and storage purchased together as part of a solution?
    Yes all of them were bought together around 1 year ago and installed shortly afterwards, the solution has been working fine since the installation. 


  • 4.  RE: Lost access to datastore

    Posted Jun 03, 2021 02:45 PM


  • 5.  RE: Lost access to datastore

    Posted Jun 03, 2021 03:31 PM

    Hi

    Please see attached zip file. 

    I have run the tool as you have suggested and attached.



  • 6.  RE: Lost access to datastore

    Posted Jun 03, 2021 05:29 PM

    Hello.
    I did not get any information in the DSA.
    Do you have access to the server's service device? which is used to remotely manage and monitor the server. At Lenovo it is called Xclarity Controller.
    It is the ethernet port labeled XCC and is on the left side of the video connector on the back of the server.
    If you have access you can capture the screenshots with the Frimware levels of the server and attach them in the post.

    Attached is a link how to get the service data from the Lenovo server.

    https://www.youtube.com/watch?v=wqDqQZS6eRM

     

    Default
    User: USERID
    Password: PASSW0RD (0 is zero).

    run the following commands to verify
    esxcfg-vswitch -l
    esxcfg-vmknic -l
    esxcli iscsi adapter list

    esxcli network nic list

    Execute the following command for the vmnicX being used for the ISCSI connection to know the driver and Frimware of it

    esxcli network nic get -n vmnicX

     



  • 7.  RE: Lost access to datastore

    Posted Jun 04, 2021 10:02 AM
    Hi
     
    I do have full access to the server, but it does not look like we have the xClarity feature.
    I can see the network port that you have mentioned, although it is only marked with a spanner symbol not XCC. I have changed my laptop network details to be in the same range as the default IP range of the controller (192.168.70.125) but no joy in gaining access. 
    A littler further browsing I found a table at the following link that states our model is not supported for XCC - marked in red below
    Server XCC Standard XCC Advanced XCC Enterprise
    SE350 ( 7Z46 / 7D1X)SupportedMost models**Some models**
    ST50 (7Y48/7Y50Not supportedNot supportedNot supported
    ST250 (7Y45/7Y46)Most models*UpgradeUpgrade
    SR150 (7Y54)Most models*UpgradeUpgrade
    SR250 (7Y51/7Y52)Most models*UpgradeUpgrade
    ST550 (7X09 / 7X10)Most models*UpgradeUpgrade
    SR530 (7X07 / 7X08)Most models*UpgradeUpgrade
    SR550 (7X03 / 7X04)Most models*UpgradeUpgrade
    SR570 (7Y02 / 7Y03)Most models*UpgradeUpgrade
    SR590 (7X98 / 7X99)Most models*UpgradeUpgrade
    SR630 (7X01 / 7X02)Most models*UpgradeUpgrade
    SR635 (7Y98 / 7Y99)Not supportedNot supportedNot supported
     
    Here is the output of the commands you asked me to run. Hope this gives you some guidance to my issue
     
    [root@esxi-vs2c:~] esxcfg-vswitch -l
    Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks
    vSwitch0         11776       4           128               1500    vmnic0

      PortGroup Name        VLAN ID  Used Ports  Uplinks
      dev-10-new            2010     0           vmnic0
      dev-1                 2001     0           vmnic0
      Unity-dev             1001     0           vmnic0
      Production-Old        3001     0           vmnic0
      dmz-15                3115     0           vmnic0
      dmz-14                3114     0           vmnic0
      dmz-12                3112     0           vmnic0
      dmz-13                3113     0           vmnic0
      dmz-11                3111     0           vmnic0
      dmz-10                3110     0           vmnic0
      dmz-9                 3109     0           vmnic0
      dmz-8                 3108     0           vmnic0
      dmz-7                 3107     0           vmnic0
      dmz-6                 3106     0           vmnic0
      dmz-5                 3105     0           vmnic0
      dmz-4                 3104     0           vmnic0
      dmz-3                 3103     0           vmnic0
      dmz-2                 3102     0           vmnic0
      dmz-1                 3101     0           vmnic0
      dmz-0                 3100     0           vmnic0
      Production            3002     0           vmnic0
      VM Network            0        0           vmnic0
      Management Network    111      1           vmnic0

    Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks
    StorageA         11776       4           1024              1500    vmnic4

      PortGroup Name        VLAN ID  Used Ports  Uplinks
      StorageA              0        1           vmnic4

    Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks
    StorageB         11776       4           1024              1500    vmnic5

      PortGroup Name        VLAN ID  Used Ports  Uplinks
      StorageB              0        1           vmnic5

    [root@esxi-vs2c:~] esxcfg-vmknic -l
    Interface  Port Group/DVPort/Opaque Network        IP Family IP Address                              Netmask         Broadcast       MAC Address       MTU     TSO MSS   Enabled Type                NetStack
    vmk0       Management Network                      IPv4      172.16.xxx.xxx                           255.255.255.0   172.16.xxx.xxx   b0:26:28:e8:xx:xx 1500    65535     true    STATIC              defaultTcpipStack
    vmk0       Management Network                      IPv6      fe80::b226:28ff:fee8:xxxx               64                              b0:26:28:e8:xx:xx 1500    65535     true    STATIC, PREFERRED   defaultTcpipStack
    vmk1       StorageA                                IPv4      192.168.xxx.xxx                         255.255.255.0   192.168.xxx.xxx 00:50:56:62:xx:xx 1500    65535     true    STATIC              defaultTcpipStack
    vmk1       StorageA                                IPv6      fe80::250:56ff:fe62:xxxx                64                              00:50:56:62:xx:xx 1500    65535     true    STATIC, PREFERRED   defaultTcpipStack
    vmk2       StorageB                                IPv4      192.168.xxx.xxx                         255.255.255.0   192.168.xxx.xxx 00:50:56:65:xx:xx 1500    65535     true    STATIC              defaultTcpipStack
    vmk2       StorageB                                IPv6      fe80::250:56ff:fe65:xxxx                64                              00:50:56:65:xx:xx 1500    65535     true    STATIC, PREFERRED   defaultTcpipStack

    [root@esxi-vs2c:~] esxcli iscsi adapter list
    Adapter  Driver     State   UID                                        Description
    -------  ---------  ------  -----------------------------------------  ----------------------
    vmhba64  iscsi_vmk  online  iqn.1998-01.com.vmware:esxi-vs2c-5cf8737b  iSCSI Software Adapter

    [root@esxi-vs2c:~] esxcli network nic list
    Name    PCI Device    Driver  Admin Status  Link Status  Speed  Duplex  MAC Address         MTU  Description
    ------  ------------  ------  ------------  -----------  -----  ------  -----------------  ----  -------------------------------------------------------
    vmnic0  0000:01:00.0  ntg3    Up            Up            1000  Full    b0:26:28:e8:xx:xx  1500  Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet
    vmnic1  0000:01:00.1  ntg3    Up            Down             0  Half    b0:26:28:e8:xx:xx  1500  Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet
    vmnic2  0000:01:00.2  ntg3    Up            Down             0  Half    b0:26:28:e8:xx:xx  1500  Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet
    vmnic3  0000:01:00.3  ntg3    Up            Down             0  Half    b0:26:28:e8:xx:xx  1500  Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet
    vmnic4  0000:45:00.0  i40en   Up            Up           10000  Full    68:05:ca:af:xx:xx  1500  Intel(R) Ethernet Controller X710 for 10GbE SFP+
    vmnic5  0000:45:00.1  i40en   Up            Up           10000  Full    68:05:ca:af:xx:xx  1500  Intel(R) Ethernet Controller X710 for 10GbE SFP+
    vmnic6  0000:81:00.0  ntg3    Up            Down             0  Half    b0:26:28:f8:xx:xx  1500  Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet
    vmnic7  0000:81:00.1  ntg3    Up            Down             0  Half    b0:26:28:f8:xx:xx  1500  Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet
    [root@esxi-vs2c:~] esxcli network nic get -n vmnic4
       Advertised Auto Negotiation: true
       Advertised Link Modes: Auto, 10000BaseT/Full
       Auto Negotiation: true
       Cable Type: DA
       Current Message Level: 0
       Driver Info:
             Bus Info: 0000:45:00:0
             Driver: i40en
             Firmware Version: 7.00 0x80005183 1.2203.0
             Version: 1.8.6
       Link Detected: true
       Link Status: Up
       Name: vmnic4
       PHYAddress: 0
       Pause Autonegotiate: false
       Pause RX: false
       Pause TX: false
       Supported Ports: DA
       Supports Auto Negotiation: true
       Supports Pause: true
       Supports Wakeon: false
       Transceiver:
       Virtual Address: 00:50:56:5e:xx:xx
       Wakeon: None

    [root@esxi-vs2c:~] esxcli network nic get -n vmnic5
       Advertised Auto Negotiation: true
       Advertised Link Modes: Auto, 10000BaseT/Full
       Auto Negotiation: true
       Cable Type: DA
       Current Message Level: 0
       Driver Info:
             Bus Info: 0000:45:00:1
             Driver: i40en
             Firmware Version: 7.00 0x80005183 1.2203.0
             Version: 1.8.6
       Link Detected: true
       Link Status: Up
       Name: vmnic5
       PHYAddress: 0
       Pause Autonegotiate: false
       Pause RX: false
       Pause TX: false
       Supported Ports: DA
       Supports Auto Negotiation: true
       Supports Pause: true
       Supports Wakeon: false
       Transceiver:
       Virtual Address: 00:50:56:52:xx:xx
       Wakeon: None
     
    Regards


  • 8.  RE: Lost access to datastore

    Posted Jun 04, 2021 02:16 PM

    Hello.
    The network configuration including ISCSI is normal, but there are some details like:
    IPv6 if you are not using it is preferable to disable it.
    For the ISCSI configuration the recommended MTU is 9000.
    In the adapters being used for ISCSI vmnic4 and vmnic5 the driver (1.8.6) and the Firmware (7.0) is among those recommended by VMware in its compatibility matrix for version 6.7 Update 3.

    https://www.vmware.com/resources/compatibility/detail.php?deviceCategory=io&productid=37976&vcl=true

    The intel ethernet controller X710 cards that are included with some brand name servers (HP, Dell, and more) have presented a lot of problems.


    In these cases you can use the native driver (VMware) instead of the manufacturer's driver (partner).
    Another option that has been tested for versions 6.0 and 6.5 is to disable TSO and LRO.

    The last option is to change the adapters.

    I recommend you to try this second option, I attach details, it needs a reboot of the ESXi host for the changes to be applied.

    To disable TSO:

    Run this command to determine if the hardware TSO is enabled on the host:

        esxcli system settings advanced list -o /Net/UseHwTSO

        

    Run this command to disable TSO at the host level:

        esxcli system settings advanced set -o /Net/UseHwTSO -i 0

    (This command uses 0 (zero) to disable and 1 to enable.)

    To disable LRO:

        Run this command to determine if LRO is enabled for the VMkernel adapters on the host:

        esxcli system settings advanced list -o /Net/TcpipDefLROEnabled

        

        Run this command to disable LRO for all VMkernel adapters on a host:

        esxcli system settings advanced set -o /Net/TcpipDefLROEnabled -i 0

    (This command uses 0 (zero) to disable and 1 to enable)

    You must find an offline time of the ESXi host to make the changes and reboot it. Then run a rescan of HBA and Storage and check if you have access to the ISCSI Storage.


    If the problem continues, we could do a remote access (Free) for a general check of the ESXi and Storage, you must have access and users/password.

     



  • 9.  RE: Lost access to datastore

    Posted Jun 04, 2021 02:41 PM

    Hello.
    About the Lenovo SR635, I was surprised by the fact that it does not have the XCC, which was standard on all Lenovo Servers, but now I see that it does not.
    According to the product guide this server has the Lenovo XClarity Provisioning Manager lite.
    attached is the user guide
    https://sysmgt.lenovofiles.com/help/topic/LXPML/LXPM_Lite_user_guide.pdf


    If you have offline time on the server would be good to enter the BIOS (UEFI) verify and capture the firmware level that has the server. Additionally know if the server has Lenovo XClarity Provisioning Manager or Lenovo XClarity Provisioning Manage lite.

     



  • 10.  RE: Lost access to datastore

    Posted Jun 05, 2021 03:27 PM

    Hi

    Thank you for your time that you are spending on this problem

    I have done what you have suggested, changed MTU size to 9000 on both storage devices vmnic 4 and 5 and disabled IPv6. I have disabled TSO and LRO by using the commands that you have supplied. I have rebooted the ESXi server and rescanned vmhba64 but unfortunately no joy. Still not seeing any of the datastores that I should see.

    When I check the storage device Lenovo DE4000H scsi sessions, I see that there is no mention of the 3rd server. Could it be that I have a faulty 10GB network card within the server, even though I am able to ping the end point IP address (ControllerA port 0e and ControllerB port 0e)? 

    Session (SSID)       Initiator User Label     Initiator iSCSI Name                                            Associated Host   Connections (CID)
    0x00023D000002:2 vs2b_1                      iqn.1998-01.com.vmware:esxi-vs2b-280xxxxx    vs2b                       0x0
    0x00023D000002:2 vs2a_1                      iqn.1998-01.com.vmware:esxi-vs2a-3b8xxxxx    vs2a                       0x0
    0x00023D000001:1 vs2b_1                      iqn.1998-01.com.vmware:esxi-vs2b-280xxxxx    vs2b                       0x0
    0x00023D000001:1 vs2a_1                      iqn.1998-01.com.vmware:esxi-vs2a-3b8xxxxx    vs2a                       0x0
     
    I will check the BIOS option out on Monday and create a bootable USB with LXPM.
    The server is basically down, so if you are good to do a remote check, any time is good, you can private message me on kenneth.wendt@gmail.com to arrange.


  • 11.  RE: Lost access to datastore

    Posted Jun 14, 2021 02:53 PM

    Hi Enrique

    Thought I would just post the details here as well for a completion of the diagnostics.

    IBM have replaced the 10GB adapter and I reconfigured the link to have the correct details, but it seemed to have made no difference.
    I then went ahead and reinstalled the ESXi with the image from the Lenovo website for version 6.7U3. I then recreated the link for storage once again making use of the document you supplied me. Again I could not see the storage. 
    I then removed the link from dynamic targets and recreated it under static targets, like I originally had it. I could now see the storage via 1 path, Controller A.
    For some strange reason when I add the second path for Controller B, it does not seem to save it, even though it says it was successful under recent tasks.
    Good news is that I can see the datastores once more. Bad news is that the link is in a degraded state due to not being able to save the 2nd static path.


  • 12.  RE: Lost access to datastore

    Posted Jun 14, 2021 04:13 PM

    Hi Again

    I have followed the instruction in the url: https://tv.netapp.com/detail/video/6062700336001/vmware-configuration-guide-for-e-series-integration-with-esxi however when I get to the time stamp 00:08:08 where I need to save the static target device, only the first one is saved. I have tried different browsers incase it was a browser caching issue, but get the same out come on both firefox and edge. Any idea why I am limited to one device. I see you added four links, but I only have two, one per controller.

    If you would like to do another remote session, we can plan for tomorrow.



  • 13.  RE: Lost access to datastore
    Best Answer

    Posted Jun 16, 2021 12:53 PM

    Hi Enrique

    Thank you for all the assistance on this strange issue. It is finally resolved.

    For any future visitors here are the high level steps taken.

    1. IBM replaced the physical network SFP card. But still had the same issue and was not able to detect any of the datastores. Not sure this was necessary

    2. I reloaded the ESXi with a newer version of 6.7U3 that I obtained from the Lenovo website, not VMware, as this is a specific ESXi version with all the extra Lenovo goodies included.

    3. I was now able to add 1 static route and could see the datastores. However I was not able to get the second path to save. No matter how many times I added it, once clicking save it would disappear.

    4. Added the second scsi path using the command line. #esxcli iscsi adapter discovery statictarget add -A vmhbaXX -a ip_address:port_number -n iqn_number_of_sortage_device

    5. Rescanned for new devices from cli : #esxcli iscsi discovery rediscover -A vmhbaXX

    This solved my issues. Shout out to Enrique for sticking with the problem through to the end.



  • 14.  RE: Lost access to datastore

    Posted Jun 22, 2021 01:44 PM

    command in step 5 should read: esxcli iscsi adapter discovery rediscover -A vmhbaXX (XX=correct number of your adapter)