VMware vSphere

 View Only
Expand all | Collapse all

IO issues VMs in one host

  • 1.  IO issues VMs in one host

    Posted Jul 09, 2013 03:30 AM

    hi guys

    I have a cluster made of 2 ESXi servers running build 1065491 for a month monitoring has reported me some issues with VMs - ping has been lost -

    After that I found that VMs reporting losing pings were hosted in the same ESXi server. Today a issue was reported with the vCenter appliance which did not respond.... doing more research I found that all VMs hosted in that particular host were showing up this messages both Windows - Linux

    source: LSI_SAS

    Event ID: 129

    Reset to device, \Device\RaidPort0, was issued.

    source: disk

    Event ID: 153

    The IO operation at logical block address 142f8d for Disk 0 was retried.

    kernel: [40114.926402] mptscsih: ioc0: attempting task abort! (sc=ffff8802116c3d80)

    kernel: [40114.926410] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 05 57 80 00 00 40 00

    kernel: [40115.055129] mptscsih: ioc0: task abort: SUCCESS (rv=2002) (sc=ffff8802116c3d80) (sn=529576)

    any idea what could be causing this issue? or how to fix it?

    Important I moved one Windows 2012 VM from non-problematic ESXi server to the problematic one and LSI_SAS and disk messages started to show up

    thanks a lot guys



  • 2.  RE: IO issues VMs in one host

    Posted Jul 09, 2013 04:26 AM

    It looks to be the issue with the LSI controller hardware being used. Can you involve your hardware vendor to do a sanity test on this. ensure the BIOS of server and firmware and driver is up to date



  • 3.  RE: IO issues VMs in one host

    Posted Jul 09, 2013 04:41 AM

    This server is not using local storage in fact does not have disk it's booting from Storage could it be an issue with the Fiber channel card?

    Firmware is missing just one version...

    how I am supposed to update drivers when ESXi install them all when is installed



  • 4.  RE: IO issues VMs in one host

    Posted Jul 09, 2013 07:10 AM

    can you confirm which controller are you using or post the output of #lspci



  • 5.  RE: IO issues VMs in one host

    Posted Jul 09, 2013 12:56 PM

    interesting

    Like I said these server has no internal disk but looks like controller is there

    00:0c:00.0 Mass storage controller: LSI Logic / Symbios Logic LSI2004 [vmhba0]

    so even when is not used by internal should be affecting the server? or is it Fiber channel HBAs?

    00:11:00.0 Serial bus controller: QLogic Corp ISP2532-based 8Gb Fibre Channel to PCI Express HBA [vmhba1]

    00:1b:00.0 Serial bus controller: QLogic Corp ISP2532-based 8Gb Fibre Channel to PCI Express HBA [vmhba3]

    thanks

    BTW in about 4 hours I am going to update all firmware in server but any input will be appreciated



  • 6.  RE: IO issues VMs in one host

    Posted Jul 09, 2013 01:01 PM

    The aborts generated are for vmhba0 which is the LSI controller as stated earlier...



  • 7.  RE: IO issues VMs in one host

    Posted Jul 09, 2013 01:15 PM

    sorry for keep asking how do you know are generated for hba0? i have not post any logs yet



  • 8.  RE: IO issues VMs in one host

    Posted Jul 09, 2013 02:01 PM

    Hi,

    Can you post vmkernel logs file from affected ESXI host.

    Regards

    Mohammed



  • 9.  RE: IO issues VMs in one host
    Best Answer

    Posted Jul 09, 2013 02:45 PM

    Hi,

         I do understand the LSI SAS and mptscsi messages are coming inside the guests which are having a LSI controller. I can agree to a point that the LSI vmhba0 in the server might have nothing to do with the error messages inside the guests since you do not have the vms in the local datastore but from FC.

    So, let us take a look at the usual suspects:

    1. Like Memmad pointed out, from the vmkernel log, we can see if the FC lun is having any trouble and is being reported.

    2. Does the ESX itself perform very slow while running any commands like esxcfg-scsidevs -m or esxcfg-rescan vmhbaX [X adapter of your FC]

    3. Is the FC lun configured for multi path facing any issues, again this can be from vmkernel logs.?

    4. I am sure that the swap file would be generated on the FC lun, but do have a check if the last updated swap file on the vm directory is having current time. This is to rule out any delay in the swap file writing from the guest due to FC issues.

    5. There were earlier issues similar to yours like VM losing pings, slower performance, but they were in iscsi and in 4.1 kb here

    It does seems that the guests are aborting or resetting their scsi commands which seems to be the issue here.



  • 10.  RE: IO issues VMs in one host

    Posted Jul 09, 2013 03:45 PM

    thanks a lot guys for your input

    OK, I found this:

    1. no logs. I was trying to get vmkernel for you guys

    2013-07-09_0943 - karlochacon's library

    second

    This command is taking some time compare to the other esxi server which works normal

    # esxcfg-scsidevs -m


    right now I rebooting the server to update firmware on the server.



  • 11.  RE: IO issues VMs in one host

    Posted Jul 09, 2013 03:57 PM

    1. no logs. I was trying to get vmkernel for you guys

    Hmm.. might be due to the reason, a scratch partition was not set.

    This command is taking some time compare to the other esxi server which works normal

    Lets hope that it could fail due to an error message.

    Just in case, you can see the vmkernel activity by pressing Alt+F12 in your console of ESXi server while performing esxcfg-rescan vmhbaX



  • 12.  RE: IO issues VMs in one host

    Posted Jul 09, 2013 06:36 PM


  • 13.  RE: IO issues VMs in one host

    Posted Jul 09, 2013 07:15 PM

    I am able to pick up this kb from the clues of the screenshot :smileyhappy: Hope that helps.



  • 14.  RE: IO issues VMs in one host

    Posted Jul 09, 2013 07:20 PM

    thanks a lot

    yeah I was thinking about this too even when this ESXi has not hung yet

    esxcli system settings kernel list -o iovDisableIR



  • 15.  RE: IO issues VMs in one host

    Posted Jul 09, 2013 07:22 PM

    in fact after using this

    # esxcli system settings kernel set --setting=iovDisableIR -v TRUE

    this command is working as it should

    # esxcfg-scsidevs -m


    :smileyhappy:


    Now I am go to add some workload to this server and monitor and get back to you guys



  • 16.  RE: IO issues VMs in one host

    Posted Jul 09, 2013 07:30 PM

    Great...

    Also, the message

    vmklnx iodm event vmhba1 frame dropped 206 times in 60s

    is kinda interesting to dig out :smileyhappy:

    Happy to help,

    zXi



  • 17.  RE: IO issues VMs in one host

    Posted Jul 09, 2013 08:04 PM

    yeah I could not find anything about that message the only thing that caught my attention was the reference to vmhba1



  • 18.  RE: IO issues VMs in one host

    Posted Oct 20, 2015 05:53 PM

    Did you ever figure this out ?



  • 19.  RE: IO issues VMs in one host

    Posted Mar 17, 2019 10:55 AM

    Hi, i am seeing the same problem right now (2019) in ESX 6.5.

    I can not quite follow what was wrong in this case.



  • 20.  RE: IO issues VMs in one host

    Posted Sep 14, 2023 12:51 PM

    Hi!

    I have same issue here, any way to fix?