VMware vSphere

 View Only
Expand all | Collapse all

Weird disk latency issue on new R760 with onboard storage. Please help.

Chok45

Chok45Sep 06, 2023 06:21 AM

  • 1.  Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Jul 12, 2023 01:10 PM

    We purchased new Dell R760 with 7 onboard NVMe SSD drives that are setup in a RAID 5 on a PERC H965i.
    The server are setup with the custom Dell ISO of ESXi 8.0.1 build-21813344.
    I was setting up a new windows server to act as a proxy for veeam and notice the server just hung up for around 15 mins (which might not be related to the issue i am posting about). So I went out to see if there was something going on using esxtop. Everything was fine except for the storage area. There was barely any activity happening with the windows server as far as disk activity went but the latency numbers are a real head scratcher. 
    This is one sampling and this just randomly happens with 1 VM setup.
    CMD/s: 626.54
    READS/s: 622.37
    WRITES/s: 4.16
    MBREAD/s: 2.57
    DAVG/cmd: 136384.03
    KAVG/cmd: -136383.89 (Yes that is negative)
    GAVG/cmd: .014
    QAVG/cmd: 142262.59

    esxtop idle.png

    I downloaded iometer to do a load test of the storage and the numbers showed what i would expect.
    CMD/s: 225445.98
    READS/s: 113026.56
    WRITES/s: 112419.42
    MBREAD/s: 110.38
    DAVG/cmd: 0.06
    KAVG/cmd: 0.00
    GAVG/cmd: 0.06
    QAVG/cmd: 0.00

    esxtop load.png

    I have opened a ticket with vmware but wanted to ask the community if anyone has seen anything like this.
    Also I was on the phone with dell pro support for about 3 hours and they wanted me to call vmware since they could not find anything.
    All drivers and firmware on the storage are up to date.
    I have not put this server into production yet until an answer can be found out. My fear is I will move over servers and there will be an issue.

     



  • 2.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Jul 12, 2023 03:11 PM

    hi, it sounds really strange and I don't think shere is something wrong with the hypervisor.

    If the machine is not yet in production, you could try maybe with a linux live cd and test the virtual volume performance (i guess you'll need to destroy the current datastore and forma in a linux usable filesystem)



  • 3.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Jul 15, 2023 01:47 PM

    The KAVG/cmd is kernel-related latency and that negative value is definitely something that is not looking good.

    If this server is not productional I'd recommend disolving the R5 volume and reviewing if that is resolving that value. I've seen some very weird stuff when using RAID volumes. Using the "loose" disks would allow you to test if it is the configuration of the controller or the controller itself.



  • 4.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Jul 17, 2023 01:42 PM

    So I deleted the RAID 5 the issue is still showing up on the BOSS RAID 1 where esxi is installed.
    Now it does not show the negative value but this is still concerning.

    2023-07-17 08_37_11-esx7.illinoiseyecenter.com - PuTTY.png

     

    I have vmware and dell both working on the issue because I am not sure who is to blame.

    Also I have two other identical servers like this one with the same exact issue.

    I will let the community know what is found out in case anyone else runs into this issue.



  • 5.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Jul 19, 2023 11:54 AM

    I remember that the PERC Cards (including the BOSS Controllers) may have difficulties sharing IRQs and that this could fixed by resetting the configurations on those cards. But it has been a while and I haven't done that for a long time.

    Did support already check the Firmware on the cards? Perhaps (re-)flashing the firmware on the cards will alleviate the problems you are having.



  • 6.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Jul 19, 2023 02:52 PM

    Thanks for the suggestion. I went thru this morning and reset the BOSS and PERC RAID configurations and recreated the BOSS volume.

    The issue is still appearing.

    I am going to reload the system again using esx 8.0 instead of 8.0U1 to see if it has something to do with that release.



  • 7.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Jul 21, 2023 07:19 PM

    Hello  ,

    Post running esxtop command please use the "u" to switch to latency of Disks and then refresh every 2 Sec and see if the DAVG value is going high.

    If the DAVG value goes high on any disk with the u parameter then the issue is from the disk and if that disk is local then use the iLO/iDRAC/KVM to perform the extensive diagnostic check on the disks.



  • 8.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Jul 21, 2023 07:30 PM

    So i just checked that and the DAVG is pretty much zero when it happens.

    esxtopu.png

    esxtopd.png



  • 9.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Jul 21, 2023 09:27 PM

    Hello  ,

    The screenshot that you have shared with latency of the Local disk shows that the latency is near to 0. This means that the disk is performing well without any issues. 

    For the HBA showing weired values is a glitch and for that perform 2 tasks.

    1. Reboot the Esxi host and see if issue is appearing again

    2. Get the Driver and firmware upgraded to latest compatible version to confirm that the HBA stats are clean.

     



  • 10.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Jul 28, 2023 04:34 PM

    This is what support has told me about this.

    "The abnormal latency reported in the esxtop values is been investigated on by the Engineering team.
    I believe this behavior to be a cosmetic one, as there are no other issues reported on the host, however we would not be able to confirm the same until we have an confirmation and further action plan provided by the Engineering team."

     



  • 11.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Jul 28, 2023 04:45 PM

    Hello  ,

    As I mentioned its a cosmetic issue and there is no latency. There are 3 possible plan of Action that might be suggested by support team.

    1. reboot the Esxi host and see if the issue is still observed again.

    2. Upgrade the HBA driver and Firmware to latest compatible Version as per Vmware HCL Matrix.

    3. Upgrade the Esxi version to next build.

    This is not a real issue and you can continue using the same host on Production.

     



  • 12.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Jul 28, 2023 04:57 PM

    Reboot does not fix the issue.

    All HBA drivers and firmware has been already applied. This was done when I opened a ticket with Dell Support.

    I installed multiple versions of ESXI, I even went back to version 7.0U3n which is in the HCL Matrix for the server as being supported.

    On all versions it happened.

    I was just updating the community in case anyone else has this issue.

    If I get anymore information i will pass it along.

    Thanks for responding.



  • 13.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 01, 2023 03:01 PM

    We have the exactly same issues with 8x Dell R760 Server and Dell PERC12 RAID Controller H965i. Extreme high latencies in local RAID with 6x NVMe SSDs. Tried different VMware Versions with Dell customized Image. From 7 to 8 and so on. It seems that there is Firmware Problem or Driver Problem with esxi. At the moment These systems are useless because Performance is slower than in SATA Drives. 



  • 14.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 01, 2023 03:05 PM

    I can confirm that this ist not cosmetic. On snapshot removal latencies jumps to 400ms and Higher. 6x NVME SSD in Raid-5. 



  • 15.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 01, 2023 03:21 PM

    This is the last email I received from tech support. I am going to respond letting them know another customer is having the same issue.

    It looks like it's an intermittent issue on 8.0.1 and wasn't seen on the latest main but since it was still seen in
    8.0.1 We now trying to root cause issue on 8.0.1.
    As per the current investigation following are the obeservations.

    1.) These values go out of range on 8.0.1 and not on main (we are yet to confirm this with adequate experimentation)
    2.) It is seen on large IO sizes usually around 4M or higher.
    3.) On lower-size IOs we don't see this issue and values are just fine in that case.
    4.) This issue seems to occur only in nvme case not in scsi devices.


    With that said we are still investigating the root cause of the issue.



  • 16.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 01, 2023 03:35 PM

    Holy **bleep**. That makes Sense why the latency bumps up when Backup is running. 



  • 17.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 01, 2023 03:58 PM

    Also, not sure of your configuration but I also had this issue with my R760's and error messages.

    But it was easily solved.

    Re: Failed to cleanup registration key on volume - VMware Technology Network VMTN



  • 18.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 01, 2023 08:01 PM

    Mhh, we use the Boss Card with VMFS for HCI Deployment with StarWind VSAN. I cannot delete the Partition because OS of StarWind Appliance is laying on it. But we have no issues with Boss Card. Only with Perc12 H965i. 



  • 19.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 01, 2023 08:26 PM

    Dell released a new ESXi Image for vSphere 7 Yesterday: 

     

    https://www.dell.com/support/home/de-de/drivers/driversdetails?driverid=x3djh&oscode=xi70&productcode=poweredge-r760

    There is a new Driver for PERC RAID Controller too which is not the native one. I will give it a try. 



  • 20.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 01, 2023 08:41 PM

    I will also try Dell customized Image A12. It has the bcm_mpi3 version 8.1.1.0.0.0-1OEM integraded. 

    https://www.dell.com/support/home/de-de/drivers/driversdetails?driverid=pk7wn&oscode=xi70&productcode=poweredge-r760



  • 21.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 05, 2023 07:51 PM

    8.0U1 A04 was released also. I updated one of my servers to this version and my issue still persists.

     



  • 22.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 06, 2023 06:21 AM

    Yeah, i also tried it with no luck.



  • 23.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 06, 2023 07:27 AM

    Tried latest Broadcom Driver from Broadcom Website. 

    Both ESXi Systems crashed with high Kernel Latency on DELL H965. 

     



  • 24.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 10, 2023 02:49 PM

    At the moment there is no solution. I tried every driver i found out there. 

    We got latency spikes from 60 to 90 seconds. Thus means all virtual machines will crash. 

     

     



  • 25.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 11, 2023 12:16 PM

    Have you opened a case with Dell or VMware?

    I would be interested in seeing what they tell you about it.



  • 26.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 11, 2023 01:40 PM

    Sure. They are on investigating the log files. But unfortenately there is no solution yet. We are thinking about replacing the PERC12 with PERC11.



  • 27.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 28, 2023 06:40 PM

    Any fix on your end?

    I have contacted Dell to see about replacement my NVMe with SAS SSD setup.

    I will let you know what they say.



  • 28.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 29, 2023 06:15 AM

    Yeah, we replaced the H965i with H755N RAID Controller. After that we deactivate Cache of NVMe SSD too. Problem seems to be fixed. The H965i sucks. 

    The last day latency was under 1ms. 

     



  • 29.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 29, 2023 12:07 PM

    Found out some interesting things !

     

    Same problem with high latency on large blocksizes exists on H755N with Firmware 52.21.1-5149 released on 22.09.23 too. 

     

    On Firmware 52.21.0-4606 everything works great ! Maybe you can give a try and downgrade the firmware of H965i to 8.0.0.0.18-86 too ? Maybe Dell has implemted fixes for PERC11 and PERC12 for latest builds and same problem occurs on latest firmware for both controller.

    PERC H965i RAID Controller Firmware Version 8.0.0.0.18-86 | Treiberdetails | Dell Deutschland



  • 30.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 29, 2023 12:20 PM

    Thanks. I will communicate this to dell. It won't let me downgrade the firmware with the packages available.....

     



  • 31.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 29, 2023 12:22 PM

    Hey, why can you not downgrade the firmware ? 

    If you can passthrough the raid Controller to a windows VM it should work.

     



  • 32.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 29, 2023 12:34 PM

    The only file format is shows for older firmware is a BIN file, I have not done an update using a BIN file before.

     

     



  • 33.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 29, 2023 01:10 PM


  • 34.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Sep 29, 2023 01:42 PM

    Thanks. I downgraded and it still does the same thing with latency.

    I did get this from vmware on my specific issue.

    The Engineering team have shared the update that:

    “We have been actively debugging this issue, but looks to be a tricky one. We have added debug logs from where the stats are fetched, but we see no anomalies there, yet esxtop reports high and negative stats sometimes. We have not yet root caused the issue, debug is still in progress.”

    I will keep you updated with the progress.



  • 35.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 02, 2023 02:23 PM

    Can you confirm this issue occurs with RAID 1 or RAID 10 and not just RAID 5? Everyone I've seen thus far reporting this issue were using a RAID 5 configuration. Looking at buying an R760 and now considering picking up the PERC 11 version for now. Though not sure how easy it would be to upgrade in the future.



  • 36.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 02, 2023 03:49 PM

    In my situation it happens on the BOSS Card in a RAID 1 and the NVMe RAID 5.

    My personal thought is it is the NVMe part of the setup that is causing the issue but I cannot confirm this until Dell replaces my configuration with a SAS SSD setup or something else.

    Dell is currently working on something with it, i will let you know what i find out.

     



  • 37.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 02, 2023 04:37 PM

    Great- thank you for the information. Can you confirm your Power mode is set to Performance in the BIOS and in VMWare?



  • 38.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 02, 2023 04:57 PM

    Yes. Both are set. I just received a message from my rep at dell and said they are actively working on the issue.



  • 39.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 08, 2023 04:40 PM

    I am also see this latency issue.  We just implemented 4 R760's running VMware ESXi, 8.0.1, 22088125 dell custom ISO A04 with vSAN 8 ESA.  Using BOSS in RAID 1 for OS and we have 6 nvme drives at 3.49TB per host.   I see high latency on the storage path.  Disk latency looks to be very low as indicated in this thread.  HBA reporting very high latency in the hundreds of thousands, as high as 500,000.  Looks like I will be transfering 21TB's back to the old system.

     

    Storage path from one of the R760's.  All hosts have high numbers

    pcie.b100-pcie.0:0-eui.36563130575165740025384300000003Read latencyAveragems 238,185 262,582 23,730 238,283.28
    Select
     
    pcie.c400-pcie.0:0-eui.36563130575163600025384300000003Read latencyAveragems 44,618 2,241.194
    Select
     
    pcie.100-pcie.0:0-t10.NVMe____Dell_BOSS2DN1____________________________0100000992435000Read latencyAveragems29,998853,652047,928.68
    Select
     
    pcie.b000-pcie.0:0-eui.36563130575163640025384300000003Read latencyAveragems 42,699 1,616.539
    Select
     
    pcie.c300-pcie.0:0-eui.36563130575165750025384300000003Read latencyAveragems 16,502 500,096 21,591.111
    Select
     
    pcie.ae00-pcie.0:0-eui.36563130575165720025384300000003Read latencyAveragems 49,460 1,970.4
    Select
     
    usb.vmhba32-usb.0:0-mpx.vmhba32:C0:T0:L0Read latencyAveragems0
    Select
     
    pcie.af00-pcie.0:0-eui.36563130575165770025384300000003Read latencyAveragems 52,370 2,486.539
    Select
     
    pcie.af00-pcie.0:0-eui.36563130575165770025384300000003Write latencyAveragems 5,761 491,199 1,674 34,693.3
    Select
     
    pcie.b000-pcie.0:0-eui.36563130575163640025384300000003Write latencyAveragems 144,727 7,676.672
    Select
     
    usb.vmhba32-usb.0:0-mpx.vmhba32:C0:T0:L0Write latencyAveragems 0
    Select
     
    pcie.b100-pcie.0:0-eui.36563130575165740025384300000003Write latencyAveragems 1,039 214,675 4,763.828
    Select
     
    pcie.c300-pcie.0:0-eui.36563130575165750025384300000003Write latencyAveragems 1,031 249,404 310 37,498.707
    Select
     
    pcie.100-pcie.0:0-t10.NVMe____Dell_BOSS2DN1____________________________0100000992435000Write latencyAveragems26,136 1,393.111
    Select
     
    pcie.ae00-pcie.0:0-eui.36563130575165720025384300000003Write latencyAveragems 19,114 496,225 31,999.482
    Select
     
    pcie.c400-pcie.0:0-eui.36563130575163600025384300000003Write latencyAveragems 135,047 6,945.561


  • 40.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 08, 2023 05:51 PM

    I thought after I posted this that we are running vSAN so we wouldn't need a raid controller so I went into inventory and all we have controller wise:

    405-AACD : No Controller

    403-BCRU : BOSS-N1 controller card + wit h 2 M.2 480GB (RAID 1) for the OS.

    Any idea's?



  • 41.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 09, 2023 06:54 AM

    Hey Guys,

     

    we found out some very interesting things. 

     

    First one: Abnormal high latency inside ESXTOP is a known issue and Dell has actualized the Known Issues section:

     

    VMware vSphere ESXi 8.x on Dell PowerEdge Systems Release Notes | Dell D eutschland

     

    Second one: We also had real latency issues with H755N AND H965i RAID Controller inside 2x R760xs Systems. 

     

    After Troubleshooting of hours we connect the RAID Cards to other SL_Connector on Server Mainboard. 

    After that all Latency issues are gone !! If we connect them back restart the host and did copy tests latency jumps back to 300 - 1000ms. Really Really strange. We can replicate this issue on other Dell R760xs too.

    The Really strange thing is that this issues is on R760 with other mainboard design too. We have 6x Dell R760 with DUAL RAID Controller H965i. One of the RAID Cards per Server has the same Problem. On Every system. One RAID Card is working fine. One has latency issues. On R760 Dual RAID Configs it is not possible to easy connect the cards to other connector on mainboard because both connectors are in use. On R760xs Single Config this is possible.  

     

    On Factory Config of R760xs the RAID Cards were connected to SL5_CPU1_PA3 on Mainbord. We connect them to SL4_CPU2_PA2 and then Latency where great. Problem exists with different RAID Cards so it cannot be the card itself.  

     



  • 42.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 09, 2023 07:08 AM

    We also did a downgrade to vSphere 7 because on 8 we had problems with iscsi iser adapter which were lost after reboot.



  • 43.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 09, 2023 12:37 PM

    So are you thinking it has something to do with the mainboard?

    I finally was connected to a couple Dell senior support engineers and they had me download a SLI ISO to do testing of the setup using fio and iostat.

    I broke the raid and made all the NVME drives as non-raid, then I recreated the RAID and ran the test.

    Everything has been sent to Dell this morning so I will have to wait and see what they say.

    Also I have told them to read thru this thread to see what everyone else is saying.

    I hope to get a solution at some point from dell or vmware, cause I have bunch of new equipment that are just bricks right now.



  • 44.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 09, 2023 12:49 PM

    Yeah it can be. Because at the moment our systems are running absolutely fine without latency issues since we change the mainboard connector from SL5_CPU1_PA3 to SL4_CPU2_PA2 . 

     

     

     



  • 45.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 09, 2023 12:51 PM

    Yeah i can absolutely understand you. We are on the same boat. Had 6x R760 Server which Dell willl replace with systems with H755 Single RAID Controller and SAS SSD (instead NVME). Hopefully these systems will run better.



  • 46.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 09, 2023 12:54 PM

    Do you have Single RAID Controller config ? If yes you can try to attach the Data Cable to the second Connector on mainboard and check if latency problems still exists. Becareful you have to clear config on idrac because of validation errors after you change these port.  



  • 47.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 09, 2023 12:59 PM

    Each server has a dual H965i, but drives are only connected to one of the two raid cards.

    Once dell support gets back to me I might try doing that.



  • 48.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 09, 2023 01:08 PM

    Okay, then you can try to attach 3 Drives to Controller 1 Backplane and 3 Drives to Controller 2 Backplane.

    Create RAID-5 and Datastores within VMware and Check Performance. Maybe you will see that the second Controller has no problems. Then you have the exactly same Problem like we have. 

     

    Controller 1 (left) is connected to SL1_CPU2_PA1

    Controller 2 (right) is connected to SL3_CPU1_PA2

    Regardless which Controller is connected to SL3_CPU1_PA2. This one has latency issues under VMware (local) AND via PCIe Passthrough. 

     



  • 49.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 09, 2023 01:16 PM

    Did you run the fio test with big block sizes ? >=64KB

     

    We found out that the latency problem only occur on big block sizes. 



  • 50.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 09, 2023 01:19 PM

    128k, this is the configuration i used with the test.

    [global]
    rw=write
    numjobs=1
    iodepth=128
    ioengine=libaio
    time_based
    runtime=600
    bs=128k
    direct=1



  • 51.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 09, 2023 01:29 PM

    I think 128k Block SIze is too small to represent the issue.

     

    Can you please check how the results are with 256K, 512K and 1MB Block Sizes ?

    We saw the problem with Block Sizes bigger than 512K.



  • 52.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 09, 2023 02:34 PM

    This is on a RAID5 with 7 drives. Block size 512k

    slciec_0-1696861985015.png

     

    Block size 1024k

    slciec_1-1696862008814.png

     



  • 53.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 09, 2023 08:05 PM

    So, unless I am mistaken- those results look good? Those are based on being plugged into which backplane connector?

     

    What does it look like if you set numjobs= to a higher value?



  • 54.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 09, 2023 08:35 PM

    I increased the jobs to 16 this is what i got back.

    slciec_0-1696882522992.png

    I also ran this to measure random read/write performance.

    fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=sbd --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75

    Results

    slciec_1-1696883647370.png

     

    slciec_2-1696883654692.png

    The numbers look fine to me unless I am reading something wrong.

    Also I would note this is not a test done with esxi installed, I am using a dell image that is running Rocky Linux 8.8 with all the appropriate drivers and test utilites on it.

    After running all these test I get the feeling this is not specifically hardware related as much as it is esx related with drivers or something.

     



  • 55.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 10, 2023 07:45 AM

    Yeah i absolutely agree that it can be a problem with ESXi and PowerEdge x60 PowerEdge Systems. We didnt saw the Latency problems when Local Microsoft Windows with latest Driver was installed. The Problem only occured on VMware ESXi with PCIe or non PCIe Passthrough. Maybe there is something wrong with PCIe Bus or something else. Problem exists only on one Connector on Mainboard with VMware.



  • 56.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 10, 2023 06:04 PM

    Dell got back to me and basically told me the thing we already knew.

    "The hardware is performing as expected. While in the Support Live Image, all the drives observes extremely low latency times on the tests you performed, and the overall performance was very good and pretty consistent. All of this does point toward ESXi/VMware being the bottleneck, unfortunately."

    I updated my ticket with Vmware but if I don't hear back from them I am unsure what to do next.

     



  • 57.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 17, 2023 05:15 PM

    VMWare got back to me and told me this is a cosmetic issue and the numbers I am seeing are wrong.

    slciec_0-1697562880580.png

     

    So in my case this is probably true since I don't see any issues when running load test on the virtual machine.

     



  • 58.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Oct 18, 2023 06:33 AM

    Hey,

    as i said the high latency values under ESXTOP is a bug and a cosmetic thing on VMware vSphere.

    But we saw real latency issues on our environment inside Windows VM´s with PCIe Passthrough enabled. On local file copy latency jumped to 60sec and more and datastores will crash. 

    We can replicate the issue without installation of starwind vsan. 



  • 59.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Nov 30, 2023 05:55 PM

     

    Hello Pal, 

     

    Greetings for the Day.

    Hope you are doing well. 

    I see there is huge latency happening and also with this latency or virtual machines will be very slow 

    We can do deep dive and pull the storage sense parameters 

    Run the command and if interested the share the snaps 

    ssh to host 

    cd /var 

    cd run 

    cd logs or log 

    then run command 

    cat vmkernel.log | grep "performance has deteriorated" | awk '{print $20, $21}' | sort -nr | head -10Top 10 instances of "performance has deteriorated" in vmkernel.log
    cat vmkernel.log | grep "performance has deteriorated" | egrep -o "eui.[0-9a-f]+_*" | sort | uniq -c | sort -nrCount number of times "performance has deteriorated" per LUN


  • 60.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Dec 05, 2023 01:51 PM

    Sorry for the late reply, I have been out.

    I have looked thru the logs and do not see any instances of "performance has deteriorated".

    The only thing I have to go off of is the latency numbers.



  • 61.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Dec 05, 2023 01:59 PM

    Please run the command and share the output.. 

    egrep "H:0x" /var/run/logs/vmkernel* | grep -v "H:0x0 D:0x2 P:0x0 Sense Data: 0x5 0x20 0x0" | grep -v "H:0x0 D:0x2 P:0x0 Sense Data: 0x5 0x24 0x0" | grep -v "Error" | awk '{print $1, $5, $13, $15, $16, $17, $21, $22, $23}' | sed 's/,/ /g' | sed 's/ / || /g'| grep -v vmhba | sort -u -k6

    egrep "H:0x" /var/run/log/vmkernel* | grep -v "H:0x0 D:0x2 P:0x0 Sense Data: 0x5 0x20 0x0" | grep -v "H:0x0 D:0x2 P:0x0 Sense Data: 0x5 0x24 0x0" | grep -v "Error" | awk '{print $1, $5, $13, $15, $16, $17, $21, $22, $23}' | sed 's/,/ /g' | sed 's/ / || /g'| grep -v vmhba | sort -u -k6

    grep "I/O error" /var/run/log/vmkernel* | awk '{print $1, $6, $10}' | sed 's/ / || /g' | sort -u -k5 | egrep -v "table|for"

    grep "I/O error" /var/run/logs/vmkernel* | awk '{print $1, $6, $10}' | sed 's/ / || /g' | sort -u -k5 | egrep -v "table|for"

     

    grep APD vobd.log | grep esx.problem.storage.apd.start
    grep -ve " 0x85" -e " 0x4d" -e " 0x1a" -e " 0x12" /var/log/vmkernel.log | less 
    grep -ve " 0x85" -e " 0x4d" -e " 0x1a" -e " 0x12" vmkernel.log | grep "sense data" | less
    grep "Not found (APD)" tdlog/logs/vmkwarning.all | grep -i 2022-09-15T12:2 | awk '{print $9}' | sort | uniq -c

     



  • 62.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Dec 05, 2023 02:29 PM

    There was no output from any of the commands.



  • 63.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Dec 05, 2023 02:38 PM

    If you are comfortable we can do teams meeting for 10 min 

     



  • 64.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Dec 05, 2023 02:40 PM

    Click here to join the meeting

     

    Meeting ID: 296 308 313 21
    Passcode: AtNZwn

    Download Teams | Join on the web

    Learn More | 19_meeting_ZDgwNjk0ODUtOTI4My00ZTM5LWJkYjAtZWUyNzM4ZmY3OTg5@thread.v2&messageId=0&language=en-US" target="_blank">Meeting options



  • 65.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Dec 05, 2023 02:41 PM

    Thanks, but at this time it is not necessary.

    The only latency I am seeing is write latency.

    I still have a ticket open with vmware and will follow up with them.



  • 66.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Jan 09, 2024 08:23 PM

    Hey ,

    any news on this issue?

    Cheers,
    Wolfram



  • 67.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Jan 09, 2024 08:32 PM

    No. My case is still open and they have not released a patch since 9-21. The last communication I received was this.

    Thank you for being patient with us.

    I would like to inform you that We are still trying to figure out a root cause. In the interim we will be creating a KB stating these are false positives and will not affect functionality or performance.

    The process of creating KB is started and we will update you once we have it ready for customer access.

    Again in my case it does not seem to be affecting the performance based on my monitoring, but I know others have had issues.



  • 68.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Jan 09, 2024 08:51 PM

     Ah, sorry, I mixed it up -- it was  who was experiencing the real latency issues with PERC12 H965i when connected to SL5_CPU1_PA3 and no issues when connected to SL4_CPU2_PA2.

     Any news from your side from VMware and/or Dell support regarding these issues?



  • 69.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Jan 09, 2024 09:29 PM

    Unfortenately the Case was closed by Dell. Since then we have never build a ESXi Server with H965i Raid Controller. We configured every new System with h755 RAID cards. But at the Moment it seems that we will Change the whole infrastructure from VMware to another vendor. Because of termination of our vcpp.. 



  • 70.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted Jan 09, 2024 09:41 PM

     Oh, wow, all of that sucks... m(



  • 71.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted May 16, 2025 02:16 PM

    I think we have run into a similar issue.  R760, NVMe HWRAID, ESXi 8.  Performance great at first but after couple of hours drops 90%.  Wiped the array and rebuilt as RAID10, and seems to be better.




  • 72.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted 4 days ago

    @ARTHUR DOGRAMACIAN

    We are seeing the same issue.  Dell R660 (basically a 760 different form factor).  NVMe disks on BOSS controller.  This has the ESXi on it. 

    Data disks (vms) are connected to SAN.  

    For the vmhba0 which is the ESXi install BOSS we see extreme #s and VMs now have latency since moved to this host... 




  • 73.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted 3 days ago

    @Ed Fraser

    We ended up setting drives up in a RAID10 configuration, as no matter what we did, RAID5 would degrade for no reason whatsoever.  Also, controller refused to ever let us add disks to an existing array.  Maybe that feature is listed but unsupported with NVME HWRAID, but I couldn't find that in any documentation.  Dell also wants through on any virtual drive created with NVME HWRAID.




  • 74.  RE: Weird disk latency issue on new R760 with onboard storage. Please help.

    Posted 3 days ago

    @ARTHUR DOGRAMACIAN Our BOSS NVMe disk pair with ESXi on it is a RAID 1 mirrored set I wonder if that matters.  

    We also have an H965i but it only has a single disk on it.  Not used for data or anything.  Just sitting there idle.