vSAN1

 View Only
Expand all | Collapse all

vSAN 7.0 poor write performance and high latency with NVMe

  • 1.  vSAN 7.0 poor write performance and high latency with NVMe

    Posted Nov 09, 2020 08:02 PM

    Hi All,

    Having some vSAN write performance issues, I would appreciate your thoughts.

    The basic spec;

    5x vSAN ready nodes, 2x AMD EPYC 7302 16-Core Processor, 2TB RAM, 20x NVMe disks across 4 disk groups. 4x Mellanox 25GbE Networking, Jumbo frames configured E2E.

    When running any workloads, including HCIBench we are observing really poor write performance. See below, 30 minutes of 30+ms write latency. Reads are through the roof 400k+ IOPS, writes between 20-40k IOPS depending on parameters. Took 12 hours to consolidate a 10TB snapshot the other day!

     
     
     

    Screenshot 2020-11-09 194510.jpg

    Things I have tried:

    • Disabled vSAN checksum - This made 2k IOPS improvement.
    • AMD tuning guide : NPS=1 which is the default but suits the workload.
    • Increased the stripe width from 1 to 2, this improved reads but made write worse.
    • No de-dupe and compression enabled.
    • Tried Mirroring and FFT=0 some small improvement but nothing significant.
    • All patched up from both hardware and software.

    Notes:

    • vSAN insight shows no issues.
    • Really expected 60K+ write IOPS.

    Any ideas please we really expected better.



  • 2.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Nov 09, 2020 09:42 PM

    Hi there!
    It seems that you've been doing some deep troubleshooting so let me ask you some things you didn't mention.

    - Are your hosts compatible with ESXi 7.0?
    - Are your disks on-format version 11?
    - Did you check if your NIC and HBA firmware and driver versions are up to date? I believe they are since you mentioned hardware and software is up-to-date.
    - Can you run proactive tests? https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vsan-monitoring.doc/GUID-B88B5900-33A4-4821-9659-59861EF70FB8.html
    - Did you run HCI Bench tests with different type of block sizes?
    - Do you have vSAN VLAN and VMKernel separated from management and vmotion? 



  • 3.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Nov 10, 2020 08:57 AM

     thanks for helping here.

    • Yes all hosts on the HCL from a major vendor, they are vSAN ready nodes.
    • Yes disk on v11 and data redistributed.
    • Support have checked the firmware is all up to date as per the VMW HCL
    • Proactive tests report no issues.
    • We ran easyrun and made our own 60/40 64k parameter file.
    • vSAN network is dedicated pNICs and VLAN.


  • 4.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Nov 09, 2020 09:47 PM

    Hi,

    Are you running 7.0 or 7.0 U1 here?


    "Took 12 hours to consolidate a 10TB snapshot the other day!"
    Are you running HCIBench alongside a production workload? This is not in any way advisable and actually means the benchmarks are not a valid baseline (as they are contending for the resources of the other workloads and available storage).
    Why would anyone have 10TB snapshots lying around in their environment?


    Specifically which model of NVMe are in use for Cache and Capacity-tiers? (and what driver + firmware combination)


    When you tested (anything) with FTT=0, were you configuring it so it ran/deployed the FTT=0 vmdk(s) only on the node where these data were stored? (otherwise it isn't really a good test as the IO still has to traverse inter-node network to commit writes).


    Just so that you are aware (and in general for all storage): IOPS do not equal all other IOPS and thus stating X is expected to do X IOPS isn't really painting the full picture, e.g. if one is pushing 100,000 4K IOPS this is the same throughput as 6,250 64K IOPS - from the throughput in the picture you shared this looks to be ~64K block size (but no way of telling if it is half 4K and half 512K either).


    I would advise opening a Support Request with vSAN GSS, we have a dedicated team for Performance cases.



  • 5.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Nov 10, 2020 09:05 AM

     

    Thanks for taking a look at this:

    • It started at 7.0 and is now 7.0.1 Update 1 (build-16850804) Kernel 7.0.1 (x86_64)
    • No we are not running HCIbench along side anything that would be madness.
    • 10TB snapshot - Long story, basically someone created a snapshot when the VM was built last week and forgot about it. Then the DBA restored 10TBs of data! 
    • Cache disks are all Dell Express Flash PM1725b 1.6TB SFF F/W 1.1.0
    • Capacity disks are Dell Express Flash PM1725b 3.2TB SFF F/W 1.1.0
    • The FFT=0 test as to eliminate the vSAN network in case that was the issue, I presumed all writes would start local to the host?
    • It was a 64K test, so i'm impressed you can see that from the image.

    Looking forward to hearing from you soon.



  • 6.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Nov 11, 2020 07:55 PM


     

    Can you update the disks to ODF v13? I suggest this as in any vSAN update (where these are present), the vast majority of performance enhancements actually only come into effect once the disks have been updated for the new version introduced (and thus why we have these at all aside from where it is for a specific feature-enablement e.g. Encryption in v5).

     

    Thanks for clarifying that you aren't running HCIBench while other workloads are running, but are you running this on an empty vsandatastore? If not then this can have implications, the main 2 being that caches have data on them and data stored on the Disk-Groups may limit (and/or dictate) where test data can be placed, in an extreme case (e.g. if the utilisation in the cluster or on certain disks was relatively high) the test data could in theory push individual disk utilisation >80% (the default CLOM rebalance threshold) and now the test is in contention with a reactive rebalance. Are you using flush-cache between tests and have you checked what the storage utilisation (per-disk) via RVC during these tests?
    If you are running this alongside other data and these cannot be moved off temporarily, if you have the resources available to evacuate one node, you could test it as a 1-node vSAN (I know, only FTT=0 then but will give a good idea of per-host capabilities).

     

    As an aside relating to how long a snapshot of X size takes to consolidate - this isn't just a case of how much data the cluster can write, as you are aware this isn't the only VM using the cluster and what can determine this even more is the VMs usage of the snapshot and base-disk data during this time (and other sources of contention such as backups).

     

    Regarding the FTT=0 tests done already - I haven't played with HCIBench in quite some time but I do recall at one point there being some issue with placement of FTT=0 data not being 'pinned' to the respective host(s) as it is supposed to (or at least expected to).

     

    What is the VM layout and numbers you were running during these tests? Is it possible that it was just pushing I/O against a very limited number of components of a very limited amount of vmdk Objects on a very limited number of disks?


      makes some good points and you should be aiming to dig deeper and not just focus on one set of graphs in isolation, vSAN Observer data and esxtop data can also help with this.

     

    My guess at 64k is nothing to be impressed about - average IO size = throughput/s divided by iops



  • 7.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Nov 10, 2020 09:28 AM

    Hi,

    First of all you should check metrics on the backend (select host - monitor - vsan - performance -backend/disk/etc) to found out there the bottleneck appears and latency spikes. This schema could help you:

     

    Secondly, I'll suggest to run HCIbench with such parameters - 3 VMs per each host, 5 vmdks per each VM, 30GB size of vmdk, 4K 100% Write, 30 min warmup and 1h test, initialization on (zero if there is no DD&C on the cluster and random in case you enable DD&C). After test you could save observer data ("save results" button) there is all information from each vSAN module. Screenshots from it could help to found out the issue.



  • 8.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Nov 08, 2021 07:46 AM

    Hello, did you solve this problem ? We have similar issue with All-Flash SAS on 10Gbit network. 

    4hosts 3diskgroups

     



  • 9.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Nov 08, 2021 08:43 AM

    Hi,

    Not really, upgrading to 7.0.2 helped. In the end Dell arranged for a VMware SME to engage and he called out some minor tweaks but ultimately said it's correctly configured and working as expected. Given the all NVMe and 4x 25GbE networking I still think it's not running A1.



  • 10.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Nov 08, 2021 08:56 AM

    Same here I'm trying 4x10Gbit LACP but still big latency, sometimes congestion. We consider to buy 25Gbit switch, but I don't know if it will helps. 



  • 11.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Jul 14, 2022 12:41 PM

    Do you have still this issue or is it solved?



  • 12.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Jul 14, 2022 01:19 PM

    We reduced write IOPS to vSAN and now it's fine, but we are waiting for 25Gbit NIC to switch from 10G to 25G on vSAN, so I hope it'll help. Then I can try previous setup and can lat you know.



  • 13.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Jul 15, 2022 12:57 PM

    What kind of workload are you running? What GuestOS?

    Or do you just HCIBench the environment?



  • 14.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Jul 15, 2022 01:18 PM

    Windows, SQL workloads. Also used HCI bench.



  • 15.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Jul 15, 2022 01:43 PM

    Ok interesting. Have you set the PVSCSI to the recommended values mentioned here https://kb.vmware.com/s/article/2053145. If so set it back and test again with SQL.

    What kind of workload profile have you tested with HCIBench?



  • 16.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Jul 15, 2022 01:44 PM

    Thanks, yes VMware tried all these. I have moved away from this project now. Hope you get yours sorted.



  • 17.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Jul 15, 2022 02:12 PM

    Ah ok no worries. Just wanted to see if you had the same issues as I had. I sorted it out, just wanted to help.



  • 18.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Jul 15, 2022 02:17 PM

    WHat was your issue ?



  • 19.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Jul 15, 2022 02:35 PM

    We had SPLUNK running on Linux and per default Linux running max sector at 512K. That means an IO can have as much as 512K block size and SPLUNK was writing with this size. vSAN will take this and split it into 64K blocks. In addition, we had max PVSCSI settings. After changing max sector to 64K and cmd_per_lun=32 and ring_pages to 8 we were going down from 77ms to 3ms on the VM layer and everything worked flawlessly.



  • 20.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Nov 04, 2022 05:15 PM

    Hi Brian

    Seem to be facing this exact issue with Splunk on VSAN, I see 512K blocks getting thrown at vSAN and confirmed with our Linux admins that the max_sectors_kb still defaults at 512, which seems to be the issue (will conform next week)

    I am curious though, we default pvscsi in our puppet configs to settings per https://kb.vmware.com/s/article/2053145

    vmw_pvscsi.cmd_per_lun=254

    vmw_pvscsi.ring_pages=32

     

    What made you move away from the large-scale IO pvscsi settings after setting max_sectors_kb to 64K ?



  • 21.  RE: vSAN 7.0 poor write performance and high latency with NVMe

    Posted Aug 02, 2022 10:03 PM

    We had major write latency with VSAN.  After logging a call with VMware they highlighted that it is a bug and been fixed in 7.0U3f:

    Thank you sharing the requested details.

    I have reviewed the logs and below are my findings.
    Issue:
    High write latency on a disk group.

    Assessment:
    6 node vSAN all flash cluster with deduplication and compression enabled.
    ESXi version: ESXi 7.0 Update 3e (build-19482537)

    We could see that one of the disk group on host '<hostname>' was reporting high Log congestion bandwidth, causing latency on the disk group and could impact multiple VMs having components placed on this disk group..

    [Image is no longer available]

    We could see the host '<hostname>' was taken into maintenance mode on 18th July at 11:16 UTC which resolved the issue.

    2022-07-18T09:05:33.743Z: [GenericCorrelator] 5641261302076us: [vob.user.maintenancemode.entering] The host has begun entering maintenance mode
    2022-07-18T11:16:19.384Z: [GenericCorrelator] 5649106943053us: [vob.user.maintenancemode.entered] The host has entered maintenance mode

    We have know issue with the current ESXi build when 'unmapFairness' and 'GuestUnmap' is enabled.
    Please find the KB# below:
    https://kb.vmware.com/s/article/88832?lang=en_us

    Resolution
    Update to vSAN/ESXi 7.0 U3f which contains the code fix for this issue.

    Workaround
    To disable unmap, SSH into each host in the cluster and run the following commands:
    # esxcfg-advcfg -s 0 /VSAN/GuestUnmap
    # esxcfg-advcfg -s 0 /LSOM/unmapFairness
    Place the host into maintenance mode with ensure accessibility and reboot the host to make the new setting active