VMware vSphere

Β View Only
Expand all | Collapse all

Problem with memory tiering and client VMs

  • 1.  Problem with memory tiering and client VMs

    Posted Dec 15, 2025 04:56 AM

    Hi there,

    It's in my homelab, so YES, it is unsupported hardware! But I'll be happy if somebody has some new ideas to investigate. 😊

    The context : 3 Minisforum MS-01 cluster with ESXi 8.0 U3g, all cores active (performance & effiency) and 96GB of RAM - All identical - No VSAN

    I purchased 3 dedicated NVME disks and I wanted to activate memory tiering.
    Starting that point, I started to have issues with some VMS with one common thing : all are Windows 11 VMs.
    All other VMs didn't show any sign of issue.

    Symptoms:
    - not available anymore, no network connectivity. Can happen when using it.
    - black screen in vCenter console
    - no VMtools reporting anymore
    - very high CPU consumption, which turns the cluster in a very bad state (DRS score below 30% when usually >95%)
    - comes back like nothing happened when migrated to another host, until it starts again. There is also no trace in VM's event viewer.

    What I've tried so far but not working:
    - host afinity rule (VM will become unavailable at some point, even if not moving, even if powered on on host where it should)
    - changed CPU allocation at start (both "assigned at power on" or else)
    - different memory tiering ratios 
    - changed scheduler settings on VM :  bcdedit /set hypervisorschedulertype classic

    As soon as I turned the tiering feature off, it is OK again (and it has been rock stable for months before)

    Any idea I could try out? Did somebody achieve proper memory tiering in a mini-PCs lab?

    Thanks in advance! πŸ˜‰



    -------------------------------------------


  • 2.  RE: Problem with memory tiering and client VMs

    Broadcom Employee
    Posted Dec 16, 2025 04:42 AM

    What kind of NVMe devices are you using? On Reddit a while back someone reported issues as well, and in their case the NVMe were thermal throttling as a result of the system overheating. 

    -------------------------------------------



  • 3.  RE: Problem with memory tiering and client VMs

    Posted Dec 16, 2025 05:00 AM

    Hi Duncan,

    I'm using Samsung 990 Pro disks (1 TB).
    There were installed on the fastest port on MS-01s (the Gen 4 4x)

    I don't see why only Win11 VMs only would be impacted os much by thermal throttling, but for sure, they are reacting differently compared to my other workloads (mostly Almalinux + Win servers).
    Would you have any clue about the architecture difference that could lead to such?
    Also: is there any way to check if ithermal throttling is happening on the host?

    What I will try next on my side is to disable effiency cores on one host, put Win11 VMs on this host and see if it is still making such issue.

    But of course, if you have any other idea that I could try, I'm open to any suggestion! 😊

    -------------------------------------------



  • 4.  RE: Problem with memory tiering and client VMs

    Broadcom Employee
    Posted Dec 16, 2025 05:55 AM

    Ah, you got P and E cores, ESXi doesn't support that, that could indeed also be the problem. Sorry completely read over that. The 990 pro normally shouldn't be an issue. Not sure if you can detect thermal throttling with your hosts, but in ESXi you may be able to check the temp on the system. The reddit post on the issue is here by the way: https://www.reddit.com/r/vmware/comments/1mlutso/esx_9_nvme_tiering_literally_unusable_performance/

    -------------------------------------------



  • 5.  RE: Problem with memory tiering and client VMs

    Posted Dec 17, 2025 02:14 AM

    Hi Duncan,

    Yesterday, I tested deactivating the E cores, but unfortunately, it didn't change anything. My client VMs crashed after a while.

    I think it's a sinking ship because one of the options I still have is to upgrade to VCF9 and hope it handles it better, but since my employer is currently unwilling to invest in VMware until they have final pricing for the licences and an overview of the alternatives, I was unable to renew my certification in time and, as a result, I am not eligible for VMUG...

    In short, this is not the subject at hand, but apart from the option of reducing the ratio (as suggested by Dave) or leaving a host with memory tiering disabled on which I will run these VMs, there is not much else that can be done.

    But in the process, I found something interesting:
    - when E cores are enabled, my host displays 14 logical processors, so like each actual physical core
    - but when E cores are disabled, there are 12, which corresponds to the 6 P cores with hyperthreading

    So, I'm really wondering which settings will perform best! 😊

    -------------------------------------------



  • 6.  RE: Problem with memory tiering and client VMs

    Posted Dec 17, 2025 02:25 AM

    Hi Duncan,

    Some news about P & E cores : so I did disable all E cores on one host and created afinity rules just for this host and enables memory tiering just for that host.

    Unfortunatelly, this did solve the issue and all 3 client VMs I have were not available after a while.

    So I guess this is a sinking ship as, except if somebody comes up with some "magic parameter" for Windows client VMs, I'll remain stuck.

    And I'm unfortunatelly not v9 "elligible" as my employer didn't want to invest money to renew my certification, so I can't apply to VMUG licences anymore. Maybe next year but if alternatives are more ready, we might migrate within 2-3 years, which is exciting but also sad at the same time considering what I dedicated to VMware for almost 2 decades.

    BUT, I learned something interresting switching of the E cores :

     host looks when all cores are activated, my host looks like a 14 cores server:

    • But when I disable E cores, it looks like a 12 cores???

    Which is interresting because according to Intel website, with Hyperthreading, the total amount should give 20 threads (6x2 + 8) meaning that P cores are not seen as Hyperthreaded IF E cores are enabled. I'm wondering what would be the perf difference between the 2 settings.

    https://www.intel.com/content/www/us/en/products/sku/232135/intel-core-i913900h-processor-24m-cache-up-to-5-40-ghz/specifications.html

    I'll try to reduce the ration as suggest by Dave but I might ending up with one host out of 3 not using memory tiering and run Win11 VMs on that one host.

    -------------------------------------------



  • 7.  RE: Problem with memory tiering and client VMs

    Broadcom Employee
    Posted Dec 16, 2025 09:33 AM

    What ratio are you using? The differences between the tech preview version (8.0U3) and 9.0 are night and day in many areas including performance. The recommended DRAM:NVMe ratio for 8.0U3 is 4:1 so only 25% comes from NVMe. 
    Are you actually seeing pages move to NVMe? Is the NVMe device showing r/w? 

    I would start by putting the ratio back to 25% or lower even. Changing this ratio higher in 8.0U3 could be the culprit, in addition to unsupported devices.

    -------------------------------------------



  • 8.  RE: Problem with memory tiering and client VMs

    Posted Dec 17, 2025 02:18 AM
    Edited by fehret Dec 17, 2025 02:19 AM

    Hi Dave,

    I was using 100% but my current workloads are really not high on those hosts.

    I don't even have enough workload to fll my 96 GB of RAM yet, I wanted memory tiering to spin up labs and nested VMs (which is also NOT recommended, I know)

    I still have an old Dell server with enough RAM, but in Europe, electricity bill hurts a little and it is really not practical to work on labs when you need 15-20 minutes to spin everything up.

    Concerning checking if pages are moving to NVME, do you have any recommandations/docs how to do that?

    But I'll try the ratio down and let you know, thanks for the suggestion! πŸ˜‰

    -------------------------------------------



  • 9.  RE: Problem with memory tiering and client VMs

    Posted Dec 17, 2025 04:39 AM
    Edited by fehret Dec 17, 2025 04:38 AM

    Update ratio : not working either with 25%... below, there is no interrest! 🀣

    -------------------------------------------



  • 10.  RE: Problem with memory tiering and client VMs

    Broadcom Employee
    Posted Dec 17, 2025 07:17 AM

    I have not tried this on 8.x, but on the commandline you can look at "memstats -r vmtier-stats" to see if the tier is actively being used or not. Note, and this is often forgotten, we tier-out memory when there's host pressure, not just because we can. There needs to be a reason, so if there's no reason, you won't see tiering.

    -------------------------------------------



  • 11.  RE: Problem with memory tiering and client VMs

    Broadcom Employee
    Posted Dec 17, 2025 10:40 AM

    I just tested it, and if you go to the command line you can indeed see the stats. Just as an example, I powered on a lab in my own environment with Memory Tiering, and I deployed some VMs and overloaded to host to ensure there was memory pressure, and below you can see that there are memory pages stored in Tier1 (NVMe).

    memstats -r vmtier-stats -u mb -s name:memSize:isTiered:active:tier1Target:tier1Alloc:consumed:tier1Consumed:tier1ConsumedPeak
    
     VIRTUAL MACHINE MEMORY TIER STATS: Wed Dec 17 15:27:27 2025
     -----------------------------------------------
       Start Group ID   : 0
       No. of levels    : 12
       Unit             : MB
       Selected columns : name:memSize:active:tier1Target:consumed:tier1Consumed
    
    --------------------------------------------------------------------------
               name    memSize     active tier1Target   consumed tier1Consumed
    --------------------------------------------------------------------------
          vm.533611       4096        384           0        371             5
          vm.533612       4096        382           0        368             4
          vm.533613       4096        379           0        365             4
          vm.533614       4096        353           0        336             1
          vm.533615       4096        386           0        374             5
    --------------------------------------------------------------------------
              Total      20480       1883           0       1812            18
    --------------------------------------------------------------------------
    -------------------------------------------



  • 12.  RE: Problem with memory tiering and client VMs

    Posted Dec 18, 2025 08:54 AM

    Hi Duncan,

    I tested the command and I can confirm tiering works if I put "pressure" on it.

    In the example below, the first command is when all 3 hosts are available and the second one is when one is in maintenance. We can clearly see the difference.

    BTW: nice article on your blog! πŸ‘ŒπŸ˜‰

    But regardless if I'm in situation with tiring occuring or not, my Win11 VM will start getting crazy (of course, except when memory tiering is completelly off)

    But chat GPT gave me some interresting thoughts (no real solution) that it might be related to vTPM but I can turn this issue in all directions, I don't understand why ONLY Win11 VMs will have such issue. I've a couple of Win2019, Win2022 VMs and none of them had issues, despite they also encrypted and have vTPM. Also GPOs are almost all similar as I use CIS benchmarks for all.

    I'll try a brand fresh VM soon to see if it has something to do with legacy stuff (the current VMs were upgraded from Win10 to Win11) but still no clue why for the moment! 😊

    PS: the fact it is crashing when all 3 hosts are on and no memory tiering is occuring rules out thermal throttling IMHO.

    -------------------------------------------



  • 13.  RE: Problem with memory tiering and client VMs

    Broadcom Employee
    Posted Dec 19, 2025 06:49 AM

    I am just happy it rules out issues with memory tiering :-) 

    -------------------------------------------



  • 14.  RE: Problem with memory tiering and client VMs

    Posted Dec 29, 2025 04:36 AM

    Well, without tiering active: no problem and with memory tiering active : PROBLEMS ! 🀣

    I still need to test the fresh VM thing... I'll let you know

    -------------------------------------------



  • 15.  RE: Problem with memory tiering and client VMs

    Posted Dec 29, 2025 10:10 AM

    Spoiler: new fresh Win 11 VM didn't solve the issue πŸ™ƒ

    -------------------------------------------



  • 16.  RE: Problem with memory tiering and client VMs

    Posted Jan 15, 2026 04:10 AM

    Hi there,

    So return of experiece: frustrated! πŸ˜‘

    I've configured 2 out of 3 hosts with memory tiering and put my Win 11 with affinity rules on the 3rd.

    When I start labs and stuff, you can see that vSphere will "fill up" the physical memory on all 3 hosts instead of using the memory tiering, resulting having the 3rd host's RAM completely full and the 2 with memory tiering not really used (even after the labs bootup/firsrt rush)

    Not what I expected... 😁

    -------------------------------------------