VMware vSphere

 View Only
  • 1.  HTAware mitigation tool for L1TF

    Posted Feb 08, 2019 02:00 AM

    Hi Guys,

    Has any anyone run the HTAware mitigation tool from VMware for L1TF? If yes, can you please share your experience using the tool and findings after running it?

    Cheers,

    Onil



  • 2.  RE: HTAware mitigation tool for L1TF

    Posted Feb 08, 2019 10:42 AM

    I've used it. Most of my clusters are not heavily CPU used, the limiting resource on them is memory. I have not noticed any reduction in performance by enabling VMkernel.Boot.hyperthreadingMitigation on any of my ESXi hosts. If I look at the performance charts for CPU % utilised on any of my vSphere clusters I cannot see any change from before or after I enabled this.

    I still have my most heavily CPU used cluster to do this weekend and will let you know how that looks next week.



  • 3.  RE: HTAware mitigation tool for L1TF

    Posted Feb 11, 2019 01:57 AM

    Hi cjscol

    Thanks for advising, How did you go with your most heavily CPU used cluster? Did you end up running the HTAware tool and making the recommended changes?



  • 4.  RE: HTAware mitigation tool for L1TF

    Posted Feb 13, 2019 04:16 PM

    I am seeing an increase CPU contention on the virtual machines after enabling VMkernel.Boot.hyperthreadingMitigation on my most heavily used ESXi cluster, see screenshot below from vROps - VMkernel.Boot.hyperthreadingMitigation was enabled about midday on Feb 9.



  • 5.  RE: HTAware mitigation tool for L1TF

    Posted Feb 11, 2019 03:33 PM

    It's been quite some time since we ran HTAware (and subsequently switched to the SCA scheduler). The tool did not have a lot of recommendations for us - I think it mentioned a couple of VMs with many cores, and a couple of very busy VMs as potential problems.

    We since had the experience that the main problem is latency - we have quite a few terminal server VMs where users noticed lagging keyboard and mouse response, for example. On one cluster in particular, we couldn't help disabling the SCA scheduler.

    For VMs with non-interactive workloads, we haven't seen any complaints, but the safety margins have shrunk. Particularly VMs with a bad vCPU/pCPU ratio (say 8 vCPUs on a system with 12 cores per socket) experience high latency or bad CPU ready values compared to pre-SCA times.

    I have to say I'm quite disappointed with VMware in this regard - it's been four months now, and we're still stuck with the SCA scheduler in a state that effectively disables hyperthreading. Microsoft's Hyper-V core scheduler, in comparison, has chosen to only schedule vCPUs belonging to the same VM on any HT pair - so there's more leeway for scheduling decisions, while still mitigating risks that L1TF creates on the VM/hypervisor boundary.

    Alex.