VMware Workstation

 View Only
  • 1.  Using vPMC with Linux guest on i7-13700H Windows host

    Posted Nov 13, 2023 04:31 PM

    Using VMWare Workstation Pro 17.5, on a Windows 11 host (i7-13700H CPU), with VBS and Hyper V disabled, I'm able to start a VM with virtualized performance counters enabled. The guest OS (Ubuntu 22) recognizes the counters: dmesg reports PMU version 5 and cpuid reports that counters are supported. But the problem is that "perf stat -e cpu-cycles /bin/date" reports zero cpu-cycles. Running the same ubuntu version natively on the hardware, it works fine and reports a >0 number. Any idea what could be wrong? 



  • 2.  RE: Using vPMC with Linux guest on i7-13700H Windows host

    Posted Nov 14, 2023 03:17 AM

    I suspect there is a difference in performance counter capabilities between e-cores and p-cores.
    What is the output of MSR 0x345 of all cores when running Ubuntu natively on the hardware?

    sudo rdmsr --all 0x345

    You could try

    (1) disabling the e-cores on the UEFI on the host (if it is possible) or
    (2) set affinity to only the p-cores of the vmware-vmx.exe process of the VM from Task Manager or
    (3) setting the Performance profile of the Windows host to "High Performance"; this seems to be a workaround as well to avoid scheduling vmware-vmx.exe process on the e-cores.

    There is a vmx edit equivalent to (2), (see spoiler) This assumes processors 0-11 are the hyperthreaded p-cores and processors 12-19 are the e-cores of the i7-13700H



  • 3.  RE: Using vPMC with Linux guest on i7-13700H Windows host

    Posted Nov 15, 2023 02:07 PM

    Thanks for your reply. The MSR 0x345 register has value 0x74FF for the p-cores and 0x174FF for the e-cores when running natively on the hardware. Running as guest, it has the value 0x2000 (same value as on a guest on my W10 host with VMWare 15, on which the perf counters are working correctly).

    Both disabling e-cores in the host's bios and adding the "processor.use" lines in the vmx file didn't resolve the problem. But I found an error message in the dmesg log (see below) that might be related. I'm not sure if this is a kernel or a VMWare issue, but with the kernel upgraded to 6.5.0 (on Ubuntu 23 guest) I get the same error. 



  • 4.  RE: Using vPMC with Linux guest on i7-13700H Windows host

    Posted Nov 16, 2023 02:55 AM

    With the failed write to MSR 0x38f, it is probably as good as vPMC not enabled in the VM. My guess is that the rdmsr 0x38f would fail (this is what happened when vPMC is unchecked in an Ubuntu 22.04 VM in version 16.2.5) or even if it reads probably returns 0x0.

    As it is writing 0x0001000f000000ff and failing, the 0xff at bits 0-7 means that the code that attempts to do so, determined that hyperthreading is disabled.

    Just for comparison, on an i7-8700K Ubuntu 22.04 host running version 16.2.5 with an Ubuntu 22.04 VM,

    rdmsr 0x38f
    returns 70000000f (on both host and guest) with hyperthreading enabled on host
    returns 7000000ff (on both host and guest) with hyperthreading disabled on host

    cpuid returns in the guest VM
    number of counters per logical processor = 0x4 (4) with hyperthreading enabled on host
    number of counters per logical processor = 0x8 (8) with hyperthreading disabled on host

    The results are independent on whether the ht flag is present or not in /proc/cpuinfo in the guest VM. It seems PMC code is using something different from /proc/cpuinfo code to determine whether to show the ht flag or not.

    I guess if you run the rdmsr -all 0x38f on a Linux host, the p-cores would show 0x0f on bits 0-7 as there is HT by default while the e-cores would show 0xff for bits 0-7 as e-cores don't have HT at all.

    Considering attempt to write MSR 0x38f is assuming HT is off (maybe because it determined this from the e-core instead of p-core), perhaps try flipping affinity to the e-core instead of p-core. But it is possible that setting affinity to e-core the VM would become slow and unusable. I don't have access to a machine with e-core/p-core so I can't say from direct experience but some posts here seem to suggest running VMs on e-cores makes it slow/unusable.

    Alternative, is to disable hyperthreading in the p-cores from the host UEFI (if possible) so as to make both p-cores and e-cores run without hyperthreading and presumably writing MSR 0x38f with 8 counters will succeed instead of crashing.

    As to whether it is a bug of the Linux kernel/PMC or VMware, it is hard to say but I would lean towards kernel code being at fault considering that these would likely run fine on VMs with host CPUs without e-core/p-core differences and it is the Linux code crashing and not VMware. But that doesn't explain why it run fine on the bare metal i7-13700H.



  • 5.  RE: Using vPMC with Linux guest on i7-13700H Windows host

    Posted Nov 19, 2023 08:23 AM

    It is strange that even though the e-cores are already disabled on the host, the kernel/PMC code somehow see there is some e-core (logical CPUs without hyperthreading thus attempting to write 0xff instead of 0x0f for bits 0-7 to MSR 0x38f).

    Perhaps a less drastic workaround instead of disabling hyperthreading on p-cores on the host, is to mask out the hybrid architecture in addition to setting affinity to p-cores/disabling e-cores.

    This is done by adding to the VM vmx

    cpuid.7.edx = "----:----:----:----:0---:----:----:----"

    This will make

    cpuid -l 0x7 | grep hybrid

    show "hybrid part" as false instead of true.

    From what I read, code intended to run on Intel 12th gen and newer should check this hybrid flag first and then check leaf 0x1a eax register to see what type of core it has. But it is quite possible that PMC code checks the hybrid flag and uses some other method to determine whether it is e-core or p-core with hyperthreading.

    Version 16.2.5 does not seem to pick up the CPU leaf 0x1a from the vmx so the output of

    cpuid -r -l 0x1a

    shows eax bits as all zeroes even I had set bits 24-31 with some values. I was trying to fool the VM running on i7-8700K host that it has e-cores/p-cores but the only thing that was successful was having the "hybrid part" show as true in the VM.



  • 6.  RE: Using vPMC with Linux guest on i7-13700H Windows host

    Posted Nov 22, 2023 09:16 AM

    With both e-cores disabled and hyperthreading disabled in the host's EUFI, I still get the same result, the same message in 'dmesg' output: 

    [    3.674540] Performance Events:  AnyThread deprecated, Alderlake Hybrid events, Intel PMU driver.

    [    3.674540] core: cpu_core PMU driver:
    [    3.674540] ... version:                5
    [    3.674540] ... bit width:              48
    [    3.674540] ... generic registers:      8
    [    3.674540] ... value mask:             0000ffffffffffff
    [    3.674540] ... max period:             000000007fffffff
    [    3.674540] ... fixed-purpose events:   4
    [    3.674540] ... event mask:             0001000f000000ff
    [    3.676104] unchecked MSR access error: WRMSR to 0x38f (tried to write 0x0001000f000000ff) at rIP: 0xffffffffb42b84f4 (native_write_msr+0x4/0x40)
     
    The guest reports "hybrid part" disabled for all 4 cores. Cpuid on 0x1a also shows zeroes in all registers. The rdmsr output on the guest: 0x345: 0x2000, 0x38f: 0x0. I also tried to write MSR 0x38f manually, but in all cases when I set bit 48 (EN_PERF_METRICS) it fails, for any value of bits 0..7 (when leaving bit 48 to zero, the write succeeds). 
     


  • 7.  RE: Using vPMC with Linux guest on i7-13700H Windows host

    Posted Nov 22, 2023 11:45 AM

    If the "hybrid part" was already showing "false" perhaps it is a VMware software bug not bringing all the necessary bits in.

    You could force the hybrid part to be "true" by flipping bit 15 of leaf 0x7 edx. This worked even with version 16.2.5.

    cpuid.7.edx = "----:----:----:----:1---:----:----:----"

    In addition, the leaf 0x1a eax register bits 24-31 of the host p-core leaf 0x1a eax register such as this

    cpuid.1a.eax = "0100:0000:----:----:----:----:----:----"

    But I am not optimistic that VMware Workstation 17.5.0 software will pick up leaf 0x1a from the vmx (it didn't work with version 16.2.5). Now I am leaning towards that it is VMware bug, possibly the features of Intel 12th gen and later are not fully available in the guest.



  • 8.  RE: Using vPMC with Linux guest on i7-13700H Windows host

    Posted Nov 22, 2023 12:59 PM

    With these lines in the vmx, the hybrid part is reported as true for all (4) CPUs and it also works for the core type, this is now reported as Intel Core (eax=0x40000000). But wrmsr 0x38f is still failing and no working counters. Maybe I should report this as an VMware issue? 



  • 9.  RE: Using vPMC with Linux guest on i7-13700H Windows host

    Posted Nov 22, 2023 01:24 PM

    Interesting the vmx cpuid 0x1a mask was picked up; either that or setting the hybrid flag made the guest OS kernel do additional stuff.

    The cpuid.1a.eax masking in the vmx (if that is where the guest OS got it from) should match the host p-core considering that the VM has affinity to only p-cores. What I put out there (40h) is what I thought would be the likely value for a p-core based on an Intel documentation (two other values were "reserved" and another was "Intel Atom"). So try to match it with the results for the p-core of the host from

    cpuid -r -l 0x1a

    To be clear, I don't work for VMware. So yeah, I think you should report this to VMware that the counters are not available inside a VM running on a host with an Intel 13th gen (very likely with any Intel CPU with p-core/e-core) while it works fine at bare-metal.



  • 10.  RE: Using vPMC with Linux guest on i7-13700H Windows host

    Posted Nov 22, 2023 03:42 PM

    On the host eax is 0x40000001, so I tried that, but with the same result. So I'll report this issue to VMware.

    I also tried the Performance Counter Monitor from Intel (https://github.com/intel/pcm) on the guest, and the 'pcm-core' tool from this toolkit shows CPU cycle and instruction counts. But this tool might be using a different interface to access the counters, so I'm not sure if this says anything about the correct functioning of the the virtual counters.