VMware vSphere

 View Only
  • 1.  Estate Wide random CPU Ready Spikes

    Posted Oct 09, 2024 12:23 PM

    Hi All,

    Need to throw this one open a little as we are running out of options \ ideas and will shortly be opening a ticket.

    We are randomly seeing CPU ready spikes of 400ms when the estate is otherwise totally inactive (In this case its a carbon copy of prod where the issue is also seen.)

    Running ESXi 7.0 u3 (Dell Version)

    Affecting Windows and Linux builds.

    Spike will occur at a random interval on a random VM

    No VMs have a ridiculous quantity of CPUs assigned.

    Each VM has a Reservation and Limit equal to to the Mhz of vCPUs assigned. (Customer requirement)

    Hosts are not overprovisioned [By the total MHz available provided by the pCPU vs that assigned to vCPU] (Customer requirement)

    A total of 20% of the pCPU overall MHz has been reserved to Host overhead. 

    VM Tools and VM Hardware are out of date planning uplift to evaluate impact.

    ESXi Host CPU \ Memory Utilisation is on tickover only.

    Anyone any thoughts..?

    P



  • 2.  RE: Estate Wide random CPU Ready Spikes

    Posted 29 days ago

    Interesting case those random CPU-ready spikes often hint at brief vCPU scheduling contention, even when overall host usage seems low. It's worth checking your vCPU-to-pCPU ratios, affinity/anti-affinity rules, and any background tasks like snapshots or backups that might be stalling CPU scheduling. Monitoring via esxtop at the moment of the spike and checking NUMA configuration and power-management settings can also help pinpoint root causes.

    For a similar discussion on this platform, see: https://community.broadcom.com/property managementdoes-cpu-ready-increase-when-you-spread-it-across-processors

    -------------------------------------------



  • 3.  RE: Estate Wide random CPU Ready Spikes

    Posted 27 days ago

    Random CPU Ready spikes across an entire estate usually indicate a systemic resource-contention issue rather than a host-specific fault. In VMware or similar virtualized environments, this often happens when multiple VMs compete for the same physical CPU cycles, especially if vCPU allocations are oversized compared to the underlying pCPU capacity. It can also be triggered by DRS migration delays, noisy-neighbor workloads, or misconfigured CPU reservations/limits that cause scheduling stalls during peak operations.

    If the spikes occur consistently across clusters, it's worth checking whether recent changes were made to HA/DRS settings, host power management profiles, or BIOS configurations such as C-states and hyper-threading. Reviewing firmware consistency and ensuring that all hosts match in microcode and CPU generation can also eliminate unpredictable scheduling behavior. Pulling esxtop or performance logs during spike windows will help confirm whether the ready time correlates to queue depth, storage latency, or VM bursts. Addressing these root causes typically stabilizes CPU scheduling and reduces estate-wide performance anomalies.

    -------------------------------------------



  • 4.  RE: Estate Wide random CPU Ready Spikes

    Posted 27 days ago
    Edited by Carl Bidwell 27 days ago

    Those random CPU‑ready spikes can be tricky! Often, they happen due to scheduling contention even when hosts seem idle. It's worth checking your vCPU-to-pCPU ratios, any reservations or limits, and host BIOS/power settings. Capturing esxtop data during a spike can really help pinpoint what's going on. Keeping VM Tools and hardware versions up-to-date also makes a difference. Sharing some snapshots or logs might help the community narrow down the cause faster.

    For another interested discussion on this platform, see: https://community.broadcom.com/vmware-cloud-foundation/discussion/input-type-selling-property-to-capture-array-with-multiple-properties