VMware vSphere

 View Only
Expand all | Collapse all

high VM CPU ready time with High CPU Usage in OS

  • 1.  high VM CPU ready time with High CPU Usage in OS

    Posted Nov 14, 2014 01:08 PM

    Hi All,

    I have a Windows 2003 Ent x32 edition VM. This VM has 8 vCPU's. They are given as 8 virtual sockets and 1 core per socket. VM hardware version is 9 and VM wools are updated.

    It is running on a Host whose CPU is half utilized. It has two physical CPU's -- each of 10 core. This makes 20 cores. Hyper threading is enabled. Total logical processors are 40

    The CPU Ready time % for the VM is close to 26. Inside the OS the CPU utilization is close to 80 %. This is a Citrix Server and many users connect to it for application access.

    I did google on what can cause it and have checked below points.

    • Host is not over utilized
    • Inside the OS all the vcpu's are getting used
    • There is no limit
    • Memory inside the OS is fine and is not on the higher side

    Please suggest on what I can check next?

    Thanks

    Vaibhav



  • 2.  RE: high VM CPU ready time with High CPU Usage in OS

    Posted Nov 14, 2014 01:42 PM

    Hello,

    8 Virtual CPUs is a lot for just only 4 GB Memory - can you check if your OS is not swapping in any way? Perhaps the CPU is just waiting until it can serve memory from the hard drive. Use RAMMap to determine a more granular physical memory layout.

    Can you also post an esxtop screenshot along with the VM screen (the initial one or press v) , CPU mapping (press c) and memory mapping (press m)?



  • 3.  RE: high VM CPU ready time with High CPU Usage in OS

    Posted Nov 14, 2014 02:16 PM

    Hi Alistar,

    My bad I did not mention RAM -- RAM assigned to the VM is 20GB and inside the OS it is fine. No issues related to RAM



  • 4.  RE: high VM CPU ready time with High CPU Usage in OS

    Posted Nov 19, 2014 06:41 AM

    Hi All,

    Any suggestions here

    There are close to 20 users connected to the server. this is a Citrix Server. RAM utilization is fine on the server. %RDY time goes to 30. What to try next ?



  • 5.  RE: high VM CPU ready time with High CPU Usage in OS

    Posted Nov 19, 2014 08:06 AM

    Hi All,

    As per Jason to calculate effective %RDY time you need to divide the %rdy by per of vcpu’s. http://www.yellow-bricks.com/2010/01/05/esxtop-valuesthresholds/#comment-5861

    So in my case I have %rdy time as 30 and I have 8 vcpu’s. So the effective %RDY time is 30/8 = 3.75. This is fine for a VM.

    So now the question comes why the Co-Stop Value is high. If I check the real time settings for the VM then maximum co stop was 82 and latest was 13. As per article http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2000058 high co stop value can be seen if high CPU utilization is seen within the virtual machine Guest Operating System. This is true in my case.

    So does this mean can I safely ignore co stop value or do we have a formula like %rdy to calculate co stop effectively.



    Thanks



  • 6.  RE: high VM CPU ready time with High CPU Usage in OS

    Posted Nov 19, 2014 03:18 PM

    hey Ivaibhavt,

    Do you have any snapshots on that citrix VM? If so I would consolidate them as soon as possible as this can increase teh CO-STOP time

    http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2000058

    Also co-stop time is the amount of time it takes to schedule the request to all the vCPU's when processing a request.  One way to reduce this is to drop the number of vCPU's in the machine.  Even though you are currently running at 80% CPU on the citrix VM it could be running more like 110% due to the lag or time it has to wait for the co-stop.  You can try dropping it to 6vCPU and see if you get any performance gains.

    Here is a better article describing costop

    VMware KB: High co-stop (%CSTP) values seen during virtual machine snapshot activities

    Some other things that can be causing this is how many pCPU does your host have? Are you keeping things within the NUMA nodes?  Do you have hyper threading on? Hyper threading can sometimes be hurtful in MS TS farms or Citrix farms due to the high performance/requests of the CPU's.  This is due to the way hyper threading works.  Hyper threading is essentially cutting a core in half and sharing the cores same caching, so sometimes when you have large requests the cache gets filled up and you have to wait for the hyper threaded core to process the information.  The best way to think of this is think of a road.  What is the best way to get more traffic down the road?  You drag a dotted line down the middle and allow traffic in both directions or in the same direction.  However what happens when a car comes down the road that is as big as both lanes?  It has to wait till there is room on the road before it can drive forward.

    Also what other VM's on running on the host / cluster that the cirtrix VM is running on?

    Anyhow I hope this has helped.  Supply us some more information and we should be able to point you in the right direction



  • 7.  RE: high VM CPU ready time with High CPU Usage in OS

    Posted Nov 20, 2014 02:36 PM

    Hi JPM300,

    Thank you for replying and explaining the hyperthread functionality.

    There are total of 6 VM's on the Host. The Host Memory utilization is not even at the half way mark. The Host CPU is at 60 to 70 % utilized. This is no limit / reservation defined on the VM.

    The Other VM's running on the Host do not consume great deal of resources they are normal ones.

    I did check and there was no snapshot on the VM. Just want to share a behavior as I am unable to understand this.

    When I vmotion the VM (for which the Ready time was high) to another Host the Ready time comes down. What has vmotion got to do with ready time ? Not able to understand this

    Thanks



  • 8.  RE: high VM CPU ready time with High CPU Usage in OS

    Posted Nov 20, 2014 03:00 PM

    If you vmotion the VM that is having problems to a different host and the ready time drops that would mean the new host you have vMotioned the VM over to has less stress on the CPU, which is why you see the ready time drop.  It could be the host you moved it from is having more CPU usage or scheduling issues while the new host has less of these which is why you are seeing less ready time.

    On the host where the ready time is high, SSH into the CLI and run esxtop, at the top of the screen you will see a CPU average Load.  If the average load is over 1, it means your host is maxed out on CPU and is starting to hit contention:

    VMware KB: Troubleshooting ESX/ESXi virtual machine performance issues

    Under the CPU Constraints

    Examine the load average on the first line of the command output.

    A load average of 1.00 means that the ESXi/ESX Server machine’s physical CPUs are fully utilized, and a load average of 0.5 means that they are half utilized. A load average of 2.00 means that the system as a whole is overloaded.


    Also use ESX top to measure your co-stop time and ready time

    http://www.yellow-bricks.com/esxtop/


    I suspect if you where to drop your vCPU a little you might see a slight improvement due to the scheduling issues that seem to be happening.


    Citrix and MS TS farms are tricky in visualization due to their unique requirements and load on the servers.  Typically when virtulizing Citrix or MS TS you end up scaling out instead of up, where in the physical server environment you typically sale up.  IE.)  Physical box 40-60 users per server, Virtual box 20-30 users per server.  This is not to say you can't have LARGE Citrix VM's its just trickier to find the sweet spot between max # of users per server, performance, vCPU, ect.


    VMware has a Citrix best practice guide here:


    http://www.vmware.com/files/pdf/solutions/vmware-citrix-xenapp-best-practices-EN.pdf

    This might have some extra useful information.  Also most Citrix / TS farms have special settings for the AV on that system as well so they don't spawn up to many AV engines for each user ect.

    Hope this has helped



  • 9.  RE: high VM CPU ready time with High CPU Usage in OS

    Posted Nov 21, 2014 10:29 AM

    Hi JPM300,

    Thank you for explaining :smileyhappy: the scenario. my case is little different let me explain you.

    The VM in question (with high Ready time) is VMa. It was on Host A. Host A cpu utilization is close 73%. If I check esxtop value then I see .73 as cpu load.

    A colleague of mine vmotioned the VMa to Host B. The Host CPU utilization and the ready time for the VMa both were on the higher side. So my colleague vmotioned the VMa again to Host A.

    The Guest OS Utilization is same as before. The Host A CPU utilization is same as before. Nothing special was done on the VM in terms of reservation/limit/share/datastore movement. Nothing at all just vmotion to the same Host again.

    This was done last evening. Around 20 hours have gone by and the Ready time is in hundreds and sometimes less. Earlier it was in high thousands.

    Sorry I should have mentioned this before. This is something I am unable to understand how can vmotioning the VM to the original Host with no change at all bring the ready time down.

    Thanks



  • 10.  RE: high VM CPU ready time with High CPU Usage in OS

    Posted Nov 21, 2014 03:11 PM

    Hey Ivaibhavt

    Thanks for the reply a few questions:

    What is Host B's CPU ready time and co-stop time?  How many vCPU does your vMA have?  Are both hosts the same hardware model number?



  • 11.  RE: high VM CPU ready time with High CPU Usage in OS

    Posted Nov 24, 2014 07:55 AM

    Hey JPM300,

    I was out during weekend so couldn't reply. Here are the details

    Current Host (Host A) on which the VM (VMA) resides is a Proliant BL460c Gen8

    Host B is also a Proliant BL460c Gen8.

    Both the servers have two E5-2670 processor. Each processor has 10 cores. Hyper threading is enabled. So this makes 40 Logical processors per host.

    The VMa has 8 virtual sockets and 1 core is assigned to each physical socket. This is the CPU given to the VM.

    Correct me if I am wrong you would like to know the co-stop value and the ready time for Host B. This is the host on which VMa was placed for sometime and then later vmotioned to Host A.

    Thanks



  • 12.  RE: high VM CPU ready time with High CPU Usage in OS

    Posted Nov 25, 2014 06:20 AM

    Hi JPM300,

    Yesterday we had Delhi VCP Club Meet, I met a Storage guy and explained him about the issue.

    He said that it could be possible that there might be a hung process which could be driving the ready time in high thousands. When you vmotion then clone of VM gets created on the destination host and during the process it might be possible that hung process might have got released.



  • 13.  RE: high VM CPU ready time with High CPU Usage in OS

    Posted Nov 27, 2014 07:28 AM

    Hi there again,

    a thing has come into my mind - if it is impossible to break this single VM down to two "smaller" ones, perhaps you could try disabling core sharing for this machine? You can do it in the VM's properties under Resources -> Advanced CPU and set the Core Sharing to None like in this screenshot https://vmxp.files.wordpress.com/2014/10/image007.png. The VMkernel scheduler should be smart enough not to put anything on the already loaded physical cores, but one can never know - at least I think it is worth a try.

    If you'd like to note which process would be really using the most of your CPU, I recommend running SysInternals' Process Explorer and observing the process that consumes the most CPU. This can be also caused by a hung/resource-consuming thread inside a process so once you find what takes the most CPU time, double-click the process and select "Threads" tab and note down the first three most busy threads.

    Have you tried to observe the NUMA spread via the ESXTOP while the VM is busy as described in this thread?VM's on NUMA nodes

    Oh, and if you could post screenshots of your two hosts with VMs running on them (their core counts/memories) and your hosts' total CPU Ready time, that would help a lot, this issue might not be caused by just this single VM but also by something else in your environment.



  • 14.  RE: high VM CPU ready time with High CPU Usage in OS

    Posted Nov 27, 2014 11:26 AM

    I see there is a great deal of discussion about ready time within the guest.. Of course this is where the users will feel it and us Admins see it via performance monitoring.

    A very basic fact.. The more cores assigned to the guest the higher readytime values you will see. Using esxtop is a far better way to evaluate if you really do have ready time issues. Via esxtop any readytime value with double digits is of concern. Any value over 20 is likely to impact the users.

    *****A little more on the guest before discussion things from a host/hardware perspective.***

    If you look at why visualization originally gained popularity, it was because we all had hundreds of expensive servers which were utilizing 2-20% of their resources. So we would pile as many systems onto an ESX server as we could, ready time was acceptable as the servers were not forward facing (as in customers were not on the console) a few ms of latency was acceptable for a file to be retreaved or written, a DB to be accessed etc..

    We all pushed back at the idea of virtualising servers which had heavy resource requirements. This was somewhat a very sensible thing..

    Times have moved on, we now have high percentages of our servers virtualised. To make the environment easier to manage if we structure as many servers in s similar model so we virtualise many servers which really are not ideally suited for virtualisation.

    Database servers are the obvious one to mention, but we do it anyway because it gives us centralized management and simplistic DR, backup etc..

    Citrix is another which is not ideal for virtualisation.. But we do it. Because it's forward facing a few ms of latency is felt by the user and seen as a jerky mouse movement.

    We have built and manage a large Citrix XenApp environment. We have found some best practices to follow.

    Do not over commit resources.

    Look at your physical CPU to allocated CPU rates. Do not exceed 1 to 1 where XenApp is concerned. VMWare in fact advise preallocate/reserving ALL guest resources..

    More servers with less vCPU. The most effective server would have 1 vCPU but that's not practical.. Limit XenApp to 4 vCPU's. if you need more capacity build another XenApp server.

    Look at your application base. If you have ugly apps which are resource pigs push them to yet another XenApp server. eg. one user opens an application which loads a 3GB jvm all 19 others suffer.. WEB pages which are flash intensive can grab loads of CPU.. Client side rendering or GRID off load cards are the only solution here. or just block it at the firewall or of course degrade it.

    If you have slow disk subsystems you could use RAM caching to speed up disk activity..

                • Harware and Hypervisor*******

    I would ask some more questions about the environement from a hypervisor perspective.

    What version of ESXi are you running

    Which build is this based on eg. ESXi 5.0 u2 HP release or VMWare release.

    What are your firmware levels baselined at on the chassis and blade.

    What type of storage are you using and over what medium are you connecting.

    cheers