VMmark

 View Only
Expand all | Collapse all

CPU utilization goes upto 99 % ? What are the probable reasons

  • 1.  CPU utilization goes upto 99 % ? What are the probable reasons

    Posted Dec 03, 2013 03:58 PM

    On running VMmark for 10 tiles I am getting high CPU utilization close to 99 %. What are the probable reasons ? We are using the best servers in terms of configuration. Please let me know the probable parameters other than processor, bios options etc.?



  • 2.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Broadcom Employee
    Posted Dec 03, 2013 04:06 PM

    This is a very wide open question as there could be any number of reasons why your CPU utilization is close to 99%.  Although you say you're using "the best servers in terms of configuration", where are you getting your expectations from?  Are you comparing your configuration with similar VMmark2 results?  What do you think your utilization should be at?



  • 3.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Posted Dec 03, 2013 04:07 PM

    We are using flash based memory for our run.



  • 4.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Broadcom Employee
    Posted Dec 03, 2013 04:10 PM

    That doesn't answer any of my questions.  Just using flash based memory doesn't mean you'll never see a CPU bottleneck.

    Although you say you're using "the best servers in terms of configuration", where are you getting your expectations from?  Are you comparing your configuration with similar VMmark2 results?  What do you think your utilization should be at?



  • 5.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Posted Dec 03, 2013 04:20 PM

    Yes we are comparing our configuration with VMmark 2 results. We are getting QOS for mail server , I expect utilization close to 99% leads to latency, it should be around 90-95 %.



  • 6.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Posted Dec 03, 2013 04:23 PM

    QOS error for mail server. Does Host Power Management policies- “High performance,” “Balanced,” “Low power,” or “Custom have anything to do with utilization and QOS related issues ? If so which one needs to be used ?



  • 7.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Broadcom Employee
    Posted Dec 03, 2013 04:35 PM

    Again, there could be any number of reasons why you're seeing higher CPU utilization.  Unless you have setup an identical testbed as a published result then you'll need to adjust your expectations according to the configuration you're running against.  Most likely you'll need to compare your result at a lot more detail for you to set accurate expectations based on a published result.

    IE, are you running the same servers? (BIOS/FW)? CPUs?  (Family/Model/Stepping).  Same Memory (model, size, speed).  Same NICs (model, driver), Same HBAs (model, driver, speed) Same ESXi (version, tunables, etc).  This is just a few of the potential places that even a slight difference might impact your overall performance results.

    Yes, Host Power Management policies can impact performance (and CPU/QoS).  Typically High performance would be what you would select for a performance only VMmark2 run.

    I would recommend reviewing some of the published results in detail and seeing how submitters have configured their environments and what tunables they thought had a impact.



  • 8.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Posted Dec 08, 2013 05:09 PM

    Pls find the details

    Model Intel(R)Xeon(R)CPU E5-2650 v2 @ 2.60 GHz

    Processor Sockets 2

    Processor Cores per socket 8

    Logical Processor 32

    Hyperthreading Enabled

    Manufacturer Supermicro

    Model X9DRW-7/iTPF

    It would be great Joshua/Rebecca if you people put some insight on the issue CPU utilization.



  • 9.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Broadcom Employee
    Posted Dec 09, 2013 12:45 PM

    Hi Vikas,

    What are you asking us to do here?  If you read my previous comment you'll see that this is a complex issue and unfortunately I don't really see any submissions that include systems all that similar to yours.  Have you compared your tunings to what others have done?

    With just a cursory comparison to a recent 2 socket submission with SSDs and E5-2690 processors (2.9GHz versus your slower 2.6GHz E5-2650s), I see that run was only able to get 10 tiles in as well.  Seems to me that might indicate you're at a reasonable amount of performance.

    What level of CPU utilization are you expecting and how did you come to this conclusion?



  • 10.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Posted Feb 10, 2014 04:24 PM

    Hi Joshua,

    We are running two identical servers as per below configuration:

    Pls find the details

    Model Intel(R)Xeon(R)CPU E5-2650 v2 @ 2.60 GHz

    Processor Sockets 2

    Processor Cores per socket 8

    Logical Processor 32

    Hyperthreading Enabled

    Manufacturer Supermicro

    Model X9DRW-7/iTPF

    So In total (cumalative of two servers)

    Sockets = 4; Cores= 32 and Logical Processor = 64

    We are using 1 Ghz NIC and all the tuning parameters almost same as per the published result on vmmark site.

    So, does the processor speed can be a viable reason for this ? Please put some light on this issue.



  • 11.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Posted Feb 10, 2014 04:51 PM

    Adding to the information furnished above by Vikas,

    1. servers constituting the test bed are having ESXi 5.1.0 , 256 GB of memory each , BIOS version 3.0 and 8 Gb fibre HBA .

    Now, when other things look very much similar (except clock speed of processor 2.6 v 2.9 Ghz ) , i am wondering what makes cluster reach saturation  with less number of tiles (as far as CPU utilization is concerned).

    While in published results , i can see valid  scores with 10 tiles while in our case running 7 tiles itself gives error free but invalid QoS numbers. I was referring here and here to conclude about cluster saturation.



  • 12.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Broadcom Employee
    Posted Feb 10, 2014 05:31 PM

    Can you provide an updated summary of what you're seeing?  In the first post Vikas states that he is running 10 tiles and seeing CPU saturation, but in Anandbibhuti's post he says 7 tiles and failing QoS.  We should start at the first signs of compliance or issues (with respect to tile count) and work our way up from there.



  • 13.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Posted Feb 10, 2014 05:50 PM

    Hi Joshua,

    The updated status is that with 7 tiles itself the CPU utilization is reported very high (~98-99%). This results in incorrect QoS value (having * in final result/Score.out file) for one or more tile. But with 6 tiles, (barring few instances)  things look fine with a valid result of fully compliant run. (Number of tiles was reduced from 10 gradually to try and find "thresh-hold" limit  of cpu saturation). In both scenarios, STAX does not report any error.



  • 14.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Broadcom Employee
    Posted Feb 10, 2014 06:00 PM

    Can you upload a zip of your results folder so I can take a look?  Ideally with a result at 7 tiles and REPORTER=1 in your VMMARK2.config.



  • 15.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Posted Mar 05, 2014 05:53 PM

    Please find the attached result for the 9 tile compliant run and 10 tile non compliant run and also VMmark2 Cluster report. Please let us know the probable reason for the non compliant run for the 10 tiles.

    CPU utilization metrics is as follows:

    On server-1

    10 tiles - maximum CPU utilization - 71

    9 tiles - maximum CPU utilization - 67

    On server-2

    10 tiles - maximum CPU utilization - 81

    9 tiles - maximum CPU utilization - 75



  • 16.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Broadcom Employee
    Posted Mar 06, 2014 09:25 PM

    I took a look at this and saw that those CPU util numbers above look comparable to the results Josh posted in Reply 8. Olio is network-intensive, and it looks like that last tile saturates the vSwitch it resides on.


    May I ask why you don't run the same build of ESXi on both servers?


    Also, the versions of VMware tools are old on all the VMs. Since Olio is so network-intensive, I would recommend updating the tools to make sure you're using the latest VMXNET drivers then rerunning the test. I don't think that'll make much of a difference though. Are you using VMXNET 3?


    Lisa



  • 17.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Posted Mar 09, 2014 06:30 PM

    1.  vmware tools are up to date now.

    2. We are moving ahead to make build same on both the servers

    3. Also, all linux virtual machines have vmxnet3 but windows machine (mailserver and virtualclient) have E1000 network drivers. Olio machines that show incorrect QoS are having vmxnet3 drivers anyway.

    But do you think point2 and point3 really make negative impact on test results ? We have all 1 Gb card in our env. Could that be a prob area ?



  • 18.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Broadcom Employee
    Posted Mar 10, 2014 04:00 PM

    As I wrote in reply 15, the problem is likely saturation of the 1 Gb NIC. Go through recent 2.5 submissions and find out what type of NIC was used. If there are any recent submissions with 1 Gb NICs, how many were used?





  • 19.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Posted Apr 08, 2014 06:38 AM

    Things work fine now when all the NICs were changed to 10Gb. No more incorrect data for Olio workload.



  • 20.  RE: CPU utilization goes upto 99 % ? What are the probable reasons

    Broadcom Employee
    Posted Apr 08, 2014 02:23 PM

    Glad to hear it.

    Lisa