vCenter

 View Only
Expand all | Collapse all

Agent java process using 100% CPU

  • 1.  Agent java process using 100% CPU

    Posted Feb 24, 2009 01:21 PM
    Hi,

    I have Hyperic 4.0.1 monitoring 11 platforms and on 10 of them it is working great but on the other 1 the Java process is just hogging all the CPU and I don't know why.

    Basically there are 2 web servers that are both being monitored. They use the same hardware and were setup by the same people using the same binaries at the same time. All config files are identical. One machine works perfectly and the other is having this problem with Java.

    I have tried restarted the agent and done some testing with new versions of the agent as well as excluding unwanted plugins in the agent.properties file but there is still no change.

    The only time I was able to get the CPU back down was but turning off collection of all metrics and restarting the agent. I then added the metrics back one-by-one and the CPU usage stayed at a negligible level. It remained this way for about 2 weeks but has now started to creep back up again and no change to the machine has been made that I am aware.

    Does anyone have any ideas what could but causing Hyperic/Java to do this?

    Thanks,

    Andy


  • 2.  RE: Agent java process using 100% CPU

    Posted Feb 24, 2009 05:39 PM
    Hi,

    what OS are you running on the server ? Are you using a HQ Agent package with the bundled JRE ?
    Check if there are special JAVA_HOME settings for the user you are using to start the Agent.
    Did you already upgrade the HQ Agent to latest 4.0.3 version on that host ?

    Cheers,
    Mirko


  • 3.  RE: Agent java process using 100% CPU

    Posted Feb 25, 2009 09:21 AM
    Hi,

    All the machines are running Red Hat Enterprise Linux 4.

    Originally I was using the bundled JRE but I have tried using a different JRE but that had no affect.

    I have also tried 4.0.2 and 4.0.3 and still get the exact same symptoms.

    There are no special JAVA_HOME settings for the hyperic user that I can see.

    The thing I'm finding most odd is that it's not a constant CPU usage at the moment, it seems to be about every 3-4 minutes. At first I thought it was something to do with the agent talking to the server but that should be happening more often so I ruled that out.

    Thanks,

    Andy

    Additional thought: I've change the agent on the server I'm having problems with so all the files have changed and yet the problem remains. Is it possible that the problem could lie at the server end in some way?

    Message was edited by: kwayley

    Message was edited by: kwayley


  • 4.  RE: Agent java process using 100% CPU

    Posted Feb 25, 2009 05:34 PM
    Hi,

    I support kwayley. Same problem here.

    I have 2 servers monitored with Hyperic 4.0.3 with identical hardware and configuration features. One works perfectly and the other doesn't.

    With agent.logLevel=DEBUG can't detect the problem either. Agent Java process is taking up 100% of the CPU during 2 minutes every 10-15 minutes.

    Services running in this machine are:

    SSH
    PostgreSQL
    MySQL
    NTPD
    Apache


    Temporarily I have installed hyperic-hq-agent-3.2.6 and now it works right.


    Regards,
    KPCasting

    Message was edited by: KPCasting

    Message was edited by: KPCasting


  • 5.  RE: Agent java process using 100% CPU

    Posted Feb 26, 2009 11:23 AM
    I just tried what KPCasting did and installed 3.2.6 on this machine and likewise my CPU load has now dropped dramatically. I can still see the CPU climbing to 100% just as it was but but the duration that it stays at this level for has now fallen right down.

    The average CPU usage on this machine is now about 15% but it is about 5% on its twin.


  • 6.  RE: Agent java process using 100% CPU

    Posted Feb 26, 2009 05:01 PM
    Can you try starting the 4.0.3 agent without the Java Service Wrapper? Perhaps the Wrapper is configured in a specific way that is not playing nicely with your system?

    ./bundles/agent-4.0.3-EE/bin/hq-agent-nowrapper.sh start


  • 7.  RE: Agent java process using 100% CPU

    Posted Feb 26, 2009 06:10 PM
    I tried running hq-agent-nowrapper.sh and although the performance was better it was still not as good as that of the other node. CPU on this one was still hitting 100% whereas on the other it never went about 6%.

    Performance running 4.0.1 agent without the wrapper is similar to that of running the 3.2.6 agent as normal. Both leave the CPU averaging at approximately an additional 10% CPU usage over that of the other identical machine.


  • 8.  RE: Agent java process using 100% CPU

    Posted Feb 26, 2009 06:33 PM
    Is this linux or unix system? If you are familiar with prstat and jstack, try to check which thread is taking most of the cpu cycles.

    Java stack could then tell a bit more what part of the agent is causing the load.


  • 9.  RE: Agent java process using 100% CPU

    Posted Feb 27, 2009 12:30 PM
    It is a Red Hat Enterprise 4 Linux system (2.6.9-55.ELsmp)

    I don't have prstat. Tried jstack but all I get is a load of errors:

    "Thread 28314: (state = BLOCKED)
    Error occurred during stack walking:
    sun.jvm.hotspot.debugger.DebuggerException: sun.jvm.hotspot.debugger.DebuggerException: get_thread_regs failed for a lwp
    at sun.jvm.hotspot.debugger.linux.LinuxDebuggerLocal$LinuxDebuggerLocalWorkerThread.execute(LinuxDebuggerLocal.java:134)
    at sun.jvm.hotspot.debugger.linux.LinuxDebuggerLocal.getThreadIntegerRegisterSet(LinuxDebuggerLocal.java:437)
    at sun.jvm.hotspot.debugger.linux.LinuxThread.getContext(LinuxThread.java:48)
    at sun.jvm.hotspot.runtime.linux_x86.LinuxX86JavaThreadPDAccess.getCurrentFrameGuess(LinuxX86JavaThreadPDAccess.java:75)
    at sun.jvm.hotspot.runtime.JavaThread.getCurrentFrameGuess(JavaThread.java:252)
    at sun.jvm.hotspot.runtime.JavaThread.getLastJavaVFrameDbg(JavaThread.java:211)
    at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:50)
    at sun.jvm.hotspot.tools.JStack.run(JStack.java:41)
    at sun.jvm.hotspot.tools.Tool.start(Tool.java:204)
    at sun.jvm.hotspot.tools.JStack.main(JStack.java:58)
    Caused by: sun.jvm.hotspot.debugger.DebuggerException: get_thread_regs failed for a lwp
    at sun.jvm.hotspot.debugger.linux.LinuxDebuggerLocal.getThreadIntegerRegisterSet0(Native Method)
    at sun.jvm.hotspot.debugger.linux.LinuxDebuggerLocal.access$800(LinuxDebuggerLocal.java:34)
    at sun.jvm.hotspot.debugger.linux.LinuxDebuggerLocal$1GetThreadIntegerRegisterSetTask.doit(LinuxDebuggerLocal.java:431)
    at sun.jvm.hotspot.debugger.linux.LinuxDebuggerLocal$LinuxDebuggerLocalWorkerThread.run(LinuxDebuggerLocal.java:109)"

    There is more than that but you get the idea.


  • 10.  RE: Agent java process using 100% CPU

    Posted Mar 01, 2009 04:52 PM
    I am running an agent (4.0.3) on windows with no wrapper.
    I have seen agent go up to 50% CPU when monitoring via hq (1 min poling)
    If I monitor with perfmon I can see it go even higher (very short but duration but high peaks)

    I would like to know:

    1. What would be a normal CPU usage for the agent (I would expect not more than 2%)
    2. What parameters can be used to optimize it (debug level? excluding plugins? reduce interval of log tracking?...)


  • 11.  RE: Agent java process using 100% CPU

    Posted Mar 02, 2009 12:11 PM
    Yesterday morning just after a scheduled backup task that runs every morning and uses 100% of both CPU cores the agent CPU has now dropped to the level I would expect to see given what the other node is doing.

    Unfortunately I am still running the old agent and I don't really want to put the new agent back again now until I can feel confident that the CPU isn't just going to shoot up again.

    Can anything think why the CPU would drop down all by itself like this?

    I have also noticed this when applying the exclude.plugin option to all of the servers that are being monitored. Initially the memory used by the agent increases and then after a day to a week it drops right down for no obvious reason.


  • 12.  RE: Agent java process using 100% CPU

    Posted Mar 09, 2009 09:16 AM
    Okay this is really starting to bug me now, at the beginning of last week the agent ran perfectly but since then the CPU usage has been creeping back up.

    To give you an idea of what I've been dealing with please see the attached picture. The red line is purely because of Hyperic. The green line shows the equivalent CPU core from the other machine.


  • 13.  RE: Agent java process using 100% CPU

    Posted Mar 09, 2009 12:00 PM
    Can you run the agent in debug and upload the log file?
    It would be interesting to see if it is related to log tracking.
    For me the problem was completely removed when I reduced the log tracking interval. (although on windows)


  • 14.  RE: Agent java process using 100% CPU

    Posted Mar 09, 2009 12:55 PM
    Log file attached while running agent 3.2.6.

    Would it be helpful to see the log from a 4.0.* agent as well?


  • 15.  RE: Agent java process using 100% CPU

    Posted Mar 16, 2009 09:17 AM
    Both versions of the agent are still causing me problems as they are both being far too CPU hungry. Does anyone have anything additional I could try in order to get it fixed?


  • 16.  RE: Agent java process using 100% CPU
    Best Answer

    Posted Mar 16, 2009 09:36 AM
    Although I can't know for sure log tracking has to do with it, as it was in my case, I would try to run the agent without log tracking or using a very high interval for it and comparing the CPU.

    Try putting this in the agent.properties:
    track.interval=99999

    (the units are in seconds)


  • 17.  RE: Agent java process using 100% CPU

    Posted Mar 16, 2009 09:42 AM
    Thanks, I'll give that a try and see what happens. :)


  • 18.  RE: Agent java process using 100% CPU

    Posted Mar 16, 2009 12:22 PM
    Yup, it was log tracking that was causing the problem.

    Basically one node was setup to track a HUGE php log and the other one wasn't. Now that I've removed tracking for that log everything has returned to normal.

    Thanks very much. :)


  • 19.  RE: Agent java process using 100% CPU

    Posted Dec 03, 2012 04:55 AM

    Hi SLTB/kwayley,

    I am facing the same issue.

    Curently i am using CentOS 5.5. and Hyperic 4.4.0.

    CPU have been hoggeg uotp 100% after each 15 mins for few seconds.and curently no track.interval in defined in agent.properties.and default interval is 5 mins.

    If the issue is related to log tracking then it sould happen after each 5 minutes.

    waiting for reply......thanks.

    Regards,

    Sanjiv Singh

    Software Engineer (iLabs)

    Impetus Infotech (India) Pvt. Ltd.

    D-40, Sector-59, Noida - 201307, UP |  (M) +91-9990-447-339 | www.impetus.com