DX Application Performance Management

Expand all | Collapse all

CA Tuesday Tip: 2013 - Java Agent Hangs, Crashes, OOM, High CPU, Agent conf

  • 1.  CA Tuesday Tip: 2013 - Java Agent Hangs, Crashes, OOM, High CPU, Agent conf

    Posted 04-10-2013 11:47 AM
      |   view attached
    CA APM Tuesday Tip by Sergio Morales, Principal Support Engineer for 4/10/2013

    Hi Everyone,
    Here is an update of my prevoius post sent last 2011 .Below a checklist of the points you must review whenever you have any of the above mentioned Java agent issues:

    Checklist:

    1.Is the issue affecting all the AppServers?

    2.Does the problem occur on start-up?

    3.Verify that the configuration is supported – see APM compatibility guides:
    https://support.ca.com/irj/portal/anonymous/phpsupcontent?contentID=883df031-705e-425b-9a0e-73130da8a204&productID=5974
    If not listed, open an Enhancement via the CA APM Community site at the following address:

    https://caideation.secure.force.com/ideation/idealist?c=09a30000000JkclAAC&sort=popular&lang=en_US&lrUrl=https://communities.ca.com/web/ca-wily-global-user-community/welcome&isLoggedIn=false&lrCUrl=https://communities.ca.com/web/ca-wily-global-user-community/welcome&lrCname=CAWily/APM Global User Community
    Link to instructional video for the proper way to submit enhancement requests:
    www.ca.com/media/idea_vision_21610/index.html

    4.If the agent is configured correctly, check if there is any log file.

    5.Look for possible syntax errors in the Autoprobe.log. The issue could be related to a syntax error in the pbd/pbls preventing the agent to startup.

    6.Find out if the problem is related to the Agent instrumentation or a JVM bug. This is a very common issue.

    NOTE: A jvm crash has to do with a defect/bug of the jvm as it is not supposed to crash under any circumstances.

    Open the IntroscopeAgent profile, set introscope.autoprobe.enable=false, you need to restart the jvm.
    If the problem persists, it will confirm that the problem is not related to the Agent instrumentation:

    a.Try switching from -javaagent to –Xbootclasspath
    If the problem persists, you need to open a support incident with the jvm vendor.
    b.Upgrade to a latest JVM or use an alternate JVM

    If the problem does not persist, enable the instrumentation back but disable SQLagent, any custom PBD/PBL and Agent extensions:
    a)Stop the Appserver
    b)In the IntroscopeAgent profile, set
    introscope.autoprobe.enable=true
    c)Disable SQLAgent by removing the SQLAgent.jar out of the AGENT directory.
    If you are using v9.1, you can use: Introscope.agent.sqlagent.sql.turnoffmetric=true
    d)Disable JMX collection by setting introscope.agent.jmx.enable=false
    e)Turn off tracers for network, filesystem and System File Metrics in toggles pbd file.

    #TurnOn: SocketTracing
    #TurnOn: UDPTracing
    #TurnOn: FileSystemTracing
    #TurnOn: ManagedSocketTracing

    f)Disable any additional Agent extension such as: ChangeDetector, Leakhunter, Powerpacks.
    g)Disable any additional custom pbd or agent extension or formatter created by the Professional Service team.
    If the problem does not occur, you must then introduce back each component one by one until you reproduce the problem.

    7.If the issue is related to memory, make sure you set introscope.agent.reduceAgentMemoryOverhead=true in the IntroscopeAgent.profle

    8.If the issue is related to an OOM, confirm if the problem occurs in native or heap space. You should be able to confirm this from the stack trace or threadump. If the issue occurs in native memory, the only component that works directly with native memory is Platform Monitor, try to disable it.

    9.If the issue related to High CPU, disable Platform monitor – for more details about how to disable this extension see the Agent guide.

    10.If the issue is related to memory, crash or performance and you are using v9.1 and IBM JDK J9, switch to use AgentNoRedefNoRetrans.jar and IntroscopeAgent.NoRedef.profile instead of Agent.jar and IntroscopeAgent.profile

    11.If the issue is related to memory, crash or performance and you are using v9.x, try to disable deep inheritance (introscope.autoprobe.deepinheritance.enabled=false): note that deep Inheritance cache that we build at startup which looks at each class loaded by the JVM could have a significant memory\CPU overhead if there are huge number of classes loaded at startup. Currently, it is unlimited in size until we start aging out entries which are 10 minutes old.

    12.If you are using 9.1.1.1+ and Oracle RAC as backend you might notice an overhead as we use reflection to get the correct RAC instance name. You should see some relief in CPU utilization if you add the below agent property: introscope.agent.sqlagent.cacheConnectionsURLs=true

    13.Mixed mode is NOT supported. In presence of legacy extensions, tracers, pbd, please switch to using the legacy mode in agent. Turning On mixed mode (new tracers + legacy) will cause performance issues.

    14.If the problem persists and you are using 9.1.x, switch to use pre 9.1 legacy mode, set:
    introscope.agent.configuration.old=true (hidden property) you must add it to the profile if you want to switch
    Legacy pbd/pbls are located in wily/examples/legacy, copy all files to wily/core/config and reconfigure the agent as below:
    introscope.autoprobe.directivesFille=default-typical-legacy.pbl, hotdeploy,spm-legacy.pbl

    15.If you are using v9.1 and Java1.7 use –XX:-UseSplitVerifier as an additional parameter along with normal Agent parameters in order to start the JVM 7 without the new verifier. Reason: JRS 202 has made the change in class verification by type checking. Classfiles with version number 51 are exclusively verified using the type-checking verified, and thus the methods must have StackMapTable attributes when appropriate. Exceptions you will see if UseSplitVerifier is not used are:
    java.lang.VerifyError:StackMapTable error:bad offset
    java.lang.ClassFormatError:Illegal local variable table
    Java.lang.InternalError

    16.For some appservers, additional configuration steps are required in order to enable the agent.

    For example: “java.lang.NoClassDefFoundError: com/wily/introscope/agent/trace/IMethodTracer” error will occur if you are using OSGI felix configurations.

    a) If Glassfish 3.1.2, open <glassfish_home>\glassfish\config\ osgi.properties and add wily classes to the property: eclipselink.bootdelegation=oracle.sql, oracle.sql.*, com.wily.*
    b) If Glasfish 3.1.1. open <glassfish_home>\glassfish\osgi\felix\conf\config.properties and add a regexp to wily : org.osgi.framework.bootdelegation=sun.*,com.sun.*,com.wily.*
    c) If Weblogic and Apache felix, add sling.bootdelegation.com.wily=com.wily.* to the sling.properties file. This setting unconditionally adds the com.wily.* package to the org.osgi.framework.bootdelegation property
    d) If Tomcat and Apache felix, add the below system property:
    -Datlassian.org.osgi.framework.bootdelegation=com.wily.*,sun.*,net.customware.*,org.apache.*
    e) If Jboss6, use the below JVM options:
    -javaagent:%WILY_HOME%\Agent.jar -Dcom.wily.introscope.agentProfile=%WILY_HOME%\core\config\IntroscopeAgent.profile -Xbootclasspath/p:%JBOSS6_HOME%\lib\jboss-logmanager.jar -Djava.util.logging.manager=org.jboss.logmanager.LogManager -Dorg.jboss.logging.Logger.pluginClass=org.jboss.logging.logmanager.LoggerPluginImpl
    f) If Jboss7: See JBoss7 1 Issues.doc

    If you see the below exceptions, the problem is related to https://issues.jboss.org/browse/JBAS-7427 .
    LOGMANAGER is in a wrong state due to JBOSS Class Loading problem. Application CLASSLOADING is controlled by application server, you should contact JBOSS AS:

    java.lang.IllegalStateException: The LogManager was not properly installed (you must set the "java.util.logging.manager" system property to "org.jboss.logmanager.LogManager")
    at org.jboss.logmanager.Logger.getLogger(Logger.java:61)
    at org.jboss.as.server.Main.main(Main.java:83)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

    17. Review “Known issues” section from latest product readme files
    For v8.x: Introscope8.x.x.x_README.pdf
    For v9.0.x: APM_Known_Issues9.0.x.x.pdf
    For v9.1.xAPM_Release_Notes_9.1.x.x_EN.pdf

    APM Product Releases and Announcement: https://support.ca.com/irj/portal/anonymous/phpsupcontent?contentID=378db89f-3375-42c3-a07d-9e983d13c0a6&productID=5974



    What to collect if the problem persist?

    Collect the following information and open an incident with CA Support.
    1.Zipped content of AGENT_HOME/logs
    2.IntroscopeAgent.profile
    3.Generate a series of 5 thread dumps on the application server for OOM/high CPU situations spaced 5 -10 seconds apart.
    4.Appserver logs
    5.App server config or startup script files.
    6.Core dump, if applicable.
    7.Exact version of the application server, jvm and OS.
    8.In case of OOM, collect heapdump. Additional jvm switches will be required for this.
    For Sun jvm, add the following jvm switch: -XX:+HeapDumpOnOutOfMemoryError
    9.Enable GC log. Additional jvm switches will be required for this.
    For Sun jvm, add the following jvm switches: -Xloggc:<filename>.log -XX:+PrintGCDetails

    Regards,
    Sergio

    Attachment(s)

    doc
    JBoss7 1 Issues.doc   34K 1 version


  • 2.  Re: CA Tuesday Tip: 2013 - Java Agent Hangs, Crashes, OOM, High CPU, Agent conf

    Posted 02-18-2015 10:35 AM

    Hi Sergio,

     

    I am trying to monitor a TIBCO Active Matrix Service Grid 3.2, but next message is shown in the application server log file.

     

    TIBCO-AMX-HPA-000047: error initializing notification group 
             at com.tibco.amf.hpa.core.common.notification.NotificationGroup.<init>(NotificationGroup.java:188) 
             at com.tibco.amf.hpa.core.runtime.notification.StatusNotificationGroup.<init>(StatusNotificationGroup.java:91) 
             at com.tibco.amf.hpa.core.runtime.notification.StatusManagerStateMachine.initStatusNotificationGroup(StatusManagerStateMachine.java:432) 
             at com.tibco.amf.hpa.core.runtime.notification.StatusManagerStateMachine.bind(StatusManagerStateMachine.java:110) 
             at com.tibco.amf.hpa.core.runtime.node.NodeStateMachine.bind(NodeStateMachine.java:127) 
             at com.tibco.amf.hpa.tibcohost.node.internal.Activator.initStateMachine(Activator.java:290) 
             at com.tibco.amf.hpa.tibcohost.node.internal.Activator.init(Activator.java:143) 
             at com.tibco.amf.hpa.tibcohost.node.internal.Activator.start(Activator.java:112) 
             at org.eclipse.osgi.framework.internal.core.BundleContextImpl$1.run(BundleContextImpl.java:711) 
             at java.security.AccessController.doPrivileged(Native Method) 
             at org.eclipse.osgi.framework.internal.core.BundleContextImpl.startActivator(BundleContextImpl.java:702) 
             at org.eclipse.osgi.framework.internal.core.BundleContextImpl.start(BundleContextImpl.java:683) 
             at org.eclipse.osgi.framework.internal.core.BundleHost.startWorker(BundleHost.java:381) 
             at org.eclipse.osgi.framework.internal.core.AbstractBundle.resume(AbstractBundle.java:389) 
             at org.eclipse.osgi.framework.internal.core.Framework.resumeBundle(Framework.java:1131) 
             at org.eclipse.osgi.framework.internal.core.StartLevelManager.resumeBundles(StartLevelManager.java:559) 
             at org.eclipse.osgi.framework.internal.core.StartLevelManager.resumeBundles(StartLevelManager.java:544) 
             at org.eclipse.osgi.framework.internal.core.StartLevelManager.incFWSL(StartLevelManager.java:457) 
             at org.eclipse.osgi.framework.internal.core.StartLevelManager.doSetStartLevel(StartLevelManager.java:243) 
             at org.eclipse.osgi.framework.internal.core.StartLevelManager.dispatchEvent(StartLevelManager.java:438) 
             at org.eclipse.osgi.framework.internal.core.StartLevelManager.dispatchEvent(StartLevelManager.java:1) 
             at org.eclipse.osgi.framework.eventmgr.EventManager.dispatchEvent(EventManager.java:230) 
             at org.eclipse.osgi.framework.eventmgr.EventManager$EventThread.run(EventManager.java:340) 
     Caused by: java.lang.NoClassDefFoundError: com/wily/introscope/agent/trace/IMethodTracer 
             at com.tibco.tibjms.TibjmsxSessionImp._createProducer(TibjmsxSessionImp.java:1021) 
             at com.tibco.tibjms.TibjmsQueueSession.createSender(TibjmsQueueSession.java:92) 
             at com.tibco.tibjms.admin.MessengerUtil.<init>(MessengerUtil.java:51) 
             at com.tibco.tibjms.admin.TibjmsAdmin.<init>(TibjmsAdmin.java:421) 
             at com.tibco.tibems.qin.TibQinGroupConnectionImpl._queryGroupConnections(TibQinGroupConnectionImpl.java:1380) 
             at com.tibco.tibems.qin.TibQinGroupConnectionImpl._createGroupConnection(TibQinGroupConnectionImpl.java:1311) 
             at com.tibco.tibems.qin.TibQinGroupConnectionImpl.initialize(TibQinGroupConnectionImpl.java:1909) 
             at com.tibco.tibems.qin.TibQinGroupConnectionFactory.createGroupConnection(TibQinGroupConnectionFactory.java:73) 
             at com.tibco.neo.gms.GroupConnectionFactory.createGroupConnection(GroupConnectionFactory.java:62) 
             at com.tibco.amf.hpa.core.common.notification.NotificationGroup.<init>(NotificationGroup.java:184) 
             ... 22 more 
     Caused by: java.lang.ClassNotFoundException: com.wily.introscope.agent.trace.IMethodTracer 
             at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(BundleLoader.java:513) 
             at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:429) 
             at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:417) 
             at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.loadClass(DefaultClassLoader.java:107) 
             at java.lang.ClassLoader.loadClass(ClassLoader.java:247)          ... 32 more
    
    
    
    

    It seems that there is a problem loading OSGI packages as you mentioned before:

     

    16.For some appservers, additional configuration steps are required in order to enable the agent.

     

    For example: “java.lang.NoClassDefFoundError: com/wily/introscope/agent/trace/IMethodTracer” error will occur if you are using OSGI felix configurations.

     

    a) If Glassfish 3.1.2, open <glassfish_home>\glassfish\config\ osgi.properties and add wily classes to the property: eclipselink.bootdelegation=oracle.sql, oracle.sql.*, com.wily.*

    b) If Glasfish 3.1.1. open <glassfish_home>\glassfish\osgi\felix\conf\config.properties and add a regexp to wily : org.osgi.framework.bootdelegation=sun.*,com.sun.*,com.wily.*

    c) If Weblogic and Apache felix, add sling.bootdelegation.com.wily=com.wily.* to the sling.properties file. This setting unconditionally adds the com.wily.* package to the org.osgi.framework.bootdelegation property

    d) If Tomcat and Apache felix, add the below system property:

    -Datlassian.org.osgi.framework.bootdelegation=com.wily.*,sun.*,net.customware.*,org.apache.*

    e) If Jboss6, use the below JVM options:

    -javaagent:%WILY_HOME%\Agent.jar -Dcom.wily.introscope.agentProfile=%WILY_HOME%\core\config\IntroscopeAgent.profile -Xbootclasspath/p:%JBOSS6_HOME%\lib\jboss-logmanager.jar -Djava.util.logging.manager=org.jboss.logmanager.LogManager -Dorg.jboss.logging.Logger.pluginClass=org.jboss.logging.logmanager.LoggerPluginImpl

    f) If Jboss7: See JBoss7 1 Issues.doc

    Do you know where can I find this configuration option for this AppServer?

     

    Regards!!



  • 3.  Re: CA Tuesday Tip: 2013 - Java Agent Hangs, Crashes, OOM, High CPU, Agent conf

    Posted 02-19-2015 03:53 AM

    Hello,

    A similiar issue was reported in the past, try the below, hopefully that will help, but if not, open a support ticket.

    Regards,

    Sergio

     

    ===

    The procedure is as follows:

    1. Open machine.xmi, which in our case is located in <TIBCO_HOME>/tibco/data/tibcohost/<HOST >/tools/machinemodel

     

    2. Find the runtime configuration for your node which looks like the following

    <runtimes xsi:type="machinemodel:OSGiRuntime" name="<NODE_NAME>" description="TIBCO ActiveMatrix ...>

     

    3. Modify the following properties under that runtime configuration

    a. Append com.wily to org.osgi.framework.bootdelegation values

    <frameworkProperties key="org.osgi.framework.bootdelegation" value="com.ibm.*,com.sun.activation.*,... *,sun.*,com.wily.*"/>

     

    OR

    b. set osgi.compatibility.bootdelegation to true

    <frameworkProperties key="osgi.compatibility.bootdelegation" value="true"/>

     

    4. set the jvm args

    <frameworkProperties key="jvm.args" value="-javaagent:d:/wily/Agent.jar -Dcom.wily.introscope.agent=d:/wily/core/config/IntroscopeAgent.profile...">

     

    You can also specify JVM args in java.extended.properties in the .tra file:

    -javaagent:d:/wily/Agent.jar -Dcom.wily.introscope.agentProfile=d:/wily/core/config/IntroscopeAgent.profile

     

    For this specific customer setup, specifying -Dorg.osgi.framework.bootdelegation or -Dorg.osgi.compatibiity.bootdelegation in .tra file didn't work. the changes had to be done in the machine.xmi.



  • 4.  Re: CA Tuesday Tip: 2013 - Java Agent Hangs, Crashes, OOM, High CPU, Agent conf

    Posted 05-06-2015 01:06 PM

    Hi again Sergio,

     

    I have installed a java agent on a TIBCO Business Works Server in a Windows server, this appserver runs as a windows service, but the agent is only monitoring the Administration console, and I do not know how to monitor all the projects inside this BW Server.

    I have followed this guide: TIBCO BusinessWorks - DevTest Solutions - 8.0 - CA Wiki to monitor Administration console, but I could not find every project inside the windows register.

     

    bwengine.gif

    register.gif

     

    I also tried to add the -javaagent directive directly in the bwengine.tra files as follows:

     

    bwengine.gif

    No logs are generated with this configutarion, the processexplorer tool shows an .exe program running, for each bwengine.exe process there is no windows service asociated.

    bwengine.gif

    Do you know where can I configure the -javaagent directive to load the agent in each BW project?

     

    Thanks and regards!