VMware vSphere

 View Only
  • 1.  Divide by zero error on HQ server (?)

    Posted Aug 10, 2006 09:16 PM
    I know y'all have to be getting tired of me. :(

    I have an agent system that is unable to communicate with the HQ server beyond the initial setup.

    This agent system is a Windows 2003 Server, 32-bit, x86. I uncompress the 2.7.3-91 agent, then run hq-agent.exe. I set it up the way I do all my agents. All defaults except for the server port at 80. I then go to the Dashboard on the 2.7.3 server, and there is the system and server/services/whatever ready to be imported. I do import them. Thereafter, the client cannot send metrics.

    I turned on DEBUG, and get the following:

    2006-08-10 15:52:40,456 INFO [CommandsServer] Setting the HQ server to: http://ftw-mon-core-01:80/jboss-lather/JBossLather
    2006-08-10 15:52:40,456 INFO [ConfigPopulateThread] Starting config populate thread
    2006-08-10 15:52:40,550 INFO [AutoinventoryCommandsServer] Autoinventory report successfully sent to server.
    2006-08-10 15:52:54,331 ERROR [CommandListener] Failed handling new connection
    org.hyperic.hq.agent.AgentConnectionException: Error negotiating auth: Remote host closed connection during handshake
    at org.hyperic.hq.bizapp.agent.server.SSLConnectionListener.handleNewConn(SSLConnectionListener.java:94)
    at org.hyperic.hq.bizapp.agent.server.SSLConnectionListener.getNewConnection(SSLConnectionListener.java:159)
    at org.hyperic.hq.agent.server.CommandListener.listenLoop(CommandListener.java:168)
    at org.hyperic.hq.agent.server.AgentDaemon.start(AgentDaemon.java:680)
    at org.hyperic.hq.agent.server.AgentDaemon$RunnableAgent.run(AgentDaemon.java:744)
    at java.lang.Thread.run(Unknown Source)
    2006-08-10 15:53:15,689 INFO [RuntimeAutodiscoverer] Running runtime autodiscovery for .NET 1.1
    2006-08-10 15:53:15,736 INFO [RuntimeAutodiscoverer] .NET 1.1 discovery took 0.047
    2006-08-10 15:53:18,408 ERROR [ScheduleThread] Metric Value not found: \.NET CLR Data\SqlClient: Current # pooled and nonpooled connections
    2006-08-10 15:53:18,424 ERROR [ScheduleThread] Metric Value not found: \.NET CLR Loading(_Global_)\Rate of appdomains
    2006-08-10 15:53:18,424 ERROR [ScheduleThread] Metric Value not found: \.NET CLR Loading(_Global_)\Rate of Load Failures
    2006-08-10 15:53:18,471 ERROR [ScheduleThread] Metric Value not found: \.NET CLR Networking\Bytes Sent
    2006-08-10 15:53:18,471 ERROR [ScheduleThread] Metric Value not found: \.NET CLR Loading(_Global_)\Rate of Assemblies
    2006-08-10 15:53:20,611 ERROR [SenderThread] Error sending measurements: Remote error while invoking 'measurementSendReport: org.hyperic.lather.LatherRemoteException: Runtime exception: RuntimeException; CausedByException is:
    / by zero


  • 2.  RE: Divide by zero error on HQ server (?)

    Posted Aug 10, 2006 10:57 PM
    Not at all, Brad. Keep 'em coming. This is actually a known issue. We are investigating the cause of it, but at the moment we only have a band-aid fix for the symptom. The fix is in the source tree, but not in the binaries yet. Hopefully we'll get to the bottom of this soon, or at least before the release of the next version. Thanks.

    Charles


  • 3.  RE: Divide by zero error on HQ server (?)

    Posted Aug 11, 2006 03:20 PM
    Is there any known workaround, or am I going to be unable to monitor this specific server in the interim?

    I should add to my report that I went and changed to SSL and it had no difference at all.

    On another note, the .NET and IIS plugins go NUTS on this system! 8MB of log in no time trying the same detections over and over and over in a seemingly interminable loop.


  • 4.  RE: Divide by zero error on HQ server (?)

    Broadcom Employee
    Posted Aug 11, 2006 08:09 PM
    It looks like you enabled more than the default set of metrics for .NET and likely for IIS too?
    For .NET it seems certain metrics are not available on all platforms and/or depend on which .NET components are in use. You can run the plugin from the command line to see which are available:

    .\jre\bin\java -java pdk\lib\hq-product.jar -p dotnet -t ".NET 1.1"

    You should disable metrics collection for any failed metrics. At some point we'll figure out the dependencies here or just gracefully handle counters that are not present on the system rather than logging ERRORs. You can also use the command line to test IIS: -p iis -t "IIS 5.x"

    As for the divide-by-zero, I don't think there is a workaround at the moment, we're looking into it.


  • 5.  RE: Divide by zero error on HQ server (?)

    Posted Aug 14, 2006 02:24 PM
    No, when I set up HQ server I turned off all .NET monitoring, as we don't use it. I turned off almost all IIS metrics. I think the only ones I left on were connections and availability. This is the global metric default.


  • 6.  RE: Divide by zero error on HQ server (?)

    Posted Sep 05, 2006 02:37 PM
    Do you have any news on a fix for the Divide by zero error. I too have found this error.

    Rgds

    Simon


  • 7.  RE: Divide by zero error on HQ server (?)

    Posted Sep 05, 2006 04:36 PM
    The HEAD tag has several fixes for this problem, if you want to build
    from the tree. However, we are still testing those fixes.
    Otherwise, we are targeting a release date for 2.7.4 in about 2 weeks.

    Charles



  • 8.  RE: Divide by zero error on HQ server (?)

    Posted Sep 28, 2006 04:01 PM
    Do you know which version of the enterprise edition has fixed this problem?


  • 9.  RE: Divide by zero error on HQ server (?)

    Posted Sep 28, 2006 09:22 PM
    You can follow this bug report on our Jira:

    http://jira.hyperic.com/browse/HHQ-217

    A workaround was put in for both 2.7.4 and 2.6.28 so that the agent will not crash when this condition occurs.


  • 10.  RE: Divide by zero error on HQ server (?)

    Posted Sep 29, 2006 09:13 PM
    Resolved in new version.