DX NetOps

 View Only
  • 1.  JAVA SERVICE ON DATA AGGREGATOR

    Posted Jul 20, 2020 10:14 AM
    failure in java service of the data aggregator.
    The java service of the data aggregator server is restarting frequently.
    displays message :

    - JMS broker does not respond to heartbeat in 300 secs.. (error count:10)
    - FailoverTransport; Transport (tcp://127.0.0.1:61616) failed , attempting to automatically reconnect: java.io.EOFException

    How can I get around this? Any guidance?

    Cássio Pinheiro - PRODEMGE

    ------------------------------
    Analista
    Cia de Tecnologia da Informação de Minas Gerais - PRODEMGE
    ------------------------------


  • 2.  RE: JAVA SERVICE ON DATA AGGREGATOR

    Broadcom Employee
    Posted Jul 20, 2020 10:28 AM
    Try stopping and starting activemq service on DA.

    Once it's started, confirm it's listening on all 4 ports:

    netstat -an | grep 61
    Should see LISTEN for 61616/61618/61620/61622.

    Then try starting the DA.  Another thing it could be is the DA is taking too long to load the attributes from the DB, and AMQ connections will time out and get recreated over and over.

    If you keep seeing this message over and over and it's been an hour, it's possible your DA doesn't have enough memory to load the DB.  Startup requires more memory than normal run time as we need to load the DB into structures before we construct the objects in the item repository cache.

    INFO | xtenderThread-97 | 2020-06-15 13:05:49,106 | ItemDbInterfaceImpl | vertica.impl.ItemDbInterfaceImpl 752 | re.database.irep.vertica | | Waiting for Attribute loading to complete - Load time: 0:04:00.018

    If this is happening, open a support case.


  • 3.  RE: JAVA SERVICE ON DATA AGGREGATOR

    Broadcom Employee
    Posted Jul 20, 2020 12:51 PM
    To add to what Jeff said.  If you do end up opening an issue, it would be interesting to know, has this been happening right along or is this a recent occurrence?  If this behavior is new has anything changed in the environment?  For example, have you started monitoring qos and has that causes the number of polled items to increase exponentially.

    Joe


  • 4.  RE: JAVA SERVICE ON DATA AGGREGATOR

    Posted Jul 21, 2020 02:19 PM
    Hi Jeffrey and Joseph.

    Unfortunately, the context of the problem is different from the one mentioned above. As reported the heartbeat error has happened out of sequence, where seconds later it is restored alone, but has caused a visual impact.

    Other suggestions ?  I would not like to stop monitoring this process.

    Cássio

    ------------------------------
    Analista
    Cia de Tecnologia da Informação de Minas Gerais - PRODEMGE
    ------------------------------



  • 5.  RE: JAVA SERVICE ON DATA AGGREGATOR

    Broadcom Employee
    Posted Jul 21, 2020 02:25 PM
    Is the DA java process restarting? or ActiveMQ java process restarting?

    What scale are you running?  # devices, # polled items?   How much memory is on DA machine?

    You can try doing an on-demand report for the DA item in PC, and setup like config below. Check DA heap usage for like 8 hr leading upto a DA restart.




  • 6.  RE: JAVA SERVICE ON DATA AGGREGATOR

    Posted Jul 21, 2020 03:38 PM
    Edited by GPD Jul 21, 2020 03:50 PM
    Jeffrey.

    ActiveMQ restarting.

    about 2900 devices
    polling 5 min
    16 GB RAM
    8 GB swap


    report created, awaiting data collection.

    CADA


    Cássio

    ------------------------------
    Analista
    Cia de Tecnologia da Informação de Minas Gerais - PRODEMGE
    ------------------------------



  • 7.  RE: JAVA SERVICE ON DATA AGGREGATOR
    Best Answer

    Broadcom Employee
    Posted Jul 21, 2020 03:50 PM
    Please open support case.

    Collect a DA CARE, cd /opt/IMDataAggregator/RemoteEngineer, run ./re.sh
    Upload the zip file it creates to case.