Automic Workload Automation

 View Only
  • 1.  JDE agent disconnecting

    Posted May 30, 2023 03:24 AM
      |   view attached

    Hi Team,

    There is a JDE agent which is disconnecting often. Below is the error we found in the agent report.

    2013309 RA Agent is starting ...
    20230530/151046.361 - U02013340 Automatic restart of Jobs is enabled.
    20230530/151047.262 - U02000106 Reading content of connection object 'CONN.JDEDWARDSAGENT.JDEDWARDSCONNECTION.AUSYDWE1UA01'.
    20230530/151047.369 - U02000107 Connection object 'CONN.JDEDWARDSAGENT.JDEDWARDSCONNECTION.AUSYDWE1UA01' with time stamp '1651191533000' was successfully transferred to the Agent.
    20230530/151047.372 - U02000106 Reading content of connection object 'CONN.JDEDWARDSAGENT.JDBCCONNECTION.E1TEST92'.
    20230530/151047.493 - U02000107 Connection object 'CONN.JDEDWARDSAGENT.JDBCCONNECTION.E1TEST92' with time stamp '1674192121000' was successfully transferred to the Agent.
    20230530/151133.516 - U02013315 RA Agent 'JDEDWARDSAGENT' has been initialized.
    20230530/151133.519 - U07001001 Charset used by the Agent: 'ISO-8859-15'
    20230530/151133.626 - U02000204 Error recreating the Job instance: 'java.io.FileNotFoundException:..\temp\restart19279292 (The system cannot find the file specified)'
    20230530/151133.677 - U00003620 Routine 'UCEX_R' forces trace because of error.
    20230530/151133.679 - U00003449 Output to the TRACE file is finished.
    20230530/151133.695 - U02000048 TRACE file opened with flags '0000000000000001', max.trace: '9', memory trace: '0', search type: 'S', search string: ''.
    20230530/151133.787 - U00003449 Output to the TRACE file is finished.
    20230530/151133.799 - U02000048 TRACE file opened with flags '999900000000000000', max.trace: '9', memory trace: '0', search type: 'S', search string: ''.
    20230530/151133.804 - U02000204 Error recreating the Job instance: 'com.uc4.ex.restart.RestartObjectInputStream$ClassNotAllowedException:Class name not accepted: java.util.GregorianCalendar'
    20230530/151133.858 - U02013342 Restart of RA Job '19279313' will be performed.
    20230530/151133.887 - U02013342 Restart of RA Job '19279296' will be performed.
    20230530/151133.911 - U02000204 Error recreating the Job instance: 'java.io.FileNotFoundException:..\temp\restart19279297 (The system cannot find the file specified)'
    20230530/151133.990 - U02013342 Restart of RA Job '19277285' will be performed.
    20230530/151134.006 - U02013342 Restart of RA Job '19279307' will be performed.
    20230530/151134.059 - U02000204 Error recreating the Job instance: 'java.io.FileNotFoundException:..\temp\restart19269391 (The system cannot find the file specified)'
    20230530/151134.154 - U02000204 Error recreating the Job instance: 'java.io.FileNotFoundException:..\temp\restart19280083 (The system cannot find the file specified)'
    20230530/151134.256 - U02013342 Restart of RA Job '19280082' will be performed.
    20230530/151134.300 - U02013342 Restart of RA Job '19273486' will be performed.
    20230530/151134.336 - U00003620 Routine 'UCEX_R' forces trace because of error.
    20230530/151134.338 - U00003449 Output to the TRACE file is finished.
    20230530/151134.353 - U02000048 TRACE file opened with flags '0000000000000001', max.trace: '9', memory trace: '0', search type: 'S', search string: ''.
    20230530/151134.462 - U02013342 Restart of RA Job '18993411' will be performed.
    20230530/151134.510 - U00003449 Output to the TRACE file is finished.
    20230530/151134.523 - U02000048 TRACE file opened with flags '999900000000000000', max.trace: '9', memory trace: '0', search type: 'S', search string: ''.
    20230530/151134.528 - U02000015 Periodical job test started.
    20230530/151134.533 - U02000204 Error recreating the Job instance: 'com.uc4.ex.restart.RestartObjectInputStream$ClassNotAllowedException:Class name not accepted: java.util.GregorianCalendar'
    20230530/151134.692 - U02000204 Error recreating the Job instance: 'java.io.FileNotFoundException:..\temp\restart19281254 (The system cannot find the file specified)'
    20230530/151134.773 - U02013342 Restart of RA Job '19274719' will be performed.
    20230530/151134.810 - U02000204 Error recreating the Job instance: 'java.io.FileNotFoundException:..\temp\restart19281286 (The system cannot find the file specified)'
    20230530/151135.030 - U02013342 Restart of RA Job '19259903' will be performed.
    20230530/151135.094 - U00029000 Report '19280085' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\PLOG_OABQEPBY.TXT' has been started.
    20230530/151135.136 - U00029000 Report '19281355' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\OABQEXRG.TXT' has been started.
    20230530/151136.112 - U00029000 Report '19280088' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\PLOG_OABQEXSB.TXT' has been started.
    20230530/151136.211 - U00029000 Report '19281266' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\PLOG_OABQEXRL.TXT' has been started.
    20230530/151136.217 - U00029002 Report '19281266' ended abnormally.
    20230530/151136.439 - U00029000 Report '19281315' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\PLOG_OABQEUSF.TXT' has been started.
    20230530/151136.448 - U00029001 Report '19281315' ended normally.
    20230530/151136.524 - U00029000 Report '19281324' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\PLOG_OABQEQXU.TXT' has been started.
    20230530/151136.534 - U00029001 Report '19281324' ended normally.
    20230530/151136.654 - U00029000 Report '19281255' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\PLOG_OABQEYVQ.TXT' has been started.
    20230530/151136.739 - U00029000 Report '19281356' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\OABQEUSF.TXT' has been started.
    20230530/151138.587 - U00029000 Report '19281256' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\PLOG_OABQEJAL.TXT' has been started.
    20230530/151139.816 - U00029002 Report '19281256' ended abnormally.
    20230530/151139.883 - U00029000 Report '19280107' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\OABQEXRL.TXT' has been started.
    20230530/151141.638 - U00029000 Report '19281257' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\PLOG_OABQEXRG.TXT' has been started.
    20230530/151141.643 - U00029002 Report '19281257' ended abnormally.
    20230530/151142.025 - U00029000 Report '19281264' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\PLOG_OABQEQXJ.TXT' has been started.
    20230530/151142.295 - U00029000 Report '19281271' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\PLOG_OABQEUSB.TXT' has been started.
    20230530/151142.323 - U00029000 Report '19281272' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\PLOG_OABQEXRV.TXT' has been started.
    20230530/151142.346 - U00029000 Report '19281357' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\OABQEJAL.TXT' has been started.
    20230530/151143.087 - U00029000 Report '19281281' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\PLOG_OABQEXRK.TXT' has been started.
    20230530/151143.110 - U00029000 Report '19281325' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\OABQEXRH.TXT' has been started.
    20230530/151144.036 - U00029000 Report '19281358' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\OABQEYVR.TXT' has been started.
    20230530/151145.309 - U02000005 Job 'JOBS.R59NZ00001_AUPA0004' with RunID '19277291' started.
    20230530/151145.313 - U02000003 Job 'JOBS.R59NZ00001_AUPA0004' started with RunID '19277291'.
    20230530/151145.373 - U00029000 Report '19281359' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\OABQEQXU.TXT' has been started.
    20230530/151146.112 - U02000005 Job 'JOBS.R56AU00022_NZPNC0001' with RunID '19280099' started.
    20230530/151146.115 - U02000003 Job 'JOBS.R56AU00022_NZPNC0001' started with RunID '19280099'.
    20230530/151146.540 - U02000005 Job 'JOBS.R42520_NZPP0001' with RunID '19269393' started.
    20230530/151146.543 - U02000003 Job 'JOBS.R42520_NZPP0001' started with RunID '19269393'.
    20230530/151146.737 - U00029000 Report '19281360' for file 'G:\Automic\Agents\RA.JD.Edwards\bin\..\temp\OABQFAOS.TXT' has been started.
    20230530/151148.404 - U02000005 Job 'JOBS.R55LP001_PWB0001' with RunID '19281343' started.
    20230530/151148.407 - U02000003 Job 'JOBS.R55LP001_PWB0001' started with RunID '19281343'.

    Please help us to find the root cause and resolution.


    Thanks and regards,

    Ashmitha

    Attachment(s)

    txt
    19281339.LOG (2).txt   404 KB 1 version


  • 2.  RE: JDE agent disconnecting

    Broadcom Employee
    Posted May 30, 2023 06:54 AM

    Please get in contact with Broadcom Support and open a ticket.

    Contact Support - Support Portal - Broadcom support portal



    ------------------------------
    Michael K. Dolinek

    Engineering Program Manager | Agile Operation Division
    Broadcom Software
    ------------------------------



  • 3.  RE: JDE agent disconnecting

    Posted Jun 01, 2023 01:20 AM

    Hi,

    As this version is out of support, we have not raised a case with the support team.

    We have to solve this issue ASAP. So, will you please help us with this issue?

    Regards,
    Ashmitha




  • 4.  RE: JDE agent disconnecting

    Posted Jun 01, 2023 01:12 PM

    You need the log from the previous agent execution to see what caused the shutdown or termination of the agent, it will have the trace reference as well if enabled.  The log you attached is the start up log after that event occurred and won't tell you much about what happened prior.  Also, the errors about missing job files in the start up report is something I see, sometimes, when the agent is stopped while there are active job executions; it is not related to the reason the agent stopped in the first place and I don't know why the JDE agent sometimes recovers and resumes these jobs and other times not.

    From your startup log your Java max heap size may be too small especially if what you describe as a connection loss is actually the agent stopping (note: "connection" issues between the Agent and AE or Agent and JDE DB will be reported in the log along with its retry attempts, it will not cause the agent to stop).  You should consider adding more memory to the agent startup parameter in the service manager and see if this stabilizes your agent assuming your able to find evidence of heap dumps and size at or above your startup parameter at that time.

    Prior to upgrading, we had similar issues and had to increase the max heap size to work around it, the upgraded agent manages memory better.  Also we find using Output Search Pattern in the JDE job definition to register JDE printer output/logs causes the agent memory to spike during scans due to the size of the PrinterOutput folder in JDE and retention policy for these in our case.   If you don't need to register any files from JDE report output, I would clear this out for all jobs where it is not applicable to help manage the agent's memory usage.   Also consider running those jobs that need output registration in a separate Automic Job Queue so you can manage the concurrency of execution, maybe a few at a time rather than unlimited, keeping in mind the job status scan occurs in incremental batch and can trigger many scans simultaneously.

    Hope this helps.




  • 5.  RE: JDE agent disconnecting

    Posted Jun 01, 2023 02:39 PM

    A few observations.  The log you attached is the JDE agent report (log) after the agent has started, it is an indicator of an agent's startup routine, not previous shutdown/termination logging.  A simple connection loss either to the JDE DB for job status polling or to the AE via a CP process will just log the event and keep trying to reestablish any connections.  In both of these cases the agent never physically stops to shutdown.

    I think your agent is physically stopping or terminating, not loosing a connection to the AE.   As a result you see the errors about missing job generation files the JDE agent uses to manage these executions in the start up log, which I have seen on my own system when the agent is stopped and there are JDE jobs still running/registered; it doesn't always do a good job self recovering from these uncoordinated shutdowns.  To see the cause of the agent shutdown or termination you'll need to open the Agent report for the previous execution of it which ended, and possibly any trace files generated or registered from that event.

    I noticed your memory heap size for the JDE agent is relatively small (512MB?) and you're running an older version of the agent.  After increasing the max heap size and eventually upgrading, my memory heap dumps stopped occurring.  Since upgrading isn't an option, you can confirm in any trace files generated what the heap size was at the time of agent shutdown and increase the max heap size to something with more room, or simply double it and see if you get more stability from the agent after you confirm it is terminating unexpectedly.

    Are you performing job output file scans through the JDE solution directly?  If so are these file registrations necessary in "every" job?  If not, then clear out the Output Search Pattern string from the RA JDE job definition where they aren't required so the agent doesn't try to scan after the job completes.   The agent Java process also performs these file scans uses a lot of memory depending on job concurrency and size of the directory being scanned, and for us was the cause of agent terminations until we increased the agent Java heap size from Service Manager and managed jobs needing to run the Output Search Pattern through an Automic Queue to throttle how many could execute simultaneously to keep the memory usage in check for this particular event.

    Hopefully this gets you pointed in the right direction and can resolve your issues assuming this is caused by a heap dump.