AppWorx, Dollar Universe and Sysload Community

 View Only
Expand all | Collapse all

OAE agent Success script intermittent issue where it does not get called.

  • 1.  OAE agent Success script intermittent issue where it does not get called.

    Posted Jan 06, 2020 10:28 AM
    Hi everyone,

    I was asked by management to see if the user community had any knowledge of this problem with there
    Applications Manager instances. Your help, and insite is greatly appreciated.

    We are currently running V9.1.1, and will be upgrading our production system this weekend to V9.3.1.
    Our test instances were upgraded from V9.1.1 to V9.3.1, and we have been performing testing.

    We are still in the process of testing against our test systems which we noticed a problem. I was
    told by our Automation team they are seeing a intermittent issue running OAE concurrent managers jobs.
    They told me sometimes the success scripts are called, and sometimes not. The Automation team also
    told me they saw this issue in V9.1.1 from time to time, but now with V9.3.1 it seems to be showing
    itself even more. My first thought is maybe the upgrade is running faster, and it seems that way.

    I looked at known issues with V9.3.1, and could not find anything related to OAE Success script issues.
    I also searched the Broadcom Community, and nothing really stands out.

    This is all I know about this but thought I would give it a shot asking the user Community. If there is
    something I can do to possibly debug this in more detail, that would be great. The OAE agents have the
    debug option on.

    This is all I know about this but thought I would give it a shot by opening a case to ask. If there is
    something I can do to possibly debug this in more detail, that would be great. The OAE agents have the
    debug option on.

    Thank you for the help,

    Rich


  • 2.  RE: OAE agent Success script intermittent issue where it does not get called.
    Best Answer

    Posted Jan 09, 2020 12:25 PM
    Hi everyone,

    I have some great news for you on this case. We did a bunch of testing this morning, and think
    we know what is happening.

    For many years now we had a upgd, test, and prod instance. The upgd, and test instance both have
    OAE_DEVL (HRDEVL), OAE_DTDT (HRDTDT), OAE_TEST (HRTEST), and OAE_UPGK (HRUPGK) defined to them.
    Our production system only has OAE_PROD (HRPROD). As a side note we do the same for the Banner
    agents BANDEVL, BANFAID, BANTEST, BANUPGD which are shared by upgd, and test. Production only
    has BANPROD.

    We ran a series of HRMS test jobs this morning on our AMTEST system against OAE_TEST. But this
    time we had OAE_TEST stopped on AMUPGD. We ran this 5 times without any issues. We even ran an
    additional longer running HRMS job on AMTEST which all worked.

    We then started OAE_TEST on AMUPGD, and requested the jobs to run on AMTEST. In the first series
    of tests it failed two out of three times. In the second series of tests it worked two out of
    three times.

    We then stopped OAE_TEST on AMUPGD, and requested the jobs to run on AMTEST. We ran the same
    HRMS process flow, and all three jobs worked.

    From what we could tell the problem is related to having the agent active on two instances of
    Appman. The second Appman instance was not even running any HRMS jobs, but the same agent was
    active on both sides. This seems to be enough of a problem to cause this to work sometimes, and
    fail others.

    It seems what we will need to do is make sure we do not have the same OAE agent active on both
    sides, it can only be active on one. We got away with this most likely when the database was
    using DBMS pipes over Advanced Queuing.

    The Automation team will be performing additional testing today, and we will see how this goes.
    Now that we might of found the issue, this makes us feel a alot more comfortable about upgrading
    production.

    I have heard before from support how doing this was not supported, but it was this way for many
    years and worked.

    I have a support case open to see if I can learn more about what is happening internally to cause
    this to happen.

    Thank you very much,

    Rich