Hi everyone,
I have some great news for you on this case. We did a bunch of testing this morning, and think
we know what is happening.
For many years now we had a upgd, test, and prod instance. The upgd, and test instance both have
OAE_DEVL (HRDEVL), OAE_DTDT (HRDTDT), OAE_TEST (HRTEST), and OAE_UPGK (HRUPGK) defined to them.
Our production system only has OAE_PROD (HRPROD). As a side note we do the same for the Banner
agents BANDEVL, BANFAID, BANTEST, BANUPGD which are shared by upgd, and test. Production only
has BANPROD.
We ran a series of HRMS test jobs this morning on our AMTEST system against OAE_TEST. But this
time we had OAE_TEST stopped on AMUPGD. We ran this 5 times without any issues. We even ran an
additional longer running HRMS job on AMTEST which all worked.
We then started OAE_TEST on AMUPGD, and requested the jobs to run on AMTEST. In the first series
of tests it failed two out of three times. In the second series of tests it worked two out of
three times.
We then stopped OAE_TEST on AMUPGD, and requested the jobs to run on AMTEST. We ran the same
HRMS process flow, and all three jobs worked.
From what we could tell the problem is related to having the agent active on two instances of
Appman. The second Appman instance was not even running any HRMS jobs, but the same agent was
active on both sides. This seems to be enough of a problem to cause this to work sometimes, and
fail others.
It seems what we will need to do is make sure we do not have the same OAE agent active on both
sides, it can only be active on one. We got away with this most likely when the database was
using DBMS pipes over Advanced Queuing.
The Automation team will be performing additional testing today, and we will see how this goes.
Now that we might of found the issue, this makes us feel a alot more comfortable about upgrading
production.
I have heard before from support how doing this was not supported, but it was this way for many
years and worked.
I have a support case open to see if I can learn more about what is happening internally to cause
this to happen.
Thank you very much,
Rich
Original Message:
Sent: 01-06-2020 10:27 AM
From: Richard Blumlein
Subject: OAE agent Success script intermittent issue where it does not get called.
Hi everyone,
I was asked by management to see if the user community had any knowledge of this problem with there
Applications Manager instances. Your help, and insite is greatly appreciated.
We are currently running V9.1.1, and will be upgrading our production system this weekend to V9.3.1.
Our test instances were upgraded from V9.1.1 to V9.3.1, and we have been performing testing.
We are still in the process of testing against our test systems which we noticed a problem. I was
told by our Automation team they are seeing a intermittent issue running OAE concurrent managers jobs.
They told me sometimes the success scripts are called, and sometimes not. The Automation team also
told me they saw this issue in V9.1.1 from time to time, but now with V9.3.1 it seems to be showing
itself even more. My first thought is maybe the upgrade is running faster, and it seems that way.
I looked at known issues with V9.3.1, and could not find anything related to OAE Success script issues.
I also searched the Broadcom Community, and nothing really stands out.
This is all I know about this but thought I would give it a shot asking the user Community. If there is
something I can do to possibly debug this in more detail, that would be great. The OAE agents have the
debug option on.
This is all I know about this but thought I would give it a shot by opening a case to ask. If there is
something I can do to possibly debug this in more detail, that would be great. The OAE agents have the
debug option on.
Thank you for the help,
Rich