ESP Workload Automation

 View Only

 How to get an ESP failure from an AS400 job in MSGW status

kmallen's profile image
kmallen posted May 07, 2021 02:05 PM
Hi,
When an AS400 job fails it often goes into MSGW status and waits for an operator to reply before it's completely cancelled.  ESP considers this as being still in an executing state, with a status message of "Message waiting for a reply".  Is there any way to make this go automatically into a failed state?
Thanks
Lucy Zhang's profile image
Broadcom Employee Lucy Zhang

Hi Keith,

There is nothing can be done from ESP nor agent side. 

Is it possible to make change to the AS400 JOB so that it's not interactive?

Thank you,

Lucy

TERESA JOHNSON's profile image
TERESA JOHNSON

We trigger a simple REXX to check the status for the AS400 WOB.  We check for 2 messages as the characters changed at some stage due to upgrade.




Thanks
Teresa
TERESA JOHNSON's profile image
TERESA JOHNSON
Here are the WOBs that stop and start the trigger process.





Thanks
Teresa
kmallen's profile image
kmallen
To Lucy, no it's not really that the job is interactive.  It's just that the AS400 often reacts to a program problem by putting the job into MSGW state.  In theory the computer operator can sometimes fix the issue (such as disk space) and tell the job to continue.  in reality, MSGW is almost always an actual program failure, so the job is just waiting for the operator to confirm that.  Maybe it's asking too much of ESP, but it would be nice if there was an option on the ESP AS400 job, to tell it that MSGW status should cause an alert (similar to overdue or max run time exceeded).

To Teresa, that looks interesting, I will play around with this option.  It may be a bit of overhead if I want to do this for hundreds of jobs though, so I may just do this for the more critical jobs.

Thanks all
TERESA JOHNSON's profile image
TERESA JOHNSON
Keith

You could try wildcarding the jobname for the JOBONCSF wob.  We only have a handful of jobs so wasn't an issue.

Or have a look at see if the below would suit better.

Thanks
Teresa


CA Tuesday Tip by Lucy Zhang, Principal Support Engineer for 10/11/2011

JOBONCSF and CSFJOB functions are used very popularly with REXX code, to retrieve CSF data and take actions as needed.

But many may find their limitation: they can only filter on JOB name; so if there is no pattern about the JOB name, wildcard ‘-‘ needs to be used to retrieve ALL entries on CSF causing possible abend 878.

The undocumented command LCSF supports the same filter criteria as freeform filter on CSF, which is documented in the Operator's Guide. With proper filter, LCSF will get way less entries than JOBONCSF/CSFJOB. This command was added for web-based interface, so its output has a format easy for parsing, not for reading.

You can issue LCSF command from page mode, REXX code, or in batch job.
Here is an example to show JOBs that are in Waiting status from page mode:
LCSF STATUS='WAITING-'

The following JCL will list all the jobs in Applications that are COMPLETED and on APPLHOLD:
Note: Please change sub_sys and prefix to proper values.
//ESPBATCH JOB
//ESP EXEC PGM=ESP,PARM='SUBSYS(sub_sys)'
//STEPLIB DD DSN=prefix.SSCPLINK,DISP=SHR
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
LCSF COMPLETED AND APPLHOLD
//

To retrieve all jobs on critical path from CSF, you can code REXX in ESP Proc:
X=TRAPOUT('LINE.')
"ESP LCSF CRITICAL_PATH"
X=TRAPOUT('OFF')

Above are from Knowledge Document TEC500752 and TEC553699.