I am trying to submit a job and, if it receives a specific exit code (or User Status Message) then wait 5 minutes and execute again. If it receives that same specific exit code (or User Status Message) again, wait 5 minutes and then re-execute. Rinse and repeat until it gets a 0, then stop.
So far I have tried using RERUNM and RESUBMIT but neither seem to "obey" the time dependency (5 minutes). So the result is that the job runs every second (or less) until it gets a 0.
I've also tried using an insert task, but I'm not sure how to go about cleaning the app up once everything is done. I've added a task that completes everything but so far it is cleaning everything up before the app gets a chance to run!
Any help would be much appreciated. I can send job statements if needed.
P.S. - We do not have "Restart Option".
Will you review the "Resubmitting a job 5 minutes after it fails" in "Examples cookbook" and see if it helps?
For some reason I am unable to create alerts. I noticed that the only two alerts we have are configured in the ESP parm file. So I'm wondering if you create alerts in the parm file, are you not able to create alerts in the page screen?
Yes, you can create alert in ESP page mode dynamically and use it right away, and you need to add it in ESPPARM too, so that it won't be lost after ESP STC is recycled.
Hope this helps,
I figured out what the problem is. In the Example Cookbook under the "recipe" that you referred to (
Resubmitting a job 5 minutes after it fail), I was following the instructions explicitly. Step 2 says:
2. Define the Alert using the ALERTDEF command or initialization parameter. For example:
OPER ALERTDEF ID(BAD) EVENT(CYBER.RESUB_JOB)
The problem lies in how it tells you to define the alert. I had to use the Advanced User Guide to find how alerts should be defined. On P. 137, it says that you have to use the keyword "ADD" when defining the alert. So, the command in Step 3 in the cookbook should read:
OPER ALERTDEF ADD ID(BAD) EVENT(CYBER.RESUB_JOB)
I realize that the cookbook probably is not being updated any more. However, I thought I would add this remark here.
For the record, this works perfectly. I have made a few modifications but the general concept is exactly what we needed.
Thank you again for your help!
I checked on DocOps and it is also missing the 'add'
DocOps allows comments, so I added your finding.
Hopefully it will be corrected in future DocOps update.
Thank you for bringing this to our attention. The syntax used in the example on DocOps is now correct. If you have any further questions or comments, please feel free to reach out and let us know.
I hope no one minds me resurrecting this thread. Using the alert works perfectly if ESP fails the job. So my next step in this little process is to determine how to use the ALERT statement to trigger the event based on a non-standard status. In this case, I am using an HTTP job, and the user status will either be "Code:202 Status: Accepted" or "Code 200 Status OK".
Essentially, if the job gets a user status of "Code:202 Status: Accepted", I need to wait 5 minutes and resubmit, and continue to do so until it gets a "Code 200 Status OK", then stop. However, if when I get a "Code:202 Status: Accepted", ESP completes the job ergo the application is complete, ergo, I cannot resubmit the job, even with an alert.
While I've added a task to keep the application alive through the process, I am having difficultly determining how to trigger the NOTIFY syntactically. So, the basic syntax is NOTIFY <condition> ALERT(ALERTNAME). According to the command reference, it is possible to use PNODE here. however, I have not been able to get this to work:
NOTIFY PNODE('Code:202 Status: Accepted') ALERT(ACPD)
I have also tried this:
IF MNSTATUS EQ 'Code:202 Status: Accepted' THEN DO NOTIFY <what condition do I use here?> ALERT(ACPD) ENDDO
Would it be possible to create a variable and use that for the condition?
Any advise would be helpful!
It is easiest to have the notify trigger in every case and have "IF" statements in the appl that gets triggered determine what to do. I have some code for a similar scenario......This code below restarts a failed job 3 times. Unfortunately I have a few of these in my library and I do not know which is the best version. I have not verified the code below but you can get an idea from this.
APPL DPFAILR JCLLIB 'POWDO03.TEST.JCL' DOCLIB 'POWDO03.TEST.DOC' OPTIONS RESTARTSTEP DELAY=1 NUMTIMES=3 /*ECHO "RESUB DELAY" %DELAY /*ECHO "RESUB NUMBER OF TIMES" %NUMTIMES /*ECHO "MNSUB#" %MNSUB# /*ECHO "MNJOB" %MNJOB /*ECHO "MNMXCMPC" %MNMXCMPC /*ECHO "MNTAG" %MNTAG IF %MNJOB = 'CMDSXYZ' THEN DO IF %MNSUB# < %NUMTIMES THEN DO IF %MNMXCMPC > 0 THEN DO JOB LINK1 LINK PROCESS RUN ANY RELDELAY %DELAY RELEASE (LINK2) ENDJOB JOB LINK2 LINK PROCESS RUN ANY ECHO %MNJOB 'WILL BE RESTARTED'
ESPNOMSG AJ %MNJOB RESUBMIT APPL(%MNAPPL..%MNAPPLGEN) ENDJOB ENDDO ELSE DO ESPNOMSG AJ DPTST006 RELEASE APPL(%MNAPPL..%MNAPPLGEN) ENDDO ENDDO ELSE DO JOB DPNOTIFY RUN ANY ENDJOB ENDDO ENDDO JOB TEST1 LINK PROCESS RUN ANY ENDJOB
Thank you for providing this. However, I'm not sure that it is very helpful. My problem is that the job isn't failing. It is marked complete whether it gets "Code:202 Status: Accepted" or "Code 200 Status OK". What I need to do is resubmit if it gets a 202, and continue resubmitting every 5 minutes until it gets a 200.
I guess what I'm trying to do is monitor PNODE status and take action. This is confusing because all of the examples I have basically allow you to take action based on "normal" stautuses, such as FAIL, ABEND, or END. So the question is, if the Job / User status is something odd, like "Code 202 Status Accepted", how can you do something with that? Here is what I have tried, and does not work. But perhaps it gives an idea of what I'm trying to do:
HTTP_JOB CC120W02 AGENT BCAPP601 RUN TODAY INVOCATIONTYPE GET SERVLET_URL http://bx3was1:7089/check_ftc_delta_status NOTIFY MNSTATUS EQ 'Code:202 Status: Accepted' ALERT(ACPD) ENDDO ENDJOB
Have you considered using the LCSF command?
You could issue the command and then use REXX to parse the reply.
Enter the following command from Page Mode:
LCSF (JOBNAME EQ 'CC120W02') AND (PNODE NE 'COMPLETE')
Does it produce output that you could parse to make a decision as to what you want to do next?
I think you will need to trigger the alert based on the normal conditions(FAIL, ABEND, or END) and then in the next appl put the "IF" logic to determine what to do.
IF "MNSTATUS EQ 'Code:202 Status: Accepted' THEN DO.......
So, I have been working with IF logic on this. However, I believe I am missing something somewhere. Here is my definition:
HTTP_JOB CC120W02 RUN TODAY AGENT BCAPP601 INVOCATIONTYPE GET SERVLET_URL http://bx3was1:7089/check_ftc_delta_status IF MNSTATUS EQ 'Code:202 Status: Accepted' THEN + ESP TRIGGER TSTOPER.CC120W02 ADD ENDDO ENDJOB
The status is not being evaluated. The "TRIGGER" statement is being done every time the job runs. So instead of the TRIGGER statement only happening when the job gets a '202', it's happening regardless. Eventually this results in failures due to the process.
Have I keyed something wrong here?
The MNSTATUS variable (monitor variable) can only be used after an alert or monitor is triggered. It will not have any value in the job itself.
I think you will need to use a NOTIFY statement every time. When the alert gets triggered it will build a second applcation. The "IF" logic is in that APPL.
I guess that brings me back around to one of my original questions - what is the proper syntax for using %MNSTATUS to test for a specific condition (i.e. "Code 202: Status Accepted") and alert based on that?
Can you add the following to the WOB definition and rerun and see what is in the variable?
SEND 'MNSTATUS:%MNSTATUS.' USER(*)
All I get is MNSTATUS: ESP(J83892C)
Now you know why the IF logic is not working.
From the Advanced User Guide:
Using monitor variables
You can use a set of built-in variables called monitor variables to get the most flexibility from job monitoring and Alert processing. Monitor variables work much the same as other symbolic variables that ESP Workload Manager processes. However, unlike other symbolic variables, monitor variables are only available with job monitoring and Alert processing.
See Don Powell's comment above you will need to use a NOTIFY statement to have the monitor variables values populated.
Did you try my comment above with the LCSF command? Does that provide the information you need?
Follow the steps below.
1) Add the NOTIFY to the appl
NOTIFY JOBEND ALERT(JEND)
2) Define the JEND ALERT
OPER ALERTDEF ADD ID(JEND) EVENT(evntprfx.JEND)
3) In the appl that event alert triggers (evntprfx.jend)
Add this line to see what value MNSTATUS has in it.
SEND 'MNSTATUS:%MNSTATUS.' USER(username)
Create the condition to see the exact value.
4) In the same appl that the alert triggers, add the "IF logic" with the exact value substitued for 'Code:202....'
IF %MNSTATUS EQ 'Code:202 Status: Accepted' THEN DO
Do whatever here
Let me know if you have any questions.
Give me a call 214-473-1191....
Please forgive me, but would you be willing to tell me what you mean by, "Create the condition to see the exact value"?
The "Code:202 Status: Accepted" message that you show may not be the exact value that is stored in the MNSTATUS variable.
Have the client run the job and have it end with the condition that creates the "Code:202 Status: Accepted" status.
The Alert will trigger and the MNSTATUS variable will be populated.
The "SEND 'MNSTATUS:%MNSTATUS.' USER(username)" will display the exact message in the MNSTATUS variable.
Then the exact 'Code:202...' message should be replaced in the IF statement. "IF %MNSTATUS EQ 'Code:202 Status: Accepted' THEN DO"
The next time when it is triggered and receives the "Code:202 Status: Accepted" The "IF" logic will be true and it will go through that piece of code after the IF.
Let me know if this did not help...... I can give you a call if we need to.
That makes plenty of sense. So. I added it to the job, and all I get in the console is this:
(where 'j83892c' is my username)
Here's what is in CSF:
Job Name Account ApplName Gen# P Node Job Qual Job Status ___ CC120W01 ? CC120APP 55 COMPLETE Code:200 Status: OK ___ CC120W02 ? CC120APP 55 COMPLETE Code:202 Status: Accepted ___ KEEPOPEN ? CC120APP 55 WAITING WAITING UNTIL 23.59 14 AP
So it looks to me like %MNSTATUS is, indeed, not what I need to be looking for. So now I need to figure out exactly where that 202 is coming from, and if there is a way I can monitor for it!
In your note "I added it to the job" What job did you add it to?
The MN variables are not available in the initial application. They are only available in the application that is initiated by the alert.
1) Application 1 - CC120APP
This application has job CC120W01 and NOTIFY in it. MN variables are not populated here.
2) Notify Condition is satisfied and Alert is triggered
2) Application 2 is triggered by the alert
MN variables are available here. Put the 'SEND 'MNSTATUS:%MNSTATUS.' USER(username)" in this application to see the value of MNSTATUS.
I started out by adding 'MNSTATUS:%MNSTATUS' to a job that I ran in adhoc, just to see if I could get a message from it. Based on the behavior I saw (including messages being sent to an unrelated user even though I specified my user ID) I have reason to believe that there may be something "hinky" with our ESP instance. I will give you a call and see if we can't do a WebEx to verify. If this is the case, I will have to open a ticket.
I don't think "Hinky" is involved here.
MN variables are populated by an alert. They are only available in the application that the alert triggers. To see what alerts have been defined on the system issue OPER ALERTDEF. I have one called FAIL that triggers POWDO03.FAIL.
In my example: The MN variables will only be available in step 3)
1) I have an application that look like
APPL TEMPLATE JCLLIB 'POWDO03.TEST.JCL' NOTIFY FAILURE ALERT(FAIL) JOB DPTST001 RUN DAILY ENDJOB
2) When DPTST001 fails the NOTIFY is initiated and in my case it triggers alert FAIL. The ALERT FAIL triggers event POWDO03.FAIL.
3) Event POWDO03.FAIL triggers application FAIL
MN variables will be available here only.
For example MNJOB will contain DPTST001
I left my phone number. Give me a call and we can discuss this.