How do you handle task's that fail and restarted mulitple times?
There are more and more products out there that may fail internally (Ex. Java failure) and the task just ends.
In SSM it will restart the task because the states are not the same, I know there is the SSMRETRY, just wondering if there is any other processes people have come up with.
Thanks in advance for replies.
In our REXX that does the START for a stc… If we see X starts for a stc in Y mins, we set the task to DOWN DOWN, don’t let the start issue and alert someone of a possible LOOP condition. Once it is set to DOWN DOWN, the counter is reset to ZERO, so when you manually go to start, it is ok.
On occasion, if someone is manually starting/stopping (usually during testing), the limit is also reached and it doesn’t come up, but a message is displayed to tell the person to try again (as the counter is reset back to zero.)
'real' abends are handled differently - using Stateman counters.