We have various jobs in our environment that use the :RESTART function to allow multiple steps to be accomplished in the same job, but I've recently been questioning whether I've been writing these jobs properly.
Generally speaking, we'll design jobs with a model of "perform step 1; if successful, perform step 2", and establish restart points for both steps. There's typically two different ways I see jobs like this written:
Design #1:
:RESTART STEP1
echo "Running step 1"
./step1.sh
retval=$?
if ! test $retval -eq 0
then
exit $retval
fi
:RESTART STEP2
echo "Running step 2"
./step2.sh
retval=$?
(exit $retval)
Design #2:
:RESTART STEP1
echo "Running step 1"
./step1.sh
retval=$?
if test $retval -eq 0
then
continue="Y"
fi
:RESTART STEP2
if test $continue = "Y"
then
echo "Running step 2"
./step2.sh
retval=$?
fi
(exit $retval)
Now there seem to be tradeoffs with each approach:
- With design #1, if the first step fails, the Restart Point prompt upon restart will be prefilled with "STEP1" like we want...yet the job would have aborted without ever executing the UC4 footer code due to the 'exit' command. Skipping that footer doesn't actually cause any particular problem that I've noticed other than aesthetically, but I'm not sure if that footer is doing something more important than I realize.
- With design #2, if the first step fails, we skip over step 2 (as intended), and the job ends properly -- the UC4 footer is generated and everything. However, when we go to restart the job, the Restart Point prompt is prefilled with 'STEP2', since that was the last restart point it ran across during the first execution, even though the failure was in step 1. Our operators need to make a pointed effort to update that restart point back to STEP1.
Has anyone figured out a way to implement restart points like this in a way that both generates the UC4 footer and leaves the default restart point at the point of failure? Or are there other approaches to using these that people have developed?
Thanks in advance!
-- Daryl