We have 2 LPAR's both running OPS/MVS, but NOT using MSF. We have shared DASD. We do NOT have any other console automation products (Automation Point, etc).
Recently had a situation where OPS/MVS crashed on one LPAR and went unnoticed. Trying to develop a way to monitor OPS from the other LPAR that is better than just issuing an RO smfid,D A,OPSMAIN and interrogating the results.
Does anyone have a better suggestion?
Assuming you have the Automatic Restart Manager component of z/OS enabled on the system, you can register OPS to it by setting the ARMELEMNAME OPS/MVS parameter (Refer to doc for complete set up details), then when OPS fails abnormally, ARM will restart it (assuming you wanted to start it right back up). You could opt to have a rule enabled that just has an )INIT section coded, and if OPS has been restarted again after the initial start of the IPL, invoke your in-house notification procedures to send out an alert/info message. Rule would look something like:
)MSG OPSRESTART )Init If OPSINFO('PRODUCTSTARTS') > 1 then do /**** Invoke your in-house alert automation, email,page,WTO,etc ***/ end
Another method would be to simply issue another START command to OPS after it initializes. Since only one copy of the same SSID can 'run', the other would just wait around. You would then need to add logic in your system shutdown automation to cancel this 'stand-by' copy of OPS so it doesn't start when you are shutting the system down.