As I said we had several instances of users leaving tasks in PASSIVE mode as well as starting new tasks and not stopping them prior to an IPL. I just wrote a REXX that goes through and checks the list of running tasks and compares it to the tasks in the SSM table, the list of tasks in the STCB4OPS variable and the list of tasks in the Exclude_All variable in SHUTSYS2 (I made this a global variable as well so it is easier to access). If it finds anything in PASSIVE mode or not in SSM, it generates an e-mail which is sent to our Systems Programmers. Since we do our IPL's over the weekend, I have it run at 15:00 every Friday on every system. That way my team can catch tasks before they cause problems at IPL time and have a chance to catch people before they leave. As a secondary check, I also run the exact same check when the Operators issue a SHUTMAINT or SHUTSYS. In this case it will also display a nice, big, red flower box on the console alerting them to the tasks that may disrupt the IPL. I have been called a couple times since this went in. Nothing major but I really shouldn't be getting a call at all but I digress. This sounds like what you are looking for.
I also have looked into resetting the counter and I can't remember for sure but I believe I came to the conclusion that it was too difficult to find an appropriate spot to reset it. Your first thought, as was mine, is to reset it when it reaches the UP state but something sticks in my mind that it wasn't quite that easy. I just don't remember what stopped me from doing that. There are also End of Memory rules which detect when an address space has left memory. SSMEOM is an example of one such rule. SSMEOM is what sets all of my tasks' current states to DOWN. It is incredibly handy.
As for locking down OPS, I too have just gone through that process. If you are not already using External Security I suggest you do so as it will be much easier to maintain. Prior to rolling out External Security my entire company had access to CA-OPS. Now it is limited to just those that need it. The downside is that External Security is not as granular as I would like it to be, at least for our environment. For example, we have no way to restrict someone from enabling/disabling rules in a rule set. I have one group that has one rule set in CA-OPS and that is where all of their rules reside. I wanted to let them have access to just that one rule set but the only way to lock it down is to use a security rule to augment external security. This is one of those things I would like to see CA try and improve. The other minor concern is that the MSF resources are checked more frequently than you would think. Access is "required" in some strange places. This means that users will have more access to MSF than I potentially would like.
One final thought on this Friday afternoon, if you remove SSMRETRY you do run the risk of a "bad" task starting infinite number of times unless you modify the SSMEOM rule. I had to do that when I removed SSMRETRY for them. I had SSMEOM check the task name and if it matched one of their tasks, it immediately set it to FAILED instead of TERM in the appropriate section of the select statement. Thinking about it recently however, I am thinking that one of the SSMRETRY options, perhaps the counter, could accomplish the same thing. I would need to look at SSMRETRY again to see what all parameters it used but if you set the time limit to zero and the retry limit to one, you have essentially removed the SSMRETRY logic without removing the SSMRETRY logic. That is the logic would still function but there would be no time limit to worry about and the task would be put into FAILED status after one failed attempt. You would still have to reset the FAILED state but that is much easier for even the most novice user to do on their own. Again, I will need to look at SSMRETRY a little closer but it is quitting time on a Friday Hopefully this helps.