A revision has been made to prevent the script from cycling through all active OneClick servers during the rest call. Once it makes a successful connection, it proceeds on with the script. It will only cycle through all OneClicks if one or more cannot be queried.
**UPDATE**
The same SetScript and ClearScript can be used on both primary and secondary servers.
Hello Sean,
Currently, as soon as GAS registers the alarm service on the secondary, the notification service kicks in, so it's all a matter of how long it takes for the failover to complete. In instances of long activation times, this should not be an issue. My current customer had extremely long activation times at one point > 5 hours. SANM still processed but at a slightly slower rate until the models were 100% activated.
I ran a test today again in the Dev environment and as soon as the failover was complete (MLS failover), I got bombarded with notifications. I have been unable to verify if there is any store and forward mechanism.
I do not have any heart beat checks in the code - this all based off of the out-of-box functionality. If this is something you'd like to try out, be my guest. Hopefully with a little more work, we can get this or something similar added into the official Spectrum code.
Karen, this is a very clever solution. Kudos!
Follow-up questions regarding the delay between the time an MLS fails and when the OC detects and registers it?
- Is there a heartbeat between OC and MLS and how do you configure that frequency?
- How does GAS (global alarm sync) affect this solutions timing in failure and recovery?
- how much delay does this test introduce into notification? In a major outage (eg 1,000 alarms) could this add up to measurable aggregate delay?
Not that delay is bad, just that calling it out helps to anticipate worst case aggregate delay. For instance, if delay is 5 minutes in a major outage, it may be wise to wait until alarms settle before restoring the primary SS.