We use the following watch to monitor checkpoint HA state
Name |
watchChkPtHAMismatch |
Developer ID |
0xffff0000 |
Author |
elewis1 |
Last Modified Time: |
Jul 17, 2013 1:19:14 PM EDT |
Model Type |
Host_Device |
Data Type |
Boolean |
Expression |
( ( ( ( ( haState == "standby" ) & ( haIdentifier == 1 ) ) | ( ( haState == "active" ) & ( haIdentifier == 2 ) ) ) | ( haState == "initializing" ) ) | ( haState == "Ready" ) ) |
Instance |
None |
Active By Default |
No |
Evaluate |
By Polling every 0 Days + 00:05:00 |
Inheritable |
Yes |
Threshold |
Threshold violated if value == TRUE .
Threshold reset if value != TRUE .
Generate Major alarm with cause code 0xfff00003 .
Alarm is user clearable.
Watch will not be reset upon user clearing of alarm. |
Here is my documentation on why and how this works. We have a large firewall environment and its been 100% accurate.
"checks active for nodes configured as primary and standby for nodes configured as secondary"
Primary reason this works: haIdentifier is hardcoded per FW
Why is this coded assembly style?
watch expressions do not allow string conversion. using boolean of a text compare is a way around this
Checks OIDS:
SNMPv2-SMI::enterprises.2620.1.5.6.0 haState (text-string of standby state)
SNMPv2-SMI::enterprises.2620.1.5.8.0 haIdentifier (integer of what firewall is configured to be, 1 for primary, 2 for secondary)
all possible combinations:
node-01 active 0*1+1*0+0+0 Result: 0
node-01 standby 0*1+1*0+0+0 Result: 1
node-01 ready 0*1+1*0+0+0 Result: 1
node-01 init 0*1+1*0+0+0 Result: 1
node-02 active 1*0+0*1+0+0 Result: 1
node-02 standby 0*0+1*1+0+0 Result: 0
node-02 ready 0*0+0*1+1+0 Result: 1
node-02 init 0*0+0*1+0+1 Result: 1