Hey guys,
Our team recently received an alert which shows this message: Initiating failover from remote Hub /<DOMAIN>/<ORIGIN>/<HUB>/spooler.
-- Then we looked into the status of the Primary hub and found that it's stable and working as expected
-- Then I opened up the HA probe and found the status in RED which has this status: The probe is up but not connected
-- I tried to ping the Primary hub from HA Hub and was able to get replies and seems that connectivity between them is established.
-- Can you guys enlighten me on what's going on with the probe?
To add additional info, here's the logs that's reoccurring from HA probe: (I changed some details to hide our environment for security purposes)
Oct 7 20:03:42:419 HA: INFO: Calling UpdateCache(0)
Oct 7 20:03:42:419 HA: INFO: UpdateCache - cache will be updated in 3000 seconds
Oct 7 20:03:42:419 HA: INFO: checking connection to /<DOMAIN>/<ORIGIN>/<HUB>/spooler (<IP of HA HUB>:48007)
Oct 7 20:03:42:419 HA: SREQUEST: _status -><IP of HA HUB>/48007
Oct 7 20:03:43:577 HA: nimSessionWaitMsg: got error on client session: 10054
Oct 7 20:03:43:577 HA: WARN: Failed to contact primary hub '/<DOMAIN>/<ORIGIN>/<HUB>/spooler': communication error. Issuing state change.
Oct 7 20:03:43:577 HA: INFO: state == '1'
Oct 7 20:03:43:577 HA: INFO: gConnected: 0
Oct 7 20:03:43:577 HA: ciGetDeviceIdentifiers - <IP of HA HUB> found (remote device) [<Random Letter + Numbers>]
Oct 7 20:03:43:577 HA: INFO: Failover 'wait_time' has expired. Changing state.