Layer7 Privileged Access Management

Expand all | Collapse all

Cluster Synchronization - Is there a tombstone age?

Jump to Best Answer
  • 1.  Cluster Synchronization - Is there a tombstone age?

    Posted 26 days ago
    I remember being on a call with product management about a year ago, in which the PM mentioned a (sort-of) tombstone age in relation to x-site synchronization.

    That is to say, a period of time (or number of changes to the DB) during which a secondary site cluster member can be re-synced, but after which a re-sync is not possible and therefore require a cluster restart.

    Use Case:
    CA PAM 3.2.4
    Multi Site Cluster
    One node in a secondary site falls out of sync, all other nodes in all sites are synced.

    Is there a tombstone age that, once passed, would require a cluster-restart to get all nodes back in sync? Or can that secondary node stay out of sync for a very long time and be re-synced with a simple "RE-SYNC SITE MEMBER" ?

    Thanks in advance.

    ------------------------------
    Services Architect
    HCL Technologies Ltd
    ------------------------------


  • 2.  RE: Cluster Synchronization - Is there a tombstone age?

    Posted 26 days ago
    I think i've found it: https://docops.ca.com/ca-privileged-access-manager/3-2-4/en/deploying/set-up-a-cluster/

    Secondary members can "self-heal" after being disconnected. Up to a configurable number of missed transactions, members download data until they catch up. This threshold defaults to 10,000 transactions, above which the member requires resyncing.

    I believe this is true for 3.24 & 3.2.5, however the 3.3 Multi Site Cluster and Secondary Sites Documentation doesn't explicitly mention this threshold.

    ------------------------------
    Services Architect
    HCL Technologies Ltd
    ------------------------------



  • 3.  RE: Cluster Synchronization - Is there a tombstone age?

    Posted 25 days ago

    Hello Sebastiano,

     

    Unlike earlier versions of PAM, r3.3 and newer is using mySQL asynchronous group replication (with single primary) mechanism which guarantees delivery.

    Once a member (re)joins the cluster it will receive all previous updates while it was unavailable.

     

    Note, due to the new method in 3.3 it is highly recommended to have at least three nodes configured in the primary Cluster Site.

    (else the possibility to run into quorum loss mode is given which then basically renders the complete PAM Cluster being unavailable)

     

    For further details about mySQL asynchronous group replication please see its documentation

    https://dev.mysql.com/doc/refman/8.0/en/group-replication.html

     

    Best Regards,

    Andreas

     






  • 4.  RE: Cluster Synchronization - Is there a tombstone age?

    Posted 25 days ago
    Thank you Andreas.

    Can you confirm that my understanding of this passage is correct for 3.2.4 & 3.2.5?​

    Secondary members can "self-heal" after being disconnected. Up to a configurable number of missed transactions, members download data until they catch up. This threshold defaults to 10,000 transactions, above which the member requires resyncing.

    Is this saying that a cluster restart is required to resync a secondary node that is > 10K transactions behind?

    Or is this really saying that 

    We should only need to perform a "RE-SYNC SITE MEMBER" to synchronize the secondary node (Which would then suggest, that a secondary node, can stay out of sync for an indefinite amount of time and be "RE-SYNC'ed" whenever)?

    Finally, the blurb says that it's a configurable threshold. In which cases would it be advisable to increase / decrease that threshold ?

    Much obliged.​

    ------------------------------
    Services Architect
    HCL Technologies Ltd
    ------------------------------



  • 5.  RE: Cluster Synchronization - Is there a tombstone age?
    Best Answer

    Posted 25 days ago
    ​The configurable transaction limit of 10000 is for automatic self-healing. If a node falls behind due to temporary network problems, usage spikes etc, it can catch up as long as it is behind by less than the limit. Once the limit is exceeded the CM database of this node becomes inactive. A node re-sync was meant to always work, independent of the current state of the local database, because it involves a download of the current database from the master node. However, there is a known problem in recent 3.2.X release where this does not work. We have a defect open with PAM Engineering to get it fixed. For now a full cluster restart is required to get the node back in sync.


  • 6.  RE: Cluster Synchronization - Is there a tombstone age?

    Posted 25 days ago
    ​Cheers Ralf.

    I was suspecting that.

    Thanks

    ------------------------------
    Services Architect
    HCL Technologies Ltd
    ------------------------------



  • 7.  RE: Cluster Synchronization - Is there a tombstone age?

    Posted 25 days ago
    Hello Sebastiano,

     

    Unlike earlier versions of PAM, r3.3 and newer is using mySQL asynchronous group replication (with single primary) mechanism which guarantees delivery.

    Once a member (re)joins the cluster it will receive all previous updates while it was unavailable.

     

    Note, due to the new method in 3.3 it is highly recommended to have at least three nodes configured in the primary Cluster Site.

    (else the possibility to run into quorum loss mode is given which then basically renders the complete PAM Cluster being unavailable)

     

    For further details about mySQL asynchronous group replication please see its documentation

    https://dev.mysql.com/doc/refman/8.0/en/group-replication.html

     

    Best Regards,

    Andreas


  • 8.  RE: Cluster Synchronization - Is there a tombstone age?

    Posted 25 days ago

    Hello Sebastiano,

     

    Unlike earlier versions of PAM, r3.3 and newer is using mySQL asynchronous group replication (with single primary) mechanism which guarantees delivery.

    Once a member (re)joins the cluster it will receive all previous updates while it was unavailable.

     

    Note, due to the new method in 3.3 it is highly recommended to have at least three nodes configured in the primary Cluster Site.

    (else the possibility to run into quorum loss mode is given which then basically renders the complete PAM Cluster being unavailable)

     

    For further details about mySQL asynchronous group replication please see its documentation

    https://dev.mysql.com/doc/refman/8.0/en/group-replication.html

     

    Best Regards,

    Andreas




  • 9.  RE: Cluster Synchronization - Is there a tombstone age?

    Posted 24 days ago
    Hi team

    We currently have a cluster configuration in which two sites are configured, each site contains two nodes, if launch 3.3 is applied. Since at the moment it is not viable at the moment to have for each site 3 nodes, that the impact could have on the solution, we could apply this release?


    Julian Riaño
    MSL



  • 10.  RE: Cluster Synchronization - Is there a tombstone age?

    Posted 23 days ago
    Hello Julian,

    The suggestion to have at least 3 nodes only applies to a Multi-Master / Primary site.

    A secondary site with only 2 nodes is perfectly fine.

    Regards,
    Andreas