Release Automation

Expand all | Collapse all

HA and Disaster Recover Setup

  • 1.  HA and Disaster Recover Setup

    Posted 12-14-2015 09:56 PM

    There’s lot of reference on HA but not so much on DR.

     

    Need help on any reference that may be useful as a reference point design to have both HA and DR enabled.

     

    Below are a draft overview stack placement (after several iterations) on which one client is trying to pursue (note: all NAC are connected to each NES).

    1. Three data centre with strict firewall and limited bandwidth from each other
    2. Data centre 3 will have the most deployment activities (Dev, SIT, UAT)
    3. Database replication CANNOT be perform between DC3 and DC2. Thus below DB are in DC1 and replicated to DC2 for DR purpose.
    4. Some of the requirements:
      1. NAC1 and NAC2 shall complement each other for HA. NAC3 will be passive when DC1 is up and running.
      2. NAC3 shall kick in (with reasonable SLA turnaround) when RA servers in DC1 is down.
      3. Switch back to DC1 servers when infra is up and running

     

    I am curious to know:

    1) If the nexus HA will work on below setup during actual usage. Have we encountered any sort of glitches/performance issue due to sync delay/latency on nexus.

    2) If it's redundant to have SLB/Proxy for NAC given the performance bottleneck for RA is actually at Database. Switching to another NAC would not necessary resolve load issue, right?

     

    HA and DR.PNG



  • 2.  Re: HA and Disaster Recover Setup

    Posted 12-15-2015 04:07 AM

    Hi,

     

         The NAC load balancer does not help with load issues as the NACs work as active / passive pairs, the load balancer / proxy role is to ensure that UI traffic is directed to the active node. I am not aware of any customers using a load balancer between NAC and Nexus but this should work.

     

    Regards

    Keith



  • 3.  Re: HA and Disaster Recover Setup

    Posted 12-16-2015 09:24 AM

    We are currently in the process of setting up a DR environment between two data centers that closely resembles what you are proposing above. The biggest challenge for us thus far came from replication for the Nexus repositories and the SQL databases but we ended up overcoming those challenges.  We are in now in the final step of configuring NIMI to provide a full mesh and will probably start testing fail over in February. If you haven't already found it then this link will take you to a Nexus document with direction on what directories to exclude for replication.  We use standard log shipping to replicate the database.  Two RA tables need to be excluded, "dbo.master_nac" and "dbo.nac_nodes".