Layer7 API Management

 View Only
Expand all | Collapse all

DR for API Portal 4.3.2

  • 1.  DR for API Portal 4.3.2

    Posted Jan 21, 2020 01:32 PM
    Hi,

    We have installed 4.3.2 version of the portal in our lower environments in a standalone mode i.e. we have 3 nodes, one each for Portal, Jarvis and DB.

    We would be starting the production installation pretty soon. In terms of the HA/DR options, can you please confirm if the following makes sense ?

    Assume we have 2 data centers and independent gateway clusters are deployed on each of them.

    On site 1, we would have the Active instance of the portal with 3 nodes (1 each for Portal, Jarvis and DB). This would be integrated with gateway hosted on site 1.

    For DR purposes, does the following set up make sense:
    1. Database - have a mysql passive db on site 2 with replication set up from site 1 to this node
    2. Jarvis - out of the box, there doesnt seem to be an option to failover to a second node. So we were planning to take daily snapshots of the Jarvis node on site 1 and restore to site 2
    3. Portal - deploy a new docker swarm cluster on site 2 with just one node that gets the same config as the node in site 1.
    3.a. Do we need to take any snapshots of portal node on site 1 similar to jarvis node i.e. does portal store anything on the node itself that needs to be replicated/copied over manually.

    Any inputs are greatly appreciated.


  • 2.  RE: DR for API Portal 4.3.2

    Broadcom Employee
    Posted Jan 21, 2020 06:02 PM
    Dear Keshava Murthy Jayaram,
    The document below could give you some ideas,
    https://techdocs.broadcom.com/content/broadcom/techdocs/us/en/ca-enterprise-software/layer7-api-management/api-developer-portal/4-3-1/install-configure-and-upgrade/scale-ca-api-portal/high-availability.html

    Regards,
    Mark


  • 3.  RE: DR for API Portal 4.3.2

    Posted Jan 21, 2020 10:25 PM
    Hi Mark,

    I have gone through the documentation already and did not find the required information. For example, there is no mention of how to configure HA for Jarvis. Similarly for the portal, its not clear whether we should point both the portal instances across 2 data centers to a single database node or they can point to the primary/secondary nodes independently.

    Additionally, as asked in my questions, does portal node store any data on the local file system that needs to be copied/replicated should a failover situation arise.


  • 4.  RE: DR for API Portal 4.3.2

    Broadcom Employee
    Posted Jan 22, 2020 12:21 AM
    Dear Keshava Murthy Jayaram,
    As per the document, the "Prerequisites" section, it requires a mysql cluster.
    The document below has more info about mysql/jarvis,
    https://techdocs.broadcom.com/content/broadcom/techdocs/us/en/ca-enterprise-software/layer7-api-management/api-developer-portal/4-2/ca-api-developer-portal/deployment-reference-architecture.html

    Regards,
    Mark


  • 5.  RE: DR for API Portal 4.3.2

    Posted Jan 22, 2020 02:57 AM
    Thanks Mark. The links shared focus more on the HA configuration. Would you have similar documentation links for a DR scenario where we have a passive instance of Portal set up and that needs to be kept in sync with the active instance of the portal.


  • 6.  RE: DR for API Portal 4.3.2

    Posted Jan 22, 2020 05:09 AM
    Edited by Deactivated User Jan 22, 2020 05:13 AM
    If you are using PORTAL version 4.2.x onward it uses the Docker Swarm concept (https://docs.docker.com/engine/swarm/admin_guide/) so for a minimum HA at least 3 node cluster are required. With manager node at each site and then worker nodes can be added but in odd numbers. This is reduce split scenario  and maintain the quorum.
    It works well with a 3 node cluster with 1 level for fault tolerance (https://docs.docker.com/engine/swarm/admin_guide/#add-manager-nodes-for-fault-tolerance)



  • 7.  RE: DR for API Portal 4.3.2

    Posted Jan 22, 2020 05:22 AM
    Thanks Ronald, as I mentioned I am looking for some inputs on DR as opposed to HA. We would be having a passive instance of Portal - in such a case what considerations we need to make from a portal standpoint ?


  • 8.  RE: DR for API Portal 4.3.2

    Posted Jan 22, 2020 05:23 AM
    Edited by Deactivated User Jan 22, 2020 05:25 AM
    For DR purposes, does the following set up make sense:
    1. Database - have a mysql passive db on site 2 with replication set up from site 1 to this node
    For one of the client we did this. We enable Active Active Replication between MySQL and then use the Virtual IP and host name that we had MySQL server in Active Passive mode. So replication will ensure both MySQL data are available with most current data and Virtual IP will ensure DB failover in case Active is not reachable. Portal was then Pointed to this virtual hostname

    2. Jarvis - out of the box, there doesnt seem to be an option to failover to a second node. So we were planning to take daily snapshots of the Jarvis node on site 1 and restore to site 2
    This should be fine. You should however use a Loadbalancer hostname (IP) to allow seamless transition when Jarvis is up on the other side. (If it is possible to move to Portal 4.4 you should go ahead as JARVIS is not a requirement over here)

    3. Portal - deploy a new docker swarm cluster on site 2 with just one node that gets the same config as the node in site 1.
    Read above. You just add the node to the swarm. Docker will do FA for you 
    3.a. Do we need to take any snapshots of portal node on site 1 similar to jarvis node i.e. does portal store anything on the node itself that needs to be replicated/copied over manually.
    As long as you have the database. You can restart Portal on any site by running portal.sh (with the docker images downloaded at that site)
    You may hit DB lockup issue which can be resolved over here 



  • 9.  RE: DR for API Portal 4.3.2

    Posted Jan 22, 2020 05:33 AM
    Thanks a lot for the inputs Ronald, it is very useful.

    Just a follow up question on your response to point 3. Does the node need to be added to the swarm on site 1?  Can it not be in its own swarm on site 2, totally independent of the site 1 configuration. In a DR situation, we just point the LB to this new node ?


  • 10.  RE: DR for API Portal 4.3.2

    Posted Jan 22, 2020 06:32 AM
    You will have to point to  swarm on Site 1.

    Portal works on DB Locks which means at a given time only 1 instance of Portal can run pointing to the MySQL database.

    If you can afford a downtime the you could do the following 
    1. Create Swarm on Site 1 which can work pointing to MySQL database 
    2. Create Swarm on Site 2 which is COLD Site pointing to MySQL database and bring it down once all images are downloaded
    3. Create a script that can Poll Site 1 for uptime and query a service (that check if docker is running we used APM for monitoring)
    4. If Docker or Site is Down
    5. Clear the logs from DB (same update command but database to clear are portal,tenant_provisioning,apim_otk,rbac)
    mysql> use portal; mysql> UPDATE DATABASECHANGELOGLOCK SET locked=0, lockgranted=null, lockedby=null WHERE id=1;
    6. Run Portal.sh on the Secondary 
    7. If Primary Docker is UP (run -->docker rm stack portal on Secondary and start portal.sh after repeating step 5)

    This becomes quite complicated and so using a docker swarm cluster is the best way going ahead. 
    Because as Soon as the Primary Site comes up. Docker Regains Polling and set the master back to it original.




  • 11.  RE: DR for API Portal 4.3.2

    Posted Jan 22, 2020 06:34 AM
    See saying the Above. You can definitely have an HOT and COLD Site.
    But you cannot have 2 HOT sites. If you are looking for that scenario you should us Docker swarm capablity


  • 12.  RE: DR for API Portal 4.3.2

    Posted Jan 22, 2020 06:40 AM
    Thanks again for the detailed explanation Ronald. Yes, we are looking for an active-passive set up of Portal. We would have only 1 instance active at any given time. So the steps detailed by you is what we intend to follow.


  • 13.  RE: DR for API Portal 4.3.2
    Best Answer

    Posted Jan 22, 2020 07:22 AM
    Okay. You need to inform the user that there will be downtime with the approach.

    You can use this setup with Just 1 node swarm on both sites

    Req:
    1. Ensure you portal requirement are met for Docker, Storage, Memory and CPY
    Steps:-
    1. On Site 1:- Download Portal version 4.x offline or online
    2. Configure Portal at {portal_folder}/conf/portal.conf
    3. Set DB parameter, Jarvis parameters, email parameter if required
    4. Set up Keys for TSSG and HTTP if you are using Signed Certs 
    5. Run sudo ./portal.sh
    6 --> This will download all the docker images and create the db tables at the pointed MySQL (community or enterprise) server
    6.1 Run sudo ./status.sh to confirm all component of Portal are running
    7. Make a backup of the following
    7.1. {portal_folder}/conf/portal.conf
    7.2 {portal_folder}/certs/ 
    8. discard portal on site 1. (Run sudo docker stack rm portal) This will only discard portal entries and clean the portal locks with db
    9. On Site 2  Download Portal version 4.x offline or online
    10. Ensure your MySQL database is replicated and Vip is upto mark
    11. Now just replace the files copied in step 7 over this site 
    12. Start portal (sudo ./portal.sh)
    13. Once portal is up and running perform step 8 on Secondary 
    14. Start Portal again on Site 1 (HOT site)  (sudo ./portal.sh)

    So now you have a failover DR site cold and you can start the HOT site with portal configured.

    Make sure all you DNS configuration are met as required by Portal


    This setup will work.

    During DR:-
    You may just need to clear the lock files (as per the above)
    Run sudo ./portal.sh on secondary




    ------------------------------
    Pre-Sales Consultant
    ------------------------------



  • 14.  RE: DR for API Portal 4.3.2

    Broadcom Employee
    Posted Jan 27, 2020 06:11 PM
    Hi:

    Are you aware of Portal 4.4 that removes Jarvis and replaces it with Druid within the Portal stack?   If you are interested in that version and documentation is not adequate for your Use Case let us know and we can try and help.

    Thanks, Alex.

    ------------------------------
    Broadcom
    ------------------------------