Automic Workload Automation

 View Only
Expand all | Collapse all

Expert’s opinions on AWA high availability, fault tolerant and Zero downtime upgrade feature

  • 1.  Expert’s opinions on AWA high availability, fault tolerant and Zero downtime upgrade feature

    Posted Oct 19, 2016 04:16 AM

    Dear product/community expert’s,

    I have use case for the AE solution with following deployment and operational requirements and would like to hear the product/community expert’s opinions/advices for the same.

    Also would like to discuss how to best use AWA product features such as “multi-server operation”,”non-stop operations” , “NetArea” , “Zero downtime upgrade” and various other configuration option to achieve fault tolerant /highly available and scalable  automation platform .

    My Primary requirements are as follow:

    Note: Architecture Diagrams are attached to the bottom of the discussion.

    1.       Deployed to two datacentres call DC1 and DC2 ( It is possible to have third DC as well in future )

    2.       Deployed to two servers (Namely “AE_1A” in DC1 and “AE_1B” in DC2 depicted in “diagram1”) at the start (Phase1). Future plan(Phase2) is to add more servers to the AE farm ( Namely “AE_2A” in DC1 and “AE_2B” in DC2 depicted in “diagram2”)

    3.       Should be a single AE system (namely PROD) where all the CP(s) and the WP(s) are distributed to all available AE servers in the AE farm.

    4.       All the distributed CPs should be in active mode and should serve any client who connect to any one of the CP in any AE server(diagram1 and diagram3 ).

    5.       At any given time, admin should be able to put all CPs on one node( or multiple node) in to the hot-standby mode where the CP will no longer serve the client but get activated when other active CPs are no longer available in AE system. (diagram2 and diagram4 ).

    6.       All AE agents should be configured to communicate to the CPs on the same datacentre of the agents as the first preference.

    7.       At the event of agents could not connect to the CPs on the same datacentre, it should connect to the CPs on the other datacentre.

     

    8.       Solution should comply with the “Zero downtime upgrade” feature as well.

    Assumptions:

    1.       Have all the required products licenses

     

    2.       All component can communicate to each other across the datacentres ( No firewall restrictions ) 

    I went through the documentation and understood all its features and architecture concepts to a good level. Still trying to figure out how to combine these features together to build the more robust solution.

    As per my understanding:

    To address requirement (3,4), we can use multi-server operations feature

    To address requirement (5), we can use non-stop operations feature

    To address requirement (6), we can use NetArea feature

    To address requirement (7), not sure this is possible as per explanation in the documentation?

    Not sure the NetArea feature and non-stop operations feature work together practically?

    Any opinions/advices reference is appreciate in advance. Thank you.

    Rgds,

     

    Indika Peiris

    Support architecture diagrams

    Diagram1

    buagap44us15.pnghttps://us.v-cdn.net/5019921/uploads/editor/1b/buagap44us15.png" width="439">

    Diagram2

    c8p19v2727kc.pnghttps://us.v-cdn.net/5019921/uploads/editor/po/c8p19v2727kc.png" width="437">

    Diagram3

    7mgf9ixx89bt.pnghttps://us.v-cdn.net/5019921/uploads/editor/kt/7mgf9ixx89bt.png" width="461">

    Diagram4

    7s0t7u2f251b.pnghttps://us.v-cdn.net/5019921/uploads/editor/jf/7s0t7u2f251b.png" width="462">



  • 2.  Expert’s opinions on AWA high availability, fault tolerant and Zero downtime upgrade feature

    Posted Oct 24, 2016 05:19 AM
    FYI
    1 -> 4  :  basic functionality of the HA configuration (active-active)

    5  :  no Hot-Standby status possible for CP. Only active or stop status. To do this function, use Service Manager command line to stop CP and activate them associated to a watchdog to check active CP. Not nice but also not really necessary if you consider that switching agent connection from one CP to another is performed automatically when CP is shutdown or canceled or unreacheable.

    6 - NetArea can be used to enforce this behavior. By default the workload balancing will try to make the connection between Agents and CPs to get the shortest response time. Depending on your settings between the two DataCenter it may be fitting your requirement without the NetArea.

    7 & 8 : standard functionality of product as far as I know.

    Hope this can help you.

    Alain


  • 3.  RE: Expert’s opinions on AWA high availability, fault tolerant and Zero downtime upgrade feature

    Posted Oct 14, 2019 10:31 AM
    Hello.
    1 -> 4  :  basic functionality of the HA configuration (active-active)
    Tell me, where can I find information about this functionality and its implementation. Thank you