Automic Workload Automation

 View Only
Expand all | Collapse all

Atomic Engine: System fault tolerance

  • 1.  Atomic Engine: System fault tolerance

    Posted Oct 14, 2019 10:31 AM
    Hello.

    Are there tools or solutions for Atomic Engine System fault tolerance?
    I use AE to run jobs in Sap.



  • 2.  RE: Atomic Engine: System fault tolerance

    Posted Oct 14, 2019 10:44 AM
    Hi.

    Given sufficient licenses, you can run two or more AE servers. Both contribute to the load, and when the primary one that runs the so-called Primary Worker Process fails, the other one usually quickly notices and takes over. They used to have another option called "NonStop Server" but that's not sold or supported anymore (and it also didn't work so great) so disregard anything you may find about it.

    You can create failover setups for the database and web interface using their respective native mechanisms, such as a load balancer.

    Hth,


  • 3.  RE: Atomic Engine: System fault tolerance

    Posted Oct 15, 2019 02:31 AM
    Hello.
    Can you tell me, where I can read how to implement and configure this?


  • 4.  RE: Atomic Engine: System fault tolerance

    Posted Oct 14, 2019 02:17 PM
    Hi

    heres some lecture for you:

    https://docs.automic.com/documentation/webhelp/english/AA/12.3/DOCU/12.3/Automic%20Automation%20Guides/help.htm#AWA/Admin/admin_multi_server_operation.htm%3FTocPath%3DAdministering%2520and%2520Configuring%7CConfiguring%2520Automic%2520Automation%7CMulti%2520Server%2520Operation%7C_____0

    My hint: at least 2 AE servers and a split DB Env.

    cheers, Wolfgang

    ------------------------------
    I know I do really know it!
    ------------------------------



  • 5.  RE: Atomic Engine: System fault tolerance

    Posted Oct 18, 2019 04:28 AM
    If you have an old style license with an active and a non-stop server license and you load it in V12.3 and start the two instances then both will show up and act as active instances. You won't see NWP (non-stop WP) processes as it was in older versions. I have experienced this behaviour at a customer site. We loaded the license although license checking has been removed in some 12.2.2 version just to check this non-stop thing.

    ------------------------------
    Senior Consultant
    setis GmbH
    ------------------------------



  • 6.  RE: Atomic Engine: System fault tolerance

    Posted Oct 21, 2019 02:00 PM
    ​By active and non-stop server, does that mean active/passive? 

    So the fact that 12.2.2 version seems to have removed license checking and 12.3 changes the AE engine from active/passive to active/active, does that still violate the licenses with Broadcom or Broadcom no longer has an active/passive license so there is no additional charge when the upgrade to 12.3 changes to an active/active AE?


  • 7.  RE: Atomic Engine: System fault tolerance

    Posted Oct 21, 2019 03:27 PM
    Yes, a non-stop instance was in fact a passive one. And your question concerning licensing is exactly the point. I have advised every customer who has this old-style license to get in contact with Automic sales.

    ------------------------------
    Senior Consultant
    setis GmbH
    ------------------------------



  • 8.  RE: Atomic Engine: System fault tolerance

    Posted Oct 21, 2019 03:52 PM
    ​I can only say that if Broadcom changes the active/passive to active/active, it was their choice.  We didn't ask for it.  They should have read the license in the database to make sure that it remains active/passive after the upgrade. 

    Even if we wanted to correct it to match our license, how could it be done since they no longer require license file to be imported into the database?


  • 9.  RE: Atomic Engine: System fault tolerance

    Posted Oct 22, 2019 04:56 AM
    Edited by Carsten Schmitz Oct 22, 2019 04:57 AM
    ​"Passive", or "non stop Servers" have been discontinued and all such former license now allow for "active-active" usage. David Ainsworth has said that much at FOKUS conference in front of about 150 people and at various other occasions.

    We have also had this confirmed in writing. In case no. 1329429, Broadcom wrote to us:
    "Broadcom accepts the fact that due to the removal of the licensing module in 12.2.2. and higher, non-stop clusters can now be used as full clusters."

    If in doubt, contact your key account manager and get such statement in writing. But due to the public nature of the statements, any claim to the contrary wouldn't have a leg to stand on, in my humble opinion.

    Disclaimer: I am not a lawyer, I sometimes just pretend to be one.​

    Br,
    Carsten


  • 10.  RE: Atomic Engine: System fault tolerance

    Posted Oct 22, 2019 09:06 AM
    Now we are using Automation Engine API 12.0.3 build.9313.
    Does this version have a cluster solution?
    How do you think you need to upgrade to version 12.3?

    Where can I find documentation on clusters, upgrade and recommendations from the vendor?


  • 11.  RE: Atomic Engine: System fault tolerance
    Best Answer

    Posted Oct 22, 2019 10:01 AM
    ​Hey Valentina.

    > Now we are using Automation Engine API 12.0.3 build.9313.
    > Does this version have a cluster solution?

    If you are on 12.0.3, you can either use the old cluster mechanism ("non-stop cluster") if you have a license. It was only made defunct with 12.2.2 and onward, as far as I know. You should probably note, however, that the "non-stop" failover didn't work all that great, sometimes takes up to a minute or two to switch servers, and this contributed to the decision to retire it from 12.2.2 onwards.

    Or you can use an active-active setup (both servers running CP and WP processes, but only one running the PWP Primary Worker Process). Running active-active has always been an option even with 12.0.3 and prior - if you have the license for it.

    If you don't have either license, you'd need to talk to your sales people and I strongly suspect they will not sell any "non-stop" licenses anymore. They will, however, probably happily sell active-active licenses to be used with any Automic release.

    > Where can I find documentation on clusters, upgrade and recommendations from the vendor?

    There is this:

    https://docs.automic.com/documentation/webhelp/english/AA/12.3/DOCU/12.3/Automic%20Automation%20Guides/help.htm

    but I can find barely anything on clustering. You can look at the sizing guidelines and the "cluster" chapter of the installation guidelines, but it's very little. That part of the documentation historically sucks. My best advise is to either talk to Sales or Account Management people. Maybe they have some white papers or some better information. Or, size your servers according to the above link, then clustering simply consists of setting up multiple servers and putting them into the same naming realm (as per the ini file). Start all processes on all servers, observe only one server will take the PWP role, done.

    For clustering the interface ("AWI"), you'd be relying on third party best practices anyway (i.e. load balancer), and dito for clustering the DB (Oracle or mssql native clustering mechanisms).

    Hth,
    Carsten






  • 12.  RE: Atomic Engine: System fault tolerance

    Posted Nov 01, 2019 12:21 PM
    ​I want to follow up with a question regarding this statement: "Broadcom accepts the fact that due to the removal of the licensing module in 12.2.2. and higher, non-stop clusters can now be used as full clusters."

    I tend to read between the lines.  I am planning on upgrading from 12.2.0 to 12.2.2 HF3.  After the upgrade to 12.2.2 HF3, do I need to do anything to make it active/active or that will be the end results after doing the first AE server upgrade?


  • 13.  RE: Atomic Engine: System fault tolerance

    Posted Nov 01, 2019 05:29 PM
    My experience so far is, that the Automic system runs in active/active mode without any further configuration. It simply ignores the non-stop license, even if the license has been loaded, which is not necessary anymore.

    ------------------------------
    Senior Consultant
    setis GmbH
    ------------------------------



  • 14.  RE: Atomic Engine: System fault tolerance

    Posted Nov 04, 2019 06:18 AM
    ​Jepp, my experience also, I can second this.


  • 15.  RE: Atomic Engine: System fault tolerance

    Posted Nov 06, 2019 11:18 AM
    ​Thank you Siegfried and Carsten. :-)


  • 16.  RE: Atomic Engine: System fault tolerance

    Posted Oct 22, 2019 03:54 PM
    ​Great information Carsten!  Thank you.