AutoSys Workload Automation

Expand all | Collapse all

Dual Event Sever Recovery question

Jump to Best Answer
  • 1.  Dual Event Sever Recovery question

    Posted 05-23-2018 03:22 AM

    HI,

    We have an AutoSys 11.3.5 INCR2 running in high-availability with dual event servers.

     

    If pimay event server crashes and we have a fail over situation and we fail over a Single Event server environment, I would like to know how to get again into dual event server running mode.

     

    first we have to stop primary, seconday and tie-breaker schedulers and application servers.

    second we have to amend the config file

    then we have to synchonize the data bases

    and last we have to start again the schedulers and application servers.

     

    My question is which scheduler must be started first?

    Do we have to start first tie-breaker, then primary and last the seconcday?

    Do we hve to start first the primay, then the tie-breaker and last the secondary?

    Which one is the right order?

     

    thanks

    José



  • 2.  Re: Dual Event Sever Recovery question

    Posted 05-23-2018 03:35 AM

    Hello José

    When any change is done in the HA configuration, the only requirement is to start the primary scheduler first, then the order of the other components (shadow scheduler , application servers or tie-breaker) is not important.

    You can start them in any order

     

    Regards

    Jean Paul



  • 3.  Re: Dual Event Sever Recovery question

    Posted 05-23-2018 04:00 AM

    Hi Jean Paul,

     

    Thanks for your answer. I already documented it in our doc.

     

    Let me another question.

     

    We have a primary server with the primary scheduler & event server, another server with the secondary scheduler and event server and a third server with the tie-breaker.

     

    There are different options for whether the primary scheduler takes over from the shadow scheduler or not when it is restarted after an outage.


    We have set primary scheduler in PrimaryFailBackmode 2 (primary scheduler resume working as soon as it can)

     

    In particular this can be important if there was an outage of the primary server because the DB is on the same server.
    Rollover of DB has happened and shadow has taken over.

    But if primary server died, the primary config file will not have been updated for the DB rollover. Therefore if the primary server is restarted and primary scheduler restarts automatically then it can take control of the scheduling.
    But because it’s config file still contains the primary database will this can make a mess?

     

    thanks José



  • 4.  Re: Dual Event Sever Recovery question
    Best Answer

    Posted 05-25-2018 03:23 AM

    Hello José

     

    In the HADS configuration you describe where the  Event Server (database)  is also sitting on the Primary scheduler, the PrimaryFailBackmode should not be set to "immediate"  because if the entire machine crashes, the scheduler cannot update the configuration file to disable its corresponding Event Server entry.

    And if upon reboot, the database and the scheduler starts straight away, the database won't be in sync with the other one sitting on the Shadow machine.

    Indeed,  you can also setup SNMP traps to be informed of such a failure and take appropriate actions to re synchronize the database. 

    If both databases are not sitting on the Autosys servers or if this is an Oracle RAC architecture, there is no possible issue like that.

     

    Regards

    Jean Paul



  • 5.  Re: Dual Event Sever Recovery question

    Posted 05-25-2018 04:26 AM

    thanks very much Jean Paul.

     

    We will fix this setting in our environment.

     

    best regards.

    José