AutoSys Workload Automation

 View Only
  • 1.  WAAE 11.3.5 > 11.3.6 migration question

    Posted Jun 05, 2019 06:49 AM
    ​Hello all,

    This is a question about migrating R11.3 Autosys agents from a R11.3.5 WAAE scheduling environment to a new R11.3.6 WAAE Scheduling environment where the
    3-character instance name of both the R11.3.5 and R11.3.6 schedulers is the **same**

    Has anyone else done this yet?

    The challenge that I am trying to overcome relates to the fact that CA have confirmed that if a given agent's agentparm.txt file contains *two* Manager entries,
    (ie. for the R11.3.5 and R11.3.6 WAAE Schedulers respecively) and both of those entries refer to the **same** 3-character instance name, then the agent will only communicate
    with the Scheduler referred to in the 2nd Manager entry.

    Possible obvious solutions to this issue are to (i) reinstall the R11.3.6 WAAE with a different instance name or (ii) install a second agent on every server in the network to talk to the new R11.3.6 Scheduling instance.
    However, we would prefer not to do either of these if at all possible.

    Our problem therefore is to design a migration plan from R11.3.5 > R11.3.6 that caters for us to easily backout from R11.3.6 > R11.3.5, in the event of issues.
    It seems that if we have to backout and point every agent back to the original R11.3.5 WAAE, this will require considerable manual intervention to revert each agent's
    agentparm.txt file it it's original pre-migration state.

    Has anyone else encountered this issue and devised an fast elegant backout procedure, should it be needed?

    Best regards,
    James.


  • 2.  RE: WAAE 11.3.5 > 11.3.6 migration question

    Broadcom Employee
    Posted Jul 01, 2019 03:47 AM
    Hi James,

    This comes up every now and again.  While I don't have a quick and easy answer, I have some suggestions that may help.
    Do you need to define the agents to the new instance prior to the cutover?  Would being able to telnet or nc to the agents on the agent port be enough to satisfy connectivity on the input side?  The same should be done from the agent side, which depending on the OS may have a different solution on how to test.  This is the safest as no agentparm.txt would be updated, but the most manual.

    Another possible way is to insert the agents into the new instance (SCHEDULER MUST BE DOWN) and then do an autoping of the agents.  The app server managerid is not persisted and has the hostname included, so prevents the agentparm from getting updated.  I would switch the config file to have the app server use the scheduler port (7507) and reurn the test to verify that the other port is open as well.  That should prevent the agentparm files from getting updated and not be a lot of effort to complete.  The danger is if the scheduler starts and starts contacting agents.  You may want to rename the scheduler binary after shutting it down to prevent even an accidental start if the machine was restarted for example.  

    Regards,
    Mike


  • 3.  RE: WAAE 11.3.5 > 11.3.6 migration question

    Posted Jul 01, 2019 10:59 AM
    All,

    I have the same problem. We currently have R11.3.5 autosys managers. We created r11.3.6 SP8 and now have more than 500 agents with different agents installed as the OS
    are a variety of Windows OS from NT4 , Win2000, Win2003, Win2008R2, Win2012R2, Win 2016, Linux and HP-UX from the oldest to newest. How will I point all these agents to the new R11.3.6 SP8 in one day. One time bigtime and revert back to 11.3.5 if a problem is encountered as a rollback procedure.
    Appreciate your tips and advice. TIA. 

    Cheers,

    Liz







  • 4.  RE: WAAE 11.3.5 > 11.3.6 migration question
    Best Answer

    Posted Jul 02, 2019 08:39 AM
    Eliza

    I was involved in a similar project (11.3.5 -> 11.3.6sp6 with about 200 agents that had grown organically, with OS from every flavour)

    As Mike said we planned to have old scheduler shut down, then start new with the same instance ID (so we didn't have to change a lot of the back-end issue management systems.

    In test environment we had 11.3.5 which we scripted to shutdown, rename, service disable (everything we could thing of to stop an accidental restart).  in prod we created a new scheduler/wcc's 11.3.6sp6  infrastructure with knowledge of only a couple of agents, and tested functionality with existing scripts and processes.

    We then go approval for test in prod at low job time (remember it being like 05:00 - 06:00 on a tuesday).  In the new prod we defined 2 jobs for every type of OS we knew about (1 to get the OS + patches, and 1 to get workload agent version).  So at test time we did:
    • export agent definitions from old 11.3.5 prod
    • shutdown old 11.3.5 prod using scripts developed in test
    • loaded agent definitions into new 11.3.6sp6 prod and ran our 2 jobs on each node
    • remove all agent definitions out of new 11.3.6sp6 prod and shut it down
    • restarted old 11.3.5 prod with scripts developed in test

    After about a month of this on each Tuesday, we had a picture of the OS and agent version (which we pushed into a web site) and all the firewalls etc sorted out so the jobs ran (don't remember if we tested port 7507 as well)

    Then we defined D-day and max roll-back time (2 days, but evaluated every 12 hours), so if we were still running the new after 2 days we would not go back (lose of history would start to be a problem. So at D-day we did
    • export agent definitions from old 11.3.5 prod
    • export job definitions + global variables
    • shutdown old 11.3.5 prod using scripts developed in test
    • loaded agent definitions into new 11.3.6sp6 prod and ran our 2 jobs on each node
    • loaded prod jobs and global variables into new prod (and waited for the excitement to start.

    We nearly rolled back at the 2nd 12 hour mark but other than that we never considered going back.

    I have left the team now, but they still use the 2 jobs to build the status portal, and are moving agents to the latest supported version on each OS, I believe it is done now but that part took 18months. Also they are running sp7 (probably sp8 now) so that they do not get into the nightmare of a huge upgrade step.

    Good luck it was a stressful ride on D-day but the testing and planning really paid off.


  • 5.  RE: WAAE 11.3.5 > 11.3.6 migration question

    Posted Jul 10, 2019 10:01 AM
    Thanks Andrew! Appreciate your tips. I was able to use powershell to get the OS for each of the agents. I just need to find a way to get the version of agent installed. 

    Liz


    --

    Cheers,

    Liz

    Maria Eliza C. Narvadez

    CA AutoSys Administrator


    Essilor Shared Services Philippines Incorporated

    T: (+632) 873 7774

    M: (+63) 917 838 9549

    www.esspi.ph







  • 6.  RE: WAAE 11.3.5 > 11.3.6 migration question

    Posted Jul 11, 2019 02:30 AM
    Liz

    either parse the agentparms file (which has a lot of other info in it as well), but not always acurrate for agent version
    or
    use cybagent -v (which give running agent version and what the agent sees the OS version as (think we did a check when matching the data to the status portal or the info from both scripts.

    Talked with the team, and they have expanded the OS collection script to give them other info from the OS, so they know where the system is in the overall scheme of things, and have a 3rd job running weekly/monthly to get the agentparm.txt which is added to the portal as additional agent info.

    As a side note they also push some autosys operational data into the portal so other parts of the ops team can see status without having to login to wcc

    ------------------------------
    Knows a little about UIM/DXim
    ------------------------------



  • 7.  RE: WAAE 11.3.5 > 11.3.6 migration question

    Broadcom Employee
    Posted Jul 15, 2019 04:04 PM
    Hi Liz,

    I'm curious on why you need the OS before the upgrade.  The 11.3 system agents are still supported with 11.3.6 SP8, so that should not be a concern.
    If it can wait until after the upgrade, you can run an autorep -M ALL -p and get the versions returned as part of the output.
    If you install the web services component with your 11.3.6 SP7+ you can use the agent inventory to export a file with your agent info (OS, Version,...) as well.

    Regards,
    Mike