AutoSys Workload Automation

 View Only
  • 1.  Update of the agent is impossible because of running jobs.

    Posted Feb 25, 2021 01:41 PM
    Hi!

    I'm trying to update an autosys agent to newer version. I performed  ./cybAgent -s, agent was stopped. But I have the next message and can't continue installation:
    To perform the upgrade, stop the agent and/or end the running jobs
    There are a lot of filewatcher jobs, running on this server. Changing their state to "FAILURE" doesn't help. MACH_OFFLINE event can't also be executed because of running jobs.
    ps ax | grep CA shows me about two hundred processes of running filewatcher jobs​. I don't want to kill every job manually and I think this is not the right way.
    What is the best way to stop all this jobs?


  • 2.  RE: Update of the agent is impossible because of running jobs.

    Posted Feb 25, 2021 02:13 PM
    Best way is to capture all the PIDs in a file by running "ps ax | grep "/opt/CA" and then run a for loop to kill all the PIDs.

    ------------------------------
    Sandeep Veetil
    Application Administrator
    ------------------------------



  • 3.  RE: Update of the agent is impossible because of running jobs.

    Broadcom Employee
    Posted Feb 26, 2021 12:07 PM
    Hi,

    If this is a linux machine, you can try using the 'killall -9 processname" command.  It does the finding and looping for you.
    Here is a link that gives you some additional syntax options: https://www.howtoforge.com/linux-killall-command/ 

    Regards,
    Mike


  • 4.  RE: Update of the agent is impossible because of running jobs.

    Posted Feb 26, 2021 02:50 PM
    I go through this each month for patching.
    1. send Machine Offline event so no new jobs start on that machine
    2. kill all filewatch jobs manually and restart those that don't automatically (so they start/run on a different machine)
    3. let running jobs finish
    4. perform machine maintenance/updates/reboot
    5. send Machine Online event to put back into rotation
    6. repeat on all other servers

    This usually takes several days to cycle through each one offline, job completions, machine maintenance, online for our 4 machines.

    Yes, this causes the filewatch jobs to need manual kill/start (several times as I cycle through machines) but it is the only way unless you take out of pool for extended time for all things to complete gracefully, which can be a long time.