Automic Workload Automation

Expand all | Collapse all

Bash scripts for the Automation Engine

  • 1.  Bash scripts for the Automation Engine

    Posted 03-04-2021 08:13 AM
    Edited by Michael A. Lowry 03-04-2021 05:41 PM
      |   view attached
    Over the past few years of working with the Automation Engine, I have been working on a set of scripts to simplify working with the Automation Engine and the Service Manager that controls it. Today I'd like to share these scripts with the broader AE community, in the hopes that others may find them useful.

    Attached are three scripts:
    • server_env.sh - defines lists of servers
    • uc4_env.sh - sets global variables including pathnames
    • uc4_functions.sh - defines functions including scml and ae
    The first two scripts must be customized for your environment.
    The smcl and ae functions are the interesting parts.
    These scripts are meant to be sourced from your user's .bashrc file. I.e.,
    . server_env.sh
    . uc4_env.sh
    . uc4_functions.sh
    Once everthing is set up, you should be able to run the smcl and ae functions directly from the command line.

    Documentation of these functions follows.

    smcl is a wrapper function for the Service Manager Command Line program ucybsmcl. It is specifically adapted for use with the Service Managers that control the Automation Engine (UC4). Because it automatically reads some information from configuration files, it is simpler to use than ucybsmcl. When using smcl, you do not need to specify three ucybsmcl options:

    • -h <hostname:port>
    • -n <phrase>
    • -p <password>

    The smcl function automatically fills in these details based on context - mostly based on the server where you run it.

    If a password is required, you will be prompted for it.

    Usage

    Abbreviated syntax

    This is a shortened syntax adapted to make commonly-used functions quicker to type.

    smcl [list] List all services (AE server processes)
    smcl start service Start a service
    smcl stop service Stop a service
    smcl shutdown service Shut down the AE system (provide the name of a running WP)


    If no action is specified, the default is list.

    Classic syntax

    This is the classic syntax used by ucybsmcl.

    smcl -c GET_PROCESS_LIST List all services (AE server processes)
    smcl -c START_PROCESS -s service Start a service
    smcl -c STOP_PROCESS -s service [-m stop mode] Stop a service or stop the system
    smcl -c SET_DATA -s service -d property value Change the properties of a service

    The smcl function is defined in uc4_functions.sh, a script containing shared functions that are also used in other scripts like ae_server.sh. The uc4_functions.sh script is sourced by the .bashrc of all UC4 technical users so, smcl should be available to all of these users.

    Limitations

    smcl works only with the AE server Service Manager running on the same node where you run it. It cannot be used to interact with the Agent Service Manager, or with Service Managers running on other nodes. The script was developed primarily for one AE implementation. It may need to be customized for your environment.

    Implementation

    The script is implemented as a Bash function in uc4_functions.sh. It relies on many other functions in that script.



    AE control script (ae)

    ae is a function that can be used to quickly start or stop the Automation Engine (UC4), or to display Service Manager information for the AE.

    Here's a bit more detail about how these actions work:

    • start starts the Automation Engine, optionally in a specified start mode.
    • kill just kills the AE Server service manager, after first asking for confirmation.
    • status action displays information about any running AE Server service manager.
      • ps listing of SMgr process
      • number of child processes
    • list, stop and shutdown use smcl wrapper to the ucybsmcl program to interact directly with the Service Manager.
      • list: runs the GET_PROCESS_LIST command and print tab-delimited output.
      • stop: sends STOP_PROCESS -m C to all running AE Server processes, after first asking for confirmation.
      • shutdown: sends STOP_PROCESS -m S to the WP with the highest CPU time (usually the PWP). This will stop all AE server processes on both nodes.

    The script performs lots of error-checking and prompts for confirmation before performing potentially destructive actions. It also prompts for the Service Manager password, if required.

    Examples

    Start the Automation Engine

    To start the Automation Engine normally, just run ae start by itself or run ae start normal.

    Display status of the Automation Engine

    Run ae by itself or run ae status. This will list details of the AE server Service Manager process and display the number of child processes.

    List running AE server processes

    Run ae list to list all of the running AE server processes.

    This lists only the AE server processes that the Service Manager knows about. If the AE is experiencing problems, there may be a breakdown in the communication between the AE and the Service Manager. In this case, the list of processes displayed may be incomplete or not up-to-date. In this situation, you should use the ps command to make sure you identify running  all AE server processes, including those that have lost communication with the Service Manager.

    Stop all running AE server processes

    Run ae stop and type 1 to confirm.

    The stop action does not stop the Service Manager. You will need to run ae kill before trying to start it again. (See the following example.)

    Kill the AE server Service Manager

    Just run ae kill and type 1 to confirm.

    If any child processes of the Service Manager are still running, a kill signal will be sent to them when the Service Manager process ends. Usually this will cause the child processes to end too. However, the Automation Engine is experiencing problems, some AE server processes may refuse to die. In this situation, it may be necessary to list hung AE server processes using the ps command, and then kill them one by one, or all at once using a command like killall.

    Shutdown the AE server on both (all) nodes

    Run ae shutdown and type 1 to confirm.

    This will completely shut down the AE server on all nodes. This is done by sending STOP_PROCESS with stop mode S (shutdown) via the SMgr to the WP with the highest CPU usage. (This is often the PWP.) This WP will in turn instruct all other running AE server processes to stop.

    Start the AE server with a minimal set of processes

    For troubleshooting, it can use useful to start the AE Server gradually, initially starting just a few processes. This makes it much simpler to find errors in the AE server logs, because there are fewer active logs to search through.

    Run ae start with start mode mini.

    The start modes rely on a particular naming convention for Service Manager Command (SMC) files. Take a look at the uc4_functions.sh script to learn more or to customize this for your needs.

    Limitations

    ae was developed primarily for one particular AE implementation. It may need to be customized for your environment.

    Implementation

    The script is implemented as a Bash function in uc4_functions.sh. It relies on many other functions in that script, including the smcl function described above.

    Attachment(s)

    zip
    AE Bash scripts.zip   9 KB 1 version


  • 2.  RE: Bash scripts for the Automation Engine

    Posted 03-04-2021 10:34 AM
    Edited by Michael A. Lowry 03-04-2021 10:34 AM
    Just a few additional pieces of information to help you get started:

    • The paths where the Service Manager and AE Server are installed are defined in uc4_env.sh.You will almost certainly need to update these to point to where you have these programs installed.
    • The server_env.sh script contains lists of servers by environment, e.g., experimental, development, testing, and production. You'll need to update these lists based on your host names and environments.
    • The functions in uc4_functions.sh are designed based on several assumptions regarding the AE server installation, including a naming convention for configuration files and server manager destination names. This naming convention is based on the environment of the server (e.g., DEV), and whether the server is the primary or secondary in the AE server cluster (i.e., A or B). Both of these details are specified in the aforementioned server_env.sh script.
    • You will have to adhere to the expected naming convention, or customize the script to use a different convention.
    • The script is designed to do plenty of error checking, and to print informative messages if an unexpected condition arises.
    Enjoy.