AutoSys Workload Automation

 View Only
  • 1.  AutoSys - remember systemd

    Posted Jun 20, 2019 06:28 PM
    So it seems the excitement about systemd has come and gone.   I think there was a lot of general advice that has gone around, and some folks have figured out various solutions, while others have yet to discover these pitfalls.

    This post is a friendly FYI to anyone who may benefit - it comes as is, with no warranties.  I figure there's enough information here to at least enhance someone's google experience.

    If you dont know what systemd is, you can find out here

    My personal objective with systemd was to achieve much of the same behavior that we've come to expect from the old sysvinit (aka system "5" init).   Given that systemd is a giant series of solutions in search of various problems, your millage may vary as to the usefulness of this post.  Regardless, at the very least, I hope to point out some discrete issues that might stem from past advice that I've seen floating around, or possibly blind faith in using the default installers (or chkconfig compatibility).

    Anyhow, getting into the examples:

    1.  Systemd commands - there are more, of course, but these are ones I typically use:
            systemctl daemon-reload   # Used to cycle the daemon and reload service attribute changes
            systemctl start <service>    # as you guessed, starts the desired service (sub start with stop/restart too)
            systemctl enable <service>  # Creates links/loads unit/service files - that way things start at boot (default is disable)
            systemctl reset-failed  <service>  # resets the failed service notification - i.e. "someone sank my cybAgent"
            systemctl set-property <service> <attribute=value> # Really only useful in a pinch, and on a default install  - not preferred
                example: "systemctl set-property waae_agent-WA_AGENT.service TasksMax=infinity"

    2.  Agent context: 
           The agent's oversimplified mission is to spawn a bunch of processes, and be able to watch those processes.
           The following systemd configuration is an example/answer to default assumptions made during install.
            #note, there are different locations where these configurations may reside - some default, others custom
            #this example lives in the system space: /etc/systemd/system - yours may reside elsewhere
    [Unit]
    ## This section sets up the dependencies - essentially putting agent startup at the end
    ## By the time the agent starts, all prerequisite services (ldap/remote fs/network/etc) are running
    Description=waae_agent-WAAE_AGENT
    Requires=multi-user.target
    After=multi-user.target

    [Service]
    ##The forking allows the parent (cybAgent.bin) to spawn its children, aka jobs
    Type=forking
    ##Tasks are synonymous with threads - defaults on some systemd versions are 512, with an idle agent consuming 34
    ##Needless to say, you'll want more than 512, otherwise your jobs may hang or fail
    TasksMax=infinity
    ##Begin self explanatory section
    ExecStart=/opt/CA/WorkloadAutomationAE/SystemAgent/WAAE_AGENT/cybAgent -a
    ExecStop=/opt/CA/WorkloadAutomationAE/SystemAgent/WAAE_AGENT/cybAgent -s
    ##End self explanatory section
    ##If you want the agent to restart after an abnormal end, set this to yes.  sysvinit behavior is "no"
    Restart=no
    ##The GuessMainPID allows systemd to determine what the parent is, and is helpful when there is no pidfile to track
    GuessMainPID=1
    ##The cybAgent is using "setuid" to execute your jobs, so you need to run it as the root user
    User=root
    ##Very important to have killmode set as "process", otherwise shutting down the agent will result in every child pid being killed
    ##I'm not sure why they chose that as a default, but I can only imagine the poor soul who discovers losing every job after cycling an agent
    KillMode=process

    [Install]
    #more sequencing that has the agent startup with the multi user env (i.e. init 3)
    WantedBy=multi-user.target
    ​​​​​​​​​​​​​​​​
    So this pretty much concludes the  agent example, again saved in /etc/systemd/system as "waae-agent-WA_AGENT.service". ​
    If chkconfig was run by the agent installer, or if you use systemctl to manually set properties, you may have a waae-agent-WA_AGENT.d directory containing individual configuration attributes.   If the service file is to be used, you can destroy the directory, as it will no longer be used.    To use the new service (unit) file, you simply run "systemctl enable waae-agent-WA_AGENT.service" and away you go.   You can use "systemctl start waae-agent-WA_AGENT.service" to start, or any number of other options (start/stop/restart/status).   One difference you might observe is the process tree that should have cybAgent.bin at the top.   Lastly, remember to reset via systemctl any time you modify the service file.

    ​3. Server context:
    The server (scheduler/appserver/webserver/wcc/eem) have a different mission, but much of what I show in this example can still be applied.
    My objective here is to protect my processes, and avoid any conflicts.   We have a lot of transactions in our shop, and have high demands on our schedulers.   Its important to understand that there are limits imposed by systemd that are in addition to, or potentially in conflict with, the system limits (/etc/system/limits.conf - or ulimit).   Conflicts with these limits can cause core dumps, hangs, and other strange behavior.
            #note pretty much like the agent, but I'm setting process properties that correspond with system/ulimits
            #this example also lives in the system space: /etc/systemd/system
    [Unit]
    Description=waae_sched.ACE
    Requires=multi-user.target
    After=multi-user.target

    [Service]
    Type=forking
    TasksMax=infinity
    ##The init scripts are well written - why let them go to waste?   Alternatively, "event_demon -A ACE" could be used, but with full path
    ExecStart=/etc/init.d/waae_sched.ACE start
    ExecStop=/etc/init.d/waae_sched.ACE stop
    Restart=no
    ##The init scripts generate a lock file, but no pidfile - so we go with guess.  Systemd does just fine guessing
    GuessMainPID=1
    ##Run as autosys or your install owner
    User=autosys
    KillMode=process
    ##These are your process limits - the have correlating /etc/system/limits.conf and ulimit equivalents.
    ##Having conflicting values will make for disastrous results - scheduler crashes, hangs, etc
    ##I'm not interested in systemd controlling these limits as I have other means...
    ##It is worth mentioning that this is a significant feature in systemd, as annoying and potentially dangerous as it is
    ##With these parameters, one can set limits via systemd that are different than that of the user running the process
    ##In an agent context, that means you wont have to inherit root's limits for all children spawned - your shop might find these useful
    LimitNOFILE=8192
    LimitCPU=infinity
    LimitFSIZE=infinity
    LimitDATA=infinity
    LimitCORE=infinity
    LimitRSS=infinity
    LimitAS=infinity
    LimitNPROC=61863
    LimitMEMLOCK=64
    LimitLOCKS=infinity
    LimitSIGPENDING=126498
    LimitMSGQUEUE=819200

    [Install]
    WantedBy=multi-user.target

    So this concludes my contribution.   The short takeway here (which I've saved for the end) is that you might be unhappy with default systemd settings... I'm surprised I'd not seen more about this topic, but I hope it finds you well - please share your experiences or feel free to point out anything noteworthy based on experiences in your shop.

    Viva la AutoSys!

    ​​​​​​​​​​


  • 2.  RE: AutoSys - remember systemd

    Posted Jun 21, 2019 08:46 AM
    thank you Lee!
    Steve C.


  • 3.  RE: AutoSys - remember systemd

    Posted Jun 21, 2019 11:03 AM
    Very welcome, Steve.   Sorry for the formatting - maybe I'll throw it in pdf and attach it.