Automic Workload Automation

 View Only
Expand all | Collapse all

Automation Engine startup sequence & Service Manager configuration

  • 1.  Automation Engine startup sequence & Service Manager configuration

    Posted Dec 05, 2019 06:05 AM
    Edited by Michael A. Lowry Dec 07, 2019 06:05 AM
    We run the Automation Engine server on two nodes, and start it via the Service Manager. The processes started are:
    • 60 work processes (30 per node)
    • 10 communications processes (5 per node)
    • 10 Java work processes (5 per node)
    • 2 Java communications processes (1 per node)
    Traditionally we have run odd-numbered AE server processes on Node A of the cluster, and even-numbered ones on Node B. We have tried to start server processes in such a way that the process names assigned by the PWP are the same as the names in the Service Manager GUI. This aids troubleshooting because processes appear in order in the Service Manager GUI. Where there are a large number of AE server processes, this is actually important. How do we ensure that the names line up? We use WAIT statements in the SMC files to ensure that each process already has its name assigned before the next process starts.

    We have noticed that the speed at which process names are assigned depends greatly on the process type & the level of server activity:
    • Primary work process: 1-3 minutes
    • Work processes: 20-60 seconds.
    • Communications processes: 5-15 seconds
    • Java work processes: 2-5 seconds
    • Java communications processes: 2-5 seconds
    We start the processes in this order:
    • Start the first WP (the PWP). Wait 10 seconds.
    • Start 1 CP. Wait 10 seconds.
    • Start several more WPs, with a 10-second delay after each one.
    • Start the remainder of the CPs, with 10-second delay after each one.
    • Start the remainder of the WPs, with 10-second delay after each one.
    • Wait 2 minutes to give all the WPs sufficient time to have their names assigned.
    • Start the JWPs, with a 5-second delay after each one.
    • Start the JCP.
    The goal has been to find a good balance between getting the server up & running quickly, and making operations & troubleshooting easier. This approach has worked pretty well, and the process names assigned by the PWP have usually matched up with the name and order of processes in the Service Manager GUI. This approach does present some problems though:
    • LDAP logins do not work until at least one JWP is up and running.
    • REST API calls do not work until at least one JCP is up and running.
    Moreover, starting with v12.3, we have found that our approach no longer works at all. AE v12.3 introduced some significant changes to the structure of the ucsrv.ini file.
    “All process numbers assigned reflect the order in which the respective process starts.”

    More correctly, AE v12.3 server processes are named based only on the process type and the order in which the processes start. It is no longer possible to influence the assignment process names (e.g., WP1, CP5) using options in the ucsrv.ini file. Because of this change, the convention of starting odd-numbered AE server processes on Node A of the cluster, and even-numbered ones on Node B, is no longer viable.

    This raises several questions:
    Q1: Is there any way in v12.3 to restrict the process names assigned on each node?
    Q2: Is there any way in v12.3 to ensure that the process names assigned by the AE line up with the names/order in the Service Manager GUI?
    Q3: Is there a recommended order for starting AE server processes?


  • 2.  RE: Automation Engine startup sequence & Service Manager configuration

    Posted Dec 06, 2019 07:39 AM
    Edited by Tim Quakulinsky Dec 06, 2019 09:56 AM
    +1

    @David Ainsworth @Tatjana Radic: I am also interested in the answers to Michael's 3 questions.

    Furthermore - without wanting to hijack this post - I am worried about the start of work processes and ILM, because I could already see that some new work processes were created instead of recycling inactive ones. Example: 60 work processes are started, WP013 and WP027 are inactive and WP061 and WP062 are active instead. However, at least in V11.2, if work processes are inactive, the ILM switch does not work.

    Q4: How is it ensured that there are no problems with ILM even after the significant changes in v12.3 to the work processes?
    Additional question Q5: How is this problem solved if the AE processes are operated in containers and their number can therefore fluctuate deliberately?
    ​​

    ------------------------------
    Automation Evangelist
    Fiducia & GAD IT AG
    ---
    Mitglied des deutschsprachigen Automic-Anwendervereins FOKUS e.V.
    Member of the German speaking Automic user association FOKUS e.V.
    ------------------------------



  • 3.  RE: Automation Engine startup sequence & Service Manager configuration

    Broadcom Employee
    Posted Dec 16, 2019 10:54 AM
    Hi @Tim Quakulinsky, @Michael A. Lowry​​​
    I replied in a comment to Q1,Q2 and Q3

    Q4: How is it ensured that there are no problems with ILM even after the significant changes in v12.3 to the work processes?
    Additional question 
    Q5: How is this problem solved if the AE processes are operated in containers and their number can therefore fluctuate deliberately?

    Problems with ILM should only occur if the server is registered in the table MQSRV, but is not connected to the Primary Work Process.
    In V12.3, we added functionality so invalid server entries should be periodically deleted by PWP.

    David


    ------------------------------
    Senior Product Line Manager - Automation
    CA Technologies, A Broadcom Company
    ------------------------------



  • 4.  RE: Automation Engine startup sequence & Service Manager configuration

    Posted Dec 07, 2019 06:38 AM
    Edited by Michael A. Lowry Dec 07, 2019 07:02 AM
    The changes introduced in v12.3 suggest several underlying changes in paradigm:
    • The idea of assigning a particular process name or ID to a particular port, server, or order in a management GUI is deemphasized.
    • Instead, server processes are assigned names/IDs & port numbers dynamically. A process with a given name, say WP10, can run anywhere and grab any available port for its ephemeral outbound connections.
    • Processes that open listening ports grab any available port from a range or list of ports. They are not longer locked to a particular port by process name/ID.
    I have a theory about the underlying reasons for these changes. My guess is that the service manager and its GUI will be significantly changed or even completely replaced in a forthcoming release, and that the changes introduced iv v12.3 represent a preparatory move in this direction.

    It's clear that Broadcom is moving in the direction of enabling the Automation Engine for operation in cloud computing environments, with individual parts of the AE server running in separate containers. Here are my guesses about how this affects the service manager:

    • The lowest unit of division will be the AE server process. Each AE server process will run in its own container, e.g., a Docker container. The options currently read from the ucsrv.ini file will instead be set via environment variables or an external configuration server. Server process roles (e.g., PWP, DWP, WP) will be set dynamically as today in v12.3.
    • The AE server will comprise a collection of containers, e.g. a Kubernetes kube. Like with the individual AE server processes, global server configuration options will be set via environment variables or an external configuration server.
    • The service manager will either run in its own container, or will be replaced by an off-the-shelf orchestrator like Kubernetes or Helm.
    • Either way, the service manager or its replacement will be cloud-enabled and easy to integrate into common cloud computing environments.
      • If the SMgr is preserved in something like its current form, the current SMD and SMC files will probably also still be around.
      • If the SMgr is replaced, the SMD and SMC fillies will surely also be replaced by a more open and standards-based configuration approach.

    This is mostly speculation of course, but I hope not entirely uniformed speculation. Perhaps someone from Broadcom could shed some light on this topic, to give us a better idea of the justification for the v12.3 changes, but also the future direction. This would help us plan for what's coming. We don't want to spend a lot of time and energy forcing the product to work in a way in which it's no longer designed to work, and we also don't want to invest effort in an approach that will be made obsolete or incompatible by a forthcoming release.



  • 5.  RE: Automation Engine startup sequence & Service Manager configuration

    Posted Dec 16, 2019 08:50 AM
    I opened a support ticket to get answers to these questions.


  • 6.  RE: Automation Engine startup sequence & Service Manager configuration

    Broadcom Employee
    Posted Dec 16, 2019 10:49 AM
    Hi @Michael A. Lowry
    ​sorry this took a few days. i received some answers from engineering.

    Q1: Is there any way in v12.3 to restrict the process names assigned on each node?
    Q2: Is there any way in v12.3 to ensure that the process names assigned by the AE line up with the names / order in the Service Manager GUI?
    The answer to both questions is no. The names of the server processes are no longer specified in the INI file so that the WPs can be started simply in containers.
    The old approach does not work in containers. If the INI file was the same then all WPs would start as WP001.
    But even earlier (before 12.3), there was no association between the name of the entry in the Service Manager and the process name.
    We could improve it, however, and introduce this relationship or restrict the list of process names for each node. We have not yet considered it.
    On-Premises customers could experience disadvantages with the changed approach due to additional operational complexity.
    Q3: Is there a recommended order for starting AE server processes?


    ------------------------------
    Senior Product Line Manager - Automation
    CA Technologies, A Broadcom Company
    ------------------------------



  • 7.  RE: Automation Engine startup sequence & Service Manager configuration

    Posted Jan 02, 2020 05:15 AM
    ​Yes, the recommended order is described in the documentation: https://docs.automic.com/documentation/webhelp/english/AA/12.3/DOCU/12.3/Automic%20Automation%20Guides/help.htm#AWA/Admin/admin_start_end_server_processes .htm # link1

    The small issue with this, though, is that at present time something, likely Service Manager Dialog, changes any start order you configure so this is mostly a moot point. We spent an hour or so fine-tuning the launch order with an Automic consultant. Three weeks later the .smc file had been reordered and all changes were invalidated.


  • 8.  RE: Automation Engine startup sequence & Service Manager configuration

    Broadcom Employee
    Posted Jan 03, 2020 04:08 AM
    Hi @Carsten Schmitz
    I checked with engineering.
    If the customer's .smc file has been modified by the Service Manager without adding new items into the .smd file (via the Service Manager Dialog) the customer should open a bug. This should not be the case at all.

    Please let me know when you have created the bug ticket.

    David

    ------------------------------
    Senior Product Line Manager - Automation
    CA Technologies, A Broadcom Company
    ------------------------------



  • 9.  RE: Automation Engine startup sequence & Service Manager configuration

    Posted Jan 03, 2020 04:31 AM
    ​Hi @David Ainsworth!

    Thank you very much for the response and suggestion, much appreciated.

    However, our experience has shown time and time again that there is no point opening ​tickets for things one can not reliably reproduce. Unfortunately, I was not able to reproduce the behaviour consistently, so I shall refrain from opening a ticket about it.

    Carsten


  • 10.  RE: Automation Engine startup sequence & Service Manager configuration

    Posted Jan 06, 2020 07:50 AM
    Edited by Michael A. Lowry Jan 07, 2020 03:31 AM
    @David Ainsworth:
    Can you give us an idea of what the service manager will look like in a fully container-ready AWA? This would help us answer the question of whether it's important to be able to assign specific process names or restrict the names by node.




  • 11.  RE: Automation Engine startup sequence & Service Manager configuration

    Posted Jan 06, 2020 08:00 AM
    Edited by Michael A. Lowry Jan 07, 2020 03:29 AM
    Right now, one must define a unique name in the SMD file for each process, even though these names have zero practical impact. This state of affairs seems rather silly.

    The names assigned when the AE starts up are determined solely by timing, and these uncontrollable/unpredictable names are what appear in the Service Manager GUI, the System Overview, and the names of log files.

    I did some tests and found that I could get a decent amount of control over the order of name assignments, if not the particular names, by starting processes in this order:
    1. JCPs
    2. CPs
    3. JWPs
    4. WPs
    I inserted a 10-second delay after each process. I inserted a 120-second delay after the first WP because the first WP to start will become the PWP, and this process always takes a lot longer.

    Starting all other processes before the PWP & other WPs is exactly opposite from what Broadcom recommends. This approach does not appear to cause any problems though. The first three categories of process come up very quickly. Once the first few WPs are up, the system begins to come to life and users can log in. 


  • 12.  RE: Automation Engine startup sequence & Service Manager configuration

    Posted Jan 06, 2020 08:24 AM
    ​but these names have almost zero impact on the names assigned when the AE starts up. This state of affairs seems rather silly.

    True, this.

    Side note: There is, however, a little trick to at least see those names in operation. In Service Manager Dialog, there is a column of 0px width. If one makes the last column smaller to have some space, then aims the mouse cursor at the rightmost pixel of the column divider to the right of the CPU time column, then drags outwards, one can expand that hidden column. This column holds the actual process name, sometimes useful for avoiding confusion.


    But yes, having this hidden behind insider knowledge is not entirely unsilly either :)


  • 13.  RE: Automation Engine startup sequence & Service Manager configuration

    Posted Jan 06, 2020 10:44 AM
    Edited by Michael A. Lowry Jul 21, 2020 04:19 AM
    @David Ainsworth:
    One more related note: there's currently no way to distinguish WP logs from JWP logs, or CP logs from JCP logs, even though these processes are quite different. I suggest that as long as these distinctions remain relevant, it would be great to be able to specify unique log locations by process type. I'll open an idea for this if you think it's worth the trouble.




  • 14.  RE: Automation Engine startup sequence & Service Manager configuration
    Best Answer

    Broadcom Employee
    Posted Jan 06, 2020 12:05 PM
    Hi @Michael A. Lowry
    ​I will discuss with engineering as designs etc are in progress.
    I will come back to you on both topics.

    cheers
    David

    ------------------------------
    Senior Product Line Manager - Automation
    CA Technologies, A Broadcom Company
    ------------------------------



  • 15.  RE: Automation Engine startup sequence & Service Manager configuration

    Posted Jan 08, 2020 03:46 AM
    Thanks @David Ainsworth. I'm looking forward to it. ​


  • 16.  RE: Automation Engine startup sequence & Service Manager configuration

    Posted Jul 21, 2020 08:42 AM
    Edited by Michael A. Lowry Jul 21, 2020 08:45 AM
    We finalized our approach to starting the Automation Engine (v12.3.2):
    • Start the JCP, and then wait 10 seconds.
    • Start the the CPs, with 10-second delay after each one.
    • Start the JWPs, with a 5-second delay after each one.
    • Start the first WP (the PWP). Wait 120 seconds.
    • Start the remainder of the WPs, with 10-second delay after each one.
    The reasoning behind this sequence is that the processes that are the quickest to start are started first, and the ones that take the longest to start are started last. We have experienced no drawbacks to this approach.




  • 17.  RE: Automation Engine startup sequence & Service Manager configuration

    Posted May 20, 2021 08:48 PM
    @Michael A. Lowry - Good discussion about process startup. ​​Thanks. We are doing totally a different way (PWP, CP,JCP, All WPs, All CPs, All JWPs then below process one by one. 
    WIN01
    Database-Agent01
    Database-Services01
    Analytics01

    It works. But sometimes we see 2nd and 3rd JWPs not coming correctly. Then, we will bring it up manually. We will try your sequence and check the results. Also what sequence you are starting the WIN01 and DB process? Will it be same like ours or different? Appreciate your input.

    Thanks,
    Prakash S