We run the Automation Engine server on two nodes, and start it via the Service Manager. The processes started are:
- 60 work processes (30 per node)
- 10 communications processes (5 per node)
- 10 Java work processes (5 per node)
- 2 Java communications processes (1 per node)
Traditionally we have run odd-numbered AE server processes on
Node A of the cluster, and even-numbered ones on
Node B. We have tried to start server processes in such a way that the process names assigned by the PWP are the same as the names in the Service Manager GUI. This aids troubleshooting because processes appear
in order in the Service Manager GUI. Where there are a large number of AE server processes, this is actually important. How do we ensure that the names line up? We use WAIT statements in the
SMC files to ensure that each process already has its name assigned before the next process starts.
We have noticed that the speed at which process names are assigned depends greatly on the process type & the level of server activity:
- Primary work process: 1-3 minutes
- Work processes: 20-60 seconds.
- Communications processes: 5-15 seconds
- Java work processes: 2-5 seconds
- Java communications processes: 2-5 seconds
We start the processes in this order:
- Start the first WP (the PWP). Wait 10 seconds.
- Start 1 CP. Wait 10 seconds.
- Start several more WPs, with a 10-second delay after each one.
- Start the remainder of the CPs, with 10-second delay after each one.
- Start the remainder of the WPs, with 10-second delay after each one.
- Wait 2 minutes to give all the WPs sufficient time to have their names assigned.
- Start the JWPs, with a 5-second delay after each one.
- Start the JCP.
The goal has been to find a good balance between getting the server up & running quickly, and making operations & troubleshooting easier. This approach has worked pretty well, and the process names assigned by the PWP have
usually matched up with the name and order of processes in the Service Manager GUI. This approach does present some problems though:
- LDAP logins do not work until at least one JWP is up and running.
- REST API calls do not work until at least one JCP is up and running.
Moreover, starting with v12.3, we have found that our approach no longer works
at all. AE v12.3 introduced some
significant changes to the structure of the
ucsrv.ini file.
“All process numbers assigned reflect the order in which the respective process starts.”
More correctly, AE v12.3 server processes are named based only on the
process type and the
order in which the processes start. It is no longer possible to influence the assignment
process names (e.g., WP1, CP5) using options in the ucsrv.ini file. Because of this change, the convention of starting odd-numbered AE server processes on
Node A of the cluster, and even-numbered ones on
Node B, is no longer viable.
This raises several questions:
Q1: Is there any way in v12.3 to restrict the process names assigned on each node?
Q2: Is there any way in v12.3 to ensure that the process names assigned by the AE line up with the names/order in the Service Manager GUI?
Q3: Is there a recommended order for starting AE server processes?