We will be having a discussion with a few of our Automic users and they are looking for information on how we could leverage our existing Automic environment to use job scheduling and automation in the cloud (specifically Azure).
Just wondering if anyone has had any use cases for this, and if so, how did you accomplish it? Did you need other agents? Can we use the REST API Webservices agent that's already included in our system?
I'd have more details in a few days at what exactly they are looking for, but wanted to have an idea if this is a scenario anyone has had to go through first.
Thanks as always for the help and feedback!
I'll provide what I can about my environment and how I have been planning our move to Azure. This is my overall testing/planning with Azure.
My recommendation is to plan the AE move to the cloud when the Databases are moved into the cloud. The application is constantly writing transactions to the database and we want to keep the latency to a minimal. I would avoid a hybrid solution for the actual Automation Engine. i.e Automic WPs/CPs in the cloud and database on premise or vice versa. Seems like a nightmare scenario for me but I have not tested this yet.
In addition to above we do not require much scalability in our infrastructure as we rely so heavily on the Oracle Database. App servers should be very consistent in size and usage so we're not the ideal candidate for cloud benefits aside from cost.
The solution I have been testing for Automic is hybrid in the sense that some of the Agents are in Azure but the AE application itself is on premise. As I am in a big shop I do not have the transparency of exactly how our servers are provisioned. I believe it is an HP enterprise offering. What I do know though is they are provisioned on the same domain/vlan as our onprem servers so the communication allows for direct agent communication from Azure to the onprem AE.
To answer your question, having the Agents in the cloud is transparent to the AE and it does not require any additional configuration for the cloud. So, the stance that I will take is that any applications that we automate with the AE can move to Azure at their own leisure as we can accommodate with the AE. Also, on the flip side if we decide to move over the entire AE application to the cloud we can accommodate any application that remains on premise. I have not done any real load testing on these agents but I've done a proof on concept and ran some jobs on Azure Windows and RHEL servers (AIX next). These were standard build Agents that are provided from Automic with the image.
The things that I am worried about is the performance of the AE once we go fully cloud. Currently the response times for my boxes are:
On prem AE -> on prem DB
Cloud AE -> on prem DB
Cloud AE -> cloud DB
With the architecture of the AE I am concerned with processes like the PWP and RWP that are responsible for writing specific transactions to the database. The single threaded nature of those might show some serious performance hits in my current Azure environment.
All this is still very much in the planning stage on our side as well but this is my initial thoughts and observations. I would be very curious what others have experienced with Azure/AWS.
Does anyone have Automic agent installed on ephemeral cloud host instances, AWS, Azure etc AND also tear these hosts down at weekends or whenever needed?
If so, how do you deal with the agent disconnecting from the Engine disgracefully(with an alert)? It will still be registered in the engine, which i don't want.
Ideally i guess the agent could be marked(or detected) as ephemeral and the engine would auto disconnect it.
The current alternatives are - manually disconnect or kick off an API call immediately before the host is destroyed.
I would love to hear how that is working out for you. We moved to Azure (Windows) and migrated the database from Oracle to SQL in September (also into Azure). We were supposed to go live with production in October but we've spent the last three months trying depsperately to figure out why Automic is no longer hitting benchmarks for performance. We've got a team on-site trying to work it out, also working with engineers from Microsoft and following recommendations from CA/Broadcom. Executive team is starting to look at new applications due to the mess we're having. I've heard that people are on AWS with no issue, but Azure is simply not working out for us. We have tried everything we can think of.
What kind of benchmark numbers are you talking about specifically?
When you start the WPs, they do a performace test. First they add 1+1 as many times as possible in 2 seconds to test CPU. Then they insert a dummy table into the database in the same timespan to test the database. The benchmark is set at 470 for the letter test. If you look at the WP logs (except for the primary), and search for "check of data source finished", it will show you the CPU/DB numbers. The highest we were ever able to get the DB number was the low 400s. The lowest was around 28. We just enabled accelerated networking and went from that high point down to the 200s.
Here is an example where our DB benchmark is at 232:
20181206/163412.302 - U00003533 UCUDB: Check of data source finished: No errors. Performance CPU/DB: '63323939'/'232 (1000/4.299241 s)'
20181206/163412.302 - U00003544 UCUDB: Reference values tested with Windows 2003 on XEON 1500 MHz: CPU 813865, DB 470
The result of this is that processing gets jammed in a queue and the AE either slows down or grinds to a halt. Our DB and App are in the same place. We've increased and decresed disk compute size (decreasing it helped), increased disk space, bumped up our Azure services to Tier 1, and about a dozen other things. We are all putting our heads through the wall at this point.