For out of memory events I just use a generic lookup for "java.lang.OutOfMemoryError" on the "vco-app" app. This triggers when vco pods crash. Only recently, with one of the Aria releases, you are able to finally set the heap size for vco, so I hope I will not see this event often:
Original Message:
Sent: Jan 08, 2025 08:45 AM
From: Simon R
Subject: Aria Automation Monitoring
Thanks for your reply. I'm being cheeky but could you post the alerts you've actually got enabled?
And what are the actions you take when these alerts occur:
Heap memory usage too high
Storage on VRA nodes gets low - how do you monitor this as we cant install the telegraf agent on the appliances?
Original Message:
Sent: Jan 08, 2025 05:37 AM
From: left_right
Subject: Aria Automation Monitoring
VRA creating tons of log entries is nothing new, sadly:

We just create alerts on events that actually did or could soon cause a service interruption, for example:
- when heap memory usage is too high
- certain vRO/ABX workflows fail
- storage on vRA nodes gets low, due to too many java heap dumps following a workflow failure
- problems with user validation/authentication
There is also a limited number of pre-defined alerts under Alert Definitions, some of which we have activated.
Original Message:
Sent: Jan 07, 2025 09:11 AM
From: Simon Rowan
Subject: Aria Automation Monitoring
How are people monitoring their on-prem deployments of Aria Automation and Orchestrator? I have connected them to Operations and Operations For Logs - it seems Logs gives the best ability to monitor the service for problems but the out of the box dashboards and alerts provide too much information. I can see I have lots of events, warnings and critical alerts but everything is still running without issues.
For example some of the Alerts I am getting out of Logs:
CRITICAL: JDBC Connection Error
WARNING - Garbage Collection Failed
CRITICAL - Failed To Establish Connection
WARNIGN - Configuration File Error
The dashboard shows lots of events but so what?

Operations seems more focussed on providing a view of things deployed via Automation and grouped in a project/deployment rather than monitoring the Automation Service itself - am I wrong?