AutoSys Workload Automation

View Only

Back to discussions

Expand all | Collapse all

autosys KPI

1. autosys KPI

0 Recommend
Pavel Vaynshtok
Posted Jul 18, 2022 01:40 PM

Reply Reply Privately
i'm looking for KPI ideas for autosys.

EP uptime

EP latency

number of jobs

number of runs

percentage of failed vs total runs

number of machines

number of job owners

what else we can measure?
2. RE: autosys KPI

0 Recommend
Jose Lopez
Posted Jul 19, 2022 03:44 AM

Reply Reply Privately
Hi,

jobs defined maybe be measured also.

BR
JR

Original Message
3. RE: autosys KPI

0 Recommend
Broadcom Employee

John Hiett
Posted Jul 19, 2022 03:53 AM

Reply Reply Privately
Perhaps another measure worth considering would be number of manual interventions, e.g. sendevents.

Original Message
4. RE: autosys KPI

0 Recommend
Rozanne Smith
Posted Jul 19, 2022 09:18 AM

Reply Reply Privately
* Number of job runs per hour of the day
* Number of jobs that won't run (no-exec, on-ice, on-hold)
* Number of stale jobs (haven't run since x)
* Number of cred owners with no jobs
* Number of machines with no jobs

Original Message
5. RE: autosys KPI

0 Recommend
Scott Fenton
Posted Jul 19, 2022 12:30 PM

Reply Reply Privately
Hey Pavel. I ran through an exercise like this not too long ago. I agree with everything you've mentioned and the things others have mentioned too. In our case my prober (prometheus, written in python) also checks...

How far behind is DB - how old is the oldest unprocessed event in ujo_event, how many unprocessed events are there
Skew time for each event - Since the last time scanned how long did each event take in ujo_proc_event to go from init_status_stamp to que_status_stamp, create a histogram and measure SLI skew over time. Good for an SLO.
Discrepancy between what config file thinks about DB status vs. what alamode in each db thinks about its own status
Blackbox user journey - how long does it take to go end-to-end from force starting a /bin/true job to reach an end state
We're also using a dashboard that reports migration ratio of jobs in old instance/jobs in new instance for instances under migration.
I'm sure there's stuff you can monitor for your GUI as well like authentication failures and the usual lot.

Good luck!
Scott

Original Message
6. RE: autosys KPI

0 Recommend
Broadcom Employee

Michael Woods
Posted Aug 09, 2022 09:15 PM

Reply Reply Privately
Hi All,

These are all good items to track. Should it be divided between the 'service' and what I would call usage data.
For the service, I would consider uptime for the components, db, agents (going offline/missing), the performance of those.
The usage KPI are the throughput, failure rate, manual sendevent, job Alarms (different than a job failure, which is not always an alarm).

Regards,
Mike

Original Message

AutoSys Workload Automation

autosys KPI

Pavel VaynshtokJul 18, 2022 01:40 PM

Jose LopezJul 19, 2022 03:44 AM

John HiettJul 19, 2022 03:53 AM

Rozanne SmithJul 19, 2022 09:18 AM

Scott FentonJul 19, 2022 12:30 PM

Michael WoodsAug 09, 2022 09:15 PM

1. autosys KPI

2. RE: autosys KPI

3. RE: autosys KPI

4. RE: autosys KPI

5. RE: autosys KPI

6. RE: autosys KPI