Hi all,
Thanks for the suggestions so far. They are all helpful and I may end up using Carsten's basic framework.
For context, we have been instructed to keep logs on disk instead of in database, for performance reasons. If I understand Josef, the purpose of the XRO system is to allow users to develop their own algorithms for which logs to keep (in the database) and which to delete. Essentially I want to do the same thing but with the physical files.
The most helpful tool for this would be a DB table which contains the SIZE of the output file on the host, in addition to the location, so that we can initiate disk cleanup jobs only when needed (but immediately, when needed).
We have over 6000 agents, many of which are idle for days at a time, and some of which have thousands of log files. For both of those reasons, we don't want to constantly scan them all physically. The agents are mostly grouped by application, and we want to give the (dozens of ) application owners the ability to set custom retention settings, such as number of days or number of logs. But we might want to override their settings (for example, remove the oldest x GB of files) if disk is getting too full.
We have some large shared hosts running diverse applications with distinct requirements, so we'd prefer a job-based cleanup algorithm to a host-based algorithm. Having the file size tied to the job object would also allow us to identify which applications are using more than their share of disk on the shared hosts.
I considered a post process step to insert the run ID and file size into a variable object, but that seems like too much overhead, given our volume.