Hi Fred,
Did you try a simple shutoff on the Agent or the problematic metric tree node? (Right-click in the investigator on the Agent or a metric node and select shut-off). This is even preserved across EM and Agent restarts? That’s the quick fix.
The long term fix would consist of:
1) Either disable SQL metrics entirely for this Agent. A bit brute force, but solves the problem once and for all, but you lose SQL metrics. You can keep them in Transaction Traces though.
2) Either use the SQL normalizer to aggregate the problematic SQL queries. (Usually the culprit is the same SQL query which shows unique variants, like “SELECT FROM TEMP001, SELECT from TEMP002, etc…” or some comments automatically generated by hibernate. The SQL normalizer works well for this kind of problems.
3) Either clamp the problematic Agent. If your Agent is generating say 50.000 unique metrics, and 45.000 of them are SQL, you can problably clamp your Agent at 10.000 for example.
4) To “reduce” the load, you could disable some of the fancier metrics for SQL details, only keep Average response time and responses per interval for example, remove stalls, concurrent invocations and errors. Doesn’t solve the metric leak problem but attenuates it.
In general, I think our OOTB clamp settings are too high. I believe they’re set at 50.000 metrics, I would advise you to reduce it at least to 20.000, possibly 10.000 at the EM level. If you look at your average number of metrics per Agent, I’m pretty sure it’s much smaller than 3000 anyhow.
Regs,
Florian.