SQL Agent Job failure Monitoring

View Only

Back to discussions

Expand all | Collapse all

1. SQL Agent Job failure Monitoring

0 Recommend
vineesha413
Posted Oct 03, 2018 12:06 PM

Reply Reply Privately
Hi Folks,

We have got SQL job failure alerts for a job which has failed yesterday that is 2nd October 10:55AM.

The check interval is 5 minutes.

Still the QOS is getting appended in the old alerts i the alarm history tab in UMP console though the job got failed afterwards as well (that is after 10:55AM 2nd october)and there were new alerts triggered for the latest job failure.

I am not able to understand why the QOS is getting populated in the old alerts that are no more relevant.Could anyone let me know if this is the expected behaviour as we have a requirement to acknowledge these alerts from CA UIM but again new alerts are getting triggered because the QOS is getting generated continuously.

Many Thanks,
Vineesha.
2. Re: SQL Agent Job failure Monitoring

0 Recommend
hitesh sehgal
Posted Oct 03, 2018 12:48 PM

Reply Reply Privately
Hi Vineesha,

It sounds a bit confusing, can you elaborate more with some screenshots.

What is the job interval which is failing, is it possible that job is getting successful in between and failing on random intervals. And what is the suppression key you are getting in both the alerts?
3. Re: SQL Agent Job failure Monitoring

0 Recommend
vineesha413
Posted Oct 03, 2018 01:17 PM

Reply Reply Privately
Hi Hitesh,

This is the suppression key for the job failure alerts:

Profile $profile, instance $instance, job $job_name (category $category_name), has failed. Run time of job: $rundate

Job was continuously failing for every 5 minutes yesterday and we got bulk alerts in our queue and the Database team have disabled the job from there end as the job is now running on some other node.
Now the team is asking us to close the alerts as the job is now running on a different node.

So when i acknowledge the alarm i am getting the new alarm stating that job failure happened with yesterday's date and time of the job run which should not be the case.

Regards,
Vineesha.
4. Re: SQL Agent Job failure Monitoring

0 Recommend
vineesha413
Posted Oct 03, 2018 01:18 PM

Reply Reply Privately
Suppression key is same for all the job failure alerts.
5. Re: SQL Agent Job failure Monitoring

1 Recommend
hitesh sehgal
Posted Oct 03, 2018 01:30 PM

Reply Reply Privately
can you check the values in the threshold value ? I hope you are using the latest version of the sqlserver probe.
Please try disabling/enabling the probe and log the support case with loglevel 5 if you are still getting alerts since this needs deep investigation.

DX Unified Infrastructure Management

SQL Agent Job failure Monitoring

vineesha413Oct 03, 2018 12:06 PM

hitesh sehgalOct 03, 2018 12:48 PM

vineesha413Oct 03, 2018 01:17 PM

vineesha413Oct 03, 2018 01:18 PM

hitesh sehgalOct 03, 2018 01:30 PM

1. SQL Agent Job failure Monitoring

2. Re: SQL Agent Job failure Monitoring

3. Re: SQL Agent Job failure Monitoring

4. Re: SQL Agent Job failure Monitoring

5. Re: SQL Agent Job failure Monitoring