In APM, for alerts listed under a management module, we have option to either have the alert Active or disabled using the checkbox option. Kindly check if we can have one more option 'Disable' along with the custom duration (in hrs/mins) or time range until which it is to be disabled.
We currently have more than 20 to 30 management modules in APM configured for monitoring each of 3 environments, that totals atleast more than 800 alert monitors monitoring the processes, services on our servers. We usually have many code deployment / patching happening on any of the servers many random times every day, during which the monitored processes get stopped.
Since we will be aware of the changes going on the server, we go to each monitor associated with those servers under change and we have to disable the alert monitor to prevent email alerts reaching us. And after the change, once processes are started back, we again need to go to individual monitor to enable the alerts.
We are requesting you to consider an enhancement in next versions to include an option to disable the alerts for a custom duration in the alert configuration section. The monitor should resume sending the alerts as usual after the entered duaration is over. This would save us time going to each monitor to enable the disabled alert monitor after the change on servers.
We have used other monitoring tools, where we have such feature as displayed in below screenshot. Kindly check the feasibility of having such option in next releases.
Posted as question by mistake. Updating the same as Idea.
We have an Alert Downtime Schedule feature in the product for quite a few years.
This is a link to the documentation from the latest release:
Please have a look and see if that fulfils your requirement.
Thanks for the useful info. The problem is that we have the recurring schedule options available in the mentioned section and does not allow us to schedule for only one time alert disabling option.
What I mean to say is, today we could be having change on 'xyz' server for which we need to disable the alerts related to it, if we set the schedule for disabling alerts for certain mins from Start time (of change on server) on today, along with that we compulsorily need to select the recurring schedule for which it is going to get disabled again next time, during which we will not be having any server changes and the monitor is not supposed to be disabled.
You can choose to run the CLW command available for the alerts you would like to deactivate at your own preferred time.
This can be further written down to a script if you like to bundle multiple deactivate commands together.
Please see CLW section in the APM Configuration and Administration Guide for your APM Version.
The Deactivate Alerts command deactivates one or more Alerts in one or more Management Modules. You supply two regular expressions—one that specifies the Alerts to deactivate, and one that specifies the Management Modules in which to deactivate those Alerts.
Command - :
deactivate alerts matching <REGULAR EXPRESSION> in management modules matching <REGULAR EXPRESSION>
deactivate alerts matching (.*) in management modules matching Sample
The actual CLW command would look something like this
Navigate to <APM_Home>\lib
java –Xmx128M -Duser=jdoe –Dpassword=mypassword –Dhost=jdoeDT –Dport=5001 -jar CLWorkstation.jar deactivate alerts matching (.*) in management modules matching Sample
Hope that helps
Thanks for the update. I see that that this approach would deactivate specific alerts for specific management modules matching the regular expression. Can it be modified in any way to specify the disable duration as well.
You can follow below steps to achieve your requirement.
1) Write CLW Commands to deactivate alerts you want. Let's say file 1
2) Write another set of CLW command to activate alerts you deactivated. Let's say file 2
3) Create a windows or unix script to call CLW command file 1
4) Use sleep command(for unix, should be having equivalent command for windows or other OS) to put the script in sleep mode for duration you want to deacivate alerts
5) Immediately call CLW command file 2 from the script
You need to run this script on demand basis whenever you want to deactivate and activate for specified duration
In Linux you can schedule the script executions via cron, task scheduler under windows.
Thank you for the detailed steps using CLW. The problem is that, not every Admin in our team will be having privileged access for running the commands on the Windows/Linux servers. But they do have Admin access to Management module (front end basically).
We were rather looking for this option in Management module itself. How about having these commands running in background be replaced by option in Alert Downtime Schedule section itself. Or perhaps, we can remove the compulsion of selecting the recurring schedule to allow user to enter start time and duration for disabling alerts just for one time disabling of alerts.
this may be an ugly workaround but you could deploy the scripts on your MOM and create "script alert actions". An admin could then "test" that action to run the script. You would need a script per service (group of alerts) and duration (e.g. 1hr by default). This could even be a single script with different parameters. Unfortunately you cannot manually pass a parameter from the APM UI.
It is possible but really put some thought into if you really want to go down that road and how you will maintain it!
Another idea: do you have some sort of automation solution (chef, puppet, CA Server Automation, ...) that could run the CLW scripts? So admins only would need permissions to start a process/workflow in your automation solution to black out a group of alerts. And you could probably pass parameters like duration! This would be a much more elegant, maintainable and better documented approach.
Thank you for the workaround. Really appreciate that. We will be configuring separate scripts on MOM for the alert groups. We will be getting them triggered whenever privileged access user for APM server is available. We currently don't have automation solution for running CLW scripts.
However, what keeps me thinking is that, if we could disable the validation/compulsion to enter recurring schedule in Alert Downtime Schedule section, after a valid start time and duration is entered, we would be able to achieve the requirement from UI itself.