Do we have any option to monitor the NAS and alarm_enrichment queue issue?
You explore the CA UIM Hub Queue Statistics Probe
You can also use a small LIA script from: QueueCheck LUA script v2.1 (QueueCheck LUA script v2.2 ); this tool will create the needed alarms and qos entries to be alarmed and to do the follow up (with sample list view included)
Another thing in paralell is to monitor the nas logs with the logmon probe for failure and timeout messages to alarm on.
One of the best practices can also be: monitor the space and number of files in the probes/hub/q/nas and probes/hub/q/alarm_enrichment directories. The Hub Queue Statistics Probe is definitely something to use in addition of this.
If you are on Microsoft SQL, you can monitor the NAS_TRANSACTION_LOG table in the database using a SQL maintenance plan.
Go into SQL Management Studio
Configure Database Mail as described at the following Microsoft article:
Configure Database Mail | Microsoft Docs
Start the SQL Server Agent at the bottom by right clicking and choosing Start. (Note the confirmation window may pop up in the back behind your Management Console window)
After it starts, right click the node and choose Properties
Click Alert System
Check Enable mail profile
Choose the Mail profile you configured above
Expand SQL Server Agent
Right click Operators
Click New Operator
Name: Alarm Failure
Email name: (Your internal distribution list for getting these failure emails)
Right click on Jobs and choose New Job
Enter a name like "Alarm Count"
Step name: Query
Database: CA_UIM (or the name of your DB if different)
DECLARE @minutes INT = -30;DECLARE @rows INT;
SELECT @rows = COUNT(time)FROM NAS_TRANSACTION_LOGWHERE time > DATEADD(mi, @minutes, GETDATE())GROUP BY SUBSTRING(CAST(time AS VARCHAR),1,1)
IF @rows = 0 BEGINRAISERROR (51000, -1, -1, 'No Alarms');END
Name: Alarm Count Schedule
Occurs every: 30 minutes
Choose Alarm Failure
At this point the SQL server will check every 30 minutes for alarms having been generated within 30 minutes.
Bear in mind that the email above is not very verbose, but you can modify some of the above message to get a little more info. This is intended as a starting point
Obviously, this is not officially supported by CA, and only intended as a community aided solution to a common problem
The method here is interesting, but I want to point out it doesn't primarily monitor queues. It monitors if alarms are going through nis_bridge properly. If the alarm_enrichment or nas queues are having a problem, it will of course be impacted and you won't get any new records in the SQL Database, but everything can be ok on queues and you can have a nis_bridge issue which will only affect the NAS_ALARM and NAS_TRANSACTION_* tables. Also note that if you disable Nas transaction logging in the nas settings you won't get records in the NAS_TRANSACTION table.
However, I recommend to monitor - if you use UMP - the NAS_ALARM table to be sure it's moving.
You're right, it doesn't monitor queues directly. However, if one is monitoring alarm_enrichment/nas queues, then it's likely they want to be alerted of failures. And if the alarm_enrichment/nas queues are not processing then any alerts generated based on the lack of processing will not be received. So the above method provides an "out of band" method of alerting to alarms suddenly not being available any longer.
Is it bullet proof? No. Again you are correct that if nis bridge synchronization fails or is disabled, this will generate alerts as well. Ideally you would have these features enabled and then you'd have alerting if your nis_bridge failed too.
I recommend NAS_TRANSACTION_LOG vs NAS_ALARMS because it is more comprehensive. A clear alarm will not be represented in NAS_ALARMS, nor will an acknowledge, but you will find evidence of them in NAS_TRANSACTION_LOG.