DX Unified Infrastructure Management

 View Only
Expand all | Collapse all

we have received an alert repeatedly in CA UIM .

  • 1.  we have received an alert repeatedly in CA UIM .

    Posted Jul 19, 2019 02:25 AM
    we have received an alert repeatedly in CA UIM for cpu , disk and memory  . what is reason for this and how to resolve this ?




  • 2.  RE: we have received an alert repeatedly in CA UIM .

    Broadcom Employee
    Posted Jul 19, 2019 02:47 AM
    Hi

    Check if the thresolds are set correctly/ valid in the cdm ->Status tab .
    Change thresholds as required depending on your baseline expected values to prevent unwanted alerts


  • 3.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 19, 2019 05:23 AM
    Hi Franklin ,

    The thresholds we have set are as shown below :

    80 % --- major
    90% --- critical

    As and when threshold breaches we will get alarms ----static alarms 

    No dynamic alarms configured ---- that means no baseline right here ?

    But for one robot we are receiving alerts continuously like cpu alert 3 times disk 3 times ...

    SO what could be the issue and what is the resolution for this ??



  • 4.  RE: we have received an alert repeatedly in CA UIM .

    Broadcom Employee
    Posted Jul 19, 2019 07:11 AM
    Hi

    Are u getting separate alerts or alerts clearing and getting generated
    Is the cpu /disk high on server .you can check in performance reports during that interval in ump
    If high fix issue why getting high on server or else change the thresholds on cdm

    Sent from my iPhone




  • 5.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 19, 2019 05:27 PM
    Maybe you have 3 notification rules that apply for that server, for that reason you receive three diferents mails from the same alarm.

    You need to filter the server on your rules for only apply the server in one rule for only recieve one mail for that alarm.



  • 6.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 22, 2019 03:07 AM
    Hi oscar mendoza ,

    We have only one rule for this configured in nas .


    Regards
    Amar


  • 7.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 22, 2019 10:41 AM
    Hi Madanraj
    is the same alarm but a new count?
    are you put on the nas that only send an email when the alarm count is lower of 2 ?
    if you dont and server has CPU to high,  the same  alarm would send several emails every time is  polled.


  • 8.  RE: we have received an alert repeatedly in CA UIM .

    Broadcom Employee
    Posted Jul 22, 2019 09:11 PM
    Hi Amar

    Also please check if below applies for your use case 

    Article title: How to prevent duplicate email alerts from nas

    https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=34261

    Article title: nas sends duplicate emails for alarm messages despite message suppression

    https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=33630


  • 9.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 25, 2019 02:05 AM
    Hi Franklin ,

    I have followed this below one but still unable to resolve .

    Article title: How to prevent duplicate email alerts from nas

    https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=34261

    Regards
    AMar



  • 10.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 25, 2019 02:16 AM
    Article title: nas sends duplicate emails for alarm messages despite message suppression

    https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=33630

    Pls find here in nas settings message suppresion is also followed . The above link states that skip numeric characters need to be ticked and the same is there in nas settings in our environment and still unable to resolve the duplicate email alarms Franklin .



  • 11.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 23, 2019 01:54 AM
    - do you have more than 1 nas? (that could play ping-pong with your messages)
    - did you deploy, via MCS, the "setup cdm (extended)" and customized profiles (=thresholds) via the 9.0.2 operator console "alarm policies"? (I had the case that I received for a critical threshold also an alarm for major and another one for warning)


  • 12.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 23, 2019 03:34 PM
    Sorry, not understanding.
    Is the problem that the alarms are coming in when there isn't a breach of cpu, memory, disk?
    or is it that the alarms are valid but there are three at a time?
    If the latter, double click the alarm in the IM console and provide an image of the Details popup for the 3.

    ------------------------------
    Support Engineer
    Broadcom
    ------------------------------



  • 13.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 25, 2019 01:42 AM
    Hi David ,

    alarms in alarm sub console of IM  are not coming in repeated mode . Only the emails are coming repeatedly whenever the cpu/disk threshold are breached .

    Regards
    AMar


  • 14.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 25, 2019 02:20 AM
    No only one nas .

    we have not deployed mcs for cdm 



  • 15.  RE: we have received an alert repeatedly in CA UIM .

    Broadcom Employee
    Posted Jul 25, 2019 02:28 AM
    Hi 

    Can you post the AO profile configiuration used to send email related to this cdm probe .
    Also do you have multiple nas AO profiles related to this server for email ?


  • 16.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 25, 2019 04:22 AM

    pls check them and let me know Franklin ... Thanks






  • 17.  RE: we have received an alert repeatedly in CA UIM .

    Broadcom Employee
    Posted Jul 25, 2019 05:56 AM
    Hi Amar,

    But for one robot we are receiving alerts continuously like cpu alert 3 times disk 3 times ...

    cpu profile still shows on_arrival , please set overdue age on that 

    For the disk are you getting exactly same emails with threshold values breached  same or different values .Its possible the values are different and you think its duplicate 
    Its difficult to analyse without additional /snapshot logs .
    If issue persists suggest open a ticket


  • 18.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 25, 2019 08:55 AM
    Hi Franklin ,

    For disk and memory , I have given this overdue 30s setting in AO rule . but the emails are getting repeated 3 times for disk threshold breach ,

    Please tell me what logs are required so that I will provide .


  • 19.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 25, 2019 08:58 AM

    Severity               : critical

    Host Name         : CHEHADDAT06

    IP                            : 192.168.183.79

    Element               : Disk

    Message              : Disk free on chehaddat06.npci.org.in on /Hadoop is now 0%, which is below the error threshold (10%) out of total size 2367.1 GB

    Time                      : 07/23/19 19:10:24

    Probe                    : cdm

    this alert came 3 times ....




  • 20.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 25, 2019 04:20 PM
    There can be tremendous value in detail and clarity.
    Your update helps in that regard:
    alarms in alarm sub console of IM are not coming in repeated mode .
    Only the emails are coming repeatedly whenever the cpu/disk threshold are breached .

    Based on that the probe side configuration is correct since only one alarm show up.
    So the problem is:
    three email notifications for the same alarm are being received.

    This sets the focus to:
    what is sending the emails?
    Are all three from the same source?
    How is nas configured to process these alarms?

    For example nas can be configured to:
    repost the alarm with a subject 'email' so the emailgtw probe picks it up and processes
    it can also be configured to email the alarm

    If both are set then there will be two emails.

    need to check the config for nas for these alarms and emailgtw or whatever is sending the emails

    ------------------------------
    Support Engineer
    Broadcom
    ------------------------------



  • 21.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 26, 2019 01:25 AM
    Edited by MADANRAJ SIVAGNANAM Aug 22, 2019 03:26 AM


  • 22.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 26, 2019 07:12 AM
    Hi David ,

    I have shared nas.cfg and emailgtw.cfg  ... please let me know what is needed to be done !!


    Regards
    Amar


  • 23.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 26, 2019 10:01 AM
    The expectation from my side is to provide guidance, to point you in the right direction, for you to investigate.

    ------------------------------
    Support Engineer
    Broadcom
    ------------------------------



  • 24.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 28, 2019 08:49 AM
    Hi Madaraj,

    Do you have nis bridge checked on HA too?? If yes do disable it, this also could result into duplicate alarms??

    ------------------------------
    System Analyst
    DXC Technology
    ------------------------------



  • 25.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 28, 2019 08:49 AM
    I mean try uncheck nisbridge on HA

    ------------------------------
    System Analyst
    DXC Technology
    ------------------------------



  • 26.  RE: we have received an alert repeatedly in CA UIM .

    Posted Jul 29, 2019 12:29 PM
    I briefly looked over the nas.cfg that you shared, specifically profiles referencing the cdm probe. I see that these are configured to execute/evaluate "on_arrival" and the count equal 1 (overdue = on_arrival, counter = eq 1). This configuration will result in the first email being sent out at the first alarm occurrence and a second for a subsequent recurring alarm. this happens because the count configuration is based on the suppcount (suppression count) for the alarm. This is initially 0 for the first occurrence. The on_arrival setting causes the profile to be evaluated and executed prior to the suppression algorithm being processed.  I would recommend that you sent the overdue age to maybe 15 seconds (15s) and not use the count configuration (blank it out) and see if that helps. This should result in only one email (first occurrence) being sent for any given alarm that matches the profile.

    ------------------------------
    [Designation]
    [City]
    ------------------------------



  • 27.  RE: we have received an alert repeatedly in CA UIM .

    Posted Aug 22, 2019 03:48 AM
    Hi James Christensen ,

    If I give count =1 and  "on arrival" ---> you have said that suppression count algorithm is not wrking properly right?

    So on arrival is suggestible or not ?

    overdue means postponing the alert or what is the exact purpose of this ?

    Regards
    Amar 





  • 28.  RE: we have received an alert repeatedly in CA UIM .

    Broadcom Employee
    Posted Aug 22, 2019 03:50 AM
    Hi 

    In the nas Auto-Operator profile when you set to use Action mode "On message arrival" the nas performs the action immediately -- before suppression is performed.
    Ensure that the 'Action Mode' setting for the nas Auto Operator profile is not set to "On message arrival."
    If this is the case, change the time setting to "On overdue age" and set a small period e.g "30s"

    https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=34261