DX NetOps

 View Only
Expand all | Collapse all

Spectrum SANM - Sending a lot of Occurrences

  • 1.  Spectrum SANM - Sending a lot of Occurrences

    Posted Nov 18, 2019 09:56 AM

    Hello team,

    I have my Spectrum environment integrated with BMC TrueSight to open tickets via SANM.

    But I have an issue that is affecting my team:

    The integration via SANM is working fine, my issue is when an alarm is raised, it opens a ticket on TrueSight, but if this alarm has one more Occurrence, without closing the first one, it's opening a new alarm for the same problem.



    If an alarm has 5 Occurrences, it's opening 5 alarms from the same problem.

    Is there any way that we can limit this ticket creation ?
    Like if i have 5 Occurrences from the same alarm, it opens only one ticket for that alarm ?

    Thank you all in advance,
    Bruno



  • 2.  RE: Spectrum SANM - Sending a lot of Occurrences

    Broadcom Employee
    Posted Nov 18, 2019 02:06 PM
    Edited by Silvio Okamoto Nov 18, 2019 02:10 PM
    Hi Bruno,

    OOB (Out Of the Box), Spectrum does not generate more than 1 occurrence of "DEVICE HAS STOPPED RESPONDING TO POLLS"  alarm.
    Did you customize the 0x10d35 event code?

    Please read this section of CA Spectrum guide:
    https://techdocs.broadcom.com/content/broadcom/techdocs/us/en/ca-enterprise-software/it-operations-management/spectrum/10-3-2/managing-network/event-configuration/event-and-alarm-customization.html

    Do not customize the following list of event codes as this will adversely affect Spectrum’s fault isolation algorithms which leads to undesirable results.

    ------------------------------
    Technical Support Engineer IV
    Broadcom Inc
    ------------------------------



  • 3.  RE: Spectrum SANM - Sending a lot of Occurrences

    Posted Nov 19, 2019 07:17 AM
    Silvio,

    There's no modifications in the 0x10d35 event code.





  • 4.  RE: Spectrum SANM - Sending a lot of Occurrences

    Broadcom Employee
    Posted Nov 19, 2019 07:30 AM

    Bruno,

    Check if the multiples occurrences of "DEVICE HAS STOPPED RESPONDING TO POLLS" were originated by the same event code.

    As you can see below, there are three event codes that generate the same Cause Code 0x10009 alarm.

     



    ------------------------------
    Technical Support Engineer IV
    Broadcom Inc
    ------------------------------



  • 5.  RE: Spectrum SANM - Sending a lot of Occurrences

    Posted Nov 19, 2019 07:44 AM
    Silvio,

    Looking at one multiple occurrences case here, I could see that the alarm was generated by the same event code, 0x10d35.
    I check here and there's no modification in the events either.



  • 6.  RE: Spectrum SANM - Sending a lot of Occurrences

    Broadcom Employee
    Posted Nov 19, 2019 01:14 PM
    Edited by Silvio Okamoto Nov 19, 2019 01:14 PM
    Hi Bruno,

    The Critical alarm that was generated on 05/11/2019 16h24min8s BRST was not cleared. And a new Critical alarm was generated on 07/11/2019 18h41min10s BRST.

    Every time the 0x10d35 event code is generated, it is preceded by the 0x10d30 event code, which will clear the previous 0x10009 cause code alarm (generated by the 0x10d35 event code). 
    But in your screenshot I don't see the 0x10d30 event code. I am wondering why it is not visible. Did you customize it?

    I also noticed your OneClick Console is displaying the Daylight Saving Time (DST) (BRST - Brasilia Summer Time). Please follow this KB article to fix it: https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=137948 (Spectrum - 2019 Brazil has canceled DST and will stay on standard time indefinitely)


    ------------------------------
    Technical Support Engineer IV
    Broadcom Inc
    ------------------------------



  • 7.  RE: Spectrum SANM - Sending a lot of Occurrences
    Best Answer

    Broadcom Employee
    Posted Nov 19, 2019 07:36 AM
    Hi Bruno,

    Did you implement the Alarm Notifier in Fault Tolerant?
    If positive, did you follow this script example?
    https://community.broadcom.com/enterprisesoftware/communities/community-home/digestviewer/viewthread?MessageKey=9ba5190d-73fd-4858-9301-906e6d232c54

    ------------------------------
    Technical Support Engineer IV
    Broadcom Inc
    ------------------------------



  • 8.  RE: Spectrum SANM - Sending a lot of Occurrences

    Posted Nov 19, 2019 07:52 AM

    SIlvio,

    Thank you for the information.
    I'll take a look at this post that you shared and I'll apply the procedure.

    I'll report as soon as I finish.




  • 9.  RE: Spectrum SANM - Sending a lot of Occurrences

    Posted Nov 28, 2019 09:02 AM
    Silvio,


    Sorry for the late reply.
    I had to open a change request do apply the fault-tolerance script.

    Turns out that this was one of the reasons that was opening multiple tickets. The other reasons was some misadjustments in the TrueSight.

    After applying the fault-tolerance script in both servers and adjusting TrueSight, the problem was solved.

    Thank you for you help, as usual.
    Bruno


  • 10.  RE: Spectrum SANM - Sending a lot of Occurrences

    Posted Nov 19, 2019 05:46 AM
    HI Bruno,

    Spectrum generates only one alarm for each device that doesn't respond and keep on increases the occurrences count but not the alarms. Seems, an issue with SANM policies.

    Is it possible you to provide the information on how did you integrate BMC TrueSight using Spectrum SANM? This helps to understand where the issue is.

    ------------------------------
    Thank you.
    Rajashekar
    ------------------------------



  • 11.  RE: Spectrum SANM - Sending a lot of Occurrences

    Posted Nov 19, 2019 07:26 AM

    Hello Rajeshekar,


    Yes, I can.

    The integration was made through an script and a binary(msend) file provided by BMC, that basically takes the information provided by the script and places a trap at TrueSight.

    #!/bin/sh
    ###############################################################################
    #
    #  CA Incorporated
    #  273 Corporate Drive
    #  Portsmouth, NH 03801
    #  Copyright (c) 2010 CA, Inc.
    #  All rights reserved.
    #
    #  IN NO EVENT SHALL CA INCORPORATED BE LIABLE FOR
    #  ANY INCIDENTAL, INDIRECT, SPECIAL, OR CONSEQUENTIAL DAMAGES
    #  WHATSOEVER (INCLUDING BUT NOT LIMITED TO LOST PROFITS) ARISING OUT
    #  OF OR RELATED TO THIS SOFTWARE, EVEN IF CA INCORPORATED HAS BEEN
    #  ADVISED OF, KNOWN, OR SHOULD HAVE KNOWN, THE POSSIBILITY OF SUCH
    #  DAMAGES.
    #
    ###############################################################################
    
    ###############################################################################
    #
    #  File: /sablime/sablime5_2/sdb/Sanm/clapi/prod/s.SetScript
    #
    #  Version: 1.1.1.16 - 01/11/10 04:17:57
    #
    ###############################################################################
    ###############################################################################
    #
    #  SetScript - default script executed by AlarmNotifier for an alarm set.
    #
    ###############################################################################
    ###############################################################################
    #
    #  MAIL Facility
    #
    #  If a user wishes to send a mail message for the alarm then set SENDMAIL to
    #  "True" and set VARFORMAIL to "RepairPerson" or "NotificationData" (or 
    #  "Both") depending on who you want the mail sent to.
    #
    #  Note: You can only send mail to users listed in $NOTIFDATA if the
    #  SPECTRUM Alarm Notification Manager (SANM) is enabled. If VARFORMAIL is
    #  set to "NotificationData" and SANM is not enabled then mail is not sent.
    #
    #  Note: The argument $REPAIRPERSON( actually $TROUBLE_SHOOTER_EMAIL ) and/or
    #        $NOTIFDATA MUST be valid Login IDS and email addresses respectively 
    #        in order for the script to send mail them.
    #
    #  Note: Ensure Mail is configured as described in the AlarmNotifier User
    #  Guide if you are running AlarmNotifier on Windows with mail enabled.
    #
    ##############################################################################
    SENDMAIL=False              #True or False
    VARFORMAIL=RepairPerson     #RepairPerson, NotificationData, or Both
    
    case `/bin/uname` in
            "Windows_NT") MAIL="mail";;
    
            "Linux") MAIL="mail";;
    
            "SunOS") MAIL="mailx";;
    esac
    
    DATE=$1
    TIME=$2
    MTYPE=$3
    # use quotes to avoid mis-interpreting special chars - like '
    MNAME="$4"
    AID=$5
    SEV=$6
    CAUSE=$7
    REPAIRPERSON="$8"
    
    if [ "$USE_NEW_INTERFACE" = "TRUE" ]
    then
        # ALARMSTATUS now being passed via environment variable
        shift 8
    else
        STATUS=$9
        shift 9
    fi
    
    SERVER=$1
    LANDSCAPE=$2
    MHANDLE=$3
    MTHANDLE=$4
    IPADDRESS=$5
    SECSTR=$6
    ALARMSTATE=$7
    ACKD=$8
    CLEARABLE=$9
    
    shift 9
    
    #FLASH_GREEN=$1
    
    if [ "$USE_NEW_INTERFACE" = "TRUE" ]
    then
        # PCAUSE and EVENTMSG now passed via environment variables
        LOCATION="$2"
        AGE=$3
        NOTIFDATA=$4
        PID=$5
        SANM=$6
        shift 6
    else
        PCAUSE=`echo "$2" | tr '\350' '\012' | tr '\351' '"'`
        LOCATION="$3"
        AGE=$4
        NOTIFDATA=$5
        EVENTMSG=`echo "$6" | tr '\350' '\012' | tr '\351' '"'`
        PID=$7
        SANM=$8
        shift 8
    fi
    
    # Information on specifying additional attributes, which requires
    # USE_NEW_INTERFACE=true :
    
    # Two ways have been added that allows the specification of additional 
    # attributes for AlarmNotifier.  You have the option of passing the 
    # attributes as environment variables or as arguments, which is reflected
    # in the two new config parameters:
    #
    # EXTRA_ATTRS_AS_ENVVARS and EXTRA_ATTRS_AS_ARGS
    #
    # For most attributes, either method is acceptable, but for multi-line text
    # attributes or for very long attribute values it is recommended to specify
    # these as EXTRA_ATTRS_AS_ENVVARS because of command-line length limitations
    # and Windows behavior that truncates the command-line at the first newline
    # character.
    
    # If EXTRA_ATTRS_AS_ENVVARS have been specified, they can be just referenced
    # by prepending # SANM_ to the value in the config file, ie:
    #
    #  EXTRA_ATTRS_AS_ENVVARS=0x100c5,0x11f84  means that
    #
    #  $SANM_0x100c5 and $SANM_0x11f84 environment variables to be set, which
    #  can be then be referenced in this script like:
    #
    # IFDESC=$SANM_0x100c5 
    # IFALIAS=$SANM_0x11f84
    #
    #  Note: Windows will uppercase these variables, so they need to be referenced
    #        that way - ie. $SANM_0X100C5 and $SANM_0X11F84.
    #
    # Alternatively, if EXTRA_ATTRS_AS_ARGS have been specified, they will be
    # added to the command-line.  For example:
    #
    #  EXTRA_ATTRS_AS_ARGS=0x100c5,0x11f84   means that
    #
    #  the values of these attributes will be added to be argument list passed
    # to this script, and can be referenced like this:
    #
    # IFDESC=$1
    # IFALIAS=$2
    
    DTYPE="$1"
    
    echo_info()
    {
    echo ""
    echo "============================================================"
    echo " "
    echo "Alarm Notification from SPECTRUM"
    echo " "
    echo "Alarm SET:"
    echo ""
    echo "Date:            " $DATE
    echo "Time:            " $TIME
    echo "DeviceType:      " $DTYPE
    echo "Mtype:           " $MTYPE
    echo "ModelName:       " $MNAME
    echo "AlarmID:         " $AID
    
    # If you wish to see the global alarm ID printed out you need to set
    # ENABLE_CORRELATION to "true" and USE_NEW_INTERFACE to "true" in the
    # configuration file.
    if [ "$ENABLE_CORRELATION" = "TRUE" ] && [ "$USE_NEW_INTERFACE" = "TRUE" ]
    then
        echo "Global AlarmID:  " $GLOBAL_ALARM_ID
    fi
    
    # If you wish to see correlation related information, you need to set
    # ENABLE_CORRELATION to "true" and SHOW_SYMPTOM_ALARMS to "true" in
    # the configuration file.
    if [ "$SYMPTOM_ALARM_LIST" ]
    then
        echo "CorrelationAlarmType:       " CAUSE
        echo "SYMPTOMGlobalAlarmIDList:   " $SYMPTOM_ALARM_LIST
    fi
    
    if [ "$CAUSE_ALARM_LIST" ]
    then
        echo "CorrelationAlarmType:       " SYMPTOM
        echo "CAUSEGlobalAlarmIDList:     " $CAUSE_ALARM_LIST
    fi
    
    # If you wish to see the UNIX alarm time printed out you need to set
    # USE_NEW_INTERFACE to true in the configuration file and uncomment
    # the following line.
    #echo "Raw Alarm Time: "  $RAW_ALARM_TIME
    
    echo "Severity:        " $SEV
    echo "ProbableCauseID: " $CAUSE
    echo "RepairPerson:    " $REPAIRPERSON
    echo "AlarmStatus:      $STATUS"
    echo "SpectroSERVER:   " $SERVER
    echo "Landscape:       " $LANDSCAPE
    echo "ModelHandle:     " $MHANDLE
    echo "ModelTypeHandle: " $MTHANDLE
    echo "IPAddress:       " $IPADDRESS
    echo "SecurityString:  " $SECSTR
    echo "AlarmState:      " $ALARMSTATE
    echo "Acknowledged:    " $ACKD
    echo "UserClearable:   " $CLEARABLE
    
    ###########################################################################
    #  When notifying on management lost (0x12dc7), customer (0x12bf6) and/or
    #  service (0x12bf7) impact attributes please pass them as environment variables
    #  and print in quotes, as they will usually contain multiple
    #  lines:
    #
    #  echo "ManagementImpactLost:   $SANM_0x12dc7"
    #  echo "ServiceImpact:          $SANM_0x12bf7"
    #  echo "CustomerImpact:         $SANM_0x12bf6"
    #
    #  Or on Windows:
    #
    #  echo "ManagementImpactLost:   $SANM_0X12DC7"
    #  echo "ServiceImpact:          $SANM_0X12BF7"
    #  echo "CustomerImpact:         $SANM_0X12BF6"
    ###########################################################################
    
    ###########################################################################
    #  The following parameters contain values only when
    #  the SPECTRUM Alarm Notification Manager is enabled.
    ###########################################################################
    if [ "$SANM" ]
    then
    echo "Location:        " $LOCATION
    echo "AlarmAge:        " $AGE
    echo "NotificationData:" $NOTIFDATA
    echo ""                                         # insert blank line
    fi
    
    # This variable has substituted placeholder chars, echo them inside quotes
    echo "ProbableCause:    $PCAUSE"
    echo ""                                         # insert blank line
    # This variable has substituted placeholder chars, echo them inside quotes
    echo "EventMessage:     $EVENTMSG"
    echo ""
    
    echo "============================================================"
    }
    
    parse_notifdata()
    {
       LIST=`echo "$NOTIFDATA" | tr ',:' '  '`
       SKIP=false
       for ITEM in $LIST
       do
          if [ "$ITEM" = "or" ]
          then
             SKIP=true
          else
             if [ "$SKIP" = "false" ]
             then
                OUTLIST="$OUTLIST $ITEM"
             else
                SKIP=false
             fi
          fi
       done
       echo $OUTLIST
    }
    
    if [ "$SENDMAIL" = "True" ]
    then
       RECIPIENTS=$VARFORMAIL
       if [ "$VARFORMAIL" = "NotificationData" ]
       then
          RCVRS=`parse_notifdata`
       fi
       if [ "$VARFORMAIL" = "RepairPerson" ]
       then
          RCVRS="$TROUBLE_SHOOTER_EMAIL"
       fi
       if [ "$VARFORMAIL" = "Both" ]
       then
          RCVRS=`parse_notifdata`
          RCVRS="$RCVRS $TROUBLE_SHOOTER_EMAIL"
          RECIPIENTS="NotificationData/RepairPerson"
       fi
    
       if [ "$RCVRS" -a "$RCVRS" != " " ]
       then
          echo " "
          echo "*******************************************************************"
          echo "Sending mail to $RECIPIENTS:" 
          echo ""
          echo "($RCVRS)"
          echo "*******************************************************************"
          echo_info | tee -i /tmp/set_alarm.$PID
          $MAIL -s "A $SEV alarm has occurred on $SERVER (Model Name=$MNAME)(Model Type=$MTYPE)" $RCVRS < /tmp/set_alarm.$PID
          rm -f /tmp/set_alarm.$PID
       else
          echo " "
          echo "*****************************************************"
          echo "NO $RECIPIENTS assigned - no mail sent"
          echo "*****************************************************"
          echo_info
       fi
    
    else
       echo_info
    fi
    
    ###############################################################################
    # # BMC ProactiveNet Performance Manager integration.                       # # 
    # # Written by Karlis Peterson, BMC Software Consultant                     # #  
    ###############################################################################
    
    BMC_PC=`echo $PCAUSE`
    num=`nawk -v pc="$BMC_PC" 'BEGIN{print index(pc,"SYMPTOMS:")}'`
    
    SHORTTEXT=`nawk -v pc="$BMC_PC" -v n="$num" 'BEGIN{print substr(pc,1,n-2)}'`
    
    # Send the creation of an alarm to BPPM
    /sphinxv01/spectrum/bmc/msend -n @cfs-010/1828#mc -a SPECTRUM -b "SPECTRUM_MNAME=$MNAME;SPECTRUM_SEV=$SEV;SPECTRUM_MTYPE=$MTYPE;SPECTRUM_MTHANDLE=$MTHANDLE;SPECTRUM_CAUSE=$CAUSE;SPECTRUM_CLEARABLE=SET;SPECTRUM_AID=$AID;SPECTRUM_LANDSCAPE=$LANDSCAPE;SPECTRUM_MTYPE=$MTYPE;SPECTRUM_DATE=$DATE;SPECTRUM_TIME=$TIME,SPECTRUM_EVENTMSG=$EVENTMSG;SPECTRUM_IPADDRESS=$IPADDRESS;SPECTRUM_DTYPE=$DTYPE;SPECTRUM_SHORTTEXT=$SHORTTEXT;SPECTRUM_MHANDLE=$MHANDLE"


    There's a SANM Policy attached to the SANM Application that basically only sends critical alarms and exclude a few alarm types, and has the age time of 20 minutes.

    This environment is fault-tolerant, and the AlarmNotifier configurations exists in both SpectroServers.