DX Application Performance Management

Expand all | Collapse all

Shell script to report a problem with APM

  • 1.  Shell script to report a problem with APM

    Posted 03-26-2018 01:49 PM

    Hello guys,

    I would like to ask for your help regarding a problem that happens in my environment that I still can not identify what it can be to be able to solve.
    The situation is as follows, we implemented a script that every time a treshold is reached this script sends a command to our workflow tool registering a problem ticket. What happens is that when the ticket is normalized, the script is not executed again with the normalization command. And I do not know why this occurs.

    Here is some information:
    Total Critical Events: 4428
    Total events Standardization: 1089
    In the action the parameters that are being passed are:

       - Pass Text to Shell

       - Alert Status

       - Agent Name(s)

    In the Alert the parameters that are being configured are:

       - Metric: Stall

       - Trigger Alert Norification: Report Only Final State Whenever Severity Changes

       - Combination: Any

       - Notify by individual metric: Checked

       - Comparison Operator: Greater Then

       - Resolution: 15 seconds

       - Threshold: 250

       - Periods over Threshold: 16

       - Observed Periods: 16

     

    Script:

    #!/bin/sh
    #
    # Determine automatically where our home is.
    # resolve links - $0 may be a softlink
    PRG="$0"
    while [ -h "$PRG" ] ; do
      ls=`ls -ld "$PRG"`
      link=`expr "$ls" : '.*-> \(.*\)$'`
      if expr "$link" : '.*/.*' > /dev/null; then
        PRG="$link"
      else
        PRG=`dirname "$PRG"`/"$link"
      fi
    done
    PRGDIR=`dirname "$PRG"`

     

    # Set environment
    if [ -f $PRGDIR/setIscEnv.sh ] ; then
    source $PRGDIR/setIscEnv.sh
    else
            echo "There is no environment setup file: setIscEnv.sh"
            echo "Please re-run configureWily in installation directory!"
            exit
    fi

     

    #---------------Define the log file-----------------------------------------#
    LOG_FILE=/opt/programas/apm/10.5/p_07667_apmca/mom/logs/chamado.log

     

    #______________Receipt of call values--------------------------------#
    ALERT=$1
    ALERT_STATUS=$2
    AGENT_NAME=$3

     

    #---------------Define the agent name and the instance name----------------------#
    AGENT_INSTANCIA=$AGENT_NAME

     

    NOME_AGENTE=`echo $AGENT_NAME|awk -F"\|" '{print $(NF-2)}'`
    NOME_AGENTE=`echo $NOME_AGENTE | sed -e "s/]\"//g"`

     

    NOME_INSTANCIA=`echo $AGENT_INSTANCIA|awk -F"\|" '{print $(NF)}'`
    NOME_INSTANCIA=`echo $NOME_INSTANCIA | sed -e "s/]\"//g"`

     

     

     

    #--------------Treat the data to get the service code and the name of the application-#
    TT_AGENTS=`echo $AGENT_NAME|awk -F"\|" '{print $(NF)}'`
    TT_AGENTS=`echo $TT_AGENTS | sed -e "s/]\"//g"`
    DOMAIN=`echo $AGENT_NAME|awk -F"\|" '{print $(NF-3)}'`
    DOMAIN=`echo $DOMAIN | sed -e "s/\"\[//g"`

     

    DOMAIN_ARRAY=( $(echo $DOMAIN | tr "-" "\n") )

     

    COD_SERV="00000"
    COD_SERVV="00000"
    APLIC=$TT_AGENTS

     

    if [ "X${DOMAIN_ARRAY[1]}" != "X" ]; then
            COD_SERV=${DOMAIN_ARRAY[0]}
            COD_SERVV=`echo $COD_SERV | awk -F"\/" '{print $(NF)}'`
            APLIC=${DOMAIN_ARRAY[1]}
    fi

     

    #------------------Define whether the call is alert, normal or critical----------------------#
    STATUS="indefinido"
    if [ "$ALERT_STATUS" = "\"3\"" ]; then
            STATUS="critical"
    elif [ "$ALERT_STATUS" = "\"2\"" ]; then
            STATUS="major"
    else
            STATUS="normal"
    fi

     

    #-----------------Take the accent off the text of the call------------------------------------#
    ALERT_TRATADO=`echo $ALERT | sed y/áÁàÀãÃâÂéÉêÊíÍóÓõÕôÔúÚüÜçÇ/aAaAaAaAeEeEiIoOoOoOuUuUcC/`

     

    #----------------Includes Custom Message for support---------------------#
    grep -q "Erros Por Intervalo" <<< "$ALERTA_TRATADO" && RECOMENDACAO="O APM detectou uma quantidade anormal de erros, para detalhar quais os erros entrar no APM e verificar a aba de erros no investigator."
    grep -q "Garbage" <<< "$ALERTA_TRATADO" && RECOMENDACAO="Em casos de execucao elevada de Garbage Collector, a recomendacao eh que a instancia seja reiniciada."
    grep -q "Tempo Medio de Resposta" <<< "$ALERTA_TRATADO" && RECOMENDACAO="O APM detectou que o Tempo Medio de Resposta da aplicacao atingiu um nivel anormal do que era esperado, normalmente isso acontece devido a uma integracao da aplicacao que esta demorando muito para responder"
    grep -q "Stall" <<< "$ALERTA_TRATADO" && RECOMENDACAO="Essa metrica determina a quantidade de transacoes que demoram mais de 30s para concluir. Normalmente isso acontece devido a uma integracao da aplicacao que esta demorando muito para responder. Para maiores detalhes veja a aba de erro da aplicacao no investigator."

     


    #----------------Replication call command for pro log file------------#
    echo `date` /opt/OV/bin/opcmsg a=APM o=\"$APLIC - $COD_SERVV\" msg_g=Aplicacao s=$STATUS -option NOME_INSTANCIA=$NOME_INSTANCIA -option IC_NOME=$NOME_AGENTE -option REGIONAL="BSA" -option AMBIENTE="PRO" -option RECOMENDACAO="$RECOMENDACAO" msg_t="BSA $ALERT_TRATADO" >> $LOG_FILE

     

    #---------------Call opening command----------------------------------------#
    /opt/OV/bin/opcmsg a=APM o="$APLIC - $COD_SERVV" msg_g=Aplicacao s=$STATUS -option NOME_INSTANCIA=$NOME_INSTANCIA -option IC_NOME=$NOME_AGENTE -option REGIONAL="BSA" -option AMBIENTE="PRO" -option RECOMENDACAO="$RECOMENDACAO" msg_t="BSA $ALERT_TRATADO"



  • 2.  Re: Shell script to report a problem with APM

    Posted 03-26-2018 04:45 PM

    Hi Felix,

     

    Just some general questions:

     

    • Is the script located in directory accessible to your Introscope instance (e.g. under the root EM directory)?
    • What does the EM log say when the alert is triggered?
    • Are you using a wrapper script to call the one you pasted in your post?


  • 3.  Re: Shell script to report a problem with APM

    Posted 03-27-2018 07:55 AM

    Hi Haruhiko Davis,

    1. Yes, the script is in a directory of EM and has access permission 777

    2. Here are a example of EM Log: 2/19/18 02:25:10.351 PM BRT [INFO] [Alarm Pooled Worker] [Manager.Action] Action "chamado" successfully executed shell command "/opt/approtinas/p_07667_apm/chamado.sh"
    3. No, I call this direct script.

     

    Ps: now that you have spoken, I went to look at the EM log if there was any error message in executing the script and I ended up finding an error that follows:

    2/23/18 11:21:10.339 AM BRT [ERROR] [Alarm Pooled Worker] [Manager.Action] Action "chamado" failed to execute shell command "/opt/approtinas/p_07667_apm/chamado.sh" with bad process exit value "2"



  • 4.  Re: Shell script to report a problem with APM

    Posted 03-31-2018 09:49 PM

    Hi Felix,

    If you run this shell script manually by passing it the two parameters you want, what happens?

    Also please change '#!/bin/sh' to '#!/bin/sh -x' to see the bug debug output to help us.



  • 5.  Re: Shell script to report a problem with APM

    Posted 04-02-2018 08:37 AM

    Converted to a discussion