Automic Workload Automation

 View Only
  • 1.  AE - Scripting to monitor/alert when total memory used worker processes reaches % of total memory available

    Posted Sep 15, 2020 08:42 AM
    Hi All,

    We had recent issue with our AE version 12.1.2 were total memory of all worker processes used 100% of system memory on the server resulting in job submission stoppage/log in issue. To fix, ran top command at Linux prompt to identify which processes were using over 1gb of memory. Then stopped, restarted each one.  

    Does anyone have alert scripting to monitor the memory usage of the worker processes and send alert when pre-determined threshold is reached  to share?

    Assistance would be appreciated.

    regards,
    Attila


  • 2.  RE: AE - Scripting to monitor/alert when total memory used worker processes reaches % of total memory available

    Posted Sep 15, 2020 08:54 AM
    Edited by Frank Muffke Sep 15, 2020 08:54 AM
    Hi
    we have a self developed script for our healthcheck for the system.

    But I think this should be part of a external monitoring system rather than a periodic job in Automic :-)

    cheers, Wolfgang

    PS: This could be an indication that your Server is not fitted well with RAM memory...


  • 3.  RE: AE - Scripting to monitor/alert when total memory used worker processes reaches % of total memory available

    Posted Sep 15, 2020 09:44 AM
    Wolfgang,

    Thanks for the note on the memory.  We sized memory based on 1gb per worker processes.  During our issue, 2 of the worker processes using > 6 gb plus couple more using around 2 gb. This is most likely related to memory leak issue in the current version. Until we can upgrade, have to monitor.

    If you don't mind me asking, what health checks is your self developed script performing?  May i have copy to review?

    thanks,
    Attila


  • 4.  RE: AE - Scripting to monitor/alert when total memory used worker processes reaches % of total memory available

    Posted Sep 15, 2020 09:59 AM
    Hi 
    regarding OS values we check mem & CPu values.

    basing on the SQLI Vara (ORA) below that lists all Work Processes...

    select AH_NAME, AH_PROCESSID, ah_hostdst, ah_timestamp1, ah_timestamp4, Host_hostattrtype, host_tcpipaddr, host_tcpipport, host_version, host_smphrase,host_smtcpipport, host_smdisplayname,MQSRV_TcpIpPort,(CASE mqsrv_type WHEN 1 THEN 'CP' WHEN 2 THEN 'WP' WHEN 4 THEN 'PWP' WHEN 16 THEN 'JWP' WHEN 32 THEN 'JCP' ELSE to_char(mqsrv_type) END) "Process Type" from AH,host, mqsrv
    where ah_otype = 'SERV'
    and ah_timestamp4 is null
    and ah_oh_idnr = host_oh_idnr
    and AH_NAME = MQSRV_NAME
    order by ah_name

    I basically do a PREP_PROCESS with OS command "CMD=ps -p &PROCID# -o %cpu=,%mem=,pid=,cmd= |awk '{print $1, $2, $3, $4, $5, $6}'" and some scripting around to get the OS values for every Server Process we have.

    We have a lot of checks/tables that is displayed in our Healthcheck - but as I stated its just a healthcheck, not monitoring...

    cheers, Wolfgang


    ------------------------------
    Support Info:
    if you are using one of the latest version of UC4 / AWA / One Automation please get in contact with Support to open a ticket.
    Otherwise update/upgrade your system and check if the problem still exists.
    ------------------------------



  • 5.  RE: AE - Scripting to monitor/alert when total memory used worker processes reaches % of total memory available

    Posted Sep 15, 2020 09:59 AM
    oops heres the screenshot of that part..



    ------------------------------
    Support Info:
    if you are using one of the latest version of UC4 / AWA / One Automation please get in contact with Support to open a ticket.
    Otherwise update/upgrade your system and check if the problem still exists.
    ------------------------------