Automic Workload Automation

 View Only

  • 1.  Bash shell functions to aid AAKE troubleshooting

    Posted Jul 23, 2025 05:36 AM
    Edited by Michael A. Lowry 8 days ago

    Updated 2025.11.11 20:49 CST: Moved script to attachment. See the reply.

    I have written a script with a bunch of Bash functions that can be useful in troubleshooting AAKE problems when the AWI is not available. The script is also capable of labeling the AAKE pods in the Kubernetes cluster. This can be very helpful when troubleshooting problems specific to a particular process type, e.g., looking at REST process logs to investigate a problem with object search.

    The latest version of the script can be found on GitHub: https://github.com/michael-lowry/aake-debug-tools

    The script comprises a set of functions. It is intended to be sourced in a Bash shell in a dedicated Linux pod running in the same Kubernetes cluster as the AAKE server processes. The AE sever logs must be stored in a PVC or NFS file system, and the pod when you use these function must have access to this file system. The path to the logs is specified by a variable at the top of the script.

    Here is an example of the output of the ae function.

    ubuntu@aake-debug-tools-c85476cf6-8bnz6:/$ ae
    AAKE Debug Tools version: 1.3.5
    Server               Version
    🟨 AE_EXP            24.4.1+hf.1.build.1751975620307
    
    AE Proc  Type/Role   Host name                                  Log file
    CP001    JCP         jcp-ws-0-7576f79d9f-pwcxz                  /usr/server/tmp/log/CPsrv_log_001_00.txt
    CP002    JCP         jcp-ws-0-7576f79d9f-g6ftk                  /usr/server/tmp/log/CPsrv_log_002_00.txt
    CP003    JCP         jcp-ws-0-7576f79d9f-9mm7r                  /usr/server/tmp/log/CPsrv_log_003_00.txt
    CP004    JCP         jcp-ws-0-7576f79d9f-mn4lr                  /usr/server/tmp/log/CPsrv_log_004_00.txt
    CP005    JCP         jcp-ws-0-7576f79d9f-kj2tz                  /usr/server/tmp/log/CPsrv_log_005_00.txt
    CP006    JCP         jcp-ws-0-7576f79d9f-94m5x                  /usr/server/tmp/log/CPsrv_log_006_00.txt
    CP007    JCP         jcp-ws-0-7576f79d9f-zxk8p                  /usr/server/tmp/log/CPsrv_log_007_00.txt
    CP008    JCP         jcp-ws-0-7576f79d9f-jz629                  /usr/server/tmp/log/CPsrv_log_008_00.txt
    CP009    REST        jcp-rest-0-65dd4699d5-mpsx4                /usr/server/tmp/log/CPsrv_log_009_00.txt
    CP010    REST        jcp-rest-0-65dd4699d5-btpx8                /usr/server/tmp/log/CPsrv_log_010_00.txt
    CP011    REST        jcp-rest-0-65dd4699d5-crb5s                /usr/server/tmp/log/CPsrv_log_011_00.txt
    CP012    JCP         jcp-ws-0-7576f79d9f-w852t                  /usr/server/tmp/log/CPsrv_log_012_00.txt
    CP013    JCP         jcp-ws-0-7576f79d9f-thjn4                  /usr/server/tmp/log/CPsrv_log_013_00.txt
    CP014    REST        jcp-rest-0-65dd4699d5-cvxfw                /usr/server/tmp/log/CPsrv_log_014_00.txt
    CP015    CP          cp-0-6b8997b86b-2zv5l                      /usr/server/tmp/log/CPsrv_log_015_00.txt
    CP016    CP          cp-0-6b8997b86b-mhtgp                      /usr/server/tmp/log/CPsrv_log_016_00.txt
    WP001    JWP         jwp-0-584fd7bb7f-26thh                     /usr/server/tmp/log/WPsrv_log_001_00.txt
    WP002    JWP         jwp-0-584fd7bb7f-2jx29                     /usr/server/tmp/log/WPsrv_log_002_00.txt
    WP003    JWP         jwp-0-584fd7bb7f-8zqc7                     /usr/server/tmp/log/WPsrv_log_003_00.txt
    WP004    JWP         jwp-0-584fd7bb7f-hf9nq                     /usr/server/tmp/log/WPsrv_log_004_00.txt
    WP005    JWP         jwp-0-584fd7bb7f-2zdld                     /usr/server/tmp/log/WPsrv_log_005_00.txt
    WP006    JWP-AUT     jwp-0-584fd7bb7f-kxklw                     /usr/server/tmp/log/WPsrv_log_006_00.txt
    WP007    JWP-IDX     jwp-0-584fd7bb7f-fw525                     /usr/server/tmp/log/WPsrv_log_007_00.txt
    WP008    JWP-PER     jwp-0-584fd7bb7f-q2j75                     /usr/server/tmp/log/WPsrv_log_008_00.txt
    WP009    JWP-UTL     jwp-0-584fd7bb7f-5nh82                     /usr/server/tmp/log/WPsrv_log_009_00.txt
    WP010    JWP         jwp-0-584fd7bb7f-csxfw                     /usr/server/tmp/log/WPsrv_log_010_00.txt
    WP011    PWP*        wp-0-7b895f69dd-xmnrb                      /usr/server/tmp/log/WPsrv_log_011_00.txt
    WP012    RWP+        wp-0-7b895f69dd-gg9np                      /usr/server/tmp/log/WPsrv_log_012_00.txt
    WP013    OWP+        wp-0-7b895f69dd-2z49h                      /usr/server/tmp/log/WPsrv_log_013_00.txt
    WP014    DWP         wp-0-7b895f69dd-rgxsc                      /usr/server/tmp/log/WPsrv_log_014_00.txt
    WP015    DWP         wp-0-7b895f69dd-wdjvv                      /usr/server/tmp/log/WPsrv_log_015_00.txt
    WP016    WP          wp-0-7b895f69dd-kll4x                      /usr/server/tmp/log/WPsrv_log_016_00.txt
    WP017    WP          wp-0-7b895f69dd-xfbzb                      /usr/server/tmp/log/WPsrv_log_017_00.txt
    WP018    WP          wp-0-7b895f69dd-wjxkt                      /usr/server/tmp/log/WPsrv_log_018_00.txt
    WP019    WP          wp-0-7b895f69dd-92hl4                      /usr/server/tmp/log/WPsrv_log_019_00.txt
    WP020    DWP         wp-0-7b895f69dd-q4bbb                      /usr/server/tmp/log/WPsrv_log_020_00.txt
    🚀 Applying labels…
    ✅ Done

    Here is a screenshot of K9s, showing a cluster with labeled AAKE pods.

    These are based on my older Bash functions to identify AE server processes.

    If some processes display without complete information, you may need to tweak the parameters of the search_logs function in the various places this function is called, so that it searches older (or higher generation) logs. The right settings will depend on the particular system and its level of activity.

    The current method I'm using to distinguish DWPs from non-dialog WPs is very inefficient, and is the reason the ae function takes a long time to run the first time. When logging changes, WPs do not print a message indicating whether they are running or normal or dialog mode. Because of this, it's usually necessary to look through lots of old logs to find this information. If anyone knows of a more efficient approach, I would be glad to learn about it.

    Enjoy!



  • 2.  RE: Bash shell functions to aid AAKE troubleshooting

    Posted Oct 16, 2025 04:30 PM
    Edited by Michael A. Lowry Oct 16, 2025 04:30 PM

    I updated the script in the original post. The new version has improved identification of process types and roles. The new version can also label your AE pods in Kubernetes.

    1. Run the script in a pod in the same Kubernetes cluster as your AE system.
    2. Make sure the pod includes kubectl.
    3. Enable the default service account for the pod where you run the script, so that kubectl can connect to the cluster.
    4. Use RBAC to assign the required authorizations to the service account (list, get, patch, watch).
    5. Modify the script and change the system names and K8s cluster host names according to your environment.
    6. Set the label_pods environment variable to true.
    7. Source the script and run the ae function.

    See this post for a demonstration of the benefit labeling AE pods.



  • 3.  RE: Bash shell functions to aid AAKE troubleshooting

    Posted 24 days ago
    Edited by Michael A. Lowry 8 days ago

    I've made some significant improvements to the script. Rather than post it here, I put the script and related files in a new GitHub repository:

    https://github.com/michael-lowry/aake-debug-tools

    The updated version is much faster and requires fewer custom hard-coded values. We now rely on this tool to label our AAKE pods whenever we need to do maintenance or troubleshooting on the cluster.

    Enjoy!



  • 4.  RE: Bash shell functions to aid AAKE troubleshooting

    Posted 23 days ago
    Edited by Michael A. Lowry 23 days ago
      |   view attached

    I moved the original version of the script to an attachment.

    Attachment(s)

    sh
    uc4_functions_k8s_1.1.5.sh   14 KB 1 version