Automic Workload Automation

 View Only
Expand all | Collapse all

Stability of Automic Platform

  • 1.  Stability of Automic Platform

    Posted Feb 14, 2020 06:18 AM
    Edited by Michael TMO8Y2tl Feb 14, 2020 06:25 AM
    Dear Community,

    are there some mechanism in order to prevent damage on the automic engine?
    e.g. prevent infinitive loops, limit the amount of allowed activities, query which identifies objects which are causing more then X RT,AH records etc.

    If so, how can this be done?


  • 2.  RE: Stability of Automic Platform

    Posted Feb 14, 2020 06:57 AM
    ​Hi.

    Good question.

    Disclaimer: The following is my opinion.

    There are certain built-in protections, e.g. against certain infinite loops. It will throttle down the execution of an object if it detects an infinite loop and print a message to the message window to that effect. This is, however, not all-encompassing, there are various ways to circumvent this and still generate infinite loops. Some of these have been declared as to not being malfunctions, hence they will not be mitigated.

    So while there are some protections, a malicious actor can still find various ways to fill up db tables or overload the engine.

    The only full solution is to limit access to the engine to non-malicious actors with sufficient training. In our organisation, we have created a best practice guidebook which (also) aims to educate users how to avoid certain infinite loop scenarios and some other pitfalls that create excessive load on the system and/or database.

    Hth,


  • 3.  RE: Stability of Automic Platform

    Posted Feb 14, 2020 08:48 AM
    Hi

    in our company we handle it similar to Carstens attempt - (almost) no limit for AE, but as much restrictions for the users as suitable & possible & easy to handle.

    No access for users without UC4 basics training, No writeable access for users without Full UC4 training.

    The system itself is with redundancy (2 Servers, Linux) and mirrored DB (oracle RAC). This saved or lives several times so far :-)

    cheers, Wolfgang

    ------------------------------
    Support Info:
    if you are using one of the latest version of UC4 / AWA / One Automation please get in contact with Support to open a ticket.
    Otherwise update/upgrade your system and check if the problem still exists.
    ------------------------------



  • 4.  RE: Stability of Automic Platform

    Posted Feb 14, 2020 11:39 AM
    "limit the amount of allowed activities"

    There are multiple tricks available.  You can throttle concurrent activities via QUEUE object restrictions.  You can throttle an individual task by limiting how many of them can run at the same time.  I've also used SYNC objects to throttle a family of tasks.  There may be more ways(?)

    Regarding finding objects that cause more RT records, I use this SQLServer query;

    select oh_name as job_name
         --, ah_timestamp1 as activation_time
         , dateadd(hour, datediff(hour, getutcdate(), getdate()), ah_TimeStamp2) as Start_time
         --, ah_runtime as runtime
         , CONVERT(varchar, DATEADD(ms, ah_runtime * 1000, 0), 114) as runtime
         , count(*) as report_size
    from uc4.dbo.rt
       , uc4.dbo.ah
       , uc4.dbo.oh
    where ah_timestamp1 > cast('20191201 00:00:00:000' as DATETIME) -- how far back in time
    and   ah_oh_idnr = oh_idnr
    and   rt_ah_idnr = ah_idnr
    and   not oh_name in ('APPUTILP', 'AE_PROD#WP001') -- Objects to exclude from the selection
    group by oh_name, ah_timestamp1, ah_timestamp2, ah_runtime
    having count(*) > 500 -- MAX_REPORT_SIZE is set in UC_HOSTCHAR_DEFAULT
    order by 2;


    ------------------------------
    Pete
    ------------------------------



  • 5.  RE: Stability of Automic Platform

    Posted Feb 15, 2020 10:14 AM

    Unfortunately, this query dont work for me because we're using Oracle (SQL Developer)..

    Perfect would be something like this:

    Client | Job_Name | Count of RT records

    sorted by the job with most RT records. Filtered by specific date range.




  • 6.  RE: Stability of Automic Platform
    Best Answer

    Posted Feb 16, 2020 06:45 AM
    ​Hi,

    You can try something along these lines :

    SELECT AH_Client, AH_NAME, COUNT (RT_AH_IDNR)
    FROM uc4.AH, uc4.RT
    WHERE AH_Idnr= RT_AH_Idnr
    GROUP BY AH_NAME, AH_CLIENT
    ORDER BY 3 DESC;

    However it's pretty slow and can surely be improved. As RT is one of the biggest tables maybe it'd make sense checking / rebuilding the indexes before running this SQL.

    Regarding the time frame you can either use the timestamps from AH or RT. Those from AH are more informative because they will tell you when the job was activated, started, when it has ended and when it was deactivated. On the other hand RT only has one timestamp, assumably the time when the report line was written to the DB.

    Best regards,
    Antoine


  • 7.  RE: Stability of Automic Platform

    Posted Feb 17, 2020 11:41 AM
    Thanks, this is exactly what I'm looking for! I just found a job with more than 500.000 pages. Is there some general option in order to limit the allowed reportlines?


  • 8.  RE: Stability of Automic Platform

    Posted Feb 17, 2020 11:46 AM
    Hi,

    See MAX_REPORT_SIZE, it limits how many pages end up in the DB vs. in the agent file system.

    https://docs.automic.com/documentation/webhelp/english/AA/12.3/DOCU/12.3/Automic%20Automation%20Guides/help.htm#AWA/Variables/UC_HOSTCHAR_DEFAULT.htm?Highlight=MAX_REPORT_SIZE

    Various job types have different options, e.g. for SAP jobs you can limit the amount of report being kept right in the job object.

    Hth,




  • 9.  RE: Stability of Automic Platform

    Posted Feb 17, 2020 12:04 PM
    Edited by Michael TMO8Y2tl Feb 17, 2020 12:31 PM

    MAX_REPORT_SIZE is already set to 20. However, there is a rest job with several post processing (POST) pages (+4.000.000). Can this be limited on a global level instead of an object level?

    For this particular job Write agent log  




  • 10.  RE: Stability of Automic Platform

    Posted Jun 05, 2020 09:11 AM

    is there an option on how to restrict POST / LOG - Reports?

    I'm also searching for a query which identifies the runid of jobs with infinitive loops.




  • 11.  RE: Stability of Automic Platform

    Posted Jun 05, 2020 09:41 AM
    Edited by Michael TMO8Y2tl Jun 05, 2020 11:31 AM

    Sometimes the MELD-Windows shows the following error-message:

    U00010037 Object: 'XYZ', client: 'XX': processing interrupted, possible loop! Script processing will continue in '128' seconds.

    Can someone provide an sql query in order to identify the runid's of those jobs? I've written this query. Maybe there is an more elegant way of getting this information. (Currently scenario that loop has been fixed in the meanwhile is not covered)

    SELECT ah_idnr       RunID, 
           ah_name       JobName, 
           ah_client     UC4_Client, 
           ah_timestamp1 JobStart 
    FROM   uc4.ah 
    WHERE  ah_name IN (SELECT DISTINCT jobname 
                       FROM   (SELECT Substr(meld_msginsert, 1, 
                                      Instr(meld_msginsert, '|') 
                                      - 1) AS 
                                      Jobname, 
                                      meld_client 
                               FROM   uc4.meld 
                               WHERE  meld_msgnr = '10037') 
                       WHERE  meld_client != '0') 
           AND ( ah_timestamp4 IS NULL 
                 AND ah_timestamp3 IS NULL );