As most of you know, we use an [DEAD LINK https://community.automic.com/discussion/comment/31392/#Comment_31392]automated batch deployment system, to promote batches (folders) of AE objects from one system to another (e.g., from TEST to PROD). One of the steps in batch deployment involves removing objects from the target folder that were not part of the deployment. In broad strokes, it does this in the following way:
This is an oversimplified description of the process, but it will serve for the purposes of this discussion.We discovered today that on rare occasions, SearchObject will return an incomplete list of objects. In our case, it happens in step 4. (And because the list returned omits some objects, the deployment program sometimes mistakenly thinks that some folders are empty when they are not. It then removes these folders, and the child objects end up in <No Folder>.)We suspect that this problem is likely to happen when these circumstances are present:
It seems possible that ucybdbld is completing before all of the database changes (e.g., changes to the OFS table) have been committed. I have reported the problem to Automic in INC00219765. I will update this thread when I have more information.
Automic Development provided a response. I will paraphrase it here.The AE DB Load program (ucybdbld) loads objects in a single database transaction. When the ucybdbld program exits with return code 0, this means all of the objects were successfully loaded into the target AE system. Work processes, including the DWPs that handle Java API requests, maintain an object cache that is used to process SearchObject operations. When the DB Load program loadss objects it notifies the WPs that they need to refresh their caches. It does so by adding entries to the internal task list (ITL) table. If a SearchObject request is processed between the the DB Load and the refresh of the object cache, the results may be incomplete or incorrect.
This approach is error-prone, because the following two steps may execute either before or after the SearchObject request is handled:
If you’re lucky, it will run like this:
However, if you’re unlucky, it will run like this:
To ensure that the DWPs have an up-to-date object cache when the SearchObject request is sent, it is necessary to insert an extra step in the process.
Automic confirmed that because of the design of the AE, there is no way to guarantee correct results from SearchObject using the Java APIs alone. It is necessary to insert this additional step that performs a direct query of the AE DB. In very busy systems, where multiple DB loads may run simultaneously or in quick succession, it is conceivable that another DB load is run between the ITL check and the execution of SearchObject. So really, there is no 100% foolproof way to guarantee correct results from SearchObject.However, in most cases, as long as SearchObject is executed quickly after the ITL check, it should return correct results. Preventing multiple simultaneous DB loads, and better yet, adding a delay between DB loads, will also help mitigate the problem. (See also this discussion: Poor AE performance following DB load of transport case.)This problem raises the more general topic of API completeness & reliability. To get some clarity on the topic, I have posed these questions to Automic:
I will post an update here when I have answers to these questions.
I opened an enhancement request about this, asking Automic to improve the Java APIs:
Some AE Java API operations do not reliably return correct and up-to-date results. For example, SearchObject may return incorrect or incomplete results if the DWP handling the SearchObject request has a stale object cache. To work around this limitation, Automic currently recommends running an SQL SELECT against the ITL table prior to running any SearchObject operation: SearchObject be relied upon to return correct results only if the ITL table is currently empty.The AE Java Application Interface should be able to provided guaranteed correct results without relying on direct called to the DB. This applies not only to SearchObject, but also to FolderTree, FolderList, ActivityList, ObjectStatistics, and all other API classes. In other words, it should be possible to build an reliable, production-quality application whose only interface to the Automation Engine is the Application Interface.
If you like it, please vote for it.
Well, apparently 15 seconds was not sufficient. The problem happened again about an hour ago. I have increased the delay to 30 seconds. I have also asked Automic in the incident for answers to the questions in comment 36662, above.Oh and by the way, there is a related problem, documented in PRB00220244, that can cause SearchObject to return the following error when listing the contents of a folder that does exist:
U00011700 Cannot find information for path '&01'.
I'm guessing the root cause of the two problems is the same.