Automic Workload Automation

Expand all | Collapse all

Best practices for file transfer polling?

  • 1.  Best practices for file transfer polling?

    Posted 07-21-2014 04:23 PM

    So I was recently tasked with fixing up a flow for a file transfer process.  The first step in the flow is a free-form FTP job that downloads files from an FTP server not running a UC4 agent.  Then it checks if a file exists on the intermediary FTP server (via UC4 condition) which is running an agent with FTP RA before ultimately transferring it offsite.  If there are no files, it cancels the flow.  After making some corrections, testing and approving the flow to be migrated to production, I learn they want to check for files to transfer every 5 minutes.

    The way it is setup right now, the whole flow is on a schedule to run every 5 minutes.  Most of the time there are no files, so it just cancels itself.  But that's 288 requests a day, with multiple cancelled jobs per requested flow.  And that's just for one file transfer process.  I feel like this pollutes our history, and seems like there must be a more elegant way to do this.  Below are a couple of my ideas, but surely this is a common scenario, so I was wondering what the general consensus on the best way to do this was.

    - A "monitoring" job that runs continuously, regularly checks for files on the remote FTP server somehow - either putting an agent on the source FTP server or integrating FTP commands into UC4 logic somehow?

    - Offloading the initial part of the flow to a script called regularly from cron and if files are retrieved requesting the flow from the host OS?

    It just feels kind of kludgy and like I must be missing something obvious but maybe it has to be that way because of requirements/limitations.  Solutions/recommendations?



  • 2.  Best practices for file transfer polling?

    Posted 07-21-2014 05:52 PM

    Perhaps something along the lines of what we do for some of our FT processing that kicks off when the files are available might work for you.  I guess this is similar to your proposed “monitoring” job though all of our hosts have an active Agent, we are on V8.

    We have a number of Time Events that have logic in their Interval Process that does a PREP_PROCESS_VAR for an associated Variable.  That variable contains Validity keywords of the Hosts (Agent name) and Values for files and their corresponding process flows that will be activated should a file be found.  These Events usually are executing throughout the day or during certain time periods.

    Keyword      Value  HOSTA@1      THE_PROCESS_FLOW_FOR_1 HOSTA@2      THE_PROCESS_FLOW_FOR_2 HOSTB@1      THE_PROCESS_FLOW_FOR_OTHER
    HOSTA#1      /file/spec/thefile1 HOSTA#2      /file/spec/thefile2 HOSTB#1      \file\spec\some.other.file

    The interval’s logic processes all of the keywords with a filter of “*#*” and does a GET_FILESYSTEM with the appropriate parameters.  If the file is found, then the keyword’s prefix is changed from a # to an @ and that keyword is retrieved with a GET_VAR.  Then an ACTIVATE_UC_OBJECT is performed with the associated process flow.




  • 3.  Best practices for file transfer polling?

    Posted 07-21-2014 06:52 PM

     

    We are on V9 and  I struggle with this too.  I've got thoughts on the subject but every case is different so I don't have a best practice.

     

    So far we have been fairly successful in telling our customers that checking more than once(maybe twice) an hour is the most we will allow, and the customers usually learn to accept that.  Quite often the reality is that they ask for high frequency only because it would be "nice to have", not because it is an actual business requirement.  Then we put it in our schedule 24 times and we are done.  We also share with the customers what time of hour it will run, so they can adjust their workflow accordingly.

     

    I have implemented remote FTP listener programs for systems that can have minor delays.  I build an FTP object that just returns a file list and in the post-process I read the file list.  If the files we are looking for exist in the list then it finishes normally and the workflow proceeds.  Otherwise the FTP listener restarts itself after a specified time interval.

     

    However using a listener program for an all day process suggests it would be in the activity window 24/7, and when the workflow finishes it would restart itself so it can continue listening.  But I don't like the idea of a "permanent" resident in the activity window so I would not entertain this idea. (I would favor 288 runs over this idea.)

     

    Another possibility would be an external listener process that uses CALLAPI to tell Automic when to run the workflow.

     

    Pete



  • 4.  Best practices for file transfer polling?

    Posted 07-22-2014 01:12 PM
    This is probably akin to your "monitoring job" idea...

    In our company, we start by building a "polling" workflow containing a single job to do an FTP list on the remote FTP server.  If any (relavant) files are found, it'll then trigger a second "processing" workflow to actually download/process those files.  (Typically we'll kick off a distinct "processing" workflow for each file we find, passing the name of the file to be processed.)
    The polling workflow is driven by a recurring time event object.

    It's admittedly a lot of objects for what should be a fairly simple task, but by separating the polling and the processing, you can isolate your statistics of when files were actually found by just looking at the statistics of your "processing" workflow.

     



  • 5.  Best practices for file transfer polling?

    Posted 07-22-2014 02:04 PM
    I've been trying to figure out what a "Time Event" is (in context) all morning and cannot do so for the life of me.  I am running AM V8 SP10 and there does not appear to be an "Event" object type at all.  Figure I might save myself some grief and just ask - is such a thing really available in V8 (SP10) and how do I use it?  I can't find it in the included documentation, the web the closest (or probably actual) thing is mentioned in V10 doc.

    I agree that having a monitor process residing in the backlog is more of a trade-off than a clear solution.  Generally I like my displays to only present information that needs my attention - I don't care if the monitoring process is running, only if it's not because of some problem or it has done some work/produced some output.  I suppose this can be partially mitigated through the use of queues for organizing running tasks into "sub-backlogs" (although kind of limited to one queue or all queues, no combination/exclusion).  So, this is a question of one task in a queue/the backlog that you don't really need to see, or a lot of entries in the history you don't need to see.

    My initial reaction was actually one of Pete's suggestions - 5 minutes is too often, is it really necessary?  Hourly/semi-hourly should be fine.  But at the same time it just lessens the undesired entries and I'm kind of conflicted as I prefer finding creative solutions rather than essentially saying "no, you're request is a bit absurd".

    Thanks!

     



  • 6.  Best practices for file transfer polling?

    Posted 07-22-2014 02:09 PM
    A "time event" is an OM object.  I've only worked with OM, so I can't speak to AM...

     



  • 7.  Best practices for file transfer polling?

    Posted 07-22-2014 02:11 PM

    Bill:

    Sorry, I assumed Operations Manager, not Applications Manager.  The Time Event is an object type in OM and AE.  I have no AM knowledge or experience.




  • 8.  Best practices for file transfer polling?

    Posted 07-23-2014 03:05 PM

    I have a shell script in a job called WAIT_FOR_FILE, which could be adapted to check an FTP location as well. It allows us to monitor how often/late files are, and it also provides flexibility for notifying the requestors, with escalation. Also, the job only has to run once per file, no matter how long it takes to arrive.

    It takes several parameters:

    • Where to look, for what file(s)
    • How often to recheck
    • Minutes after which to which to send a warning email (but not exit), and an email address
    • Minutes after which to send a final email (and exit with error or keep waiting), and an email address
    • Seconds to wait for the files to be stable (checksum) after the file is found, before exiting with success
    • What subject to use in the final email, which depends on how your workflow will react to the error exit (for example, "workflow will be canceled until tomorrow" or "workflow will continue to wait indefinitely, but no more alerts will be sent")

    If the file is found after a warning email is already sent but before the final cutoff, then it also sends a "hey, found the file, no problemo" email as well.

    I like the idea of file events, but they don't do the two-tier runtime escalation, and I really like having one single object that we can use and query for all our file-waiting needs and information (TBH, the whole team doesn't use it consistently, so it's more a theoretical benefit).



  • 9.  Best practices for file transfer polling?

    Posted 07-25-2014 07:42 AM

    Bill, there is no Time Event in AM. We are using AM since 2006 (we are in the process of migrating to V10) and the way we did it (we also have ftp jobs that checks every 15 minutes at our company... it seems to be that every company has those kind of process:smile: ) is to put a schedule on the module that fires the job every 15 minutes. Then, with an output scan, we change the status string to "file not found" if there was no file. So, with a history query, it's easy to find the jobs that actually transferred files.

    Hope this helps...