DX Infrastructure Management

Tech Tip: NFA Reaper Service Dies Periodically 

08-13-2014 11:33 AM

NFA Reaper Service Dies Periodically

 

Descriprion:
The "CA NFA Reaper" service dies periodically on an NFA Harvester machine.  The ReaperInput directory is filled with files and using up disk space.  The file "RealTimeReaperErrors*.log" from the day that the problem occurred contains an error message similar to the following:


03:00:00 Severity 2 - b_test failure at d:\buildagent\work\nfa_913_production\include\nqMemoryMappedFile.h 50 because D:\NETQOS\netflow\datafiles\ReaperInput\1406789940-9995_0.tbn
03:00:00 Severity 2 - 32 The process cannot access the file because it is being used by another process.

 

Solution:


This error occurs when a 3rd-party process touches a file in the ReaperInput directory, and prevents the Reaper service itself from processing the file.  This can cause the Reaper service to die.

 

Problems like that are caused by software such as Real Time Antivirus scans, and Backup utilities.  Make sure that the ReaperInput and ReaperWork directories are excluded from being accessed by such utilities.

 

Sometimes, it might not be obvious which 3rd-party process is interfering with the Reaper service.  The timestamp of the error message in the RealTimeReaperErrors log might give us a clue.  Check the Windows Event Viewer for any messages that indicate that an AV Scan or a Backup procedure began just before the Reaper process died.  Also, check whether Windows has any Scheduled Tasks defined to run at the same time.  Once we find out which process is interfering with the Reaper service, prevent that process from accessing the ReaperInput and ReaperWork directories under \Netflow\datafiles

Statistics
0 Favorited
6 Views
0 Files
0 Shares
0 Downloads

Tags and Keywords

Comments

06-10-2015 12:46 PM

Update to this:

 

As per the Release Notes for NFA 9.3.1, the Reaper service has been modified in the following way to help improve this issue:

 

The reaper code was modified to log messages as in the past, but to to continue processing instead of exiting. The rationale is that the locking issue is temporary and should resolve itself fairly quickly.

 

 

Development also made the following suggestions for when this behavior is seen:

In addition to making sure there are no anti-virus / backups running against the datafiles directory - the customer should also turn off Windows indexing. To do this:

 

 

On the Harvester -

 

1. Via Windows Explorer - go to the CA\NFA\Netflow directory - and right-click -> Properties on the datafiles directory.

 

2. Under the General tab - click "Advanced" - then uncheck "Allow files in this folder to have contents indexed in addition to file properties". When asked if you want the changes to affect all subfolders and files - select Yes.

 

3. Click OK -> OK, to make these changes.

08-14-2014 11:02 AM

While you are trying to find out what process may be scanning your Reaper directories, many customers have used the work around below to make sure the Reaper service doesn't stay down for any longer then 5 minutes at a time.

 

 

Here are some steps for Windows 2008 R2 to setup the scheduled task for the Reaper Service on the Harvester server.

 

This will make sure the Reaper service starts up if it goes down, and it if the Reaper Service is up and running, the scheduled task will have no impact.

 

  To setup the Scheduled Tasks, go the ‘Start->Administrative Tools->Task Scheduler’.

 

 

 

 

Then select "Create Task" on the right hand side.

 

 

 

 

 

 

Give the task a name like below:

 

 

 

 

Then click on the ‘Triggers’ tab and click ‘New’.

 

 

 

Set the task to begin ‘On a schedule’

 

Set it to run ‘Daily’

 

Select to repeat the task every 5 minutes.

 

Set the duration to “Indefinately”

 

 

 

 

 

 

 

 

Next click “Actions” and “New” and set the action to “Start a program”.

 

 

 

Then in the ‘Program/script’ section enter just the work ‘net’.

 

 

 

In the ‘Add arguments (optional):’ section enter ‘Start NetQosReaper’ and the name of the service like below:

 

             

 

 

 

 

 

 

 

 

This should prevent the service from being down for any more than 5 minutes, and if it’s already started the task will do nothing.

 

 

 

08-14-2014 10:54 AM

This is very important as this is one the most common issues support sees.


In addition to investigating what process it may be you can also run SysInternals Process Monitor. It's not easy to catch and you may have to keep an eye on it as it can use up resources fast. Use a filter to narrow down the search path.

Using Kahlils' example from above your filter would be on

D:\NETQOS\netflow\datafiles\ReaperInput\

Related Entries and Links

No Related Resource entered.