Restarting data_engine probe

Back to discussions

Expand all | Collapse all

Anon AnonSep 11, 2009 05:11 AM

Every now and then the data_engine probe has trouble writing info to my SQL server. It still writes ...

Anon AnonSep 11, 2009 10:57 AM

When I want a probe to do a cold restart, I have found that sending the probe a _stop command is the ...

1. Restarting data_engine probe

0 Recommend
Anon Anon
Posted Sep 11, 2009 05:11 AM

Reply Reply Privately
Every now and then the data_engine probe has trouble writing info to my SQL server. It still writes the data but it slows down so much that the queue for the data_engine starts to fill up and I get an alarm -

The queue 'data_engine' is 349 MB. Check the hub configuration.

The size contiues to increase until I restart the data_engine probe. I'd like to have an auto operator watch for this alarm and issue the pu command to restart the probe. I have the syntax to deactivate and reactivate the probe but I see a -R option that is for restarting a probe. I can't seem to figure out the syntax to use it. Any ideas?

Also, is there a better way to do this? Is there a way to just set a weekly restart of the probe that way I don't have to wait for it to slow down and generate the alarm?

Thanks.
2. Restarting data_engine probe

0 Recommend
Anon Anon
Posted Sep 11, 2009 10:57 AM

Reply Reply Privately
When I want a probe to do a cold restart, I have found that sending the probe a _stop command is the simplest method. When it exits, the controller restarts it automatically. It is simpler than deactivating and activating, and I think it is safer because it is less likely to leave the probe down accidentally. This only works on probes that listen on a TCP port and have callbacks, but this includes most Nimsoft-provided probes.

You can stop the data_engine from a NAS script like this:
rc = nimbus.request(robot_addr.."/data_engine", "_stop")
If you want to do this in a script fired off by an AO profile, this should do the trick:
al = alarm.get()
robot_addr = "/"..al.domain.."/"..al.hub.."/"..al.robot
rc = nimbus.request(robot_addr.."/data_engine", "_stop")
if not rc then
nimbus.alarm(NIML_MAJOR, "Failed to restart data_engine")
end
You could also hardcode the robot_addr variable if you have only one data_engine, which is usually the case. If you hardcode the robot_addr, you can run this daily/weekly/hourly by using the Schedule tab in the NAS. I would recommend upgrading to NAS 3.28 if you want to schedule a script; we had an issue with scheduled scripts not being started reliably, and that is supposed to be fixed in 3.28.

-Keith

DX Unified Infrastructure Management

Restarting data_engine probe

Anon AnonSep 11, 2009 05:11 AM

Anon AnonSep 11, 2009 10:57 AM

1. Restarting data_engine probe

2. Restarting data_engine probe