DX Infrastructure Manager

Expand all | Collapse all

Restarting data_engine probe

  • 1.  Restarting data_engine probe

    Posted 09-11-2009 05:11 AM
    Every now and then the data_engine probe has trouble writing info to my SQL server.  It still writes the data but it slows down so much that the queue for the data_engine starts to fill up and I get an alarm  -

    The queue 'data_engine' is 349 MB. Check the hub configuration.

    The size contiues to increase until I restart the data_engine probe.  I'd like to have an auto operator watch for this alarm and issue the pu command to restart the probe.  I have the syntax to deactivate and reactivate the probe but I see a -R option that is for restarting a probe.  I can't seem to figure out the syntax to use it.  Any ideas?


    Also, is there a better way to do this?  Is there a way to just set a weekly restart of the probe that way I don't have to wait for it to slow down and generate the alarm? 

    Thanks.


  • 2.  Restarting data_engine probe

    Posted 09-11-2009 10:57 AM
    When I want a probe to do a cold restart, I have found that sending the probe a _stop command is the simplest method.  When it exits, the controller restarts it automatically.  It is simpler than deactivating and activating, and I think it is safer because it is less likely to leave the probe down accidentally.  This only works on probes that listen on a TCP port and have callbacks, but this includes most Nimsoft-provided probes.

    You can stop the data_engine from a NAS script like this:
    rc = nimbus.request(robot_addr.."/data_engine", "_stop")
    If you want to do this in a script fired off by an AO profile, this should do the trick:
    al = alarm.get()
    robot_addr = "/"..al.domain.."/"..al.hub.."/"..al.robot
    rc = nimbus.request(robot_addr.."/data_engine", "_stop")
    if not rc then
       nimbus.alarm(NIML_MAJOR, "Failed to restart data_engine")
    end
    You could also hardcode the robot_addr variable if you have only one data_engine, which is usually the case.  If you hardcode the robot_addr, you can run this daily/weekly/hourly by using the Schedule tab in the NAS.  I would recommend upgrading to NAS 3.28 if you want to schedule a script; we had an issue with scheduled scripts not being started reliably, and that is supposed to be fixed in 3.28.

    -Keith