DX Infrastructure Manager

Expand all | Collapse all

Probe Non responding Alerts

  • 1.  Probe Non responding Alerts

    Posted 08-06-2015 10:31 AM

    In most of the cases robot status is showing Up ,but the probes are down (Spooler,hdb).In this scenario how can admins identify ?



  • 2.  Re: Probe Non responding Alerts

    Posted 08-06-2015 11:44 AM

    This question's been asked at least a couple of times, I'm pretty sure something will turn up if you do a search. However, I suspect you won't find any ready solutions.

     

    I have made a probe that does a "getrobots" to a hub and then does "port_list" to every robot that is in "normal" status. The probe is configured to expect that certain probes have a port in certain range. If a probe doesn't have a port, an alarm is raised. If robot doesn't respond to port_list, an alarm is raised. If robot has a faulty IP (from config file), an alarm is raised. I deploy this probe to a robot on every hub, typically the hub itself. Unfortunately I currently won't be able to share the probe, but you can do something similar with either scripting or a custom probe.

     

    -jon



  • 3.  Re: Probe Non responding Alerts

    Posted 08-06-2015 12:47 PM

    How these probes are created john?Which programming  is used

    But this probes must be provided by UIM team ...Any future plans are there to implement .



  • 4.  Re: Probe Non responding Alerts

    Posted 08-06-2015 03:52 PM

    I don't know whether the core team has plans to improve the self-monitoring aspect, maybe someone from the team can shed light on that.. don't remember who's responsible for core probes, maybe Jim.Perkins?

     

    You could use any language that UIM has an SDK for: Lua, Perl, C, C++, C# or Java

     

    -jon



  • 5.  Re: Probe Non responding Alerts

    Posted 08-06-2015 06:00 PM

    No.  That is not me.  You are thinking of the other Jim PM, cooja09.



  • 6.  Re: Probe Non responding Alerts

    Posted 08-06-2015 04:20 PM

    The issue I hit is that when a robot starts up and binds to 127.0.0.1. That's is a broken probe situation and we cannot get to the robot then on that box. We'd have to RDC to the box to then fix and update the robot.cfg to correct. I wish they get a fix in for this situation out of the box instead of having to specify the two settings and having to specify the IP config filter as well.

    Does anyone know off hand, how we can check in our environment for robots in this situation? Yes we can manually go thru each hub but something more programmatic would be very helpful. No alerts are triggered when the robot starts up in this phase.



  • 7.  Re: Probe Non responding Alerts

    Posted 08-07-2015 02:09 AM

    In this situation the robot reports it's IP to the hub as in normal situations, so it'll show up as 127.0.0.1. You can get this information as well from hub callback "getrobots". I'm also checking this with the probe.

     

    "XXYYZZNN has ip 127.0.0.1 which is not allowed. Triggered by IP profile ip is not loopback"

    -jon



  • 8.  Re: Probe Non responding Alerts

    Posted 08-07-2015 11:46 AM

    Here is a LUA script I wrote which will go through all hubs and do a "getrobots" call on each of them, and then print the full UIM Address of any robot which has 127.0.0.1 listed as its IP.   You could run this directly in the NAS script editor on the primary hub and it will print the list in the message window.

     

    ----------------------

     

    local gethubs = nimbus.request("hub","gethubs")

      if gethubs ~= nil then

      for i,hub in pairs(gethubs.hublist) do

      local getrobots = nimbus.request(hub.addr,"getrobots")

      if getrobots ~= nil then

      for j,robot in pairs(getrobots.robotlist) do

                                if tostring(robot.ip) == "127.0.0.1" then

                                  print(tostring(robot.addr) .. " " .. tostring(robot.ip))

                                 end

     

     

    end

    end

    end

    end


    --------------------



  • 9.  Re: Probe Non responding Alerts

    Posted 08-07-2015 11:52 AM

    Jason-  The above scripts helps to extracts the devices with loop-back address ?



  • 10.  Re: Probe Non responding Alerts

    Posted 08-07-2015 12:21 PM

    ?Yes, I believe it should help identify these robots.

     

     

    -Jason



  • 11.  Re: Probe Non responding Alerts

    Posted 08-07-2015 02:05 PM

    awesome thank you!!!