DX Unified Infrastructure Management

 View Only
Expand all | Collapse all

Server Availability (PING )Monitoring using Robot - How are you managing ?

  • 1.  Server Availability (PING )Monitoring using Robot - How are you managing ?

    Posted Oct 12, 2020 01:32 PM
    Hey all,

     Just a another features which is missed from eHealth is in pipe line for years....

    We have thousands of robots and each and every time we need net_connect probe to monitor the ICMP status which is really a tough process to follow and there may be human errors while on-boarding a device. There is any plan from product team to add in controller probe or cdm basic probe itself .

    And also how you folks are managing this ? 

    We have 10K robots and not sure whether all added in net_connect probe ,any way to find this.


  • 2.  RE: Server Availability (PING )Monitoring using Robot - How are you managing ?

    Posted Oct 12, 2020 01:38 PM
    Did a search and found this doc which may be of help:
    How to add profiles to net_connect en masse using a file
    Article Id: 34267
    https://ca-broadcom.wolkenservicedesk.com/external/article?articleId=34267

    ------------------------------
    Support Engineer
    Broadcom
    ------------------------------



  • 3.  RE: Server Availability (PING )Monitoring using Robot - How are you managing ?

    Posted Oct 12, 2020 01:40 PM
    May want to review this as well:
    How many profiles can net_connect handle on a single robot
    Article Id: 115479
    https://knowledge.broadcom.com/external/article?articleId=115479

    ------------------------------
    Support Engineer
    Broadcom
    ------------------------------



  • 4.  RE: Server Availability (PING )Monitoring using Robot - How are you managing ?

    Posted Oct 12, 2020 01:46 PM
    Yes David , I am using the above process for bulk config ,My question is how we can ensure the compliance that the robots present are configured in net_connect too.


  • 5.  RE: Server Availability (PING )Monitoring using Robot - How are you managing ?

    Posted Oct 12, 2020 02:03 PM
    Just to make it clear, my best with queries is just trial, error, research, try again. 
    to just get a list of the servers being monitored 
    select target from S_QOS_DAT where probe = 'net_connect' 

    To see if there is current data a join with S_QOS_SNAPSHOT is necessary.

    ------------------------------
    Support Engineer
    Broadcom
    ------------------------------



  • 6.  RE: Server Availability (PING )Monitoring using Robot - How are you managing ?

    Posted Oct 13, 2020 07:49 AM
    Edited by Luc Christiaens Oct 13, 2020 11:36 AM
    You could try something like: (in my small lab environment this works)
    This query will:
    - select all robots
    - select all sources that have a qos coming from net_connect (= normally all monitored devices)
    - as a result will list all robots that don't have a qos from net_connect

    SELECT ip,os_major,os_minor,domain,hub,robot,origin,is_hub,robot_active,user_tag_1,user_tag_2,address FROM CM_NIMBUS_ROBOT with(nolock) where robot not in (select distinct(source) from s_qos_data where probe = 'net_connect'


  • 7.  RE: Server Availability (PING )Monitoring using Robot - How are you managing ?

    Posted Oct 13, 2020 08:20 AM
    Luc,
    Nice query and a good one to save.
    thanks
    david

    ------------------------------
    Support Engineer
    Broadcom
    ------------------------------



  • 8.  RE: Server Availability (PING )Monitoring using Robot - How are you managing ?

    Posted Oct 13, 2020 02:02 PM
    Hi Luc ,

     Thanks !!

    In case we are not enabling qos from net_connect for ping monitoring the below query goes good?

    SELECT ip,os_major,os_minor,domain,hub,robot,origin,is_hub,robot_active,user_tag_1,user_tag_2,address FROM CM_NIMBUS_ROBOT with(nolock) where robot not in (select distinct(source) from s_qos_data where probe = 'net_connect'


  • 9.  RE: Server Availability (PING )Monitoring using Robot - How are you managing ?

    Posted Oct 14, 2020 12:45 AM
    Without a net_connect qos metric you will need the solution Garin proposed, writing a LUA script (or any other supported language) to get the net_connect monitored devices.


  • 10.  RE: Server Availability (PING )Monitoring using Robot - How are you managing ?

    Posted Oct 13, 2020 09:22 AM
    How we were managing it is via MCS and USM groups.
    The port check profile used to have the ability to "Assign Dynamic" in the host, so it would be able to figure out based on the hub(origin), what robots / devices are part of the origin, and then create their ping profile in net_connect using mcs.

    But unfortunately the assign dynamic is also missing from 20.3 mcs profiles.

    ------------------------------
    Chris
    ------------------------------



  • 11.  RE: Server Availability (PING )Monitoring using Robot - How are you managing ?

    Posted Oct 13, 2020 10:18 AM
    We effectively do what Luc recommends  in comparing the list of known robots with the list of installed probes. That handles at least the probe install issues. We then further use Lua to script the checks for the existence of the various net_connect profiles using the controller get/set configuration calls. That way we can filter out the human error issues - like having net_conect installed but no profiles or never going back and adding the default gateway check, etc.


  • 12.  RE: Server Availability (PING )Monitoring using Robot - How are you managing ?

    Posted Oct 13, 2020 02:05 PM
    Looking for update from Product team whether it is planned in future updates