DX Unified Infrastructure Management

 View Only
Expand all | Collapse all

Tech Tip: robot/probe version checker and report tool

  • 1.  Tech Tip: robot/probe version checker and report tool

    Posted Apr 06, 2022 01:41 AM
    Edited by Luc Christiaens Jul 04, 2022 03:59 AM
      |   view attached
    If there is something common between all clients I visited in the past it is the difficulty they had to have a clear view on what probes/packages and their versions that were installed in their environment.
    Via the good old Infrastructure Manager we could drag and drop a new version of a probe, but it was not always easy to see/check the total picture/results.
    ---
    The attached tool, available in Perl source and compiled format, will try to get the highest available version from your LOCAL archive and compare this with the information available from the local, on each robot, stored information.
    Input:
    To help you create specific target reports you have several:
    - sql filter parameters: -lo, -lh, -lr and -lp
    - on the output of the created sql query you can apply regex parameters where you can filter on: origin, hub, robot, probe, os_major_os_minor, user_tag1 and user_tag2

    Output:
    - html report with in red all robot/probe that have lower versions installed
    - PU txt file with commands ready to be executed to upgrade the probes (they are NEVER executed/deployed/upgraded automatically)


    This example was run with parameters -ex"n"
    (-ex"n" is to report also the probes that have a correct version installed)
    In the PU txt file you can find:

    Doc file is also included in package/attachment
    Remarks, suggestions, problems and questions are very welcome.

    In version 2.1, takes a totally other approach than previous versions.  
    Initially we started, in 1.0, from a list of probes installed on robots coming from cm_nimbus_probe, but this had the limitation that we didn't see/report all installed non-probe packages. (java, app_disco_, c++ distrib,..)

    Version 2.0:
    - Via the 4 available SQL selection parameters we create a list of servers 
    - for each selected server we issue a controller callback: inst_list_summary
    - on this packages list we will apply the regex filters and create a report and when needed PU commands (in a txt file)
    Version 2.3:
    - some additional code to recognize test fixes and hot fixes
    - -dp: delete package parameter, this make it possible to remove, in a controlled way, packages in your environment
    - nimsoft_generic.pm can now be located in the same directory as the tool or in perl/lib

    This latest version is posted lower (nimsoft_check_packages_version_v2.3.zip)

    #uim #package #check #tool #perl #commandline #report #version #installed  ​​​​​​​​​​

    Attachment(s)



  • 2.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 06, 2022 09:51 AM
    Very nice.

    I invented a similar wheel a couple months ago when I discovered a significant inconsistency that occurs in the installed version information. There are several sources to identify version (_status callback, get_probes callback, getrobots, local log files, vesions.txt, installed.pkg, cm_nimbus_robot, cm_nimbus_probe, etc.). And, while Nimsoft isn't an outlier in this regard, any time you have many places the same information is stored it is bound to be wrong at times. I have observed (not proven but observed) three events that confound this - if you downgrade a probe install the version information will get out of sync, if you install a package with a dependency and the install of the dependency fails, the version of the dependency will sometimes be recorded as a successful install, and if you have a package with multiple tabs and the first tab is successful and a subsequent tab is not then the package will get recorded as a successful install.

    Because of this, in my tool, I added the additional check where for each installed probe I issued the _status callback and compared that version (the one returned by the running probe) to what discovery_server populated in cm_nimbus_probe. 

    The results were a little disturbing as I found, across about 7k robots, about 250 cases where the installed version inventory value didn't match the running probe.

    Might not be a big deal for some but consider the log4j issue recently where you have to identify the systems that have a probe version with an exploit and you have to ask the question about generating the list of systems that are vulnerable - is a one in 300 probes (250 failures in 7,000 robots with an average of 10 probes) or 1 in 30 affected robots error in the list acceptable? Probably not.

    Fixing the inconsistency is a **** matter of redeploying the mismatched probe again (except in the downgrade scenario - there it was necessary to install the latest version).

    Back to the tool, a question, how do you deal with version strings like 9.34HF1? Most places in Nimsoft, this data is stored as a number and so the fix number is lost.


  • 3.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 07, 2022 05:19 AM
    Hi Garin,
    Thanks for your valuable input.
    - adding a callback, like probe_list to a controller, seems to return the correct version info with HF included. Very good tip, I will add this 
    - for the version info from local archive I was/am struggling with this HF info because the build number is not following correctly incremental (sometimes it does and for other probes it doesn't)
    Therefor I opened an idea.
    But this morning i had a new idea to solve this problem.  When replacing version 9.32hf2 by 9.321 and 9.32_hf3 by 9.323 the version to deploy would be correct. (now i need only to add the callback to obtain the correct installed version)
    So I hope to have soon an updated version with the extra callback + document better the (strange) logic in obtaining probe/package versions


  • 4.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 06, 2022 12:17 PM

    Hello,
    We've created a grafana dashboard with that information

    You select the HUB, then robot and it shows you the probe installed and their version.

    We also present the robot version with graph

    You could also modify the query to select a specific probe and it could you show the same graph but for the cdm probe instead of the robot.

    You can filter on every columns and export it as csv if you want.

    We use the /uimapi endpoint to build the dashboard.

    Thanks




  • 5.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 06, 2022 12:24 PM

    Running curl -X GET "https://<operatorconsole>/uimapi/archive/list" -H "accept: application/json" will list every packages inside the archive so you can compare it with the data from the result of running curl -X GET "https://<operatorconsole>/uimapi/hubs/<domain>/<hub>/<robot> -H "accept: application/json"

    Or build a grafana dashboard that do that for you.

    Thanks




  • 6.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 07, 2022 05:21 AM
    Hi Guillaume,
    Is this something that you can package so that we can install this?
    Seems to be very cool.


  • 7.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 07, 2022 09:54 AM
    Hello,
    I wrote a post about grafana and UIM a while ago: https://community.broadcom.com/enterprisesoftware/communities/community-home/digestviewer/viewthread?GroupId=1315&MessageKey=f65f96f7-4a22-499d-9fb3-5ff17d38eaa6&CommunityKey=170eb4e5-a593-4af2-ad1d-f7655e31513b&tab=digestviewer

    There's all the information needed to start using it.


  • 8.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 07, 2022 09:04 AM
      |   view attached
    Thanks to remarks from Garin there is already a version 1.1
    - new option -cc"y" will add an extra callback list_probe to the robot controller to obtain the installed version
    - when running with -cc"y" and the target robot is not responding no job_add will be created for that robot/probe
    - modified logic to detect (or at least trying to) Hot Fixes in the probe name (_hf, -hf and hf)

    Attachment(s)



  • 9.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 11, 2022 05:20 AM
      |   view attached
    In attachment version 1.2:
    - option -hu gives the possibility to include hub robots (y), no hub robots (n) or only hub robots (o).  This gives you the possibility  to create an upgrade report for only your hubs and later to exclude them (default: y)
    - option -ii/-ie: ip filter include/exclude
    - try to recognize hot fixes and test fixes when executed with option -cc"y"

    Attachment(s)



  • 10.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 11, 2022 07:22 AM
    Luc

    I like the project, but we are moving to MSA/Windows login authentication to the Database and so will not have an account with a password for accessing the database directly, can you offer either a config option in the settings file to either use SQL client (as is the case now) or UIMAPI to get the data from the database tables, or move to uimapi fully and then the DB type and connection becomes irrelevant?

    Cheers, Andrew

    ------------------------------
    Knows a little about UIM/DXim, AE, Automic
    ------------------------------



  • 11.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 11, 2022 09:12 AM
    Hi Andrew,
    And if you use in nimsoft_generic.dat:
    ---
    # - use Windows authentication with the current logged on user
    sql_user=trusted
    sql_password=
    ---
    Does this work in your environment?


  • 12.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 12, 2022 04:39 AM
    Luc

    MSA (Managed Service Account) is a **** type of account where the AD manages the password, so you cannot login as this user.  A MSA account is designed specifically for a service/process to have domain privileges but controlled by AD.  So no I cannot use this option.  I would have to have a normal domain user account created which we are trying to avoid, which is why I would like to ge the information via the UIM API interface if possible.

    Just as you are doing /archive/list for the packages in the archive you could use the same for /hubs then iterate /hubs/{domain}/{hubname}/robots
    and finally /hubs/{domain}/{hub}/{robot} which gives not only the probes but also the other details of the robot. Which was what I was thinking of, also as it is independent of the DB you do not have to include the different SQL statements in your code (or handling the errors of the different connectors :)

    Regards, Andrew

    ------------------------------
    Knows a little about UIM/DXim, AE, Automic
    ------------------------------



  • 13.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 12, 2022 08:58 AM
    I'd actually suggest issuing the database query via the nas db_query callback maybe? 

    I think the uimapi calls to get hubs and robots give you the list of those that the local hub/robot to UIMAPI know about in that instant which might not be representative. These lists are notoriously incorrect if you have a multi tier environment (use tunnel proxy hubs for instance) or if you forward traffic through multiple hub hops to traverse firewalls/etc.




  • 14.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 13, 2022 01:42 AM
    @Garin Walsh is right, even thought the availability has improved of the links and nodes (especially across multi-tier) the current state always seems to have some robots out for one reason or another. So it would be better to go against the DB (which is what the db_query callback gives you), unless their is a way to use the api to get the DB data rather than the current infrastructure data, perhaps an idea? :)
    Either way it's getting harder and harder to have direct DB access even with "encrypted" passwords stored in files. Which also brings up the point, if I have to use this method I need to know (provide to the security/auditors) the encryption method used to create the password encryption.


    ------------------------------
    Knows a little about UIM/DXim, AE, Automic
    ------------------------------



  • 15.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 13, 2022 03:23 AM
    @andrew, what message do you receive when you use the previous proposed integrated security logon (that with NO password and as user the **** keyword "trusted") ? Else, do you have a test system available with this sql setup?
    ---
    # - use Windows authentication with the current logged on user
    sql_user=trusted
    sql_password=
    ---
    @garin, seem sto be a good idea to use that db_query nas callback. But do you know in what format the db parameter must be specified?​


  • 16.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 13, 2022 09:35 AM
    This is the db_query info:
    The "db" string is not the name of the database to query but the location - a value of "nas" queries the local database.db, "transactionlog" is the local transactionlog.db. I thought there was a "nis" option to hit that database but apparently not.

    I was confusing this with the query capability in dap which had the ability to query an external database. dap is deprecated now though. 

    So, sorry for the misdirect.


  • 17.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 19, 2022 07:31 AM
    Hi Garin,
    No problem, it was a good idea to test.
    The tool works fine like posted (version 1.2), I will only try to get it working with a MSA account (in the future)


  • 18.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 27, 2022 09:00 AM
    Hi Luc

    Thank you for this helpful Tool.

    Is it somehow possible to query the deployed java_jre Versions?


  • 19.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Apr 27, 2022 09:40 AM
    Good idea, will check if/how this jave_jre check can be added.


  • 20.  RE: Tech Tip: robot/probe version checker and report tool

    Posted May 03, 2022 05:30 AM
      |   view attached
    In attachment version 2.0 (nimsoft_check_package_version) 
    - like the tool is now package oriented instead of probe oriented we changed the name from check_probe into check_package)
    - it will now report on all installed packages from the selected servers
    remarks, tips, questions are very welcome.

    Attachment(s)



  • 21.  RE: Tech Tip: robot/probe version checker and report tool

    Posted May 04, 2022 05:12 AM
    Works very well Luc, like the move to package management (java_jre) among other things. Like the utility a lot :)

    A suggestion for the next version, using the automated_deployment_engine callback deploy_probe rather than the disturb add_job callback. For a single deployment it doesn't make much difference I will agree but for multiple the difference in performance both resource usage and time is huge (primarily due to the multi threading and package caching).

    Still looking at DB access when an MSA is used for MS-SQL access, as the UIMAPI does not allow you to specify the source of the hub/robot information.

    Finally can you include the Nimsoft_crypt perl source as I need to provide to the auditors the encryption method used for "admin" passwords in files

    Thanks Luc

    ------------------------------
    Knows a little about UIM/DXim, AE, Automic
    ------------------------------



  • 22.  RE: Tech Tip: robot/probe version checker and report tool

    Posted May 04, 2022 10:56 AM
    Not to go a little sideways on the discussion but with regards to the performance comparison between ADE and distsrv, I've had the exact opposite experience as far as performance is concerned. I've always found (with the exception of the period of time where the defect in distsrv caused it to only deploy serially based on the hub rather than based on the robot) that distsrv is about an order of magnitude faster and far far far less impactful on the hub it is running on. 

    On distsrv I do set block_size = 1048576 to get reasonable network utilization per send and max_inst = 40 which allows for 40 simultaneous install threads.

    On ADE I had to set numThreads = 40 to prevent ADE from starting a thread per package transfer. 

    To compare, using testing I did on versions distsrv 9.34 and ADE 20.42 where I was deploying logmon to 7k robots, distsrv used about 25MB of RAM and an insignificant amount of CPU. ADE would grow up to around 6GB in RAM and a significant amount of CPU (10% of the system hosting it on average).

    It is difficult to compare results because of the behavior of ADE but looking at it mid progress, distsrv was completing a job about every 2 seconds and ADE was completing one job every 15 to 20 seconds. ADE is a little "bursty" in it's flagging of things as complete as it seems to use a polling methodology rather than notification.

    The disturbing thing is that with this large workload ADE "finished" but left about 5% of the jobs in a "Running" status but over the course of 24 hours following the testing never completed them. On restart, ADE deletes any jobs in a Running status and so these were lost. So now you are left in the situation where you have to find out what ADE failed to do and resubmit it. I also found a significant number (on the order of 2%) where ADE had flagged a package install as successful but no apparent install happened.

    When distsrv restarts with work pending, it resubmits the pending job and re-evaluates the dependencies so it can pick up where it left off.

    There's not much to mess with regarding the configuration of ADE so I am wondering if there's anything that you did that resulted in your positive experience. Based on my personal testing I'm at the point of never trusting ADE.


  • 23.  RE: Tech Tip: robot/probe version checker and report tool

    Posted May 05, 2022 05:04 AM
    @Garin Walsh, Based on your comments I went back and had a look at the ADE and distsrv configs.  I was not comparing apple with apples, I am testing on a environment with approx 250 robots over 5 hubs in 2 layers.  Once I had leveled the configs (increased distsrv threads, limited ADE threads)
    ADE java was already constrained with a max memory (512MB) as this is standard policy to limit java instances gobbling all the available memory.
    I also left the distsrv block_size at 32768 (recommended max).
    Indeed I now saw similar performance in the case of delivery, and distsrv was able to deliver to a couple of problematic robots that ADE consistently failed on. With the ADE restart issue of deleting all "running" jobs  (I agree this is very frustrating and causes a major benefit of this app to report on the sync of packages across the environment) I would agree that distsrv seems the best choice for managing the package distribution process within this app.

    Regards, Andrew
    ​​

    ------------------------------
    Knows a little about UIM/DXim, AE, Automic
    ------------------------------



  • 24.  RE: Tech Tip: robot/probe version checker and report tool

    Posted May 05, 2022 06:26 AM
      |   view attached
    In attachment version 2.1 (nimsoft_check_package_version)
    New in 2.1:
    - -du: deploy utility to use: a: ade d: distsrv, default: d
    - -dp: delete package run, n: no; yes: yes, default: n

    Attachment(s)



  • 25.  RE: Tech Tip: robot/probe version checker and report tool

    Posted Jun 08, 2022 08:54 AM
      |   view attached
    Version 2.3:
    - some additional code to recognize test fixes and hot fixes
    - -dp: delete package parameter, this make it possible to remove, in a controlled way, packages in your environment
    - nimsoft_generic.pm can now be located in the same directory as the tool or in perl/lib

    Attachment(s)