DX Unified Infrastructure Management

 View Only
  • 1.  cdm no sending some QOS

    Posted May 19, 2022 08:54 AM
    Hello,

    I am encountering a problem at a specific client with about 200 robots.
    About 30 of them do not send some of the configured QOS. Number of CPUs and disk related QOS are sent, but everything else (CPU usage, memory and network QOS) are not sent.
    I have verified with DrNimbus and tcpdump that these QOS are not sent.

    cdm logs look like they collect the data and I can not see any specific difference between the logs on a robot that sends that QOS and one that does not send it.

    OS is RHEL 8.4. UIM version is 20.4. Sysstat package is installed and sysstat service is running. I have tested with cdm versions 6.81-MC, 6.71-MC, 6.71.
    I have tried reinstalling the robot.
    I have tested configuring cdm with MCS profiles and from Admin Console.

    Is there some package required by cdm that I am missing on these servers?

    Thank you,

    Marius


  • 2.  RE: cdm no sending some QOS

    Broadcom Employee
    Posted May 20, 2022 04:24 AM
    Hi
    Are you using MCS, and if so enhanced vs legacy templates?
    If using enhanced you need to look at the plugin metrics to check the target is correctly set. 
    Regards,
    Rick


  • 3.  RE: cdm no sending some QOS

    Posted May 23, 2022 11:56 AM
    I haven't looked but do make sure that version of Linux is supported for he version of CDM you are using.

    CDM essentially scrapes the output of some of the standard OS commands and in past versions there have been occasional issues with unexpected numeric values - especially where the size of a number throws the standard command formatting off. So for instance look at the output of ps -ef on one of these systems that's not reporting QOS and see if there's anything unusual in size. Similarly look at the /proc filesystem and if there's anything unusual there. 

    Otherwise CDM is pretty reliable. 

    If you find nothing of note, suggestion is to delete the probe through IM, delete the cdm directory off the server, then deploy the original 6.80 (non-template version) and watch what the default configuration generates with regards to QOS. If that works, you add in a piece at a time your existing configuration paying attention to where it breaks.

    Log level 5 is a must while doing this so that you can provide the required evidence to support when you find what breaks it.


  • 4.  RE: cdm no sending some QOS

    Posted May 23, 2022 03:15 PM
    Mixing probe administration between mcs, IM, AC may cause a problem, especially if deployed via mcs and then any change is made via IM or AC. 
    In such a case, best to follow Garin's advice and remove the probe and redeploy, and log level 5 to see what it has for the missing metrics. 

    This also might help if you will be using mcs:
    1 open Admin Console
    2 select srv-rhel-gluster01
    3 from the 3 dot menu for controller select View Probe Utility in New Window
    4 select _nis_cache_clean & click the green arrow
    5 select _reset_device_id_and_restart & click the green arrow
    6 close this tab
    7 now in Admin Console select the primary hub
    6 from the 3 dot menu for mon_config_service select View Probe Utility in New Window
    7 select plugin_metric_correction
    8 under robot_names enter the robot name and click the green arrow 

    Check the spooler log at the robot.


  • 5.  RE: cdm no sending some QOS
    Best Answer

    Posted Mar 22, 2023 07:04 AM
    Edited by Marius Nitu Mar 22, 2023 07:17 AM

    I had narrowed the problem to be related to enabling Nic Monitor enhanced MCS profile. Before enabling it the cdm probes were collecting metrics as normal. After that only Nr of CPUs and some of Disk metrics were collected on the robots with the problem.

    I have seen the same problem at another client running Windows 2019, so the specific OS is not the problem.

    The solution I have found was:

          0. Disable Nic Monitor Enhanced MCS profile;

    1. Run a script to rename plugin_metric.cfg on the problem robots;
    2. Delete the problem robots from the inventory, but keeping the metrics and the alarms;
    3. Restart the robots in IM.

    After this adventure I stopped using enhanced MCS profiles.