DX Infrastructure Manager

Expand all | Collapse all

Method to verify vmware probe working and not in corrupted state??

Jump to Best Answer
  • 1.  Method to verify vmware probe working and not in corrupted state??

    Posted 05-16-2016 12:22 PM

    Hello, looking for some guidance here with regards to creating a script or a pu command to check if all my vmware probe instances are working correctly and not in a corrupted state.

     

    What I mean by this is out of the blue, we'd try to open a vmware probe and we get the error:  key host not found The probe's GUI doesn't load and while in this state isn't monitoring anything.

    We've already implemented the increased java memory startup change to start up the vmware probe with 256MB but that doesn't fix it 100% of the time.

    Investigating the probe while in this broken state the <properties> section which contains the username, password, etc is missing from the <resources> section of the vmware probe.

    So if you open the probe in RAW mode, you'd see under the resources folder the entries but there would be no sub-folders. When you don't see the sub folders which would be 'auto_monitors' and 'properties' the probe is in this broken state.

    broken vmware example.jpg

    We then have to go thru the whole process of restoring the probe to a good working state with a backed up good cfg file.

     

    Anyway my question; is there a way to check, for the existence of these sub-entries under the <resource> entries to verify they exist? Right now the only way to verify is to manually go to each instance and open in RAW mode and check. Would this be possible thru either thru a pu.exe call back or something else? There is no way to know the vmware probe is in this broken state unless you by chance open the probe and actually get the error pop up:  key host not found



  • 2.  Re: Method to verify vmware probe working and not in corrupted state??

    Posted 05-16-2016 12:46 PM

    Daniel ,

     

         We are also facing the same issue sometimes ,once the probe is restarted it is working fine .Ideally the probe is used to monitor the servers ,It would be bad to monitor the probe itself .It's needs to be fixed permanently by probe side itself.And it would be helpful to have a dashboard in UMP for vmware probe health status.



  • 3.  Re: Method to verify vmware probe working and not in corrupted state??

    Posted 05-16-2016 01:11 PM
      |   view attached

    Hi,

     

    May be you can look at the log file find anything matching the error and put up a logmon. I do this for quite a few critical probes. But again, this is just to diagnose and it is useful when we know the solution to fix it. And yes, vmware probe, many times is prone to issues. So, I always have a package backup in my archive.

     

     

     

     

    -kag



  • 4.  Re: Method to verify vmware probe working and not in corrupted state??
    Best Answer

    Posted 05-16-2016 01:09 PM

    Okay just figured out a way to do this much quicker than manually having to go thru each one by using the pu.exe command in the \Nimsoft\bin directory.

    So in IM tool do a Tools > Find, select Probe and then enter vmware. Generate your full list and copy out results to excel. What we need is the probe address column. Delete everything else and leave the probe address column in column B.

    In Column A put:   pu -u administrator -p password

    Column B vmware probe address list entires. Exmaple: /UIM/Hub1/prihub1/vmware

    Column C:  get_node_values NULL >> vm_check.txt

     

    So the full command would be:

    pu -u administrator -p password /UIM/HUB1/Robot1/vmware get_node_values NULL >> vm_check.txt

     

    Were getting each node in vmware and checking its properties and appending the  output to this file:  vm_check.txt

    So a good working return looks like:

    ======================================================
    Address: /UIM/HUB2/robot1/vmware Request: get_node_values

    ======================================================

    resources      PDS_PPDS        229

    0              PDS_PDS        220

      port            PDS_PCH          4 443

      host            PDS_PCH          13 HOST1-vc01

      interval        PDS_PCH          6 10min

      name            PDS_PCH          13 HOST1-vc01

      ID              PDS_PCH          13 HOST1-vc01

      active          PDS_PCH          5 true

      user            PDS_PCH          5 root

      msg            PDS_PCH          17 ResourceCritical

      key           PDS_PCH          21 HOST1-vc01.Profile

      pass            PDS_PCH          25 N6nxas2dFvs2wdH8qh0ToGw==

    status          PDS_PCH          3 OK

    May 16 12:45:10:169 pu: SSL - init: mode=0, cipher=DEFAULT, context=OK

     

    A bad entry would look like this since there is no user, or pass entries we can tell this is in a broken state:

    ======================================================

    Address: /UIM/HUB1/robot1/vmware Request: get_node_values

    ======================================================

    resources      PDS_PPDS        110

    0              PDS_PDS        101

      interval        PDS_PCH          6 10min

      name            PDS_PCH          11 10.1.10.27

      ID              PDS_PCH          11 10.1.10.27

      active          PDS_PCH          5 true

      msg            PDS_PCH          17 ResourceCritical

    status          PDS_PCH          3 OK

    May 16 12:44:59:357 pu: SSL - init: mode=0, cipher=DEFAULT, context=OK

     

    So once you have the full list of all vmware probe instance locations, copy out the results into Notepadd++, then do a remove on all the \t (tabs) and replace with " " (space). Throw this into a batch file.

    Create a batch file with the results:

    D:

    cd D:\Program Files (x86)\Nimsoft\bin

    pu -u administrator -p password /UIM/Hub1/prihub1/vmware get_node_values NULL >> vm_check.txt

    etc....

     

    Then you can go thru the list and check. This beats having to go to each one, 1x1 and opening and verifying. Hope this helps folks.



  • 5.  Re: Method to verify vmware probe working and not in corrupted state??

    Posted 05-16-2016 04:36 PM

    Found a better method:  get_status which returns the last status of each profile setup in a vmware instance. So the returned values are easier to determine if good or bad:

    The results looks like this now:

    ======================================================

    Address: /UIM/Hub1/robot/vmware Request: get_status

    ======================================================

    loc1vc1 PDS_PCH          7 POLLED

    loc2vc1 PDS_PCH          7 POLLED

    May 16 13:27:37:087 pu: SSL - init: mode=0, cipher=DEFAULT, context=OK

     

    So it’s easier now to see which ones are broken or empty:

    So broken entries have a NOK values:

    ======================================================

    Address: /UIM/Hub2/robot1/vmware Request: get_status

    ======================================================

    PROD-VC PDS_PCH          4 NOK

    May 16 13:28:49:341 pu: SSL - init: mode=0, cipher=DEFAULT, context=OK

     

    So the new command would be:

    pu -u administrator -p password /UIM/Hub1/prihub1/vmware get_status NULL >> vm_check.txt



  • 6.  Re: Method to verify vmware probe working and not in corrupted state??

    Posted 05-19-2016 03:27 PM

    Hi Daniel,

    How and where can we run "get_status" for vmprobles? Could you send a example?

    What do you think if we save the results from "get_status" into vmware-log.txt file and use  logmon probe to looking for the word "nok" into the vmware-log.txt?

    Regards,

    note: I'm about to have CA UIM as a procution monitoring system and I'm a beginner user for UIM.

    Nelson



  • 7.  Re: Method to verify vmware probe working and not in corrupted state??

    Posted 05-31-2016 09:52 AM

    Hi Nelson,

    So the command would be:

    pu -u administrator -p password /UIM/HUB1/Robot1/vmware get_status NULL >> vm_check.txt

     

    Yes you could set up a profile in logmon and scan this file output. The above command would have to be ran on a regular interval so the logmon probe would read it.