I'm experiencing the following challenge with one of our customers:
1) devices monitored with docker_monitor probe are setup with non-unique IP addresses (INSIDE), which are later NATted (Network Address Translation) to a unique IP address (OUTSIDE). The robot is configured with the robotip_alias attribute to state the unique OUTSIDE IP address (this wokrs correctly). The problem is the IP address reported by the docker_monitor probe is the INSIDE one and it results in multiple devices being over-correlated into one master device
2) additional problem caused by the above - from time to time the device (server) is reinstalled and gets setup with a different MAC and IP address (INSIDE), while the OUTSIDE IP address stays the same. The correlation is confused again and does not correlate the old and new device, because of a mismatch in MAC and IP (OUTSIDE).
i've done some extensive troubleshooting and research and I have the following idea how this could be resolved:
1) include either docker.HostName or label device attributes coming from the docker_probe into correlation names
2) exclude docker.PrimaryIPV4Address from source IPAddresses property
3) achieve a correlation using either the name_origin or origin_IP correlation rule
Somehow I'm struggling to set this up correctly (I'm testing this using the dry_run_... probe command with a test cfg file as recommended in the device discovery troubleshooting guide), thus I'm on the search for an UIM device correlation expert, who would be willing to help me. CA Support case associated with this issue: 00985900
There is an ongoing project to onboard 4000 servers and 2000+ databases in UIM monitoring by the end of March, where this issue causes a huge blocker.
Kindly requesting help with this situation.
at this moment it looks like I managed to adapt the
to correlate devices correctly in this complex customers environment.
Under <device><correlation><correlation_names> I have introduced the docker_monitor.label attribute to be included:
<correlation_names> included_probe_properties = "any.PrimaryDnsName,any.OtherDnsNames,controller.RobotName,any.SysName,any.VMName,any.ComputerName,niscache.label,docker_monitor.label" excluded_probe_properties = "ibmvm.SysName" </correlation_names>
and excluded the PrimaryIPV4Address coming from docker_monitor probe under <device><correlation><defined_correlation_property><IpAddresses>:
<IpAddresses> included_target_properties = "any.PrimaryIPV4Address,any.OtherIPAddresses" included_source_properties = "any.PrimaryIPV4Address,discovery_agent.TargetIpAddresses" excluded_target_properties = "docker_monitor.PrimaryIPV4Address" excluded_source_properties = "docker_monitor.PrimaryIPV4Address" type = IpAddress </IpAddresses>
After saving the changes and restarting the discovery_server I have enforced a new correlation run on the over-correlated robots (devices using same INSIDE IP) using the callback
which resulted in having the over-correlated devices split correctly.
For the under-correlated robots (devices using a different MAC and INSIDE IP address after restart) I have done the following:
Which resulted in merging the duplicates correctly.
We have also concluded a test - restarting a robot running the docker probe, which resulted in a change of MAC and INSIDE IP address of the robot and no new duplicates have been created, the old and new robot have been correctly correlated into one device.
anyone else care to comment?
what would you like to know? I have found the solution myself.