I would like to start a discussion around how you keep your Spectrum systems up to date with the latest changes on your devices and the connections in your network. Please reply with how you maintain your Spectrum map, whether it is through Spectrum components, other CA tools, other vendors tools or even your own special processes, scripts and tools.
As you know, Spectrum does a great job breaking down a device and discovering what capabilities the device has. It also does a good job figuring out how devices are connected. But without an accurate, complete modeling of the devices and connectivity, Spectrum can have a difficult time identifying the root cause.
We have struggled with this concept for years. We have tried using Autodiscovery, custom scripts and internal processes. Still years later, we have difficulties keeping everything up to date. New cards are added to routers. New devices are added to the network. Vendors mangle standard mibs or only support their proprietary mibs which can make it difficult for Spectrum to learn all the connections.
So, how do you keep your landscape up to date?
To start the discussion off, here is what we do.
To add devices into Spectrum we use two methods:
Connectivity mapping is handled in several ways:
Ideas we have contemplated.
We're doing something very similar (in house inventory tool that verifies snmp, then pushes to Spectrum via modeling gateway).
One thing that we have done to try to catch human error is set up an autodiscovery that walks all of the IP ranges that we use for network device management addresses. It's scheduled to run weekly and auto export full results as a CSV, and not model anything. We then have a script that compares the results to the contents of the inventory tool and sends the differences to us. This has helped a ton with undocumented devices, but it hasn't helped as much with undocumented changes to existing devices.
We've also got the in house inventory tool doing it's own discovery every night. If sysobject ID of a device changes, it pushes a delete and then an add to Spectrum. If firmware version, name, or IP change, it pushes a change.
The biggest gap we have is with point to point links, especially redundant links. I have yet to find a satisfactory way of handling lack of information about link changes.
WILLIAM BARNES wrote: We have an in-house developed tool that is supposed to validate that a device can be properly managed (tests for SNMP and other configurations). Once it passes, then it can add the device to the various tools in our environment. This is done once when the device is added and again to remove the device at the end of it's life.
WILLIAM BARNES wrote:
What logic does the import script use to determine where in the Universe Topology to place the device?
WILLIAM BARNES wrote: Discover connections after link up Events - on large devices this can impact Spectrum performance. Especially if there are ports flapping. We have brought Spectrum servers down with this enabled.
Have you tried this on Spectrum v10.x? I'm wondering if Spectrum would be able to handle this once on this version. Or alternatively, maybe the "DeviceDiscoveryAfterReconfig (0x11d27)" would work better, along with the scheduled reconfig script you mentioned.
WILLIAM BARNES wrote: using auto discovery to find devices, but not model them. Use it to feed the in-house tool.
This is a good idea. I hadn't thought of passively using the AutoDiscovery in this way.
Thanks for starting this thread. Unfortunately I don't have anything more to add since it seems we're about in the place as you. It's refreshing to at least hear this is a common problem, rather than us just overlooking the silver bullet Spectrum feature to solve all of these problems.
I’d like to note that in 188.8.131.52 and above we added the “Discover Connections During Scheduled Discovery” option to the Autodiscovery gui. Previously any scheduled discovery would not discover/update the connections for models that already existed. With this option selected, you can run scheduled discoveries at night to update the model connectivity.
I know this doesn’t solve the issue, but I wanted to throw it out there for those that may not be aware as it has helped to keep Spectrum modeling more accurate in many environments…
If that option is checked, what is the scope of connection discovery?
It should be in context of the models in the discovery (and possibly their connected devices) however every now and again I have seen it jump outside of that (not sure why)…
Bill_Barnes and mwegner are describing their use of the AutoDiscovery as being passive though, where they're only using it for Discovery but not for Modeling - and I'm now considering this approach too.
Does the “Discover Connections During Scheduled Discovery” option take effect if the AutoDiscovery does not go into the Modeling process?
I don’t believe so. I believe it’s only invoked during the modeling phase…
We do something similar. I have some scripts which check the info before I even try to model it in Spectrum, and get additional information such as device model, etc.
I have also written some logic config which determines which landscapes the device needs to be modelled on and then checks that both landscapes can actually poll the device based on some settings.
Container placement is also determined by different logic based on where the device add request is coming from. This is based mainly on naming standards.
The scripts were written that if the landscape allocation logic changes the devices will be moved to the new landscape and removed off the old one.
Once the device is added, we look for different information on the ifAlias to extract information such as circuit ID's etc. We also use this to set if we want Spectrum to monitor a particular interface on the device.
We use Spectrums Discover Connections and automatically monitor any connected devices as well.
Once a device is modelled, We do a lot of automated daily checks:
1. Check if there are any devices which have the 'DIFFERENT TYPE MODEL' alarms. These are deleted and re-modelled automagically as this indicates that the device was upgraded/changed out.
2. We also check to see that we are at least getting some traps. We assume we should get at least 1 trap a day if properly configured.
3. We also look for devices which are down for more than X days and query these with Networks
4. If a device is in Maintenance Mode and down for more than X days we also query it as sometimes our L1 engineers forget to take them out of maintenance mode.
5. Make sure we have recent configs for devices that are using NCM
6. Make sure the backups have run successfully, and that the landscapes are in sync.
7. Flag any 'Initial' state Models. We don't like the initial state for pingable model. This hardly happens but do the check anyway Empty containers also change to 'Initial' state so easy to delete these.
8. Check for any 'Unknown trap received' events so we can map them using Event Config.
9. We generate a lot of reports from Spectrum e.g. Different types of alarms, e.g. VNM related alarms (Spectrum operation) and other critical alarm types.
10. I have a random checker that does SNMP checks to see if the device in Spectrum has everything right. The random devices get taged so can't be chosen again unless every other device has been checked.
11. Scripts are also checking and alerting if any integration isn't working - like if we don't get a specific dump file when we expect to.
12. Using trap-based auto discovery to find devices not modelled (we hide the 'New Devices' container by using a 'Hidden' security string.
There's more but can't think of much more than that.