I'm keen to hear what kind of performance you guys are getting if you use TrapDirector.
I'm not sure what the maximum traps per second limit is these days but when running the performance checker I'm being told I'm over the limit. I'm averaging 400 traps a second and was at 1000 before we cleaned up some traps we didn't use.
It seems as though the performance tool (PView) might have the pre-64-bit values in it.
What do people with large domains do to manage traps? I'm using the trap-based auto discovery to find devices which might not be modelled (by hiding the New Devices container and making the devices go into Maintenance mode). This allows us to question the devices and either stop the traps from being sent on the devices or adding them correctly.
I'm interested to know if there are any good tools which you can implement before traps hit the trap director for analysis of traps. I know you get trap replicators but what I'm looking for is something that tells you what traps are coming in and kind of does a top-N report of top OID's, top devices sending traps, etc.
I want to do this before Spectrum processes them as there might be a lot of traps Spectrum hasn't been configured to deal with yet.
To really tell if your traps are causing an issue with the SS, on the machine that you have Trap Director enabled, open the Trap Management view on the VNM. On the right hand side is an entry for Alert Forwarding Queue Length. This should always be at 0. It should not be constantly increasing. When the traps are processed they are stored to memory (very quickly while it’s determined where the trap needs to go). During a trap spike this queue may grow. Once the trap spike is over, the queue should process and drop back to 0. If this queue is constantly increasing, you are over the limits of trap processing.
During testing of an average SS (meaning not highly customized or using every IP service offered) we found the following and posted it to the FAQ:
A: 600 traps/second sustained and 1000 traps/second peak for a window of about 10 minutes. The SpectroSERVER will queue up traps if larger trap rates are experienced, however this leads to an increase in memory utilization during that time.
I know this doesn’t answer your other question about a good tool but hopefully gives a little insight on whether your Trap Director SS is really bogged down or not.
Yes that link is pretty good as it has other good questions too. I think we have too many traps for TrapDirector to handle. I've been looking for ways to tweak as changing config to remove some traps types isn't an easy option. I'm monitoring things like queue length and number of traps but seems the delay might be the lookups themselves - so I was looking at trying to work out ways to keep using Trap Director as it provides failover.
I've used samplicate which sat in front of a TrapDirector and it had rules which forwarded traps to landscapes it knew devices for and only devices that were not known went in via trap director. This is a pain as failover isn't as automatic as if we use the trap director.
This is one of the areas I think CA could improve on as I think while TD works for the top 90% of customers, MSP's need to handle a lot more and having multiple TD's are not ideal.
Hi Frank, no sales pitch here but look no further than Augur - TrapStation: SNMP Trap Manager for a simply amazing trap director replacement.