DX Infrastructure Manager

Expand all | Collapse all

UIM and UMP 8.1 first feelings

  • 1.  UIM and UMP 8.1 first feelings

    Posted 01-03-2015 10:49 AM

    Hi,

     

    UIM 8.1 was released this week. The release notes looked rather interesting - a lot of intriguing stuff such as the UDM, Discovery info for interfaces and new ade stuff, such as synchronization and documentation.

     

    However, I've at least had some troubles with getting things up and running. I did an upgrade on three environments and none of them went alright. Here's a brief of what went down for me:

     

    1. UIM 8.0 on 2012R2 and SQL 2012 SP2 latest CU (FIXED, see reply below)

    Installation went fine and all the components were updated as necessary. Installation log looked fine. After installation it seems like most things work, except for USM. It goes through "Initializing" and then gets stuck on "Loading" and throws error and stack trace, that begins like this:

     

    Details: com.firehunter.ump.exceptions.DataFactoryException : null  Stack Trace: java.lang.NullPointerException      at com.firehunter.usm.DataFactory.getRoot(DataFactory.java:3264)

    I thought the getroot might be it trying to get the "root" group in USM. It does seem like the CM_GROUP table has changed a bit. UMP (more specifically, udm_manager) does also calls to a new table in the DB called datomic_kvs, which does seem to have a root or parent/child sort of construct as well. For now, I haven't been able to make sense of what the datomic_kvs table actually does.

     

    2. NMS 7.6 on 2008R2 with SQL 2012 SP1

    The installation again went succesfully. However, now many of the core probes now fail and log a stack trace. It always reports something like this:

     

    Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The driver could not establish a secure connection to SQL Server by using Secure Sockets Layer (SSL) encryption. Error: "SQL Server returned an incomplete response. The connection has been closed.".

     

    These are all Java probes, so I guess the library it uses to connect to MSSQL has been updated.

     

    3.NMS 8.0 on 2012R2 with SQL 2014 CTP2

    I'm aware this DB is not supported, but it's something that I've been testing. Here the installation doesn't work at all, it never finishes.

     

    -jon



  • 2.  Re: UIM and UMP 8.1 first feelings

    Posted 01-03-2015 02:19 PM

    OK, so I fixed the issue with 1: There's a new component called udm_manager that installs on your primary hub. By default, that binds to port 4334. So you need to make a firewall rule for that..

     

    -jon



  • 3.  Re: UIM and UMP 8.1 first feelings

    Posted 01-04-2015 12:13 AM

    tentatively planning to do this upgrade in 2 weeks or so.

     

    Been looking at the new hub 7.63 release though as we have many issues with stability related to that piece of UIM.

     

    Similarly hopeful about the list of changes for UMP as there was virtually no reasonable way in 8.0 and older to use the HTML5 dashboard piece because of the lack of any run time determination of user. It sounds like that is there now as well as the ability to propagate the results of SQL queries from one page to the next.

     

    Admittedly though I'm gun shy about upgrading because there's history of getting the bad with the good.

     

    -Garin



  • 4.  Re: UIM and UMP 8.1 first feelings

    Posted 01-08-2015 06:18 AM

    I really wish the Wiki site was working. Trying to read the Release Notes for 8.1 and Page Not Found...

    https://wiki.ca.com/display/UIMPGA/Hub+%28hub%29+Release+Notes

     



  • 5.  Re: UIM and UMP 8.1 first feelings

    Posted 01-08-2015 07:00 AM

    Im on a clean install of 8.1 and im running into issue #1 with no firewall on at all....guess ill open a support case and collect logs that show nothing failing. Grrrrrrrr



  • 6.  Re: UIM and UMP 8.1 first feelings

    Posted 01-08-2015 08:54 AM

    Regarding your issues above with the UMP's USM not loading:

    https://wiki.ca.com/pages/viewpage.action?pageId=144546234

     

    The NOTE: 

    The UDM Manager probe must be active for the USM portlet to function.

     

    Did you check if this was running or configurred? I don't have an 8.1 instance yet to play around. 

     

    Adding the Known Issues URL for 8.1: 

    https://wiki.ca.com/display/UIM81/Known+Issues

     

     



  • 7.  Re: UIM and UMP 8.1 first feelings

    Posted 01-08-2015 01:17 PM

    That would be my bet too, or if something else is binding to 4334 before that does.

     

    -jon



  • 8.  Re: UIM and UMP 8.1 first feelings

    Posted 01-08-2015 06:05 PM

    I believe you can change the port in the udm_manager probe.



  • 9.  Re: UIM and UMP 8.1 first feelings

    Posted 01-08-2015 09:34 PM

    Was part way into 8.0 upgrades and made the switch to 8.1.

     

    Ran into an issue where using admin-console on the root hub was causing all kinds of slowness and eventual unresponsiveness of the tomcat server for the service-host. This would also cause inf-manager gui to slow down when logged in against the primary hub.  Some browser and log file debugging showed repeating requests that were failing and I think possibly beating up the hub or controller too.

     

    I bumped up debug logging and opened a support case.  Support was no help, but I eventually fixed the issue by reinstalling a bunch of service-host packages while service-host was disabled just in case a late exiting tomcat process was preventing a proper update via the installer.

     

    admin_console
    ids_services
    uimserver_home
    service_host
    monitoring_services
    discovery_server

     

    This fixed the issue, but the console was still a little sluggish.  Bumping the java heap options and disabling compression on the connector within service-host.cfg made is much faster.

     

    Ironic bug.  I also updated ppm to 3.03 which wasn't included in the updates but is available in the internet archive while trying to fix a bug.  When trying to access ppm probe config via admin console, the browser spins while admin_console waits for ppm which loses it's mind trying to find ctd info for itself until it times out. This is not fixed in 3.03, but 3.03 will give you some updated config options for some probes.

     

    Adminconsole is pretty horibly documented.  Some details I've pieced together.  

     

    Problem: Adminconsole works on the LAN, but you get a "can't access server" error in the ump portlet externally.  There really isn't an adminconsole portlet for ump.  The portlet really just tries to find the service host and send you to adminconsole via a re-post of credentials to the service host in an iframe.  This is great if you happen to be on the same LAN as the admin console when accessing UMP.  It can probably be worked around by setting the /host/name parameter in service_host.cfg to a public DNS name and allowing access to the service host through the firewall.  They also have a work around in a KB article suggesting you manually setup the iframed portlet.

     

    https://na4.salesforce.com/articles/HowToProcedures/How-to-access-admin-console-from-public-IP-address-within-UMP

     

    Documentation bug: PPM documentation is hard to come by.  It's just one of those essential things that they think will just work. Anyway, for adminconsole to work, ppm must be deployed to every hub in the environment.  This has trickled from the software requirements in older versions to the known issues in 8.0, and has been overlooked in 8.1.  8.1 suggests you deploy all components of service-host to all hubs in the environment which is obviously not what you want if you don't want promote every remote network to a management network. 

     

    Unknown bug: setting /loggers/org.apache.http = INFO shows a repeating error when you browse admin console indicating that somewhere in the code the admin-console is trying a get_probe REST callback that is not implimented in any of the components of service_host.  I don't know what the impact is.

     

    Jan 08 10:27:32:285 [attach_socket, service_host] getApps: detailLevel: 1
    Jan 08 10:27:32:285 [attach_socket, service_host] callAppRest: adminconsole/callbacks/get_probe
    Jan 08 10:27:32:288 [attach_socket, service_host] service_host app adminconsole doesn't implement callbacks/get_probe
    Jan 08 10:27:32:288 [attach_socket, service_host] failed callbacks/get_probe error calling GET http://192.168.43.147:8080/adminconsole/callbacks/get_probe HTTP/1.1 code: 404
    Jan 08 10:27:32:288 [attach_socket, service_host] callAppRest: ids_services/callbacks/get_probe
    Jan 08 10:27:32:291 [attach_socket, service_host] callAppRest: monitoring_services/callbacks/get_probe
    Jan 08 10:27:32:295 [attach_socket, service_host] callAppRest: ROOT/callbacks/get_probe
    Jan 08 10:27:32:296 [attach_socket, service_host] service_host app ROOT doesn't implement callbacks/get_probe
    Jan 08 10:27:32:296 [attach_socket, service_host] failed callbacks/get_probe error calling GET http://192.168.43.147:8080/ROOT/callbacks/get_probe HTTP/1.1 code: 404
    Jan 08 10:27:32:296 [attach_socket, service_host] callAppRest: umpjslib/callbacks/get_probe
    Jan 08 10:27:32:298 [attach_socket, service_host] service_host app umpjslib doesn't implement callbacks/get_probe
    Jan 08 10:27:32:298 [attach_socket, service_host] failed callbacks/get_probe error calling GET http://192.168.43.147:8080/umpjslib/callbacks/get_probe HTTP/1.1 code: 404

     

     

    Documentation bug:  The 8.1 post install validation chart references several versions incorrectly.  Specifically 8.1 installs versions newer than what they suggest it does.



  • 10.  Re: UIM and UMP 8.1 first feelings

    Posted 01-08-2015 09:45 PM

    Hi Guys,

    If you find actual bugs can you please share them into the **** Defect Announcements **** thread as well. 

    This will help folks who run into issues find them all in one locaiton. 

    Thank you,

    Dan

     



  • 11.  Re: UIM and UMP 8.1 first feelings

    Posted 01-20-2015 12:08 AM

    I did the UIM/UMP 8.1 upgrade on the 15th and so far it seems like, on average, an improvement over 8.0.

     

    The UIM upgrade went reasonably smoothly. One issue run into is that the snmptd probe timed out during the restart for some reason. added 2-3 minutes to the outage for the upgrade. 12 minutes of actual alert processing outage during the upgrade. Snmptd was red in IM, right click and activate and it started. Nothing in log to indicate what the issue was.

     

    UMP took about 2 hours to update. Would be really nice if notification about that new port was in 70pt text in the upgrade instructions. Most of the issues that I'd been waiting for to make the new HTML5 dashboards usable didn't make it into this release which was frustrating. There are new global parameters (apparently undocumented) that finally give you access to information about the account logged in. They don't work with Nimsoft users though so there are some rough edges. Not sure if that is intentional or a defect yet. Like all UMP upgrades, the portal configuration file gets obliterated and replaced. If you follow the multi UMP server instructions, you will lose all your customized media. Make sure to follow the recommendations of backing up the directories. That way you can WinMerge or some similar tool to put the changes back the way you need them.

     

    The USM portal also refused to populate with systems until about 6 hours after the end of the upgrade. Not sure what took so long but maybe it was spending a lot of time doing discovery.

     

    Overall there are more probes running on the hubs in this version than older ones and the memory and CPU footprints grow compared to the past. On my central hub CPU usage in 8.1 is over twice that of 8.0 - system averaged something around 10% on 8.0. Now it's in the mid twenties and frequently into the 45% range. The only time the 8.0 system got about 40% was during a robot restart.

     

    Hub 7.63 seems to be more stable - uses more CPU but it does the self healing restart way less often than older versions.

     

    alarm_enrichment seems to be less stable than previous versions based on the length of time it stays connected to its input queue but on the other hand, it hasn't hung over the past four days either.

     

    The hub/controller 7.63 combo still exhibit the behavior where the flag to ignore other hubs on the same network for communications purposes is disregarded. This has the unintended benefit that if a hub allows its tunnels to become nonfunctional, the controller will use another hub. Unfortunately, in situations where there are multiple Nimsoft/UIM installs one's robot might find itself connected to another companies Nimsoft environment which is pretty embarrassing.

     

    Still don't like the fact that there's no attention being paid to IM. The amount of time spent waiting for the web pages in 8.1 to render and load makes IM look like a godsend. And with the need to distribute ppm and whatever the mystery dependencies of that are everywhere, it's a nonstarter. Add to that the fact that the footprint now for a standard UIM robot is approaching the size of a small server, there's a lot of existing hardware that won't be able to accept the new demands.

     

    -Garin



  • 12.  Re: UIM and UMP 8.1 first feelings

    Posted 01-20-2015 10:37 AM

    Garin, yeah at least they added the port to the documentation last week or so.

     

    I'm still struggling with getting the interface metrics to actually show in the "interfaces" tab, also with interface discovery I'm getting "duplicates" now from multihomed machines. I'd love it if it was mentioned somewhere which probe is actually meant to collect the interface information. It would seem like it should be snmpcollector, but I haven't seen anything towards that end mentioned anywhere.

     

    Also noteworthy, that resetting discovery data process has changed. This is also mentioned in the release notes (not sure if it was there originally), but it doesn't describe the process to do so. Basically now, after emptying cm_computer_system, you need to also empty datomic_kvs table (while udm is turned off).

     

    -jon



  • 13.  Re: UIM and UMP 8.1 first feelings

    Posted 01-21-2015 08:14 PM

    I agree with you about the IM versus the Admin Console - and then having some probe configurations ONLY available with Admin Console is to me ridiculous.  If you don't have support for ALL probes in Admin Console, don't force everyone to USE Admin Console.

    It is frustrating having to deal with the slowness/clunkiness of Admin Console when IM works much better - only to find that some probe configurations are only visible in Admin Console view.



  • 14.  Re: UIM and UMP 8.1 first feelings

    Posted 01-22-2015 03:21 AM

    Folks - please note that the transition from using Infrastructure Manager to Admin Console is a multi-release transition during which you will (as a practical matter) need to be familiar with (and use) both tools.

     

    Some probes can only be administered with IM - nas is one of them. Some probes can only be administered with Admin Console (aws, snmpcollector, et al).

     

    There are some functions you can accomplish in either tool - e.g. deploying a probe. Of course, individuals will have their own workflow preferences as to which tool they use.

     

    We know this is not ideal - please know that we are continuing to make functional, performance, and scalability improvements to Admin Console.

     

    Thank you for your ongoing patience!



  • 15.  Re: UIM and UMP 8.1 first feelings

    Posted 01-22-2015 05:13 AM

    I'm pretty sure that everyone has heard that. 

     

    I would suggest that rather than being defensive or providing excuses, that the CA folks read the comments and try to understand the hardship that this transition to Admin Console is causing at least the customers who use the forums. 

     

    Maybe you could take a look into case 153903. Admin Console hasn't worked for me since applying 8.1. Support is without an idea why. At least IM runs. 

     

    Admin Console is a nice tool if you have a handful of hubs and a reasonable number of robots. You get more than 30 hubs and it's useless.

     

    It is truly painful to use . The page failure rate in 8.0 made it useless in any interactive sense. My use of it, and others have commented similarly, was to coerce it to the point of creating a cfg file and then to manually edit that with notepad t oget it right and copy to where it was supposed to go. 

     

    The footprint required by all the probes necessary for it to even work makes it prohibitive to deploy to small servers. And the documentation about what probes are supposed to be deployed where is in the same category as the map to Eldorado.

     

    If the tool worked and scaled, I'd be excited about it. Unfortunately it falls solidly into the category of web apps that at CA World so much time was spent describing would cause customers to go elsewhere. 

     

    -Garin



  • 16.  Re: UIM and UMP 8.1 first feelings

    Posted 02-06-2015 05:08 PM

    I upgraded to 8.1 on my test bed about 2 weeks ago and followed the upgrade sequence exactly as suggested. (Prime HUB, Secondary HUBs, Robots, UMP).  The UMP install complained about the discovery_server connection which was unstable which I missed. While the UMP upgrade installer said it completed successfully, it did not. I ended up with a serious error trying to login to UMP which blocked access to the HOME screen. (dashboards and other pages did load)

     

    Turns out discovery_server was struggling in some fashion with the new udm_manager and it all turned out to be related to a special char in our DB password. (*) I set the password to a more simpler one, and discovery_server stabilized, and the UI error nav to the HOME screen in UMP went away.

     

    I'm still seeing probeDiscovery Queue lose/reset connections to discovery_server.  Support had me increase mem on the discovery_server to 4096 which maxed out my Primary HUB memory. I also at the same time enabled VMware probe and Spectrum integration and several probes became unstable due to exhausted resources. I turned off VMWare probe, and Spectrum integration and removed all probes I could from the Primary HUB including disabling snmpcollector and pollagent. Mem usage is down and I reduced discovery_server to 2048. probeDiscovery Queue is still resetting connections to discovery_server and we are still investigating.

     

    Clearly I'll need to increase memory on the Primary HUB which only has 8 Gig. (current Change Order is in approvals for 16 Gig).

     

    I completely agree with the sentiments on Admin Console.  Taking stabs at using it to administer probes is frustrating and its interface is crude and slow.  If you want the community to use it, be more clear about what works with it and what does NOT.

     

    Regards

     

    -Robert



  • 17.  Re: UIM and UMP 8.1 first feelings

    Posted 02-06-2015 06:37 PM

    I can commiserate with the resource demand. 

     

    I ran a trial install of a primary hub with the intent of comparing my current production with a fresh install to look for differences that might have crept in between the upgrades from the 6.0 starting point and support deleting and reinstalling in hopes of fixing things.

     

    The documentation recommends that you have 4GB on a server for the small install.

     

    At 4GB, you can only get about half way through the install before the OS (Centos 6 in this case) starts killing processes because of the lack of available RAM. That makes for a sketchy install process at best. WIth there being 30ish probes installed and all the java ones asking for some multiple of a GB of RAM for their memory space, you need a lot of resources to get started. Granted, the OS eventually swaps 90% of that Java requested memory out to disk but it's still a wasteful overhead.

     

    This becomes a huge issue though if you're monitoring systems that have been tuned to have just enough resources for their existing workload as the addition of UIM becomes a serious perturbation to the system rather than the blip is used to be in the 6.x days. With the current round of wasteful and inefficient probes, I have many cases where the monitoring software uses more of the server being monitored than the task the server is intended to satisfy. 

     

    -Garin

     

     



  • 18.  Re: UIM and UMP 8.1 first feelings

    Posted 02-19-2015 01:57 PM

    Just adding here a few things:

     

    Key to a lot of metrics in USM is S_QOS_SNAPSHOT being up to date.

     

    Also in "interfaces" tab there's a but, that if the most recent QOS value (from s_qos_snapshot) is < 1, it doesn't display any metrics. You can still click them and go to PRD. This has been raised as a defect.

     

    -jon