DX Unified Infrastructure Management

 View Only
  • 1.  NMS Cluster Installation - Is Nimsoft cluster supported in Windows 2012, what is the expected failover duration?

    Posted Apr 07, 2014 09:03 PM

    I see that the Compatibility Matrix shows NMS runs on Windows 2012 R2, but the documentation for the NMS cluster installation only mentions Windows 2008 R2.  I could assume, but I would rather be sure ... I want to use VMWare servers, and VMWare only supports a Windows cluster using WIndows 2012.  So I need to confirm that the Nimoft Windows cluster feature is certified for WIndows 2012 on VMWare.

     

    If a VMWare Windows 2012 cluster is supported, what is the expected Nimsoft alerting outage during a failover -- how many seconds after the configured failover timeout passes should it take for the Failover Hub start processing alerts?  VMWare itself has a fairly quick (4-6 minutes) server recovery feature, and, assuming a VMWare Windows 2012 cluster is supported, I need to decide if I should use NMS Failover Hub recovery or use VMWare recovery .



  • 2.  Re: NMS Cluster Installation - Is Nimsoft cluster supported in Windows 2012, what is the expected fa

    Posted Apr 08, 2014 10:30 AM

    We used run our primary hub as a windows cluster initially, but moved away from
    that for numerous reason:

     

    • Cluster adds complexity. You don't want that in a monitoring solution if 
      you can avoid it.
    • We consistantly had problems upgrading since different probes had 
      problems deciding if they should bind to cluster or node IP.
    • Since you move the robot between the cluster nodes, you only have one 
      robot shared on two servers. Meaning, you don't have proper monitoring of 
      the passive node.

    So we ended up reinstalled 2 new primary hubs on the side, and use HA probe to
    control queues and probes. We at least feel this is much more reliable. Easier
    to debug. Easier to configure. And more robust as you actually have 2 separate
    nodes.

     

    The only downside is that out-of-the-box, there really isn't any tools that
    help with syncing things like nas config (auto operator things) between the
    nodes, but there are scripts available to do this.



  • 3.  Re: NMS Cluster Installation - Is Nimsoft cluster supported in Windows 2012, what is the expected fa

    Posted Apr 15, 2014 10:34 PM

    Anders,

     

    If you don't mind me asking, which probes do you start/stop with the HA probe? And do you keep all of their configs in sync with scripts or only ones that change frequently?

     

    Thanks,

    Keith



  • 4.  Re: NMS Cluster Installation - Is Nimsoft cluster supported in Windows 2012, what is the expected fa

    Posted Apr 16, 2014 02:23 PM

    I disable/enable pretty much every probe except alarm_enrichment and nas. I
    have the UMP running on a dedicated server. I'm currently on vacation, but can
    get the exact list when I get back.

     

    I currently have no scripts to sync configs, and have done so manually. I got a
    custom probe from Nimsoft a while back called generic_cluster, that supposedly
    should be able to sync configs, but I haven't had time to look closer into it.

     

    I guess the main configurations that should be synced is the nas configuration
    (incl. alarm_enrichment). Everything else is fairly stable, and doesn't change
    a lot.

     

    Also, the dashboard_engine and wasp needs to be reconfigured when you failover,
    so that should be automated as well at some point.



  • 5.  Re: NMS Cluster Installation - Is Nimsoft cluster supported in Windows 2012, what is the expected fa

    Posted Apr 17, 2014 04:52 PM

    Good info. Thanks for sharing.

     

    Do you allow the HA probe to start the data_engine? A long time ago I ran into some trouble with having 2 data_engines running at the same time and creating duplicate QoS objects. I think that early version of the HA probe (which was not even on the Internet archive--it was only available by request) worked in a completely different manner than the current version, so that sort of problem may be far less likely.



  • 6.  Re: NMS Cluster Installation - Is Nimsoft cluster supported in Windows 2012, what is the expected fa

    Posted Apr 22, 2014 04:51 PM

    Alright, back at work after the Easter holiday, so can check more now :smileywink:

     

    I currently enable the following probes with HA:

     

    data_engine, nis_server, audit, sla_engine, qos_engine, assetmgmt, ace,
    relationship_services, fault_correlation_engine, cm_data_import, ppm,
    topology_agent and service_host.

     

    It's all kinda based on trial and error, and the fact that I want to
    have as many of the probes always running as possible. I consider
    starting a probe a step that could fail and twart the failover. So the
    more I have always running, the less will fail during failover.

     

    I believe they added support in the data_engine to run multiple
    instances of it a few versions back. After that it's been working quite
    well. Not have any problems with it. Can't remember id there was any
    specific config keys I set, but didn't see any when I briefly looked at
    it now.

     

    Other than than it's just a matter of redirecting your message flow by
    enabling the get queues.

     

    We've done a couple failovers, and so far it seems to be a success.