DX Application Performance Management

 View Only
  • 1.  Testing to Configure Enterprise Manager Failover to Work on a Single Host   [linux box]

    Posted Nov 10, 2015 11:40 AM

    Testing to Configure Enterprise Manager Failover to Work on a Single Host   [APM 9.6 in linux box]

     

    Goal: To mimic to Configure Enterprise Managers [MOM1 &MOM2] Failover in separate box sharing same smarstor data.

    Validation: Primary MOM1 [after restart] to regain control once failed over to Secondary MOM2 

     

    mom1  EM properties

    introscope.enterprisemanager.port.channel1=5001

    introscope.enterprisemanager.webserver.port=8081

    introscope.enterprisemanager.failover.enable=true

    introscope.enterprisemanager.failover.primary=localhost

    introscope.enterprisemanager.failover.secondary=

     

    mom2 EM properties

    introscope.enterprisemanager.port.channel1=6001

    introscope.enterprisemanager.webserver.port=8083

    introscope.enterprisemanager.failover.enable=true

    introscope.enterprisemanager.failover.primary=localhost

    introscope.enterprisemanager.failover.secondary=

     

    collector1 EM properties

    introscope.enterprisemanager.port.channel1=8001

    introscope.enterprisemanager.webserver.port=8082

     

    MOM1 & MOM2 shared followings [smartstor data,traces],

    introscope.enterprisemanager.smartstor.directory=/var/SHARED/data

    introscope.enterprisemanager.threaddump.storage.dir=/var/SHARED/threaddumps

    introscope.enterprisemanager.dbfile=/var/SHARED/data/baselines.db

    introscope.enterprisemanager.smartstor.directory.archive=/var/SHARED/data/archive

    introscope.enterprisemanager.transactionevents.storage.dir=/var/SHARED/traces

     

    Observation:

     

    1. Primary MOM1 connects to collector and list agents fine as expected

    2. Started secondary MOM2  [executed ’./Introscope_ Enterprise_ Manager]   but log shows [it starting as  another Primary MOM]

     

    11/09/15 09:16:44.166 AM PST [INFO] [main] [Manager.HotFailover] The Introscope Enterprise Manager is configured as a Primary EM

    11/09/15 09:16:44.167 AM PST [INFO] [main] [Manager.HotFailover] Acquiring secondary lock...

    11/09/15 09:16:44.168 AM PST [INFO] [main] [Manager.HotFailover] Acquired secondary lock

    11/09/15 09:16:44.168 AM PST [INFO] [main] [Manager.HotFailover] Acquiring primary lock...

    11/09/15 09:16:44.168 AM PST [INFO] [main] [Manager.HotFailover] Acquired primary lock

    11/09/15 09:16:44.168 AM PST [INFO] [main] [Manager.HotFailover] Released secondary lock

    11/09/15 09:16:44.169 AM PST [INFO] [main] [Manager.HotFailover] Proceeding with startup

     

    3. Stopping MOM1 fails over to secondary MOM2 as exepcted

    4. Restarting MOM1 does not take control back but  just starts as another   Primary MOM e.g. step 2

     

    MOM Failover without relying on OS/HW is informative  so  tried for single host scenario.

     

    Hoping to identify my misstep

    cheers!!



  • 2.  Re: Testing to Configure Enterprise Manager Failover to Work on a Single Host   [linux box]

    Broadcom Employee
    Posted Nov 11, 2015 02:44 AM

    Hello,

    These points should hopefully help.

    1. Here is also the relevant 10.1 wiki link: Use Enterprise Manager Failover - CA Application Performance Management - 10.1 - CA Technologies Documentation

     

    2. A fully implemented failover on multiple hosts typically shares a single complete EM installation on a HA NAS.

    The key thing with the failover is that the lock files (.lck) also need to be shared by the 2 EMs in the failover setup[ - these are located in EM_HOME/config/internal/server.

    EM2 regularly checks the status of the primary lock file & will be blocked from acquiring that lock until EM1 goes down.

     

    3. For a test on the same host you can also use the single installation (1 EM properties file) and start 2 copies of the EM executable Introscope_Enterprise_Manager.

    EM2 will then sit waiting until the EM1 goes down e.g. the single EM log will show something like:

    EM1 starts

    11/11/15 05:08:08.550 PM EST [INFO] [main] [Manager.HotFailover] The Introscope Enterprise Manager is configured as a Primary EM

    11/11/15 05:08:08.554 PM EST [INFO] [main] [Manager.HotFailover] Acquiring secondary lock...

    11/11/15 05:08:08.557 PM EST [INFO] [main] [Manager.HotFailover] Acquired secondary lock

    11/11/15 05:08:08.560 PM EST [INFO] [main] [Manager.HotFailover] Acquiring primary lock...

    11/11/15 05:08:08.563 PM EST [INFO] [main] [Manager.HotFailover] Acquired primary lock

    11/11/15 05:08:08.566 PM EST [INFO] [main] [Manager.HotFailover] Released secondary lock

    11/11/15 05:08:08.567 PM EST [INFO] [main] [Manager.HotFailover] Proceeding with startup

    ...

    EM2 starts

    11/11/15 05:12:01.366 PM EST [INFO] [main] [Manager.HotFailover] The Introscope Enterprise Manager is configured as a Primary EM

    11/11/15 05:12:01.370 PM EST [INFO] [main] [Manager.HotFailover] Acquiring secondary lock...

    11/11/15 05:12:01.373 PM EST [INFO] [main] [Manager.HotFailover] Acquired secondary lock

    11/11/15 05:12:01.375 PM EST [INFO] [main] [Manager.HotFailover] Acquiring primary lock...

    Waits until EM1 goes down & then you see:

    11/11/15 05:14:22.076 PM EST [INFO] [main] [Manager.HotFailover] Acquired primary lock

    11/11/15 05:14:22.098 PM EST [INFO] [main] [Manager.HotFailover] Released secondary lock

    11/11/15 05:14:22.105 PM EST [INFO] [main] [Manager.HotFailover] Proceeding with startup

     

    NOTE: That scenario will be for a PRIMARY-PRIMARY i.e. with PRIMARY-PRIMARY EM1 (localhost) will not automatically retake control after it restarts because EM2 (localhost) also starts as PRIMARY.

    If you try to test PRIMARY-SECONDARY on the same host with 2 separate installations by also sharing the lock files I am not sure it will be successful.

     

    4. For a multi-host implementation across a NAS you can choose to have either:

    PRIMARY-PRIMARY (introscope.enterprisemanager.failover.primary=Host1,Host2)

    or

    PRIMARY-SECONDARY (introscope.enterprisemanager.failover.primary=Host1 and introscope.enterprisemanager.failover.secondary=Host2).

    For PRIMARY-SECONDARY EM1 will retake control after it restarts.

     

    Others may have more field experiences but hope this helps.

     

    Regards,

     

    Lynn



  • 3.  Re: Testing to Configure Enterprise Manager Failover to Work on a Single Host   [linux box]

    Posted Dec 08, 2015 01:46 PM

    Thanks Lynn.

    Finally, I was able to test and observe most of the things as you have mentioned.

     

    Except  ".. If you try to test PRIMARY-SECONDARY on the same host with 2 separate installations by also sharing the lock files I am not sure it will be successful." 

     

    Hope to test in  multiple hosts  sharing EM installation on a HA NAS when it happens.

     

     

    Santosh