I have configured EEM failover by installing EEM on one server.
On another server i have installed EEM
i have configured failover using the clustersetup cmd and all looks good in the logs
I have registered some apps using safex on the primary and added a user - these both appear in the secondary.
So onto testing - i stopped the igateway and dxserver services on the primary then went to log into the secondary.
The apps were no longer in the drop down and and on trying to log in globally i got the following message :
EE_AUTHFAILED Authentication FailedISE_BACKENDDOWN backend is down
Is this FAD ? seems to be pointless having a backup if so - the wiki i followed doesn't describe how failover works.
One point of note we are waiting for LDAP creds so currently there is no LDAP configured currently - would this be a factor ?
With or without LDAP configured, EEM failover should work once configured correctly.
You have mentioned the apps do not show in the Application drop down list... What are the apps which are registered with EEM?
Apps dont showup in both primary and secondary EEM UI?
EE_AUTHFAILED Authentication Failed
ISE_BACKENDDOWN backend is down
The error show that there are problems with the ca directory (dxserver).
Can you look for the dxserver trace logs on both primary and secondary servers for errors?
The apps are added to the primary (APM and Executive insight) and they then appear in the secondary automatically. Then when i simulate failover by stopping the services on the primary the APM and EI options are no longer available on the secondary and i get the error above when i try and log in on the secondary.
This seems like a good KB topic since it is a popular issue. Please consider getting someone to write one...
In that case, the secondary server is looking at the primary server database.
When you stimulate the failover by shuting down primary server service, the secondary server loses its connection to the primary and hence that error is logged.
Check the secondary server dxserver trace logs for errors.
I need more information to create a KB article.
This can even be mis-configuration of EEM failover.
Starting the DXadmin daemon
Starting DXservers: itechpoz ...
itechpoz failed to start
 20150407.161103.084 WARN : max-local-ops has no effect
 20150407.161103.084 WARN : max-dsp-ops has no effect
 20150407.161103.157 WARN : Loading cache
 20150407.161103.235 WARN : Datastore was created at: 20150330232027Z
 20150407.161103.235 WARN : Datastore was created for: itechpoz
 20150407.161103.238 WARN : Cache loaded, 0 entries
 20150407.161103.238 WARN : Memory used by cache: 804016 + 0
 20150407.161103.239 WARN : Cannot register address
 20150407.161103.247 WARN : Disabling cache prior to exit
*  20150407.161103.239 DSA_E2220 Cannot register address
*  20150407.161103.247 DSA_I1240 DSA shutting down
So i guess we can't register address ? what is this referring to ?
and on the primary box :
*  20150331.100909.539 DSA_I1220 DSA started: DXserver r12.0 SP10 (build 6892) Linux/DXgrid 32-Bit
*  20150331.100909.539 DSA_I3200 License: DXserver r12.0 SP10 (build 6892) Linux/DXgrid 32-Bit prefix cn=iTechPoz entry total 27
*  20150331.100909.539 DSA_I1150 DXgrid file usage: Filesize 256000000 Used bytes 5104 (1%) Reclaimable bytes 12
*  20150331.101209.908 DSA_I1240 DSA shutting down
*  20150331.101236.193 DSA_W2650 Cannot get Multiwrite last update time for 'itechpoz-serverB'
!  DXserver r12.0 SP10 (build 6892) Linux/DXgrid 32-Bit
*  20150331.101236.194 DSA_I1220 DSA started: DXserver r12.0 SP10 (build 6892) Linux/DXgrid 32-Bit
*  20150331.101236.194 DSA_I3200 License: DXserver r12.0 SP10 (build 6892) Linux/DXgrid 32-Bit prefix cn=iTechPoz entry total 27
*  20150331.101236.194 DSA_I1150 DXgrid file usage: Filesize 256000000 Used bytes 5104 (1%) Reclaimable bytes 12
*  20150331.101337.927 DSA_W2650 Cannot get Multiwrite last update time for 'itechpoz-serverB'
*  20150331.101337.940 DSA_E2735 Multiwrite-DISP: Unable to synchronize with peer 'itechpoz-serverB'
*  20150331.101438.873 DSA_W2650 Cannot get Multiwrite last update time for 'itechpoz-serverB'
*  20150331.101539.945 DSA_W2650 Cannot get Multiwrite last update time for 'itechpoz-serverB'
*  20150331.101640.870 DSA_W2650 Cannot get Multiwrite last update time for 'itechpoz-serverB'
Are you using default ports for dsa 509?
Can you check if this port is being already used by any other process?
telnet hostname 509
Send the dxserver configuration files from both the primary and secondary servers?
Yes I’m using default 509 .. the ports should be open but I’ll check again
– nothing else should be using that port.
Sent: Tuesday, 7 April 2015 5:14 PM
To: Andy Erskine
Subject: Re: - EEM Failover -
How does it work ?
CA Communities <https://communities.ca.com/?et=watches.email.thread>
EEM Failover - How does it work ?
reply from Gopinath Harindranath
<https://communities.ca.com/people/hargo01?et=watches.email.thread> in *CA
Embedded Entitlements Manager (EEM)* - View the full discussion
yep have access on ports 509 in both ways
On my DR server igateway and WDigateway is running yet "dxserver start itechpoz" isn't - on both boxes
which dxserver configuration files are you referring to as there are several files in my ..CA/Directory/dxserver/config folder
assuming its the knowledge folder ?
In both we have ...
# eiam repository
set dsa "itechpoz-serverB" =
prefix = <cn iTechPoz>
dsa-name = <cn iTechPoz><cn PozDsa><cn "serverB">
dsa-password = ""
#for failover configuration
address = tcp "serverB" port 509
snmp-port = 509
#for dxconsole debugging. info: make sure that the port is not used
#console-port = 10510
auth-levels = clear-password
dsp-idle-time = 120
dsa-flags = multi-write
link-flags = ssl-encryption-remote
address = tcp "serverA" port 509
looks like weve found the solution - we have re-run the failover setup and re-synced and the services are starting and have tested a failover scenario sucessfully - thaks for pointing us in the right direction.
I have the same problem in my EEM failover node.
After seeing your post i have removed failover node and added again .
Finally re synced with primary node but again got the same issue in failover node.
Your welcome... Problem is resolved now.
Please open a case if needed to resolve.
I have already opened case with CA, seems it is pending for one week.
I thought , have a solution from community.
I hope that you get a response