We are seeing "Unable to load SiteMinder agent configuration object." in Webagent error log sometimes (May be once in a week).
When we see such errors in SMPS log we are seeing one of the below errors : "Status: Error 91 . Can't connect to the LDAP server" OR "Handshake error: Failed to receive client hello. Socket error" , Sometimes we are not seeing any error in SMPS.log. I went through this link "http://www.ca.com/us/support/ca-support-online/product-content/knowledgebase-articles/tec491614.aspx " . I know the reason for the error. But I need some suggestions from you.
If we are seeing : Status: Error 91 . Can't connect to the LDAP server, We will restart the Policy server then the error is gone. If we are seeing Handshake error we are not doing anything but after sometime the error (Unable to Load Siteminbder Agent Configuration Object) disappears .
We are not restarting any of the Siteminder components on scheduled manner. Can that be a reason for such errors happening intermittently ?. If we have to restart Siteminder components, do we have any scheduler available from CA? or what are your suggestions for restarting.
Does this error occur because of the overload on Siteminder Policy server? [We have 15K logins per hour with 2 Policy servers and 2 LDAP servers. ]
Do we need to increase any cache settings or Pool setting in Policy server side?
The error "Unable to load SiteMinder agent configuration object." suggests that either the specific ACO is not found in the policy store or there's connection issue to the Policy Server. Since the error is intermittent, I suspect this may relate to the connection to the Policy Server.
When the error recur, try telnet to the Policy Server's authentication port (default: 44442) from webagent machine to validate connection between webagent and PS. Also, check if you have scheduled shared secret rollover.
To validate Policy Server resources, run "smpolicysrv -stats" from the Policy Server when error recurs. It prints current server runtime statistics such as thread pool limit, thread pool message, and the number of connections in smps.log.
Thank you All for your suggestions. I will have enabled the webagent trace to know more about the activities.
Try increasing webagent to Policy Server threads, also check if any network packet drop is happening or not.
I would say check the User Store during the time you get the error. Is there any abnormally cpu usage on policy server during this time? See if there is some activity happening on user store which is consuming policy server resources. Restarting policy server maybe helping in this situation.
Thank you Bbhushan. The memory usage of LDAP server is 75%. Will that be a problem? CPU usage looks normal.