We are on Siteminder 12.51 and use MS-SQL as our user store (CA Directory for policy store). Over the weekend the MS-SQL database server crashed and had to be restarted.
I had thought that once MS-SQL was back running, Siteminder would just start working. Instead, Siteminder kept throwing errors until we restarted the policy servers. Then all was good.
Does anyone know if this is expected behavior? If the user store, MS-SQL, fails and has to be restarted, is it expected that would need to restart the policy server?
Here are some of the errors we got until we restarted the policy server.
[2418/3271781232][Sun Jul 17 2016 07:15:49][Sm_Auth_Message.cpp:288][ERROR][sm-LoginLogout-00130] Password Message could not be parsed
[2418/4099529584][Sun Jul 17 2016 07:15:55][CServer.cpp:1679][ERROR][sm-Server-01050] Failed to initialize TCP client connection. Socket error 107
[2418/4099529584][Sun Jul 17 2016 07:16:03][CServer.cpp:1728][INFO][sm-Server-01760] Closing Idle connection for session # 809
No this is not an expected behavior of policy server.
Policy server has a polling mechanism to check the health of ODBC connection at regular interval (15 seconds) and reestablish connection if it gets hung.
The polling mechanism within a connection begins as soon as a connection is created. It starts in an initialized state and, when the policy server, the network and the database server are behaving normally, it should go from initialized to available. The available state is where a connection should
spend most of its time. When a statement needs to be executed it obtains a connection and the connection transitions into the active state allowing nothing else can use it. It may not remain in this state for more than fifteen seconds. If it does, it will be flagged as hung. If a connection is flagged as hung it will be disconnected and reconnected to the database server. Internally, the database layer grabs a connection from the connection pool for each statement that it obtains. If there are no available
connections in the pool and the maximum number of connections has not been exceeded, a new connection is generated and placed in the pool. After the statement is through executing, the statement releases the connection, the connection transitions into the available state, and another statement may use it.
However, that said, if the ping (polling) thread which does the regular ping itself gets into hung state, then I am guessing that this could lead to the behavior that you are experiencing.
If you can reproduce the issue and can provide the policy server trace logs, I suggest opening a support case for the root cause analysis.