1. SDM problems when restarting a database cluster node

0 Recommend
Fabio Galarraga
Posted Jul 10, 2018 09:01 AM

Reply Reply Privately
Hi all.
I'm using SDM 14.1 with CUM3 running over Windows Server 2012 R2 and accessing Oracle 12cRAC.
Oracle 12cRAC works within a cluster of servers.
We have noticed each time an Oracle cluster node is reseted, SDM stop to work and we need to restart the Windows service.

At SDM log file we found entries like this:
SIGNIFICANT
orcl_agent.c
SHUTDOWN of orcl_agent:mdbadmin:bpvirtdb_srvr

We think SDM may continue working because other Oracle cluster nodes are still working.
How can I avoid the need of SDM restart? Do you know some workaround or setting? Is this an expected behaviour and there is nothing to do?

All comment is welcome. Thanks in advance.

Regards,
Fabio.
2. Re: SDM problems when restarting a database cluster node

0 Recommend
Broadcom Employee

Chi Chen
Posted Jul 10, 2018 10:29 AM

Reply Reply Privately
Fabio, what happens for OTHER client connection to the cluster when it is reset? For example, from SDM server, manually run a sql plus connection using the same parameters(cluster Oracle name, same user credential). If this other
client connection lost its connection when the Oracle cluster is reset, then you should work with your DBA to see why and that should not be a SDM issue but rather Oracle cluster configuration. By the way, as a "workaround", you can try to run "pdm_d_refresh" instead of recycle SDM. Thanks _Chi
3. Re: SDM problems when restarting a database cluster node

0 Recommend
Fabio Galarraga
Posted Jul 10, 2018 11:52 AM

Reply Reply Privately
Thank you Chi. DBA say sother applications stay working after a cluster node reset. I think SDM store or use some kind of caching of database connections and this produces the problem.
The next time a new reset ocurrs, I will use "pdm_status" and "pdm_d_refresh" as you suggest.
4. Re: SDM problems when restarting a database cluster node

0 Recommend
Grant Bruneau
Posted Jul 10, 2018 09:40 PM

Reply Reply Privately
We have a similar setup with sql server always on. We have noticed that when you switch between database nodes SDM may lose connection if it takes longer than 45 seconds to move nodes. After that point SDM will stop trying to reconnect and pdm_d_refresh is necessary.
5. Re: SDM problems when restarting a database cluster node

0 Recommend
Fabio Galarraga
Posted Jul 10, 2018 11:02 PM

Reply Reply Privately
Thank you Grant. Your environment looks similar to mine. As an addition, we have other CA products like PAM, Service Catalog and USS running in the same Oracle database (PDB) with different schemas. This behaviour has only been seen with SDM. The other CA products stay running without problems after the node restart; the same with other non CA applications.
6. Re: SDM problems when restarting a database cluster node

0 Recommend
Grant Bruneau
Posted Jul 11, 2018 12:12 AM

Reply Reply Privately
Yep we have seen the same. Service catalog and Pam can always recover. SDM recovers most of the time.
7. Re: SDM problems when restarting a database cluster node

0 Recommend
Broadcom Employee

Ferdinand Roehrl
Posted Jul 11, 2018 04:30 AM

Reply Reply Privately
Hello Fabio,
please remind, C Service Desk Manager is cluster tolerant but not cluster aware.
So this implies, when a failover happen in the deb server you must restart the application (to keep it simple) to stop sdm sql_connections and processes.

_r Ferdinand
8. Re: SDM problems when restarting a database cluster node

0 Recommend
Fabio Galarraga
Posted Jul 11, 2018 01:24 PM

Reply Reply Privately
OK. In an AA environment, do you think running the pdm_d_refresh command on background server is sufficient to restore the operation?
9. Re: SDM problems when restarting a database cluster node

1 Recommend
Grant Bruneau
Posted Jul 11, 2018 01:36 PM

Reply Reply Privately
You would need to run that command on each server. In AA servers have their own connections to the database.
10. Re: SDM problems when restarting a database cluster node

1 Recommend
Broadcom Employee

Raghu Rudraraju
Posted Jul 13, 2018 10:20 AM

Reply Reply Privately
Fabio,

SDM architecture basically involves in a whole bunch of native client based SQL connections we have. Not all of them are active (though all of them are connected to the database).

So take an example here. you're trying to look at a list of tickets, that might have forced SDM to use DBAgent#1 to run a query against the database.

When there is a loss of DB connectivity at this time, our use of db client API's will make DBAgent#1 recognize the loss of connection and then attempt to reconnect immediately. So, your query is re-issued again.

However, DBagent#2, 3, 4... they maybe in idle/snooze as there's not much activity on your system. Another SDM user tries to do another query, which lets say is sent to DBAgent#2. Now, DBAgent#2 at that particular time will detect loss of connection to the db and retry it again.

So, yes, we do have enough support to detect loss of connection and retry the queries. But this might happen over a period of time depending on how the agents are being used.

Hope this gives you an idea.

The same applies both to SQL and Oracle too

_R
11. Re: SDM problems when restarting a database cluster node

0 Recommend
Fabio Galarraga
Posted Jul 14, 2018 01:37 PM

Reply Reply Privately
I understand DBAgents will try to reconnect for both cases: agent active or agent idle. But what determinates the retry time depends on agent usages. So, increase the agents number can help to reduce the probability of database connection problems?
12. Re: SDM problems when restarting a database cluster node

0 Recommend
Broadcom Employee

Raghu Rudraraju
Posted Jul 16, 2018 09:48 AM

Reply Reply Privately
In a way, yes. If you have 50 agents, now 50 agents have to through the retry connection.

_R

CA Service Management

SDM problems when restarting a database cluster node

Fabio GalarragaJul 10, 2018 09:01 AM

Chi ChenJul 10, 2018 10:29 AM

Fabio GalarragaJul 10, 2018 11:52 AM

Grant BruneauJul 10, 2018 09:40 PM

Fabio GalarragaJul 10, 2018 11:02 PM

Grant BruneauJul 11, 2018 12:12 AM

Ferdinand RoehrlJul 11, 2018 04:30 AM

Fabio GalarragaJul 11, 2018 01:24 PM

Grant BruneauJul 11, 2018 01:36 PM

Raghu RudrarajuJul 13, 2018 10:20 AM

Fabio GalarragaJul 14, 2018 01:37 PM

Raghu RudrarajuJul 16, 2018 09:48 AM

1. SDM problems when restarting a database cluster node

2. Re: SDM problems when restarting a database cluster node

3. Re: SDM problems when restarting a database cluster node

4. Re: SDM problems when restarting a database cluster node

5. Re: SDM problems when restarting a database cluster node

6. Re: SDM problems when restarting a database cluster node

7. Re: SDM problems when restarting a database cluster node

8. Re: SDM problems when restarting a database cluster node

9. Re: SDM problems when restarting a database cluster node

10. Re: SDM problems when restarting a database cluster node

11. Re: SDM problems when restarting a database cluster node

12. Re: SDM problems when restarting a database cluster node