Just for the sake of completeness, my colleague found the cause. He opened the explanation with the words "you won't believe it..."
And yeah, I did not believe it....
What you should need to know beforehand:
Before the DB export CP5 was a common and real CP.
After the DB import CP5 was a JCP....
We did a COLD start
The renewed all Transfer keys and some agents kept offline (in AWI) but log was ok on the first glance.
Cause was: the CP(new CP2) remembered the old connection to previous CP(old CP5) and tried to handover to the same CP (new CP5 = JCP).
NOW CP5 was a JCP which was unable to handle the logon attempt and performed a Memory Dump but nothing else - no entry in Agent log.
RA WS Agent log snip
20200420/130309.493 - U02000072 Connection to system 'UC4server' initiated.
20200420/130309.493 - U02000011 Connection to Server 'UC4server:Port' initiated.
20200420/130309.509 - U02000004 Connection to Server 'UC4server#CP002' successfully created.
20200420/130309.510 - U02000075 CP Server 'UC4server#CP002' has '1' client connections.
20200420/130309.547 - U02000020 Environment: Hardware = 'CIT'.
20200420/130309.547 - U02000021 Environment: Software = 'CIT'.
20200420/130309.547 - U02000022 Environment: SW version = '7.2.0+build.1'.
20200420/130515.013 - U02000017 The check interval for 'Jobs' has been set to '60' seconds.
20200420/130515.013 - U02000017 The check interval for 'Server' has been set to '660' seconds.
20200420/130515.013 - U02000017 The check interval for 'Reconnect' has been set to '600' seconds.
20200420/130515.014 - U02000017 The check interval for 'Report' has been set to '60' seconds.
20200420/130515.023 - U02013327 Timestamp of the Solution in the local file system: 'Thu Feb 14 13:32:38 CET 2019'
20200420/130515.023 - U02013326 Timestamp of the Solution in the database: 'Thu Feb 14 13:32:38 CET 2019'
CP002 log snip:
20200420/130309.527 - U00003412 Agent 'RAWS02' logged on (Client connection='68').
20200420/130309.547 - U00003366 Connection to agent 'RAWS02' already exists (old connection '*CP005#00000049', new connection '*CP002#00000068').
20200420/130309.548 - U00003490 Connection to 'UC4server:Port' initiated, client connection '69(30)'
20200420/130309.557 - U00003489 Server 'UC4server#CP005' logged on (Client connection='69').
20200420/130514.501 - U00003316 Zero Downtime information: MixedMode='N', base MQSet='1', active MQSet='1', own MQSet='1', MQSet PWP='1'.
20200420/130514.553 - U00003472 Connection to Server 'UC4server#CP005' has been closed.
20200420/130514.553 - U00003407 Client connection '69(29)' from 'UC4server:Port' has logged off from the Server.
20200420/130530.403 - U00003397 Agent 'RAWS02' logged off (client connection='68').
CP005 Log snip:
20200420/130309.549 - 25 U00003406 Client connection '16' from 'UC4server:Port' has logged on to the Server.
20200420/130309.564 - 25 U00009907 Memory dump 'Unknown message from 16' (Address='n/a', Length='000099')
20200420/130309.566 - 25 00000000 F4600052 41575330 32202020 20202020 >ô`.RAWS02 <
20200420/130309.566 - 25 00000010 20202020 20202020 20202020 20202020 > <
20200420/130309.566 - 25 00000020 2020202A 43503030 35233030 30303030 > *CP005#000000<
20200420/130309.566 - 25 00000030 34392020 20202020 20202020 20202020 >49 <
20200420/130309.566 - 25 00000040 2020202A 43503030 32233030 30303030 > *CP002#000000<
20200420/130309.566 - 25 00000050 36382020 20202020 20202020 20202020 >68 <
20200420/130309.566 - 25 00000060 202020 > <
20200420/130309.568 - 29 U00003489 Server 'UC4server#CP002' logged on (Client connection='16').
After stopping new CP5 (JCP) and restarting the agents everything worked well.
I think there is a missing feature in connect handling of the CPs (or JCPs) and log message in agent logs.
KR Wolfgang
------------------------------
Support Info:
if you are using one of the latest version of UC4 / AWA / One Automation please get in contact with Support to open a ticket.
Otherwise update/upgrade your system and check if the problem still exists.
------------------------------
Original Message:
Sent: 04-17-2020 04:14 AM
From: Wolfgang Brueckler
Subject: Agents not shown active after DB clone
Hi guys,
Question to the knowing: Why do the agents not show up as running in AWI?
We performed a DB Clode from Test env to sandbox which went fine. (COLD start performed)
We did this in the past (V11.2.X) several times successfully.
Now I am restarting the agents as usual after renew transfer key but some SQL Agents are still shown as inactive in AWI (AE V12.3.0)
In SMGR and in teh log I can see they connect to the System - as shown below.
20200417/093931.489 - ------------------------------------------------------------------------------------------
20200417/093931.490 - U02000071 Current directory: /opt/uc4/V12_3/agents/postgres/bin
20200417/093931.490 - U02000066 Host information: Host name='XYZ', IP address='XYZ'
20200417/093931.569 - U02000153 The JVM Option HeapDumpOnOutOfMemoryError is enabled.
20200417/093931.577 - U02000072 Connection to system 'GMUC4' initiated.
20200417/093931.578 - U02000011 Connection to Server 'XYZ' initiated.
20200417/093931.591 - U02000004 Connection to Server 'GMUC4#CP004' successfully created.
20200417/093931.591 - U02000075 CP Server 'GMUC4#CP004' has '1' client connections.
20200417/093931.658 - U02000020 Environment: Hardware = 'SQLPOSTGRESQL'.
20200417/093931.659 - U02000021 Environment: Software = 'SQLPOSTGRESQL'.
20200417/093931.659 - U02000022 Environment: SW version = '1.0'.
here the log stops.
What I exactly did:
renewed transferkey
stopped & started agent
2nd. try
deleted ucxjsqlx.kstr
renewed transferkey
stopped & started agent
3rd. try
deleted ucxjsqlx.kstr
deleted Agent object in Clt0
started agent
The agents that came up correctly do have following additional entries in their log:
20200416/124148.820 - U02000017 The check interval for 'Jobs' has been set to '60' seconds.
20200416/124148.820 - U02000017 The check interval for 'Server' has been set to '660' seconds.
20200416/124148.820 - U02000017 The check interval for 'Reconnect' has been set to '600' seconds.
20200416/124148.821 - U02000017 The check interval for 'Report' has been set to '60' seconds.
20200416/124148.878 - U07001001 Charset used by the Agent: 'ISO-8859-15'
There are only SQL Agents affected and although I deleted their keystore file and the agent object in clt0 itself
I am unable to start them that they show up in clt 0 in AWI.
Any hints?
many THX
Wolfgang
------------------------------
Support Info:
if you are using one of the latest version of UC4 / AWA / One Automation please get in contact with Support to open a ticket.
Otherwise update/upgrade your system and check if the problem still exists.
------------------------------