Automic Workload Automation

 View Only
Expand all | Collapse all

Connection to AE system could not be established

  • 1.  Connection to AE system could not be established

    Posted Oct 14, 2019 11:53 AM
    Hello,

    we are not able to connect to our AE system. The error message of the awi is as follows:

    Connection to AE system could not be established.

    <service-address>:2217 - TimeoutException


    We have restarted our tomcat webserver and also the automation engines (cold and warm). But the connection to the cp cannot be established.


    We´ve configured the awi on this tomcat webserver to another AE system (DEV) to try an other connection, and there a login is possible. So the awi and tomcat server shouldn´t be the problem.

    The cp tries to connect to all installed agents (round about 2200). It takes a long time, that new connections are established.
    In the log of the cp a lot of Socket call error are listed e.g. "U00003413 Socket call 'connect(xx.xx.xx.xx,8873)' returned error code '115'.

    Has anyone an idea how to solve this issue?

    Versions:

    Automation Engine: 12.2.3+build.1558123282415

    DB: Oracle 12.2.0.1.0 - 64bit

    AWI: 12.2.3.GA01-dev-feature-12.2.3-GA01-89509

    Kind regards
    Stephan Schiller

    ------------------------------
    Stephan Schiller
    Debeka
    ------------------------------


  • 2.  RE: Connection to AE system could not be established

    Posted Oct 14, 2019 12:38 PM
    Hi.

    That connect error 115 is defined as EINPROGRESS in the kernel and means that an asynchronous operation on that port is already taking place. I suspect when it (re-) connects to your roughly 2200 agents, this may be benign and possibly expected.

    Does the connection that does not work (tomcat to engine CP, I presume) run across different hosts, i.e. is your AWI on a different machine from your engine? If so, have you done the telnet test? Do a

    telnet ip port

    from the source to the target, if that doesn't work you're probably firewalled or the target isn't listening on the proper port (check with e.g. netstat -an). If it does work, tcpdump/wireshark/tshark may be one (albeit a bit advanced) avenue.

    Hth,


  • 3.  RE: Connection to AE system could not be established

    Posted Oct 15, 2019 02:57 AM
    Hi Stephan,

    I would suggest you to:

    1) Telnet all existing / running CP's and ports from the web Server (to make sure Network is open). 
    2) Check on Linux if there are any limitations on the engine Server (ulimit). If there is a lot of traffic, if Limitation is hit, that may cause Problems. 

    Does that issue only occur when all agents have to reconnect and the awi is starting,  or is it a General issue?

    Maybe try to give the AWI another initial CP connect port, but not sure if this will help. As far as i remember on initial connect to a cp, there will be a check which cp is least busy for taking the Connection.

    Best Regards,
    Roman Embacher
    R.E. IT Services


  • 4.  RE: Connection to AE system could not be established

    Posted Oct 15, 2019 09:55 AM
    Hi Roman and Carsten,

    first, thank you for your help. We tried your suggestions, but there seems to be no problem with the ulimit or the connections between the servers.


    We figured out that in the cp-log a lot of "Search trusted certificates in folder '/xxx/xxx/trusted" messages are listed.

    As long as the messages are logged, the login to the AWI doesn´ t work. As soon as they are no longer logged the login in AWI is possible.
    It makes no difference if we restarte the AE or if a cp overtakes the task of an other cp.

    Thereafter we checked the CAPKI settings in the ucsrv.ini, but the parameter "trusted_cert_folder=" is commented.

    Can you explain us where these messages come from? Currently we have no idea


    Regards
    Stephan

    ------------------------------
    Debeka
    ------------------------------



  • 5.  RE: Connection to AE system could not be established

    Posted Oct 15, 2019 10:14 AM
    Hi Stephan.​

    You can possibly confirm if these messages directly relate to AWI login attempts by having an open terminal window with a "tail -F" on the CP logs, while you try the login. If you can indeed connect those two things, that might be helpful for further diagnosis.

    However, I checked the CP logs of my engine and I don't have these messages at all, and I went back and checked old, archived 12.2 logs and didn't find that message in their either. I have no clue what this message means and why it happens when the connection fails. Therefore I suggest unless someone comes up with more insights in due time, to consider opening a support ticket with Broadcom about this.

    Sorry to have no more helpful news.

    Br,
    Carsten



  • 6.  RE: Connection to AE system could not be established

    Posted Oct 15, 2019 10:27 AM
    Hi Carsten,

    i think these messages are not connected to the login via awi. I think they´re logged, when an agent connects to the cp.
    We are using CAPKI and my suggestion is, that it is in some way correlating with this.

    Do you also use CAPKI?

    Regards
    Stephan

    ------------------------------
    Debeka
    ------------------------------



  • 7.  RE: Connection to AE system could not be established

    Posted Oct 15, 2019 10:42 AM
    Hi Stephan,

    No, we have yet to delve into CAPKI. At this ​time we are NOT using it.

    Br,
    Carsten


  • 8.  RE: Connection to AE system could not be established

    Posted Oct 15, 2019 02:23 PM
    Hi Stephan,

    To be honest , i did Play around with capki a bit, and i am not 1000% sure if the trusted_cert_folder Parameter really works as expected. I tried a few settings but did not get this to work. I was thinking that i can make a private and public key for each component and then put the public keys into the trusted Folder, but did not get this to work.  

    As far as i understood, if you start to use CA PKI, all components that communicate with each other Need to trust each other. So e.g. if you want to link the Agent view in the AWI with the Service Managers and agents on the target System. You Need to

    - correctly install CA PKI on each target Server, as well as on the central components (means run the command that is in the documentation). And have key's in place. 

    The only way i got this running was the following:
    I copied the certificate files from the service manager on the automation engine (they got produced automatically when i installed the service manager). And i distributed these files to all servers where i have agents. Also i put these i think on engine server and awi server. In all Kind of ini-files you have this section:

    [CAPKI]

    certificate=C:\Automic\Automation.Platform\AutomationEngine\bin\ucsrv_certificate.pem

    key=C:\Automic\Automation.Platform\AutomationEngine\bin\ucsrv_key.pem



    And then in each ini-file i referred to these 2 certificates. Also in the ini-files of the core components

    This was the only way how i was able to make it possible, to start and stop the agents from the AWI. And i think this is where your messages come from.

    But to be hones - I'd find it be cool if someone from product management can give some examples how to set it up properly :)

    Best Regards,
    Roman Embacher



  • 9.  RE: Connection to AE system could not be established

    Broadcom Employee
    Posted Oct 16, 2019 09:56 AM
    Hi Stephan,
    message 
    U00003413 Socket call 'connect(xx.xx.xx.xx,8873)' returned error code '115'
    looks like the AE is busy with service manager scans (every time an agent tries to log on) and this blocks the CPs.
    Please try to set SMGR_AUTO_SCAN to NO in UC_SYSTEM_SETTINGS. 


    ------------------------------
    Automation Solution Architect
    ------------------------------



  • 10.  RE: Connection to AE system could not be established

    Broadcom Employee
    Posted Oct 16, 2019 09:57 AM
    Hi Stephan,
    it looks like the AE is doing service manager scans all the time:
    U00003413 Socket call 'connect(xx.xx.xx.xx,8873)' returned error code '115'.
    and this blocks the CP from accepting AWI connections.
    Please try setting SMGR_AUTO_SCAN to NO in UC_SYSTEM_SETTINGS

    ------------------------------
    Automation Solution Architect
    ------------------------------



  • 11.  RE: Connection to AE system could not be established
    Best Answer

    Posted Oct 23, 2019 09:55 AM
    Hi!

    Thanks to all for your offered help!

    We found a workaround for this issue.

    After analyzing the logs of the cp we figured out, that it may have something to do with the installation of CAPKI. We have installed CAPKI for the AE, smgr of the AE and also for the smgrs of the agents.
    It seems that for every agent connection the cp searches for trusted certificates in the given folder. This takes a lot of time in which the cp is totally busy (so even no logins are possible).

    For the workaround we turned off CAPKI for AE and also for the smgr of the AE and after that, the warnings ("Search trusted certificates in folder '/uc4/green/ae/bin/trusted") weren´t logged any more. The connections of the agents to the cp are quickly done and a login is (after a "normal" reconnect time) possible. This behavior was reproducible.

    By the way (and perhaps somebody of the support-team reads this).... we have created a p2-ticket for this issue and after more than a week now we still haven´t heard anything. That´s really ridiculous!

    Regards
    Stephan

    ------------------------------
    Debeka
    ------------------------------



  • 12.  RE: Connection to AE system could not be established

    Posted Oct 24, 2019 04:01 AM
    Edited by Carsten Schmitz Oct 24, 2019 04:01 AM
    ​Hi Stephan,

    Thanks for letting us know! I shall proceed to not touch CAPKI, as I have done until now :)

    > That´s really ridiculous!

    Yes :)

    Br,
    Carsten


  • 13.  RE: Connection to AE system could not be established

    Posted Oct 24, 2019 12:37 PM
    Hi Stephan,

    I'm interested in one detail. Are you able to control agents from the ADministration perspective when CAPKI is not available or do you Need to use Service Manager Dialog program? I made the experience that CAPKI was a must to be able to do Service Manager scan and start / stop agents directly from AWI, so i set it up for all agents. Would be interesting if you got this working without CAPKI :)  

    BR,
    Roman


  • 14.  RE: Connection to AE system could not be established

    Posted Oct 25, 2019 04:20 AM
    Hi Roman,

    it is possible to stop agents via the administration perspective, but you can´t start them again. Error message: "CAPKI Library not available"

    So for starting agents we have to use the smgr dialog.

    Regards

    Stephan

    ------------------------------
    Debeka
    ------------------------------



  • 15.  RE: Connection to AE system could not be established

    Posted May 06, 2020 04:40 PM
    Edited by Kenneth Hutchins May 06, 2020 04:44 PM
    I am sort of new to 12 and service manager within Automic I used UC4 V8 and been with my current employer for 5 months. They current are running 12.1 I am looking to upgrade to 12.3. I recently setup a test environment of 12.3 and noticed that when I upgraded agents from 12.1 to 12.3 I am not able to use service manager anymore from the Admin. 

    The behavior is how Stephan describes it. I have to manually start the agents using the smgr dialog. I keep getting errors related to CAPKI not installed. 

    How do I install CAPKI on agents?https://docs.automic.com/documentation/webhelp/english/AA/12.3/DOCU/12.3/Automic%20Automation%20Guides/help.htm#ServiceManager/CAPKI.htm?Highlight=CAPKI%20install

    This document is not clear to me. Where is the setup.exe for CAPKI?

    I did find this write up online which shows how to enable "SMGR_SUPPORT_LEGACY_SECURITY" - https://knowledge.broadcom.com/external/article/136359/capki-error-when-starting-agents-via-awi.html

    Which allows me to start/stop agents with older Service Managers. But I would like to look into using CAPKI. 

    And until recently Broadcom supports been pretty good. But When I ask for help with 12.3 I am getting no responses. So Thanks in the advance for all these post. I learn so much from you guys!

    Cheers. 

    ------------------------------
    DevOps Engineer
    ULLICO
    ------------------------------



  • 16.  RE: Connection to AE system could not be established

    Posted May 07, 2020 02:26 AM
    Hi,

    you can find the CAPKI.exe in the complete download-file -> folder \Tools\CA.PKI
    or
    you can download the single installer CAPKI zip under component downloads

    https://downloads.automic.com/downloads/advanced_mode?selected_tab=&lifecycle_entity_id=1525261028345&component_id=1525261028769&major_version_id=1573976609478%2C1528116396836&search=&no-default-values=on&show-files=on&show-patch-descriptions=on

    ------------------------------
    Thx & rgds
    Christian
    ------------------------------



  • 17.  RE: Connection to AE system could not be established

    Posted May 07, 2020 07:23 PM
    Thanks that worked.

    ------------------------------
    DevOps Engineer
    ULLICO
    ------------------------------



  • 18.  RE: Connection to AE system could not be established

    Posted May 07, 2020 04:51 AM
    Hi,

    What do you mean by

    not able to use service manager anymore from the Admin.​

    Do you mean the Agent list in the Admin view of the AWI? For what it's worth, we use 12.3.x engines and agents without CAPKI without any issues for quite some time, but we always use smgr dialog to stop and start them.

    Best,


  • 19.  RE: Connection to AE system could not be established

    Posted May 07, 2020 07:25 PM
    Yeah I meant from the AWI. But once I installed the CAPKI on all of our agents, I will disable "SMGR_SUPPORT_LEGACY_SECURITY"

    ------------------------------
    DevOps Engineer
    ULLICO
    ------------------------------