Are both the server and the agents Windows?
The only thing I can really think of, since the telnet test works, is looking at all and any software that might interfere. Look for patterns with anything that runs with admin permissions or installs kernel modules, such as anti virus, security suites, lower level network drivers such as vmware drivers etc. - anything that those machines have but not the others?
If all those agents are now in the same e.g. network segment and that's a pattern in itself, maybe talk to the network people as well, see if they possibly have some security appliance or anything in there that could break up established connections.
Failing that, you'd either need to try to debug this by enabling TCP=9 traces in the agents (Automic management tab, agent properties) or possibly with tcpdump/wireshark. Or eventually go the ticket route at Automic support.
Best regards,
Carsten
p.s. having a bank holiday here, therefore won't be responding further here now until at least Monday.
Original Message:
Sent: 10-02-2019 12:30 PM
From: Jared Kessans
Subject: 10053 - An established connection was aborted...
I am able to telnet from the agent server to the AE server using our defined CP ports. I am able to telnet from our AE server to the agent server using the Service Manager port.
As for the agent data connections, 2305 in our case, we don't have that open except between servers for file transfers.
These agents are up and those that have jobs do run, but they just reconnect every five minutes with the error.
Original Message:
Sent: 10-02-2019 11:12 AM
From: Carsten Schmitz
Subject: 10053 - An established connection was aborted...
This message is a true classic.
It is usually caused by a client firewall or something else sending RST TCP packages and thus, terminating your connections. "Excluding a directory" probably means all kinds of things for virus scanning and anti malware mechanisms, but I doubt this on it's own lets the process communicate through the firewall. Try the connection with "telnet.exe <hostname> <port>", this is the true test. The port is usually 2300 for agent data connections, 8871 for service manager control connections or whatever port your server-side CP uses.
Beyond that, '10053' has also been seen when you run certain other client software. For instance, Automic has always maintained that they refuse any support if you're running any McAfee Anti Virus component. They insist you uninstall and replace it with something else (though I've shown that disabling McAfee WebIntelligence sufficiently solves the actual issues).
But in most cases, you'll find that the telnet.exe test shows it's usually a clear-cut firewall or network issue.
Hth,
Original Message:
Sent: 10-02-2019 10:13 AM
From: Jared Kessans
Subject: 10053 - An established connection was aborted...
We have some agents in Singapore that were working fine when the servers were on a different domain, but they decided to move these servers onto the same domain that our AE exists and since they migrated them we have been getting the below error every five minutes on the agents when they reconnect. We have thousands of agents that have no issue, except for these 10 or so servers that were changed.
I have researched online, I have searched the community, but nothing that seems to point to a cause. I saw mention about Windows firewall and virus protection, but the agent directory is excluded.
Does anyone have additional ideas?
20191002/215946.614 - U02000042 Connection aborted. Error code '10053', error description: 'An established connection was aborted by the software in your host machine.'.