I need some help since I am getting a lot of agents srvc_down status or no_service.
I tried to troubleshoot according to https://knowledge.broadcom.com/external/article/87537/remote-agent-fails-to-start-receive-awe5.html
The awcomm definitely not able to view the log since the rmi server show the agent status as srvc_down or no_service.
I had checked $AW_HOME/data/net_conn.dat and confirmed the information is correct.
The agentservice and watchworx is running in the agent server for more than 22 days using awstat and grep the process but rmiserver shows the agent service status as SRVC_DOWN since 3 days ago.
I can see the agentservice is running in Window task manager too. (I have both unix and window agent server).
However, I couldn't find any result when I issue the netstat command. I can see the result after bounced the agentservice using stopso/startso.
below is one of many errors from multiple agents:
ErrorMsg: AwE-5103 network socket error (8/8/20 1:49 AM)
Details: 39385660[SSL_DH_anon_WITH_RC4_128_MD5: Socket[addr=XXXXXXX,port=10010,localport=45153]]
java.io.EOFException
at java.io.ObjectInputStream$PeekInputStream.readFully(Unknown Source)
at java.io.ObjectInputStream$BlockDataInputStream.readShort(Unknown Source)
at java.io.ObjectInputStream.readStreamHeader(Unknown Source)
at java.io.ObjectInputStream.<init>(Unknown Source)
at com.appworx.shared.code.server.B.C(RequestSocket.java:115)
at com.appworx.server.data.SocketManager$1.run(SocketManager.java:370)
My concerns to get a permanent fix or find out the root cause as this had impacted many jobs due to frequent downtime and causes many agentservice unable to process job. I couldn't afford to always login to restart the services. There are about 42 active agents in 1 rmiserver. I have 6 rmiservers.