AutoSys Workload Automation

Expand all | Collapse all

autoping is not successfull. Problem with agent

Jump to Best Answer
  • 1.  autoping is not successfull. Problem with agent

    Posted 12-11-2019 09:45 AM
    Hi!

    I'm installing new CA workload automation AutoSys envieronment. My system is Red Hat Enterprise Linux Server release 7.7 (Maipo). I have 5 servers(high availability mode): Pimary scheduler, Shadow scheduler, Tie-breaker and two Oracle event servers. I opened all necessary ports (1521,1721,5250,7163,7500,7507,7520,9000). Installation of CCS and WAAE was finished without errors. CA/WorkloadAutomationAE/autouser.P03/autosys.bash.* was added to bashrc, so I have all global variables set.

    Now I have a problem to start test job or autoping localhost machine. As I understand, WAAE agent does not interact with application server.
    Unifstat shows that all CCS and WAAE services are in running mode. chk_auto_up -r 111 shows, that connection between Primary, Shadow and Tie-breaker schedulers and their application servers exists.
    I was trying to perform autoping to localhost(scheduler) machine. I also defined shceduler machine on the Event server with jil and tryed to autoping it from Shadow and Tie-breaker schedulers, but the result is the same:

    autoping -m localhost
    CAUAJM_I_50023 AutoPinging Machine [localhost]
    CAUAJM_E_10229 Communication attempt with the CA WAAE Agent has failed! [p-autosys-app10:7520]
    CAUAJM_E_50281 AutoPing from the Scheduler WAS NOT SUCCESSFUL.
    CAUAJM_E_10229 Communication attempt with the CA WAAE Agent has failed! [p-autosys-app10:7520]
    CAUAJM_E_50283 AutoPing from the Application Server WAS NOT SUCCESSFUL.
    CAUAJM_E_50026 ERROR: AutoPing WAS NOT SUCCESSFUL.

    And the next error with agent is related to its stop and launch. I make:
    unicntrl stop waae_agent-WA_AGENT
    Stopping waae_agent-WA_AGENT (via systemctl): [ OK ]
    Executed waae_agent-WA_AGENT stop................................OK
    unifstat
    WAAE Application Server (P03) 3733 running
    WAAE Scheduler (P03) 4297 running
    WAAE Agent (WA_AGENT) 30399 running

    unicntrl stop ALL
    Stopping waae_agent-WA_AGENT (via systemctl): [ OK ]
    Executed waae_agent-WA_AGENT stop................................OK
    unifstat
    WAAE Application Server (P03) - not active
    WAAE Scheduler (P03) - not active
    WAAE Agent (WA_AGENT) 30399 running

    unicntrl start ALL
    Starting waae_agent-WA_AGENT (via systemctl): Job for waae_agent-WA_AGENT.service failed because the control process exited with error code. See "systemctl status waae_agent-WA_AGENT.service" and "journalctl -xe" for details.
    [FAILED]
    Executed waae_agent-WA_AGENT start...............................FAIL 1
    Skipping CA-WAAE

    systemctl status waae_agent-WA_AGENT.service
    CA Workload Automation System Agent...
    WAAE Agent (WA_AGENT) Agent service is starting...
    waae_agent-WA_AGENT[11260]: Agent is already running with pid 30399
    waae_agent-WA_AGENT[11260]: Unable to start Agent service
    waae_agent-WA_AGENT[11260]: [FAILED]
    waae_agent-WA_AGENT.service: control process exited, code=exited status=1
    systemd[1]: Failed to start LSB: CA Workload Automation System Agent.
    systemd[1]: Unit waae_agent-WA_AGENT.service entered failed state.
    systemd[1]: waae_agent-WA_AGENT.service failed.

    Please help me to understand why autoping to localhost is not successful and why agent service does not start and stop correctly.


  • 2.  RE: autoping is not successfull. Problem with agent

    Posted 12-12-2019 01:23 AM
    The agent does talk to the app server, but not very often; primarily to respond to autopings.

    It appears that localhost is p-autosys-app10.  If so, what happens when you do:

    nc -vz p-autosys-app10 7520
    nc -vz p-autosys-app10 7507
    ps -ef | grep cybAgent
    netstat -an | grep -e 7520 -e 7507

    If p-autosys-app10 is not the localhost, please clarify what server is the autosys primary scheduler.

    From the primary:
    nc -vz p-autosys-app10 7520

    From p-autosys-app10
    nc -vz "primary" 7507
    nc -vz "primary" 7500


  • 3.  RE: autoping is not successfull. Problem with agent

    Posted 12-13-2019 11:02 AM
    Hi,

    The first thing I would check is that you have a machine defined as p-autosys-app10 and how it is defined with an autorep -q -m p-autosys-app10.
    This is assuming as Hank mentions that it is indeed the hostname of the machine.

    Check your agentparm.txt file uses the 7520 port and the agent_name entry matches the agentname in the agentparm.txt file.
    If that looks good, I would then go to the receiver.log to see if it is actually getting the request.  If you don't see any messages, then it's a network  or configuration problem.  If you do see error messages, you could see a bad padding message, which indicates the encryption key doesn't match between the two.  A bad target name type message means that the agent name doesn't match.

    If the messages look normal, then go to the transmitter log for any errors  like a connection refused, unknown hostname or timeouts.

    That should give you the answer to the problem.
    Mike


  • 4.  RE: autoping is not successfull. Problem with agent
    Best Answer

    Posted 12-23-2019 04:20 AM
    Thanks everyone for the answers!
    I checked all advices, they helped me to understand the behavior of the system in more detail.
    I've checked $AUTOUSER/config.$AUTOSERV file from the old version of AutoSys and saw, parameter "UseEncryption=1", that means using of default encription. In my new installation I used "UseEncryption=2" - it means using of encription key. Changing this parameter to "1" didn't help, but I performed reinstall of application without checking box "Encrypt Data Using a User-specified Key"​ (previously I thought, that I need to input 32-bit agent key to this box - it was my mistake). Maybe I could solve my problem using as_secure tool, but I don't really need parameter "UseEncription" differs from "default".​