CA Client Automation

Expand all | Collapse all

PC couldnt check scalability server

  • 1.  PC couldnt check scalability server

    Posted Mar 04, 2013 07:56 AM
    Hi all,

    I was found one pc with that problem:

    caf setserveraddress ******.domain.com or IP
    Caf currently registers with the scalability server at XXXXXXXXXX.
    Connecting to YYYYYYYYYY to ask if a server is present...
    The remote machine was not contactable. The peer messaging service may not be running or network errors were encountered.


    Strange thing is that this server is online and its works for other all pc. additionaly PC has off firewall and its on the network. Uninstallation didnt help me, registration to another server didnt help too, check and remove certificates didnt help too,.... Could you please tell me what can be additionaly wrong? I never see this. Thx


  • 2.  RE: PC couldnt check scalability server

    Broadcom Employee
    Posted Mar 05, 2013 07:21 AM
    the first thing to check is basic communication

    run the following checks in order
    [list=1]
    [*]caf ping
    [*]camping
    [*]ping
    [list]

    check in both directions for each test

    if caf ping fails but camping works then there is an issue in caf
    if camping fails but ping works there is an issue in CAM
    if all 3 fail then there is an issue with basic networking
    regards
    Rich


  • 3.  RE: PC couldnt check scalability server

    Posted Mar 05, 2013 08:13 AM

    richard_little wrote:

    the first thing to check is basic communication

    run the following checks in order
    [list=1]
    [*]caf ping
    [*]camping
    [*]ping
    [list]

    check in both directions for each test

    if caf ping fails but camping works then there is an issue in caf
    if camping fails but ping works there is an issue in CAM
    if all 3 fail then there is an issue with basic networking
    regards
    Rich
    Thank you for your response:

    1. I get

    [1] 9443 ms: The remote machine was not contactable. The peer messaging service
    may not be running or network errors were encountered.
    [2] 9021 ms: The remote machine was not contactable. The peer messaging service
    may not be running or network errors were encountered.
    [3] 9324 ms: The remote machine was not contactable. The peer messaging service
    may not be running or network errors were encountered.
    [4] 9014 ms: The remote machine was not contactable. The peer messaging service
    may not be running or network errors were encountered.


    2.
    1: reply from xxxxxxxx, rtt 4ms
    2: reply from xxxxxxxx, rtt 0ms
    3: reply from xxxxxxxx, rtt 2ms
    4: reply from xxxxxxxx, rtt 7ms
    camping: Trying yyyyyyyy ...


    dc1cmss2v.dc.hella.com: camping done, statistics:-
    Sent 4, completed 4, packet size 64 bytes.
    Round-trip (ms) min/ave/max = 0.0/3.3/7.0.
    Timed out 0 (0%) (detected: late discard return 0, late completion 0)

    3. Works fine....


    Thats question why at 1fst point get huge response? Thank you


  • 4.  RE: PC couldnt check scalability server

    Broadcom Employee
    Posted Mar 05, 2013 11:30 AM
    ok lets check if there is a limitation on packet size

    • camping -s 1024 <specific client>
    • camping -s 2048 <specific client>
    • camping -s 4096 <specific client>
    • camping -s 8192 <specific client>

    run through these if one fails then we need to configure the cam packet size to always be lower


  • 5.  RE: PC couldnt check scalability server

    Posted Mar 06, 2013 05:19 AM
    I had something similar between ES and one DM.
    No idea what exactly the reason was.

    After switching cam from upd to tcp between the two servers in runs fine.


  • 6.  RE: PC couldnt check scalability server
    Best Answer

    Posted Mar 11, 2013 03:43 AM
    Ok I was checked all what I search and there is only one solution how to solve it. You must change packet size as Richard write here, this will help to solve this symptom.

    Choose from this right one:
    camping -s 8192 <Scalability Server Name>
    camping -s 4096 <Scalability Server Name>
    camping -s 2048 <Scalability Server Name>
    camping -s 1024 <Scalability Server Name>

    after that set directly in the client this:
    fragment_size = <Packet Size Value>

    and restart cam

    Thank you


  • 7.  RE: PC couldnt check scalability server

    Posted Mar 14, 2013 09:23 AM
    Hi all,

    I have save problem too.
    Ping Works fine for all site.
    Camping Works fine for all site.
    Caf Ping Works fine from server to client but doesn't work from client to server.

    C:\WINDOWS\system32>caf ping atm-itcm

    Pinging caf on atm-itcm...

    [1] 28425 ms: The remote machine was not contactable. The peer messaging service

    may not be running or network errors were encountered.

    [2] 27714 ms: The remote machine was not contactable. The peer messaging service

    may not be running or network errors were encountered.

    [3] 27207 ms: The remote machine was not contactable. The peer messaging service

    may not be running or network errors were encountered.

    [4] 27661 ms: The remote machine was not contactable. The peer messaging service

    may not be running or network errors were encountered.

    is this about network ?

    Thanks


  • 8.  RE: PC couldnt check scalability server

    Posted Mar 14, 2013 09:26 AM
    Please lets try to change frangment size its could be network problem.


  • 9.  RE: PC couldnt check scalability server

    Posted Mar 14, 2013 10:42 AM
    Hi,

    in some cases it wont work.

    Fragment size is for UDP protocol, but SSA service tend to utilize TCP due initial negotiation with client server communication.

    I did folowing:

    ------------------------------ START-----------------------------------
    csamconfigedit port=4105 delete

    caf stop
    camclose
    cam start
    "%cai_msq%\bin\camconfig" config fragment_size=1024
    "%cai_msq%\bin\camconfig" config udp_port=4104
    "%cai_msq%\bin\camsave" persist

    caf start
    ------------------------------ END -----------------------------------
    that way I prioritize UDP over port 4104 that can use packet size of 1024 bytes for transport.

    anyways. Depending on the packet size that can run over network when probing with camping -s <packetsize in bytes> <Scalability Server Name> , you can increase / decrease it in fragment size line.

    regards,


  • 10.  RE: PC couldnt check scalability server

    Broadcom Employee
    Posted Mar 26, 2013 05:57 AM
    I strongly recommned you dont disable SSA in this way

    CAM by default uses UDP for all communication and thus will not use SSA. Only if you have configured CAM to use TCP will it do so and then use SSA. If you disable SSA on a Manager but have agents configured for TCP then this will break the communication.

    regards
    Rich


  • 11.  RE: PC couldnt check scalability server

    Broadcom Employee
    Posted Mar 26, 2013 06:58 PM
    Hi,

    To add to Richard's post it is not recommended to disable SSA for 4105 TCP connections.

    Instead, it is recommended to configure SSA to listen to 4105 connections using:
    csamconfigedit port=4105 EnablePmux=true PMUXLegacyPortListen=True PmuxLegacyPortBindAddress=0.0.0.0 PmuxLegacyConnect=True PmuxLegacyConnectMaximumDelay=5000


    This is application or rather port specific command and it tells the CSAMPMUX server that any application that is listening to port 4105(i.e CAM here) will now use the port multiplexer.
    "PMUXLegacyPortListen=True" option makes sure that the application not only accept PMUX connections but also listens to 4105

    Regards!!


  • 12.  RE: PC couldnt check scalability server

    Posted Mar 27, 2013 04:02 AM
    Usually problems was due to packet size. But strange is that in some domain have main part PC no problems, and few have problems due to packet size....


  • 13.  RE: PC couldnt check scalability server

    Broadcom Employee
    Posted Apr 04, 2013 06:11 AM

    lennyr wrote:

    Hi,

    To add to Richard's post it is not recommended to disable SSA for 4105 TCP connections.

    Instead, it is recommended to configure SSA to listen to 4105 connections using:
    csamconfigedit port=4105 EnablePmux=true PMUXLegacyPortListen=True PmuxLegacyPortBindAddress=0.0.0.0 PmuxLegacyConnect=True PmuxLegacyConnectMaximumDelay=5000


    This is application or rather port specific command and it tells the CSAMPMUX server that any application that is listening to port 4105(i.e CAM here) will now use the port multiplexer.
    "PMUXLegacyPortListen=True" option makes sure that the application not only accept PMUX connections but also listens to 4105

    Regards!!
    And this would only be required if you need connection between 12.x components and 11.x components via TCP


  • 14.  RE: PC couldnt check scalability server

    Posted Apr 05, 2013 06:54 AM
    We have some cam problems between some clients.

    This happens only on WAN connection, LAN is fine.

    The limit is in all cases a udp package size of 1401. Above it can come to problems (more later...):
    camping -s 1401 -r 1 ServerA
    camping: Trying <ip> ...
    
    1:  reply from ServerA, rtt 234ms
    
    ServerA: camping done, statistics:-
    Sent 1, completed 1, packet size 1401 bytes.
    Timed out 0 (0%) (detected: late discard return 0, late completion 0)
    
    ****
    
    camping -s 1402 -r 1 ServerA
    camping: Trying <ip> ...
    
    1:  no reply (timed out)
    
    ServerA: camping done, statistics:-
    Sent 1, completed 0, packet size 1402 bytes.
    Timed out 1 (100%) (detected: late discard return 0, late completion 0)
    The even stranger thing is that it is completely different for each client.
    Example (just to illustrate, there is no "rule"):
    Machines A1 and A2 are in China, B1 and B2 are in Europe.
    From A1 "camping -s 1402 B1" works, "camping -s 1402 B2" does not work.
    From A2 "camping -s 1402 B1" does not work, "camping -s 1402 B2" works.

    From cam trace I see that the bigger package is normally received, but it cannot answer:
    11:33:03.711  1  fd 904 (UDP_SOCKET), events 01, revents 01
    11:33:03.711  1(0) active(queued messages)/14 inactive queues
    11:33:03.711  event_handler() UDP IN 1
    11:33:03.711  get_message() called
    11:33:03.711  get_message(), message from IP-A1:4104
    11:33:03.711  new_msg( length 4167 ) called
    11:33:03.711  new_msg() returns
    11:33:03.711  ack_message( ACK ) called for IP-A1:4104
    11:33:03.711  ack_message() returns
    11:33:03.711  router( from IP-A1:4104 )
    11:33:03.711  router() entry: Seq 69031465, ER, from A1/CAI002588-00000, to B2/, len 4096, data >CA_ping<       1><, created 39503, life 8, notify: yes, flags: 89, src IP-A1, dst IP-B2
    11:33:03.711  harmless request
    11:33:03.711  swap_addr() called B2/->A1/CAI002588-00000
    11:33:03.711  create_msg_id() called
    11:33:03.711  create_msg_id() returns, seq 5404 now 1365154383
    11:33:03.727  swap_addr() returns
    11:33:03.727  router() echo request
    11:33:03.727  router() dispatch: Seq 5404, MM, from B2/, to A1/CAI002588-00000, len 4096, data >CA_ping<       1><, created 39503, life 8, notify: no, flags: 91, src IP-B2, dst IP-A1
    11:33:03.727  network_msg(A1/IP-A1) called
    11:33:03.727  enqueue_msg() called (B2/->A1/CAI002588-00000)
    11:33:03.727  enqueue_msg() returns
    11:33:03.727  start_poll( IP-A1:4104, index 0 ) called
    11:33:03.727  network_msg(IP-A1:4104) returns
    11:33:03.727  router() returns
    11:33:03.727  get_message() returns
    11:33:03.727  send_message(IP-A1:4104) called
    11:33:03.727  send_message(): Seq 5404, MM, from B2/, to A1/CAI002588-00000, len 4096, data >CA_ping<       1><, created 39503, life 8, notify: no, flags: 91, src IP-B2, dst IP-A1
    11:33:03.727  set_next_udp_queue() called
    11:33:03.727  set_next_udp_queue() returns (no ready queues)
    11:33:03.727  send_message() returns
    11:33:05.727  resend(IP-A1:4104) called
    11:33:05.727  timer: re-sending message sequence 5404
    11:33:05.727  start_poll( IP-A1:4104, index 0 ) called
    11:33:05.727  resend() returns
    11:33:05.727  send_message(IP-A1:4104) called
    11:33:05.727  send_message(): Seq 5404, MM, from B2/, to A1/CAI002588-00000, len 4096, data >CA_ping<       1><, created 39503, life 8, notify: no, flags: 91, src IP-B2, dst IP-A1
    11:33:05.727  set_next_udp_queue() called
    11:33:05.727  set_next_udp_queue() returns (no ready queues)
    11:33:05.727  send_message() returns
    11:33:07.727  resend(IP-A1:4104) called
    11:33:07.727  timer: re-sending message sequence 5404
    11:33:07.727  start_poll( IP-A1:4104, index 0 ) called
    11:33:07.727  resend() returns
    11:33:07.727  send_message(IP-A1:4104) called
    11:33:07.727  send_message(): Seq 5404, MM, from B2/, to A1/CAI002588-00000, len 4096, data >CA_ping<       1><, created 39503, life 6, notify: no, flags: 91, src IP-B2, dst IP-A1
    11:33:07.727  set_next_udp_queue() called
    11:33:07.727  set_next_udp_queue() returns (no ready queues)
    11:33:07.727  send_message() returns
    11:33:09.727  resend(IP-A1:4104) called
    11:33:09.727  timer: re-sending message sequence 5404
    11:33:09.727  start_poll( IP-A1:4104, index 0 ) called
    11:33:09.727  resend() returns
    11:33:09.727  send_message(IP-A1:4104) called
    11:33:09.727  send_message(): Seq 5404, MM, from B2/, to A1/CAI002588-00000, len 4096, data >CA_ping<       1><, created 39503, life 4, notify: no, flags: 91, src IP-B2, dst IP-A1
    11:33:09.727  set_next_udp_queue() called
    11:33:09.727  set_next_udp_queue() returns (no ready queues)
    11:33:09.727  send_message() returns
    11:33:11.727  resend(IP-A1:4104) called
    11:33:11.727  timer: re-sending message sequence 5404
    11:33:11.727  start_poll( IP-A1:4104, index 0 ) called
    11:33:11.727  resend() returns
    11:33:11.727  send_message(IP-A1:4104) called
    11:33:11.727  send_message(): Seq 5404, MM, from B2/, to A1/CAI002588-00000, len 4096, data >CA_ping<       1><, created 39503, life 2, notify: no, flags: 91, src IP-B2, dst IP-A1
    11:33:11.727  set_next_udp_queue() called
    11:33:11.727  set_next_udp_queue() returns (no ready queues)
    11:33:11.727  send_message() returns
    11:33:12.727  timer: discarding message sequence 5404
    11:33:12.727  start_poll( IP-A1:4104, index 0 ) called
    11:33:12.727  bounce() called
    11:33:12.727  discarding message (reason: network error) ... 
    11:33:12.727  bounce(): Seq 5404, MM, from B2/, to A1/CAI002588-00000, len 4096, data >CA_ping<       1><, created 39503, life 0, notify: no, flags: 91, src IP-B2, dst IP-A1
    11:33:12.727  bounce() returns (no notification requested) (B2/->A1/CAI002588-00000)
    The network trace on the clints did not show any clue...

    It is quite frustrating...


  • 15.  CAM communication problems

    Posted Apr 05, 2013 08:09 AM
    Update:

    According to the CAM troubleshooting guide, there are two workarounds:
    - Set the udp fragment_size to 1400
    - Set the communication between two affected servers to tcp

    Since two of the affected servers are DMs (it may affect the move of a client from a DM to another), I plan to set the communication between the DM's/ES to TCP.

    Other clients normally never communicate outside the domain (exception for RC, but even if caf ping does now work, RC does...), so I think no change needed.
    In case I can try the udp fragment size.

    I suppose we'll never find out the reason of the problem - I am 100% sure that we did not have it earlier...


  • 16.  RE: CAM communication problems

    Broadcom Employee
    Posted Apr 10, 2013 06:54 AM
    The cause will have been networking

    If fragment size was an issue then there is an MTU restriction along the path that failed

    regards
    Rich