DX Unified Infrastructure Management

 View Only
  • 1.  Require to reboot the Primary hub machine

    Posted Dec 09, 2019 04:02 AM
    Hello,
    Two of my clients need to reboot their machine on which the primary hub (v9.0.2) is running every few days because they can NOT connect to the hub anymore with the error logs 

    ...
    Dec 6 07:04:11:510 [2912] hub: hubup - send failed for /Domain/Hub (xx.x.x.xxx:48002 communication error)
    Dec 6 07:09:35:189 [2892] hub: nimPostMessage: sockConnect failed
    Dec 6 07:09:35:189 [2892] hub: nimSession - failed to connect session to A.B.C.D:48001, error code 10055
    Dec 6 07:09:35:189 [2892] hub: nimSession - failed to connect session to A.B.C.D:48001, error code 10055
    Dec 6 07:09:35:189 [2892] hub: nimSession - failed to connect session to A.B.C.D:48001, error code 10055
    Dec 6 07:09:35:189 [2892] hub: nimSession - failed to connect session to A.B.C.D:48001, error code 10055
    Dec 6 07:14:13:790 [2912] hub: ihubrequest: SessionConnect failed: 10.3.11.222/48002
    Dec 6 07:14:13:790 [2912] hub: hubup - send failed for /Domain/Hub (xx.x.x.xxx:48002 communication error)
    Dec 6 07:14:45:807 [2892] hub: nimPostMessage: sockConnect failed
    Dec 6 07:14:45:807 [2892] hub: nimSession - failed to connect session to A.B.C.D:48001, error code 10055
    Dec 6 07:14:45:807 [2892] hub: nimSession - failed to connect session to A.B.C.D:48001, error code 10055
    Dec 6 07:14:45:807 [2892] hub: nimSession - failed to connect session to A.B.C.D:48001, error code 10055
    Dec 6 07:14:45:807 [2892] hub: nimSession - failed to connect session to A.B.C.D:48001, error code 10055
    Dec 6 07:15:22:212 [32632] hub: (nim_ldap_get_connection): LDAP server spec 'LDAP.XXX.XXXX' failed (secure=2)
    Dec 6 07:19:55:784 [2892] hub: nimPostMessage: sockConnect failed
    Dec 6 07:19:55:784 [2892] hub: nimSession - failed to connect session to A.B.C.D:48001, error code 10055
    Dec 6 07:19:55:784 [2892] hub: nimSession - failed to connect session to A.B.C.D:48001, error code 10055
    Dec 6 07:19:55:784 [2892] hub: nimSession - failed to connect session to A.B.C.D:48001, error code 10055
    Dec 6 07:19:55:784 [2892] hub: nimSession - failed to connect session to A.B.C.D:48001, error code 10055
    Dec 6 07:24:15:995 [2912] hub: ihubrequest: SessionConnect failed: xx.x.x.xxx/48002
    ...

    Does anyone experience this ?

    Thanks
    PhN


  • 2.  RE: Require to reboot the Primary hub machine

    Posted Dec 09, 2019 05:54 AM

    Hi PhN,

    That might mean too many things. Error says connectivity issue, is that Primary hub log or remote hub? Maybe provide some more information on the environment, robot versions, hub versions, etc. in regards of compatibility, apply latest HF if not yet done. Assume you confirmed it's not actual network connectivity issue (firewall ports, tunnels config, etc.)?

    There is a useful thread to check if this is not known bug of UIM 9.x. : https://community.broadcom.com/communities/community-home/digestviewer/viewthread?MID=796895

    Few more articles that might worth check:

    https://community.broadcom.com/communities/community-home/digestviewer/viewthread?MID=772685

    https://community.broadcom.com/communities/community-home/digestviewer/viewthread?MID=757168


    Regards,




  • 3.  RE: Require to reboot the Primary hub machine

    Posted Dec 09, 2019 07:01 AM
    Hello,

    The error logs is an extract from the hub.log file itself. I'm running UIM 9.0.2 with hub/robot 7.97 
    I'm sure that the is not a f/w or ssl tunnel issue related because even running IM locally on the primary hub server, you can't connect to the hub !!

    So the client has to restart the machine to get everything back to normal. it looks like that the UIM application has opened too many TCP connections as the error code 10055 in the error logs is a Windows error code related.

    WSAENOBUFS

    10055

    No buffer space available.

    An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full.

    https://support.microsoft.com/en-us/help/196271/when-you-try-to-connect-from-tcp-ports-greater-than-5000-you-receive-t

     

     




  • 4.  RE: Require to reboot the Primary hub machine
    Best Answer

    Broadcom Employee
    Posted Dec 09, 2019 09:31 AM
    Please fo to the hotfix page and install the latest version of 7.97 robot and hub
    https://techdocs.broadcom.com/us/product-content//recommended-reading/technical-document-index/ca-unified-infrastructure-management-hotfix-index.html?r=2
    ftp://UIMuser:CnIa24uJ@ftp.ca.com/UIM_Probe_Hotfixes/robot_update-7.97HF7.zip
    ftp://UIMuser:CnIa24uJ@ftp.ca.com/UIM_Probe_Hotfixes/hub_7.97HF6.zip

    ------------------------------
    Gene Howard
    Principal Support Engineer
    Broadcom
    ------------------------------



  • 5.  RE: Require to reboot the Primary hub machine

    Posted Dec 16, 2019 07:26 AM
    Hello,

    I've applied these HF but on the site where a tunnel is configured between hubs (3 hubs), one of the hubs can't be accessed because its tunnel to the tunnel server can't be estalished after the HF are installed!

    And on another site where there are two hubs and NO tunnel is configured. But after few hours, there is a memory leak for the "hub" process (on both machines) which takes > 8 GB of memory.

    So I had to downgrade the hub/robot to the previous version on both sites.




  • 6.  RE: Require to reboot the Primary hub machine

    Broadcom Employee
    Posted Dec 16, 2019 10:36 AM
    you will probably need to engage support for further help.
    I would suggest when using these hotfixes that you update all hubs to the same version of controller and hub.
    There have been several cases with compatibility issues with older hubs and controllers.

    ------------------------------
    Gene Howard
    Principal Support Engineer
    Broadcom
    ------------------------------