DX NetOps

Expand all | Collapse all

DAdaemon shuts down regularly - no heartbeat to DR after 60min

Jump to Best Answer
  • 1.  DAdaemon shuts down regularly - no heartbeat to DR after 60min

    Posted 11-02-2015 02:16 AM

    Dear community,

     

    I'd like to ask you for a little help here, as the cooperation with CA Support is very slow...

     

    We have a plain installation of ca PM 2.5.0 on RedHAT 6.7 servers (2 environments, staging and production). The problem occurs on both environments. No data is being collected yet. What happens is that the dadaemon service shuts down regularly after 1hr 5min and a couple seconds after starting, with the following reason: The primary data repository host 'host1' is no longer available, and there are no available secondary hosts. Current Host Status: {host1=DOWN, host2=DOWN, host3=DOWN}

     

    There seems to be nothing wrong with the Vertica cluster, it is running all the time and there seem to be no network issues between the nodes.

     

    From the inspection of vertica.log file I found the following:

     

    After starting the dadaemon service a new session for DA_HEARTBEATs is initiated and it starts doing the heartbeats every 10s. After exactly 60min the session is closed and no new session is created, no new heartbeats are coming in. 300s later the dadaemon shuts down, which is an expected behavior if the default 5min timeout for DR is used.

     

    Anybody came across such an issue? Am I missing something really basic here? One more thing is, that we don't have time synchronization in place yet, so the time difference between the DA server and DR nodes is all over the place (2-3s).

     

    Thanks

     

    Mike



  • 2.  Re: DAdaemon shuts down regularly - no heartbeat to DR after 60min
    Best Answer

    Posted 11-05-2015 01:30 PM

    Hi Mike:

     

    Well, time synch is required.  From the guides:

     

    "Time synchronization using NTP is required. Start the NTP daemon on Linux if it is not running."

     

    ... so i would definitely get that straightened out

     

    What is the ticket number?  The support engineer may need to have you put vertica tools in place to try and get some data from the vertica cluster

     

    Joe



  • 3.  Re: DAdaemon shuts down regularly - no heartbeat to DR after 60min

    Posted 11-08-2015 03:31 AM

    Hi Joe,

     

    first of all thanks for the reply.

     

    Let me add that in the meantime we managed to get the NTP synchronisation working, time is synchronized now. However this made no difference to the behavior of dadaemon

     

    The ticket number associated with this issue is 00219701 opened under customers site ID. Unfortunately we're moving in circles with the support engineer so far.

     

    One more info: I've tried upgrading CA PM to 2.6.0 but again, no change regarding this issue.

     

    Mike



  • 4.  Re: DAdaemon shuts down regularly - no heartbeat to DR after 60min

    Posted 11-12-2015 08:47 AM

    I have updated the support ticket with some next steps for the engineer to validate with you.  Really, it is to much for posting here as he will need to collect data