Automic Workload Automation

 View Only
  • 1.  The mqsrv table is not properly maintained(?)

    Posted Nov 18, 2019 03:06 PM
    Edited by Pete Wirfs Nov 18, 2019 03:06 PM
    This weekend some of our non-prod servers were patched and rebooted.  Over the course of the weekend the DEV AE server was also rebooted.  This morning we could not get two of the non-prod agents to connect to our DEV AE.  I rebooted the AE server and that didn't fix it.   I contacted support and they had me delete the related agent rows out of the mqsrv table.  Then everything worked.

    This seems pretty bad to me, that the product seems to retain an "I'm connected" row to an agent even through a reboot cycle.

    Have others encountered this?  How do you mitigate the issue?

    We are running our AE's on V12.3.0, Windows, SQLServer

    ------------------------------
    Pete
    ------------------------------


  • 2.  RE: The mqsrv table is not properly maintained(?)

    Posted Nov 20, 2019 03:44 AM
    Hi Pete,
    I might be mistaken, but if i remember correctly a coldstart will clean up the MQ* tables.

    ------------------------------
    Best regards,
    Thierry

    Banque de Luxembourg
    ------------------------------



  • 3.  RE: The mqsrv table is not properly maintained(?)

    Posted Nov 21, 2019 08:18 AM
    Same as Thierry I would suggest to have a start in mode "Cold" when you reboot an AE server. It can takes a bit or a lot of mmore time depending of your configuration but it saves a lot of time by avoid ing issues like the one you had.


    Some people may disagree but like Windows, sometime restarting from scratch is the best nobrainer solution .... =-)

    Alain


  • 4.  RE: The mqsrv table is not properly maintained(?)

    Posted Nov 25, 2019 01:09 AM
    A cold start means a complete failure of the entire environment. This is not necessarily feasible for every environment at any time...
    So: Thanks @Pete Wirfs for posting the workaround.

    In my eyes, that's a bug in Automic. Unfortunately, it will be difficult to get a permanent solution, because you can't reliably reproduce the error... :-(

    ------------------------------
    Automation Evangelist
    Fiducia & GAD IT AG
    ---
    Mitglied des deutschsprachigen Automic-Anwendervereins FOKUS e.V.
    Member of the German speaking Automic user association FOKUS e.V.
    ------------------------------



  • 5.  RE: The mqsrv table is not properly maintained(?)
    Best Answer

    Posted Nov 25, 2019 03:57 AM

    "Cold" when rebooting an AE server is a safe way to ensure that evrything that could have changed since last boot is accounted for. You don't need effectively to do a "Cold" start for simply stopping and starting a sing CP or WP. you have to do the cold start on the first WP that starts and once started, all other WP can use a "Normal" start.

    By the way, did you stop the AE properly before shutting down the server itself ? Because usually this type of error is coming from an unexpected stop of the AE that let some information pending in the DB tables like "Agent connected", "AE Server is active", "PWP is this WP", etc .... All sorts of anoying settings that are no longer true when you restart the AE Server functions ... but cleaned up in a "Cold" start.

    Alain