This post obviously only concerns any AE UNIX users. Windows AE operators may safely skip this and live a happy, fulfilling life nonetheless. For the UNIX folks, this may, however, be very vital:
Oh, $LD_LIBRARY_PATH, old friend. We meet again, and once again the old saying holds true: $LD_LIBRARY_PATH problems are the B.E.S.T problems (B.E.S.T. = "Bugs Extraordinarily ***tty to Trace"):
Bonus Complication: ZDU
We recently had a fun ZDU that generated funky errors ("no worker processes of the new version active") and even corrupt database entries, leading to zombified AE processes. And as a side effect, also prompted various opinions about the correct ways of ZDU'ing, and highlighted, in my opinion, that the documentation needs a serious overhaul. Because it's partly filled with ambiguity, made-up terms that have no unambigous definition ("Distributed Installation" anyone?), and leads to even Automic personel communicating things like "you need to start with two distinct installations of the same binaries", which is NOT required.
I'd like to page ainda02 on this.
But above all, it doesn't contain that one clear bit of information that I will tell you today that came out of a nearly three hour WebEx debugging session with an Automic developer:
I learned during the debugging session that Automation Engine implicitly looks for shared objects in the directory that it got started in (no, that's not neccessarily the directory the binaries are in, it's the CWD, the directory you "cd" into before you start the binary, but that should be the same directory the binaries are in). But it prioritizes $LD_LIBRARY_PATH higher.
Automic shared objects, like ucsj.so, may differ in version between the old and new version when you do a ZDU, and one set of processes may not work with the shared objects of the other version. Worse, if your $LD_LIBRARY_PATH includes the utilities' shared objects, these may be another version than even the matching AE again. If you get a miss-match here, very exciting and mysterious problems will happen!
If you would like to see proper error messages instead of mysterious problems, please sign my petition.
Bonus Complication: systemd
Why did we have an $LD_LIBRARY_PATH in the first place? Because it seemed right (when you want to start things properly without silly cd'ing), and at some point we did the (we believed) right thing and made a centralized baseline config for all Automic things, which included that $LD_LIBRARY_PATH. This was used for Service Manager, which inherited it to the server processes, and it was wired into the environment file that we fed to systemd - because as much as I dislike systemd myself, we're using the current RedHat init system, not the legacy one that will go away - the later incidentally the only one Automic still bases all it's advise and procedures upon.
So here's the update to any previous systemd information on how the setup should be:
You need systemd > version 227 for this, or a backported feature. If in doubt, consult the man page. If your systemd is < 227, you probably need to launch a shell script which cd's to the directory, then starts smgr. This approach will probably break many things related to systemd, and I highly recommend you try to avoid this, and instead get a systemd > 227.
Now someone please remind me again, why am I writing Automic documentation !?