One of our customers had an issue recently on the Data Repository where the vertica.log file grew too large, filled the disc and crashed the Vertica DB. We recovered the DB fine, however when we looked into it the reason for the vertica.log file filling up is that the log rotation does not work and I'm wondering has anyone else seen this and got a fix.
For those not familiar with this, there is an admintools command (detailed in the CAIM install doc) which is used to setup it up. The command creates a config file (/opt/vertica/config/logrotate/drdata) to be used by the Linux logrotate command to rotate (and delete old) Vertica log files, including vertica.log. By default logrotate is configured to run daily using cron.
The issue I'm seeing is that the Vertica log files are not rotated when logrotate is run via cron with the vertica config file in place. There is an error in the /var/log/messages file saying "... logrotate: ALERT exited abnormally with " and the Vertica log files do not get rotated.
If I remove the Vertica config file that message does not appear.
If I run the logrotate command manually (same command as in cron, with the Vertica config file in place) then the Vertica log files are rotated and there is no error message.This seems to imply the Vertica config file is set up correctly.
Has anyone else seen this?
Can people confirm that their log rotation is working on the DR? You should see multiple vertica.log files in the directory /CACatalog/<dbname>/<nodename>/ directory if it is working.
For info this is CAIM version 2.3.3.
Are you planning to upgrade to a newer CA Performance Management release? It would come with newer versions of Vertica if you go to 2.5 or the pending 2.6 release due out in the next couple of weeks.
I found this info in an internal Wiki site that might apply, though it was written back in 2012 so is a bit old. At least some of the things it recommends checking might lead to finding something out of place which could be causing this.
Note: There is an issue with the log rotate tool that vertica provides causes the configuration file to have incorrect entries that will cause log not to be rotated. This will be resolved in a future release from vertica 5.0.4
During the installation, a file named vertica is created in the /etc/logrotate.d/ directory
1. Confirm that there is a file named vertica in the /etc/logrotate.d/directory
The file /etc/logrotate.d/vertica should contain the following line:
At database creation, a file is created in the /opt/vertica/config/logrotate/ with the same name as the database.2. Confirm that there is a file with the same name as the database in the /opt/vertica/config/logrotate/ directory.
3. Run logrotate with -dv to check for any errors in the files.
sudo logrotate -dv /etc/logrotate.d/vertica
To capture the output:
sudo logrotate -dv /etc/logrotate.d/vertica 2> /tmp/logrotate.out
Here is an example of the output:
reading config file /etc/logrotate.d/vertica
reading config file testdb
reading config info for /data/dbs/testdb/v_testdb_node0001_catalog/vertica.log
reading config info for /data/dbs/testdb/dbLog
rotating pattern: /data/dbs/testdb/v_testdb_node0001_catalog/vertica.log weekly (52 rotations)
empty log files are rotated, only log files >= 10485760 bytes are rotated, old logs are removed
considering log /data/dbs/testdb/v_testdb_node0001_catalog/vertica.log
log does not need rotating
not running postrotate script, since no logs were rotated
4. To force the logs to rotate you can issue the logrotate with the -f option. If the vertica.log rotates, this confirm that the logrotate is set up correctly by Vertica.
sudo logrotate -vf /etc/logrotate.d/vertica
5. Other things to check if the logrotate is still not working.As the root user:
Confirm that logrotate is being run daily, should be in the /etc/cron.daily directory: ls -l /etc/cron.daily
Check to make sure the /etc/crontab contains a reference to cron.daily
01 * * * * root run-parts /etc/cron.hourly
02 4 * * * root run-parts /etc/cron.daily
Check the log file /var/log/cron for any errors
6. Once log rotation is working properly, you should begin to see gzipped files in the Vertica catalog directory. For example, assuming the following is the customer's catalog directory:
You will start to see gzipped files with names like the following once rotation occurs:
and so forth
Let me know if that reveals anything unusual that might be causing this problem.
Thanks for the response, I'm afraid it hasn't helped however. As I said logrotate works fine when it's run manually. The problem is with logrotate running via cron. It is configured to run via cron and in the /var/log/cron file there are entries to say it is starting and finished at approximately 3.30am. Those entries usually have the exact same timestamp.
Additionally there is an entry with the same timestamp (3.30am) in the /var/log/messages file saying "... logrotate: ALERT exited abnormally with "
This comes from the script /etc/cron.daily/logrotate which gets run daily by cron. The contents of this are as follows ...
if [ $EXITVALUE != 0 ]; then
/usr/bin/logger -t logrotate "ALERT exited abnormally with [$EXITVALUE]"
So it appears the logrotate command when run under cron returns the exit value 1 which is an error of some sort. Any ideas how I can find out what that error relates to?
I think I've found something on the Redhat website which seems to describe the situation I'm seeing. There is also a post on stackoverflow describing similar.
I've implemented what they suggested and will update this post in the morning.
Interesting find. Curious if that turns out to be the root cause as it sure looks like it will.
We do recommend you disable it for the CAPM, DA and DC system installs. While not explicit, from the wording below from the product docs Wiki for Performance Manager, we request it be disabled, or run in permissive mode with exclusions configured. For example from the DA install Wiki docs:
Verify that Security Enhanced Linux (SELinux) is disabled on the computer where you are going to install Data Aggregator. By default, some Linux distributions have this feature enabled, which does not allow Data Aggregator to function properly. Disable SELinux or create a policy to exclude Data Aggregator processes from SELinux restrictions.
The Wiki install info for CAPM and DC systems has a similar statement.
When installing the HP Vertica DB software to install the DR host, the dr_validate.sh script should disable it as well thus we don't specifically call it out to be disabled in the DR install Wiki documentation.
HP Vertica also only supports selinux in permissive mode:
In the end, if your environment doesn't specifically require selinux be enabled, it would be best to disable it entirely on the systems involved to ensure no future issues are encountered.
Well partial success with this last night. The log was rotated, however the Vertica db was down this morning and I'm not sure if that is related.
I had started to think that SELinux being enabled might have been an issue so thanks for the confirmation. I have now disabled SELinux and have set logrotate to run every hour (using /etc/cron.hourly). I've rebooted the system and restarted the DB and will let you know how it goes.
When it comes to Vertica or other software on Linux systems I've rarely seen any good come from having SELinux enabled in any mode other than disabled.
If not required in the environment and able to be fully disabled its always the best way to go.
Hopefully this will clear this up for good at this point.
I would also recommend upgrading to the latest release as soon as able to for the many improvements and enhancements made since the older 2.3.3 release came out.
Disabling SELinux seems to have fixed the problem. I've tested a few times using cron.hourly (and editing /var/lib/logrotate.status) and I now have a couple of rotations. I'll leave it to run overnight to confirm but it looks like that has resolved it.
Thanks for your help with this.
Excellent news. Glad to hear its resolved with a simple and small change.