Sunday, July 26, 2015

syslog-ng script for compressing and rotating logs

If you ever run a syslog-ng setup, you'll know that you need to eventually purge the built-up logs.  Logs build up, you run out of file space, and then you need to compress old ones, and eventually get rid of them.

Why not use logrotate?  Well, logrotate doesn't use disk space as a metric; it uses time.  So if you want to keep logs until a certain percentage of hard drive space is utilized (to completely maximize the amount of time you have access to logs), then logrotate is not your tool.  

Below is a script that you can use in order to retain logs as long as possible, and to compress logs that have not been modified since yesterday.

#!/bin/bash ################################################################################
#
#   Syslog cleanup script # # This script does some cleanup for our syslog system.
#
# Why not use logrotate?
# Well, this script will make decisions based upon disk utilization, not just
# age of log files.  We only delete log files when we hit disk utilization
# of 90%.  This will ensure we keep as many logs as possible.  We delete
# logs 10 at a time, until we get under the 90% mark.
#
# This will gzip all of the log files that haven't been modified in a day.
#
################################################################################


LOG_PARENT_DIR=/path/to/syslog/repo
NAV_BY_DATE_DIR=/path/to/syslog/navbydate
MAIN_LOG_FILE_DIR=$LOG_PARENT_DIR/*
DISK_UTIL_THRESH=85
LOG_DIR=/path/to/syslog/log
LOG_FILE=$LOG_DIR/syslog_cleanup.log
MOUNT_POINT=/main/fs/dr
DATE=$(date +"%F %T")
YESTERDAY_YEAR=$(date +%Y --date="yesterday")
YESTERDAY_MONTH=$(date +%m --date="yesterday")
YESTERDAY_DAY=$(date +%d --date="yesterday")
YESTERDAY_DATEPATH=$YESTERDAY_YEAR/$YESTERDAY_MONTH/$YESTERDAY_DAY
YESTERDAY_ALL_FILES=$LOG_PARENT_DIR/*/$YESTERDAY_DATEPATH/*

echo "$DATE := ===============================================================" >> $LOG_FILE
echo "$DATE := Beginning cleanup script." >> $LOG_FILE

# Get the amount of disk utilization for the /nsm directory
disk_util=$(df -h | grep $MOUNT_POINT | sed -r "s/.*\s+([0-9]+)%.*/\1/g")

echo "$DATE := Disk utilization is $disk_util" >> $LOG_FILE

# If we have over 90% disk utilization
if [ $disk_util -ge $DISK_UTIL_THRESH ]
then

    # While disk util is over 90%
    while [ $disk_util -ge $DISK_UTIL_THRESH ]
    do
        DATE=$(date +"%F %T")
        echo "$DATE := Disk utilization is $disk_util" >> $LOG_FILE

        # Find the 10 oldest files, and delete them
        $(find $MAIN_LOG_FILE_DIR -type f -printf '%T+ %p\n' | sort | head -n 10 | cut -d ' ' -f2 | xargs rm)

    done

    # Reset the disk util variable
    disk_util=$(df -h | grep nsm | sed -r "s/.*([0-9]+)%.*/\1/g")
fi

# This will gzip all log files that have not been modified in 1 day.
DATE=$(date +"%F %T")
echo "$DATE := Size of yesterday's files before compression:" >> $LOG_FILE
$(du -sh $YESTERDAY_ALL_FILES >> $LOG_FILE)
echo "$DATE := Gzipping log files in $MAIN_LOG_FILE_DIR." >> $LOG_FILE
echo "$DATE := Gzipping the following log files:" >> $LOG_FILE
$(find $YESTERDAY_ALL_FILES -type f -name "*.log" -exec gzip {} \; -exec echo "Gzipped " {} \; >> $LOG_FILE)
echo "$DATE := Size of yesterday's files after compression:" >> $LOG_FILE
$(du -sh $YESTERDAY_ALL_FILES >> $LOG_FILE)

# Find and remove all empty directories
DATE=$(date +"%F %T")
echo "$DATE := Finding and removing all empty directories." >> $LOG_FILE
$(find $MAIN_LOG_FILE_DIR -type d -empty -exec rmdir {} \;)

echo "$DATE := Finding and removing all broken symbolic links." >> $LOG_FILE
$(find $NAV_BY_DATE_DIR/* -type l -xtype l -exec rm -f {} \;)

DATE=$(date +"%F %T")
echo "$DATE := Ending cleanup script." >> $LOG_FILE
echo "$DATE := ===============================================================" >> $LOG_FILE