Rotating Docker Container Logs To Comply With Retention Policies
Docker's default configuration doesn't perform log rotation.
For busy and long running containers, this can lead to the filesystem being filled with old, uncompressed logging data (as well as making accidental docker logs $container
invocations quite painful).
It is possible to configure docker to rotate logs by editing daemon.json
, but the rotation threshold options are fairly limited:
-
max-size
: size at which to rotate -
max-file
: max number of rotated files
Whilst these options do help to reduce filesystem usage, being purely size based they fail to support a number of extremely common log rotation use-cases
- Log rotation at a specific time based interval (e.g. daily log rotation)
- Maximum retention periods (to comply with GDPR retention policies etc)
Unfortunately, json-file
isn't the only logging driver to suffer from this limitation, the local driver has the same restrictions. It looks like there's an implicit decision that anyone who wants to follow common rotation practices should just forward logs onto syslog
, journald
or some other logging infrastructure (such as logstash
). In practice, there are a variety of use-cases where this may be undesirable.
However, as json-file
simply writes loglines into a logfile on disk, it's trivial to build a script to implement the rotation that we need.
This documentation details how to set up interval based log rotation for docker containers
The basic command
The meat and bones of our rotation solution is a loop which
- Lists running containers
- Uses
docker inspect
to identify where their logs are - Copies the log to a predefined destination
- Truncates the original
- Compresses the copy
- Removes any rotated logs older than
n
days
We can achieve this with the following
# Where do we want to archive logs to?
LOGDIR="/var/log/docker"
# Date to use in rotated filenames
DATESTR=`date +'%Y%m%d-%H%M'`
# ensure the logdir exists
mkdir -p "$LOGDIR"
for container in `docker ps --format '{{.Names}}'`
do
logpath=`docker inspect --format='{{.LogPath}}' "$container"`
logdest="${LOGDIR}/${container}-${DATESTR}.json.log"
# Copy the logfile
cp "$logpath" "$logdest"
# Truncate the original
truncate -s 0 "$logpath"
# Compress the copy
gzip -f "$logdest"
done
# tidy out logs older than 90 days
find "$LOGDIR" -name '*gz' -mtime +90 -exec rm {} \;
We copy and truncate rather than moving the logfile because docker
will continue to use it's original file-handle (meaning it won't write into a replacement logfile unless the container is restarted).
Collecting Statistics
We could just put the above into a shell script, add it to a crontab and call it job done.
But, I generally think it's better to collect statistics at the same time: it means logrotation activities can be graphed to make it easier to spot when something unexpected happens.
My preference is to collect stats and write them into InfluxDB, which we can achieve with the following script
#!/bin/bash
#
# From https://www.bentasker.co.uk/posts/documentation/linux/periodically-rotating-docker-container-logs.html
#
# Where do we want to archive logs to?
LOGDIR=${LOGDIR:-"/var/log/docker"}
# Set this to "" to disable stat submission
INFLUX_HOST=${INFLUX_HOST:-"http://127.0.0.1:8086"}
INFLUX_USER=${INFLUX_HOST:-""}
INFLUX_PASS=${INFLUX_PASS:-""}
INFLUX_DB=${INFLUX_DB:-"telegraf"}
INFLUX_LOG_TAG=${INFLUX_LOG_TAG:-"docker"}
# Containers to rotate logs for
#
# If specifying manually, space seperate them
CONTAINERS=${CONTAINERS:-""}
function writeStats(){
if [[ "$INFLUX_HOST" == "" ]]
then
return
fi
# Build the point
POINT="log_rotate,host=$HOSTNAME,logs=$INFLUX_LOG_TAG total_t=${TOTAL_TIME}i,purge_t=${PURGE_TIME}i,purged_files=${PURGE_COUNT}i,rotate_t=${ROTATE_TIME}i,rotate_count=${X}i,skipped_files=${SKIPPED}i,rotated_lines=${LINECOUNT}i $NOW"
auth="X-Foo: bar"
if [[ ! "$INFLUX_USER" == "" ]]
then
auth="Authorization: basic `echo -n "$INFLUX_USER:$INFLUX_PASS" | base64`"
fi
curl -X POST "${INFLUX_HOST}/write?db=${INFLUX_DB}&precision=s" \
-H "$auth" \
-d "$POINT"
}
START=`date +'%s'`
DATESTR=`date +'%Y%m%d-%H%M'`
# ensure the logdestination exists
mkdir -p "$LOGDIR"
# Initialise some counters
SKIPPED=0
X=0
LINECOUNT=0
# This could have been included in the definition above
# but apparently doing so breaks syntax higlighting on my site
# will have to fix that...
if [[ "$CONTAINERS" == "" ]]
then
CONTAINERS=`docker ps --format '{{.Names}}'`
fi
for container in $CONTAINERS
do
logpath=`docker inspect --format='{{.LogPath}}' "$container"`
if [ ! -f "$logpath" ]
then
SKIPPED=$(( $SKIPPED + 1 ))
continue
fi
logdest="${LOGDIR}/${container}-${DATESTR}.json.log"
# Copy the logfile
cp "$logpath" "$logdest"
# Truncate the original
truncate -s 0 "$logpath"
# Add an informative logline
echo "{\"log\" : \"`date +'%Y/%m/%d %H:%M:%S'` [info] Log rotated. See $LOGDIR for older logs\\n\", \"stream\":\"stdout\",\"time\":\"`date +'%Y-%m-%dT%H:%M:%SZ'`\"}" >> "$logpath"
# Increment the line counter
LINECOUNT=$(( $LINECOUNT + `wc -l "$logdest" | cut -d\ -f1`))
# Compress the copy
gzip -f "$logdest"
# Increment the counter
X=$(( $X + 1 ))
done
ROTATE_END=`date +'%s'`
# tidy out old logs
PURGE_COUNT=`find "$LOGDIR" -name '*gz' -mtime +90 -print | wc -l`
find "$LOGDIR" -name '*gz' -mtime +90 -exec rm {} \;
PURGE_END=`date +'%s'`
# Calculate some stats
TOTAL_TIME=$(( $PURGE_END - $START ))
PURGE_TIME=$(( $PURGE_END - $ROTATE_END ))
ROTATE_TIME=$(( $ROTATE_END - $START ))
# Write to InfluxDB (if enabled)
writeStats
# Write to stdout
cat << EOM
Docker log rotation completed.
Files Rotated: $X
Files Skipped: $SKIPPED
Lines rotated: $LINECOUNT
Old logs purged: $PURGE_COUNT
Total time: $TOTAL_TIME
EOM
This performs the rotation, but also provides some additional statistics
-
total_t
: time spent processing -
purge_t
: time spent purging old logs -
purged_files
: number of files purged -
rotate_t
: time spent rotating -
rotate_count
: number of files rotated -
skipped_files
: number of files skipped (because the file was not found) -
rotated_lines
: how many loglines were in this rotation
With the statistics safely stored in InfluxDB, we can trivially create a dashboard using Flux queries like
from(bucket: "telegraf/autogen")
|> range(start: -7d)
|> filter(fn: (r) => r._measurement == "log_rotate")
|> filter(fn: (r) => r.host == v.host)
|> filter(fn: (r) => r._field == "rotate_count")
|> group(columns: ["logs"])
|> aggregateWindow(every: 1d, fn: sum)
Scheduling Rotation
Once we've got a script we're happy with, it's simply a case of saving it on the server (I called it docker_logs_rotate.sh
) and scheduling it in cron
. The following will have the job run once daily at midnight
echo "0 0 * * * root INFLUX_HOST='https://myinfluxdbhost:8086' /path/to/docker_logs_rotate.sh" | sudo tee /etc/cron.d/docker_logs_rotate
Reading Docker logs directly
Each of the lines within docker's log is a JSON encapsulated object:
{"log" : "2022/05/30 00:00:03 [info] Log rotated. See /var/log/docker for older logs\n", "stream":"stdout","time":"2022-05-30T00:00:03Z"}
{"log":"206.189.120.26 - - [30/May/2022:00:00:05 +0000] \"GET /categories/i2p.xml HTTP/1.1\" 304 0 \"-\" \"feedparser/6.0.8 +https://github.com/kurtmckee/feedparser/\"\n","stream":"stdout","time":"2022-05-30T00:00:05.884810401Z"}
This isn't particularly convenient if you're trying to review loglines - especially those which have are full of escaped quotes etc.
However, they can be converted back to a more human readable form by doing
cat $PATH_TO_LOG | jq -r '.log'
In the example above, this'll print
2022/05/30 00:00:03 [info] Log rotated. See /var/log/docker for older logs
206.189.120.26 - - [30/May/2022:00:00:05 +0000] "GET /categories/i2p.xml HTTP/1.1" 304 0 "-" "feedparser/6.0.8 +https://github.com/kurtmckee/feedparser/"
Conclusion
Docker's default approach to logging isn't particularly ops friendly: logs aren't rotated by default and even when rotation is enabled, the default logging driver only supports time based thresholds (which is problematic for any operator who has to observe time based retention periods).
However, implementing proper rotation of container log files is simply a case of creating a small script to copy files over.