Skip to content

tailer (tail ‐f)

Fabian Stäber edited this page Jul 28, 2016 · 10 revisions

One of the basic components of grok_exporter is a file tailer, which is something like tail -f, except that it can also deal with logrotate. It took a while to get this right, so I would like to share my findings here.

###Logrotate

Logrotate is a tool that automatically archives old log data so that log files don't grow infinitely. Depending on the configuration, there are several ways how logrotate deals with the files. The following shell commands simulate two possible logrotate configurations:

Move the old file and create a new one.

mv logfile logfile.1 && echo > logfile
echo 'next log line' >> logfile

Copy the old file and trunkate the original copy. This has the effect that the original file is never deleted, which is good if programs keep the logfile open while logging.

cp logfile logfile.1 && :> logfile
echo 'next log line' >> logfile

Option 1: Polling

One option to implement a logrotate-aware file tailer is to continuously poll for new loglines. This is what filebeat does. Filebeat's polling interval is configurable with the backoff and max_backoff configuration options.

However, if a line is written and the logfile is rotated before the next polling, the line will be lost. Therefore, grok_exporter does not implement Option 1.

Option 2: Fsnotify

All operating systems provide some way for programs to subscribe to file system events. Using file system events, we can avoid unnecessary polling, and we can be sure sure that we don't miss anything if logrotate runs immediately after a line has been logged.

fsnotify is a Go library providing a unified event API across the most common operating systems (Linux, BSD/macOS, Windows). This is what mtail uses.

However, it turns out that the sequence of events provided by fsnotify is not independent of the operating system when it comes to corner cases:

  • When logrotate does something like mv logfile logfile.1 && echo 'next log line' > logfile, the WRITE event may be lost on BSD operating systems (this is due to a race-condition in the way kqueue is used to simulate recursive directory watches).
  • When logrotate truncates a file instead of removing it, the truncation results in different fsnotify events on different operating systems.
  • Some underlying implementations keep the logfile open when monitoring the file. However, we cannot use the open file handles, because other underlying implementations don't keep open files. As a result, we open the watched logfile twice on some operating systems (like BSD), which works, but is not nice.

After a lot of debugging, we figured that interpreting the fsnotify events correctly for each operating system is about as hard as to go with Option 3. Therefore, grok_exporter does not use fsnotify.

BTW: mtail seems to have its focus on Linux, in which case it doesn't really have these problems.

Option 3: Operating System Specific File System Watchers

While debugging the behaviour of fsnotify, we found out that the underlying system calls are actually not as hard as it sounds, especially as we only want to watch a single file. So the current implementation has one file tailer for each operating system:

grok_exporter implements Option 3.

Summary

Implementing tail -f is a lot harder than it seems, and if you want to do it correctly, you need to implement it in an operating system specific way.