s6
Software
skarnet.org
The s6-log program
s6-log is a reliable logging program with automated log rotation, similar to
daemontools' multilog,
with full POSIX regular expression support.
Interface
s6-log [ -d notif ] [ -q | -v ] [ -b ] [ -p ] [ -l linelimit ] [ -- ] logging script
s6-log reads and compiles logging script to an internal
form. Then it reads its standard input, line by line, and performs actions
on it, following the script it is given. It does its best to ensure there
is never any log loss. It exits cleanly when stdin closes or when
it receives a SIGTERM or a SIGHUP.
Options
- -d notif :
readiness notification.
With this option, s6-log writes a newline character to file descriptor
notif when it is ready, i.e. when it has successfully
parsed its logging script and initialized all its necessary resources
such as the logdirs defined in the script, and is now listening to stdin
in order to process log lines.
- -b : blocking mode. With this option, s6-log stops
reading its standard input while it has unflushed buffers. This ensures that
every log line has been fully processed before reading the next one; this is also
multilog's behaviour.
By default, s6-log keeps reading from stdin even if its buffers still
contain data. -b is safer, but may slow down your service; the default
is faster, but may lead to unbound memory use if you have a lot of output to
write to a slow file system.
- -p : protect against SIGTERM. Do not exit on receipt of a
SIGTERM; only exit on a SIGHUP or when reading EOF on stdin. This is useful
for a logger that you really do not want to lose even if automated
administration (e.g. the downing of a supervision tree) would kill it.
- -q | -v : quiet | verbose. Decreases | increases s6-log's
verbosity, i.e. which messages are sent to stderr. The default verbosity is 1.
Currently supported verbosity levels:
- 0: only write alerts and fatal errors
- 1: write alerts, warnings and fatal errors
- -l linelimit : if a log line is longer than
linelimit bytes, split it by inserting a newline after the
linelimitth byte. After the split, the remainder of the line will
also be processed, so it will go through the selection process, timestamping,
etc. linelimit cannot be less than 48, unless it is 0 (which means
infinite). The default is 8192 bytes. Setting linelimit to 0 ensures
that lines will never be split; this may cause important memory consumption
by s6-log if it is fed extremely long lines, so use with caution.
Logdirs
A logdir (logging directory) is a place where logs are
stored. s6-log can be scripted to write into one or more logdirs.
A logdir may contain the following files:
- lock: this file is locked by s6-log at start,
to make sure only one instance is running at the same time.
- current: the file where selected log lines are appended.
If current has the executable-by-user flag, it means that no
s6-log process is currently writing to it and the previous
s6-log process managed to cleanly finalize it. If it does not,
either an s6-log process is writing to it or the previous one
has been interrupted without finalizing it.
- state: last processor's output, see below.
- previous: a rotation is happening in that logdir.
- processed, newstate: a rotation is happening
in that logdir and the processor script is running.
- timestamped files: those files are @timestamp.s or
@timestamp.u and are old log files that have been processed
and rotated (if they're ending in .s) or that were the current
file when s6-log got interrupted (if they're ending in .u), in which
case they have not been processed.
Rotation
In a logdir, selected lines are appended to the current file.
When current becomes too big, a rotation happens.
The current file will be possibly processed, then
it will become an archived log file named
@timestamp.s, where timestamp, a
TAI64N
timestamp, is the absolute time of the rotation. If there are too
many archived log files in the logdir, the older ones are then
deleted. Logging then resumes, to a brand new current
file.
You can use this mechanism to ensure that your logs never fill up
the available disk space, for instance: something that neither
syslogd, nor syslog-ng, nor rsyslog offers.
Processors
A processor script can be set for every logdir. When a rotation
occurs, current (which has then been renamed previous) is
fed to processor's stdin, and processor's stdout is saved
and archived. processor can also read the state file
on its fd 4; what it writes to its fd 5 will be saved as the next
state file, for the next rotation. A processor script runs with
the logdir as its working directory.
Processors should not background themselves: s6-log considers the
processing done when its processor direct child dies.
Processors should exit 0 on success and nonzero on failure; if a
processor fails, s6-log will try it again after some
cooldown time.
Processors make s6-log Turing-complete by allowing you to use any
external program to handle log files that are going to be archived.
Logging script syntax
When starting up, s6-log reads its arguments one by one; this
argument sequence, or directive sequence, forms a
logging script which tells s6-log what to log, where, and how.
Every directive can be a selection directive, a control
directive or an action directive. A valid logging script always
contains at least one action directive; every action directive can be
preceded by zero or more selection or control directives. s6-log will exit 100
if the script is invalid. If it can process the script but the last directive
is not an action directive, s6-log will emit a warning.
Selection directives
These directives tell s6-log whether to select or deselect lines it reads
from stdin; actions will only happen on selected lines. By default, every
line is selected.
- +regexp: select yet-unselected lines that match regexp, which must be a
POSIX
Extended Regular Expression.
- -regexp: deselect yet-selected lines that match regexp, which must be a
POSIX Extended Regular Expression.
- f: select exactly lines that have not yet been acted
upon (i.e. that were always deselected when the script encountered an action
directive).
Control directives
These directives tune s6-log's behaviour for the next actions.
- nnumber: next logdirs will contain up to
number archived log files. If there are more, the oldest archived
log files will be deleted, only the latest number will be kept.
By default, number is 10.
- sfilesize: next rotations will occur when
current log files approach filesize bytes. By default,
filesize is 99999; it cannot be set lower than 4096 or
higher than 268435455.
- Stotalsize: next logdirs will contain up
to totalsize bytes of archived (and maybe processed) log files. If
archived log files take more space than that, the older ones are deleted
until the total size fits. A totalsize of zero means no such limit;
use n0 instead if you don't want any archived log files. By
default, totalsize is 0 (unlimited).
- ltolerance: next rotations will be triggered
when the size of current goes higher than filesize minus
tolerance. tolerance cannot be more than half of
filesize. By default, tolerance is 2000.
- rcooldown: if an error occurs during operations
on the next logdirs, retry every cooldown milliseconds. By default,
cooldown is 2000; it's strongly discouraged to set it to a value
under 50.
- Ealertsize: only the first alertsize
bytes of the selected lines will be used in the next alerts. An
alertsize of 0 means no limit. By default, alertsize is
200.
- ^statussize: only the first statussize
bytes of the selected lines will be used in the next status file updates.
If a line is shorter than statussize bytes, the status file will be
padded with newlines so it is always statussize bytes long. 0 means
an unpadded, unlimited status file. By default, statussize is 1001.
- pprefix: sets prefix as a prefix to be
printed on every output line. For instance, a pfoobar: directive means
that the next action directives should prepend every line with foobar:
(plus a space) before outputting it. Note that a prefix is always printed
after the timestamps, if any. To remove a prefix for the next action
directives, use p.
- !processor: registers
execlineb -Pc processor as a processor for the next logdirs;
execlineb must be found in s6-log's PATH. This directive is only
supported if s6 has been built with execline support; otherwise, it yields a
syntax error at starting time.
If processor is empty, no processor will be set for the next logdirs.
By default, no processor is set.
- ?processor: registers
/bin/sh -c processor as a processor for the next logdirs.
It is just like the ! directive, except that the processor string
is interpreted by /bin/sh, not execlineb. It is useful
for people who want to build s6 without execline support.
- t: the logged line will be prepended with a
TAI64N
timestamp (and a space) before being processed by the next action
directive. Giving the t directive several times before an action
directive has no effect.
- T: the selected line will be prepended with a
ISO 8601
timestamp for combined date and time representing local time
according to the system's timezone, with a space (not a 'T')
between the date and the time and two spaces after the time, before being
processed by the next action directive. Giving the T directive
several times before an action directive has no effect.
Note that unlike the other control directives, the t and
T directives are not sticky: their effect will
disappear after the next action directive, and they need to be
reapplied if necessary. If both a t and a T
directives are given before an action directive, the TAI64N timestamp
will always appear before the ISO 8601 timestamp.
Action directives
These directives determine what s6-log actually does with the
selected lines.
- 2: alert. s6-log will print "s6-log: alert: ",
possibly prepended with a timestamp, followed by the first
alertsize bytes of the line, to its standard error.
- 1: forward to stdout. s6-log will print the selected
line to its stdout. If any error occurs, e.g. if stdout was a pipe and the
reading end closed, this directive will be ignored for the rest of
s6-log's lifetime.
- =statusfile: status. s6-log will atomically
update the statusfile file with the first statussize
bytes of the line, and pad it with newlines. s6-log must have the right
to write to statusfile and to statusfile's directory.
- dir (must start with '/' or '.'): logdir.
s6-log will log the line into the logdir dir. s6-log must have
the right to write to dir.
Signals
- SIGTERM: If s6-log has been run with the -p option, does nothing.
Without this option, SIGTERM instructs s6-log to stop reading stdin after the
next newline and exit after logging the last line.
- SIGALRM: triggers a rotation on every logdir s6-log is monitoring,
as if the current file in those logdirs had reached the size
limit.
Examples
s6-log -b n20 s1000000 t /var/log/services/stuff
Logs all of stdin, prepending every line with a TAI64N timestamp, into the
/var/log/services/stuff logdir, with a maximum archive of
20 log files of 1 MB each; makes sure every line has been written
before reading the next one.
s6-log n30 E500 - +fatal: 2 - +^STAT =/var/log/foobard/status f s10000000 S15000000 T !"gzip -nq9" /var/log/foobard
- Sends alerts to stderr with the 500 first bytes of lines containing "fatal:".
- Maintains the /var/log/foobard/status file at 1001 bytes,
updating it when it finds a log line starting with "STAT".
- Logs all other lines to logdir /var/log/foobard, prepending
them with an ISO 8601 timestamp. When current
reaches at least 9998 kB (i.e. 10 MB filesise minus 2kB tolerance), pipe it
through gzip -nq9 and save the result into a timestamped archive file, with
a maximum of 30 such files or a total of 15 MB of compressed archive files.
Why use execlineb to interpret the "processor" string?
Because it is exactly what
execlineb
is for.
- Directly executing processor is not flexible enough. We want
to be able to run a complete command line, with an executable name and its
arguments.
- We could interpret the processor string via /bin/sh.
This is what multilog
does. However, /bin/sh, despite being the traditional Unix interpreter,
is overpowered for this. We don't need a complete shell script interpreter:
most processor commands will be very simple, with only two or three
words, and we only need a way to turn a string into an argv, i.e. a
command line.
- execlineb
was designed just for this: to turn simple strings into command lines.
It is a very fast and lightweight script launcher, that does not do any heavy
startup initialization like /bin/sh does. It happens to be the perfect
tool for the job.
- To be perfectly honest: I also did this on purpose so people have a
reason to use the
execline language. But
seriously, it really is the perfect tool for the job.
Why have another logging mechanism?
Because the syslog mechanism and all its implementations (save one) suck.
I'm not being judgmental; I'm just stating the obvious.
The syslog design is flawed from the start
When
asked why he started rsyslog, Rainer Gerhards came up with a lot of
hand-waving and not a single word about technical points. There is a
reason for that: rsyslog is forked from sysklogd! So, no matter how
many bells and whistles are added to it, it still suffers from the same
basic flaws.
The problem with syslogd does not come from such or such implementation.
The problem comes from syslog's design in the first place.
- syslog makes you send all your logs to the same place.
The logs from a zillion processes are read by a single syslogd server.
The server checks log lines against system-wide regular expressions
to decide where to write them. This raises the following issues:
- Unless the client explicitly mentions its name in every log
line, there is no way for log readers to know what process generated a
given line. Some syslogd implementations can log the pid of the client;
big deal.
- Log lines from every client have to run through the same regular
expression matching. This requires huge regular expression sets, and an
obvious performance impact, to do anything meaningful. And as a matter
of fact, standard syslogd configurations don't do anything meaningful:
they separate the logs into a few streams such as /var/log/messages,
/var/log/auth.log, /var/log/daemon.log or
/var/log/syslog with very vague semantics. All of syslogd's
line processing power remains unused, because making real use of it would
be too complex.
- syslogd logs to files. This is wrong, because files grow
and disks fill up. Sure, there are utilities such as logrotate
to perform cleaning up, but as long as logging and log rotation are
kept separate, there is a race condition: a log file can grow and fill
up a disk before a rotation occurs. I am all for separating tasks that
can be separated, but there is no choice here: logging and log
rotation management must be done by the same tool.
Only a few non-mainstream implementations of syslogd do this, including the
Busybox one - and that is a
feature added by the Busybox developers who are aware of the problem
but want to maintain compatibility with the historical syslogd.
Neither syslogd (-ng or not) nor rsyslogd manages its log files: that's a
flaw that no amount of external tools is going to fix.
- syslogd is a complex process that runs as root. We all know what
complex processes running as root mean: bugs turning into security holes.
- syslog requires a syslogd service, and fails otherwise. A syslogd
service may not be present, it may fail... or it may want to log stuff.
Who's going to take care of syslogd's error messages?
syslog is slow, it's unsafe, and it's incomplete. The only reason people
use it is because it's historical, it exists, and there hasn't been any
serious alternative yet, except maybe
multilog, which
s6-log improves upon.
A not-so-modest proposal: the logging chain
Unix distributions already do this to some extent, but it's at best
unclear where the logs go for any given program.
- Every program, without exception, should send its logs (be it
error messages, warning messages, or anything) to its standard
error descriptor, i.e. fd 2. This is why it's open for.
- When process 1 starts, the logging chain is rooted to the
machine console: anything process 1 sends to its stderr
appears, without modification, on the machine console, which should
at any time remain the last resort place where logs are sent.
- Process 1 should spawn and supervise a catch-all logging
mechanism that handles logs from every service that does not
take care of its own logging. Error messages from this logging
mechanism naturally go to the machine console.
- Process 1's own error messages can go to the machine console,
or dirty tricks can be used so they
go to the catch-all logging mechanism.
- Services that are spawned by process 1 should come with their
own logger service; the supervision mechanism offered by
s6-svscan makes it easy. Error messages
from the loggers themselves naturally go to the catch-all
mechanism.
- User login mechanisms such as getty, xdm or
sshd are services: they should be started with their own
loggers. Of course, when a user gets a terminal and a shell, the
shell's stderr should be redirected to the terminal: interactive
programs break the automatic logging chain and delegate responsibility
to the user.
- A syslogd service may exist, to catch logs sent via
syslog() by legacy programs. But it is a normal service, and logs
caught by this syslogd service are not part of the logging chain.
It is probably overkill to provide the syslogd service with its own
logger; error messages from syslogd can default to the catch-all logger.
The s6 package, including the ucspilogd program,
provides enough tools to easily implement
a complete syslogd system, for a small fraction of the resource needs and
the complexity of native syslogd implementations.
So, given a program, where are its logs sent ?
- Logs sent via syslog() will be handled by the syslogd service as
usual. Smart administrators will make sure that those ones are as few as
possible. The rest of this analysis is about logs sent to stderr.
- If the program is descended from a user's interactive program,
its logs are sent to the user's terminal or the user's choice redirection
target.
- If the program is descended from a logged service, its logs are
naturally sent to the service's logger.
- Else the logs are sent to the catch-all logger.
- Only the catch-all logger's error messages, the kernel's fatal
error messages, and maybe process 1's error messages, are sent to the
system console.
What does s6-log have to do with all this?
In a logging chain situation, every service must have
its own logger. To avoid syslogd's design mistakes, one logger process
per service must be run. s6-log fits that role. Using s6-log as
your one-stop logger offers the following benefits:
- Every instance of s6-log can run as a different user, so it's
easy to give different access rights to different logs. It is also
more secure not to have any logger process running as root.
- s6-log consumes very little memory per instance (unless it
accumulates unflushed log lines, which you can avoid with the
-b option). So, launching a lot of s6-log processes does
not waste resources.
- s6-log is vastly configurable via logging scripts; every instance
is as powerful as a traditional syslogd.
- s6-log can log to a RAM filesystem and thus is suitable as a
catch-all logger. Clever tricks like Upstart's logd or daemontools'
readproctitle
are just that: tricks. s6-log gives a unified interface to all of
your system's loggers.
You're wrong about being as powerful as
syslogd: s6-log does not do remote logging.
You mean you want to send, live, every log line
over the network via UDP ? You can't be serious.
Do yourself a favor and use s6-log to write log lines to a logdir,
with a processor script that sends files-being-archived to the
network, possibly after compressing them. More reliability, less
log lines lost, less network traffic, better engineering. If you
have no disk to even write the current files to, write
to a small RAM filesystem.
If you have to log stuff live via the network, you
do not need any local logging software. You don't even need syslogd.
Just filter your stderr via some grep that selects lines for
you, then sends them to a network socket. A trivial shell script, or
execline
script, can do that for you.
Do not insist on using syslogd. It does nothing magical, and nothing
that can't be done in a simpler way using simpler tools.