s6
Software
skarnet.org
Service directories
A service directory is a directory containing all the information
related to a service, i.e. a long-running process maintained and
supervised by s6-supervise.
(Strictly speaking, a service is not always equivalent to a
long-running process. Things like Ethernet interfaces fit the definition
of services one may want to supervise; however, s6 does not
provide service supervision; it provides process supervision,
and it is impractical to use the s6 architecture as is to supervise
services that are not equivalent to one long-running process. However,
we still use the terms service and service directory
for historical and compatibility reasons.)
Contents
A service directory foo may contain the following elements:
- An executable file named run. It can be any executable
file (such as a binary file or a link to any other executable file),
but most of the time it will be a script, called run script.
This file is the most important one in your service directory: it
contains the commands that will setup and run your foo service.
It is forked and executed by s6-supervise
every time the service must be started, i.e. normally when
s6-supervise starts, and whenever
the service goes down when it is supposed to be up. A run script
should normally:
- adjust redirections for stdin, stdout and stderr. When a run
script starts, it inherits its standard file descriptors from
s6-supervise, which itself inherits them from
s6-svscan. stdin is normally /dev/null.
If s6-svscan was launched by another init system, stdout and stderr likely
point to that init system's default log (or /dev/null in the case
of sysvinit). If s6-svscan is running as pid 1 via the help of software like
s6-linux-init, then its
stdout and stderr point to a catch-all logger, which catches and
logs any output of the supervision tree that has not been caught by a
dedicated logger. If the defaults provided by your installation are not
suitable for your run script, then your run script should perform the proper
redirections before executing into the final daemon. For instance, dedicated
logging mechanisms, such as the log subdirectory (see below) or the
s6-rc pipeline feature, pipe your
run script's stdout to the logging service, but chances are you want
to log stderr as well, so the run script should make sure that its
stderr goes into the log pipe. This
is achieved by fdmove
-c 2 1 in execline,
and exec 2>&1 in shell.
- adjust the environment for your foo daemon. Normally the run script
inherits its environment from s6-supervise,
which normally inherits its environment from s6-svscan,
which normally inherits a minimal environment from the boot scripts.
Service-specific environment variables should be set in the run script.
- adjust other parameters for the foo daemon, such as its
uid and gid. Normally the supervision tree, i.e.
s6-svscan and the various
s6-supervise processes, is run as root, so
run scripts are also run as root; however, for security purposes, services
should not run as root if they don't need to. You can use the
s6-setuidgid utility in foo/run
to lose privileges before executing into foo's long-lived
process; or the s6-envuidgid utility if
your long-lived process needs root privileges at start time but can drop
them afterwards.
- execute into the long-lived process that is to be supervised by
s6-supervise, i.e. the real foo
daemon. That process must not "background itself": being run by a supervision
tree already makes it a "background" task.
- An optional executable file named finish. Like run,
it can be any executable file. This finish script, if present,
is executed everytime the run script dies. Generally, its main
purpose is to clean up non-volatile data such as the filesystem after the supervised
process has been killed. If the foo service is supposed to be up,
foo/run is restarted after foo/finish dies.
- By default, a finish script must do its work and exit in less than
5 seconds; if it takes more than that, it is killed. (The point is that the run
script, not the finish script, should be running; the finish script should really
be short-lived.) The maximum duration of a finish execution can be
configured via the timeout-finish file, see below.
- The finish script is
executed with two arguments: the exit code from the run script (resp. 256 if the
run script was killed by a signal), and an undefined number (resp. the number of
the signal that killed the run script).
- If the finish script exits 125, then s6-supervise
interprets this as a permanent failure for the service, and does not restart it,
as if an s6-svc -O command had been sent.
- A directory named supervise. It is automatically created by
s6-supervise if it does not exist. This is where
s6-supervise stores its information. The directory
must be writable.
- An optional, empty, regular file named down. If such a file exists,
the default state of the service is considered down, not up: s6-supervise will not
automatically start it until it receives a s6-svc -u command. If no
down file exists, the default state of the service is up.
- An optional, empty, regular file named nosetsid.
If this file exists and starts with the word setpgrp, s6-supervise will run the service
in a new process group (the run script will be a process group leader), but not in a new session.
If this file exists and does not start with setpgrp,
s6-supervise will start the service in the same session and process group as itself.
If no nosetsid file exists, the service has its own process group and is started
as a session leader - which is the default and should normally not be changed. Using the
nosetsid file is a hack; it should only be used in testing environments for
job control convenience, and probably never outside that use case.
- An optional regular file named notification-fd. If such a file
exists, it means that the service supports
readiness notification. The file must only
contain an unsigned integer, which is the number of the file descriptor that
the service writes its readiness notification to. (For instance, it should
be 1 if the daemon is s6-ipcserverd run with the
-1 option.)
When a service is started, or restarted, by s6-supervise, if this file
exists and contains a valid descriptor number, s6-supervise will wait for the
notification from the service and broadcast readiness, i.e. any
s6-svwait -U,
s6-svlisten1 -U or
s6-svlisten -U processes will be
triggered.
- An optional regular file named timeout-kill. If such a file
exists, it must only contain an unsigned integer t. If t
is nonzero, then on receipt of a s6-svc -d command,
which sends a SIGTERM (by default, see down-signal below) and a
SIGCONT to the service, a timeout of t
milliseconds is set; and if the service is still not dead after t
milliseconds, then it is sent a SIGKILL. If timeout-kill does not
exist, or contains 0 or an invalid value, then the service is never
forcibly killed (unless, of course, a s6-svc -k
command is sent).
- An optional regular file named timeout-finish. If such a file
exists, it must only contain an unsigned integer, which is the number of
milliseconds after which the ./finish script, if it exists, will
be killed with a SIGKILL. The default is 5000: finish scripts are killed
if they're still alive after 5 seconds. A value of 0 allows finish scripts
to run forever.
- An optional regular file named max-death-tally. If such a file
exists, it must only contain an unsigned integer, which is the maximum number of
service death events that s6-supervise will keep track of. If the service dies
more than this number of times, the oldest events will be forgotten. Tracking
death events is useful, for instance, when throttling service restarts. The
value cannot be greater than 4096. If the file does not exist, a default of 100
is used.
- An optional regular file named down-signal. If such a file
exists, it must only contain the name or number of a signal, followed by a
newline. This signal will be used to kill the supervised process when a
s6-svc -d or s6-svc -r
command is used. If the file does not exist, SIGTERM will be used by default.
- A fifodir named event. It is automatically
created by s6-supervise if it does not exist.
foo/event
is the rendez-vous point for listeners, where s6-supervise
will send notifications when the service goes up or down.
- An optional service directory named log. If it exists and foo
is in a scandir, and s6-svscan
runs on that scandir, then two services are monitored: foo and
foo/log. A pipe is open and maintained between foo and
foo/log, i.e. everything that foo/run
writes to its stdout will appear on foo/log/run's stdin. The foo
service is said to be logged; the foo/log service is called
foo's logger. A logger service cannot be logged: if
foo/log/log exists, nothing special happens.
Stability
With the evolution of s6, it is possible that
s6-supervise configuration uses more and more
files in the service directory. The
notification-fd and timeout-finish files, for
instance, have appeared in 2015; users who previously had files
with the same name had to change them. There is no guarantee that
s6-supervise will not use additional
names in the service directory in the same fashion in the future.
There is, however, a guarantee that
s6-supervise will never touch
subdirectories named data or env. So if you
need to store user information in the service directory with
the guarantee that it will never be mistaken for a configuration
file, no matter the version of s6, you should store that information in
the data or env subdirectories of the service
directory.
Where should I store my service directories?
Service directories describe the way services are launched. Once they are
designed, they have little reason to change on a given machine. They can
theoretically reside on a read-only filesystem - for instance, the root
filesystem, to avoid problems with mounting failures.
However, two subdirectories - namely supervise and event -
of every service directory need to be writable. So it has to be a bit more
complex. Here are a few possibilities.
- The laziest option: you're not using s6-svscan
as process 1, you're only using it to start a collection of services, and
your booting process is already handled by another init system. Then you can
just store your service directories and your scan
directory on some read-write filesystem such as /var; and you
tell your init system to launch (and, if possible, maintain) s6-svscan on
the scan directory after that filesystem is mounted.
- The almost-as-lazy option: just have the service directories on the
root filesystem. Then your service directory collection is for instance in
/etc/services and you have a /service
scan directory containing symlinks to that
collection. This is the easy setup, not requiring an external init system
to mount your filesystems - however, it requires your root filesystem to be
read-write, which is unacceptable if you are concerned with reliability - if
you are, for instance, designing an embedded platform.
- Some people like to have
their service directories in a read-only filesystem, with supervise
symlinks pointing to various places in writable filesystems. This setup looks
a bit complex to me: it requires careful handling of the writable
filesystems, with not much room for error if the directory structure does not
match the symlinks (which are then dangling). But it works.
- Service directories are usually small; most daemons store their
information elsewhere. Even a complete set of service directories often
amounts to less than a megabyte of data - sometimes much less. Knowing this,
it makes sense to have an image of your service directories in the
(possibly read-only) root filesystem, and copy it all
to a scan directory located on a RAM filesystem that is mounted at boot time.
This is the setup I recommend, and the one used by the
s6-rc service manager.
It has several advantages:
- Your service directories reside on the root filesystem and are not
modified during the lifetime of the system. If your root filesystem is
read-only and you have a working set of service directories, you have the
guarantee that a reboot will set your system in a working state.
- Every boot system requires an early writeable filesystem, and many
create it in RAM. You can take advantage of this to copy your service
directories early and run s6-svscan early.
- No dangling symlinks or potential problems with unmounted
filesystems: this setup is robust. A simple /bin/cp -a or
tar -x is all it takes to get a working service infrastructure.
- You can make temporary modifications to your service directories
without affecting the main ones, safely stored on the disk. Conversely,
every boot ensures clean service directories - including freshly created
supervise and event subdirectories. No stale files can
make your system unstable.