s6
Software
skarnet.org
The s6-supervise program
s6-supervise monitors a long-lived process (or service), making sure it
stays alive, sending notifications to registered processes when it dies, and
providing an interface to control its state. s6-supervise is designed to be the
last non-leaf branch of a supervision tree, the supervised process
being a leaf.
Interface
s6-supervise servicedir
- s6-supervise switches to the servicedir
service directory.
- It exits 100 if another s6-supervise process is already monitoring this service.
- If the ./event fifodir does not exist,
s6-supervise creates it and allows subscriptions to it from processes having the same
effective group id as the s6-supervise process.
If it already exists, it uses it as is, without modifying the subscription rights.
- It sends a 's' event to ./event.
- If the default service state is up, s6-supervise spawns ./run.
- s6-supervise sends a 'u' event to ./event whenever it
successfully spawns ./run.
- When ./run dies, s6-supervise sends a 'd' event to ./event.
- When ./run dies, s6-supervise spawns ./finish if it exists.
./finish will have ./run's exit code as first argument, or 256 if
./run was signaled; it will have the number of the signal that killed ./run
as second argument, or an undefined number if ./run was not signaled.
- ./finish must exit in less than 5 seconds. If it takes more than that,
s6-supervise kills it with a SIGKILL.
- When ./finish dies, s6-supervise restarts ./run unless it has been
told not to.
- There is a minimum 1-second delay between two ./run spawns, to avoid busylooping
if ./run exits too quickly.
- When killed or asked to exit, it waits for the service to go down one last time, then
sends a 'x' event to ./event before exiting 0.
Options
s6-supervise does not support options, because it is normally not run
manually via a command line; it is usually launched by its own
supervisor, s6-svscan.
However, the behaviour of an instance of s6-supervise can be tuned via
various configuration files in the service directory. These files, and
what they do, are listed on the
service directory documentation page.
Readiness notification support
If the service directory contains a valid
notification-fd file when the service is started, or restarted,
s6-supervise creates and listens to an additional pipe from the service
for readiness notification. When the
notification occurs, s6-supervise creates a ./supervise/ready
file containing the absolute time when readiness occurred, then sends
a 'U' event to ./event. The ./supervise/ready
file is deleted on service death.
If the service is logged, i.e. if the service directory has a
log subdirectory that is also a service directory, and the
s6-supervise process has been launched by
that is also s6-svscan, then by default
the service's stdout goes into the logging pipe. If you set
notification-fd to 1, the logging pipe will be overwritten
by the notification pipe, which is probably not what you want. Instead,
if your daemon writes a notification message to its stdout, you should
set notification-fd to (for instance) 3, and redirect outputs
in your run script. For instance, to redirect stderr to the logger and
stdout to a notification-fd set to 3, you would start your
daemon as fdmove -c 2 1 fdmove 1 3 prog... (in execline), or
exec 2>&1 1>&3 3<&- prog... (in shell).
Signals
s6-supervise reacts to the following signals:
- SIGTERM: bring down the service and exit, as if a
s6-svc -xd command had been received
- SIGHUP: exit as soon as the service stops, as if a
s6-svc -x command had been received
- SIGQUIT: close stdin, stdout and stderr and exit as soon as
the service stops, as if a
s6-svc -X command had been received
Usage notes
- s6-supervise is a long-lived process. It normally runs forever, from the system's
boot scripts, until shutdown time; it should not be killed or told to exit. If you have
no use for a service, just turn it off; the s6-supervise process does not hurt.
- Even in boot scripts, s6-supervise should normally not be run directly. It's
better to have a collection of service directories in a
single scan directory, and just run
s6-svscan on that scan directory. s6-svscan will spawn
the necessary s6-supervise processes, and will also take care of logged services.
- You can use s6-svc to send commands to the s6-supervise
process; mostly to change the service state and send signals to the monitored
process.
- You can use s6-svok to check whether s6-supervise
is successfully running.
- You can use s6-svstat to check the status of a
service.
- s6-supervise maintains internal information inside the ./supervise
subdirectory of servicedir. servicedir itself can be read-only,
but both servicedir/supervise and servicedir/event
need to be read-write.
Implementation notes
- s6-supervise tries its best to stay alive and running despite possible
system call failures. It will write to its standard error everytime it encounters a
problem. However, unlike s6-svscan, it will not go out
of its way to stay alive; if it encounters an unsolvable situation, it will just
die.
- Unlike other "supervise" implementations, s6-supervise is a fully asynchronous
state machine. That means that it can read and process commands at any time, even
when the machine is in trouble (full process table, for instance).
- s6-supervise does not use malloc(). That means it will never leak
memory. However, s6-supervise uses opendir(), and most opendir()
implementations internally use heap memory - so unfortunately, it's impossible to
guarantee that s6-supervise does not use heap memory at all.
- s6-supervise has been carefully designed so every instance maintains as little
data as possible, so it uses a very small
amount of non-sharable memory. It is not a problem to have several
dozens of s6-supervise processes, even on constrained systems: resource consumption
will be negligible.