The s6-supervise program

s6-supervise monitors a long-lived process (or service), making sure it stays alive, sending notifications to registered processes when it dies, and providing an interface to control its state. s6-supervise is designed to be the last non-leaf branch of a supervision tree, the supervised process being a leaf.

Interface

     s6-supervise servicedir

s6-supervise switches to the servicedir service directory.
It exits 100 if another s6-supervise process is already monitoring this service.
If the ./event fifodir does not exist, s6-supervise creates it and allows subscriptions to it from processes having the same effective group id as the s6-supervise process. If it already exists, it uses it as is, without modifying the subscription rights.
It sends a 's' event to ./event.
If the default service state is up, s6-supervise spawns ./run.
s6-supervise sends a 'u' event to ./event whenever it successfully spawns ./run.
When ./run dies, s6-supervise sends a 'd' event to ./event.
When ./run dies, s6-supervise spawns ./finish if it exists. ./finish will have ./run's exit code as first argument, or 256 if ./run was signaled; it will have the number of the signal that killed ./run as second argument, or an undefined number if ./run was not signaled.
./finish must exit in less than 5 seconds. If it takes more than that, s6-supervise kills it with a SIGKILL.
When ./finish dies, s6-supervise restarts ./run unless it has been told not to.
There is a minimum 1-second delay between two ./run spawns, to avoid busylooping if ./run exits too quickly.
When killed or asked to exit, it waits for the service to go down one last time, then sends a 'x' event to ./event before exiting 0.

Options

s6-supervise does not support options, because it is normally not run manually via a command line; it is usually launched by its own supervisor, s6-svscan. However, the behaviour of an instance of s6-supervise can be tuned via various configuration files in the service directory. These files, and what they do, are listed on the service directory documentation page.

Readiness notification support

If the service directory contains a valid notification-fd file when the service is started, or restarted, s6-supervise creates and listens to an additional pipe from the service for readiness notification. When the notification occurs, s6-supervise creates a ./supervise/ready file containing the absolute time when readiness occurred, then sends a 'U' event to ./event. The ./supervise/ready file is deleted on service death.

If the service is logged, i.e. if the service directory has a log subdirectory that is also a service directory, and the s6-supervise process has been launched by that is also s6-svscan, then by default the service's stdout goes into the logging pipe. If you set notification-fd to 1, the logging pipe will be overwritten by the notification pipe, which is probably not what you want. Instead, if your daemon writes a notification message to its stdout, you should set notification-fd to (for instance) 3, and redirect outputs in your run script. For instance, to redirect stderr to the logger and stdout to a notification-fd set to 3, you would start your daemon as fdmove -c 2 1 fdmove 1 3 prog... (in execline), or exec 2>&1 1>&3 3<&- prog... (in shell).

Signals

s6-supervise reacts to the following signals:

SIGTERM: bring down the service and exit, as if a s6-svc -xd command had been received
SIGHUP: exit as soon as the service stops, as if a s6-svc -x command had been received
SIGQUIT: close stdin, stdout and stderr and exit as soon as the service stops, as if a s6-svc -X command had been received

Usage notes

s6-supervise is a long-lived process. It normally runs forever, from the system's boot scripts, until shutdown time; it should not be killed or told to exit. If you have no use for a service, just turn it off; the s6-supervise process does not hurt.
Even in boot scripts, s6-supervise should normally not be run directly. It's better to have a collection of service directories in a single scan directory, and just run s6-svscan on that scan directory. s6-svscan will spawn the necessary s6-supervise processes, and will also take care of logged services.
You can use s6-svc to send commands to the s6-supervise process; mostly to change the service state and send signals to the monitored process.
You can use s6-svok to check whether s6-supervise is successfully running.
You can use s6-svstat to check the status of a service.
s6-supervise maintains internal information inside the ./supervise subdirectory of servicedir. servicedir itself can be read-only, but both servicedir/supervise and servicedir/event need to be read-write.

Implementation notes

s6-supervise tries its best to stay alive and running despite possible system call failures. It will write to its standard error everytime it encounters a problem. However, unlike s6-svscan, it will not go out of its way to stay alive; if it encounters an unsolvable situation, it will just die.
Unlike other "supervise" implementations, s6-supervise is a fully asynchronous state machine. That means that it can read and process commands at any time, even when the machine is in trouble (full process table, for instance).
s6-supervise does not use malloc(). That means it will never leak memory. However, s6-supervise uses opendir(), and most opendir() implementations internally use heap memory - so unfortunately, it's impossible to guarantee that s6-supervise does not use heap memory at all.
s6-supervise has been carefully designed so every instance maintains as little data as possible, so it uses a very small amount of non-sharable memory. It is not a problem to have several dozens of s6-supervise processes, even on constrained systems: resource consumption will be negligible.