s6-linux-init
Software
skarnet.org
The s6-linux-init-maker program
s6-linux-init-maker reads configuration options on
the command line, and outputs a directory to place in the
root filesystem. That directory contains a script suitable
as an init program, as well as support file hierarchies to
get a complete
s6
infrastructure running when the system is booted on that
script.
s6-linux-init-maker only writes scripts. At boot time, these
scripts will call commands provided by other skarnet.org packages
such as
execline or
s6. It is the
responsibility of the administrator to make sure that all the
dependencies are properly installed at boot time, and that the
correct options have been given to s6-linux-init-maker so that
the programs are found on the root filesystem of the
machine - else the scripts will crash.
Interface and usage
s6-linux-init-maker \
[ -c basedir ] \
[ -l tmpfsdir ] \
[ -b execline_bindir ] \
[ -u log_uid -g log_gid | -U ] \
[ -G early_getty ] \
[ -2 stage2 ] \
[ -r ] \
[ -Z ] stage2_finish \
[ -3 stage3 ] \
[ -p initial_path ] \
[ -m initial_umask ] \
[ -t timestamp_style ] \
[ -d dev_style ] \
[ -s env_store ] \
[ -e initial_envvar ] ... \
[ -n ] \
dir
- s6-linux-init-maker should be run as root.
- s6-linux-init-maker parses options on its command line.
- It writes data into a directory dir, which must not
exist beforehand.
- It exits 0 if everything went well, 100 if a user error occurred,
and 111 if a problem occurred during the creation of the directory
or its contents.
dir should then be copied by the administrator to the place
declared as basedir. Be careful: it contains fifos, files with
precise uid/gid permissions, and files with non-standard access rights,
so be sure to copy it verbatim. The
s6-hiercopy
tool can do it, as well as the GNU or busybox cp -a or mv commands.
The basedir/init script
is then suitable as a "stage 1" init program, i.e. the first program
run by the kernel. The administrator should make a symbolic link
from /sbin/init to basedir/init; the
machine will then be ready to boot
Boot sequence
When the kernel boots, it runs the basedir/init script,
also known as stage 1. and this is what happens:
- stage 1 is an
execline script, so
the first process run by the kernel is the
execlineb
program launcher.
- stage 1 mounts a
tmpfs
filesystem on tmpfsdir.
- stage 1 copies basedir/run-image verbatim to
tmpfsdir.
- stage 1 empties its environment, then reads a global set of environment variables from the
basedir/env
environment directory.
- stage 1 forks a child that will block until
s6-svscan is running.
- stage 1 executes, as process 1, into
s6-svscan,
with tmpfsdir/service as a
scan directory.
- This scan directory already contains at least one service, which is the
catch-all logger: error messages from the supervision tree, and
from services that do not have a dedicated logger, are handled by a
special s6-log
instance and made available in tmpfsdir/uncaught-logs
instead of clogging the system console.
- If the -G option has been given to s6-linux-init-maker, the
scan directory will also contain a service for an early getty.
- s6-svscan starts all the services defined in the scan directory,
and unblocks the child forked by stage 1.
- This child executes into stage2.
stage2 is the responsibility of the administrator - it will
not be written automatically!
It should
contain all the necessary initialization sequence to bring up a proper
system. When stage2 is executed, the machine state is as follows:
- stage2's working directory is / and its stdin
is /dev/null. Its
stdout and stderr both point either to /dev/console or to the pipe
to the catch-all logger, depending on the -r option.
- The system has a valid device directory mounted on /dev.
- Depending on the kernel boot command line, the root filesystem
may be in read-only mode.
- There is a tmpfs available for root only in tmpfsdir.
- s6-svscan
is running as process 1. At any time, it is possible to make it supervise a long-lived
process by linking the appropriate
service directory
into tmpfsdir/service, then running the command
s6-svscanctl -a tmpfsdir/service. Services without a
dedicated logger will send their output to the catch-all logger.
- A getty service may already be available. The point of this early
getty is essentially to make it easier to debug if stage2 fails.
There is nothing else. In particular, no filesystem has been
mounted yet, including /proc and /sys; and no one-time
initialization
has been performed. The point of stage 1 is only to make it
possible to run stage2 with a logging infrastructure and a
supervision infrastructure already available, and all the
real machine and service initialization should happen in stage2.
Shutdown sequence
- A shutdown is performed when the administrator runs one of the
s6-halt,
s6-poweroff or
s6-reboot commands.
- Those commands send a signal to the
s6-svscan
process running as pid 1; this signal is caught and s6-svscan runs the
corresponding "signal handler" script that has been placed by
s6-linux-init-maker into the
basedir/run-image/service/.s6-svscan directory (and that
has been copied at boot time to tmpfsdir/service/.s6-svscan).
- That script first spawns the stage2_finish script, who
must have been written by the administrator. The purpose of
stage2_finish is to perform the high-level shutdown sequence
while the supervision tree is still alive. Typically, when using a
service manager, stage2_finish would tell the service manager
to bring all services down. When using
s6-rc, a typical
stage2_finish script just contains s6-rc -da change.
More generally speaking, stage2_finish should undo what
stage2 has done at boot time.
- The "signal handler" script then tells s6-svscan to exit via an
appropriate s6-svscanctl
command: s6-svscan then executes into the stage3 script, which, like
stage2 and stage2_finish, is the responsibility of the
administrator. When stage3 runs, the machine is in the following
state:
- The supervision tree has been torn down: it is not operational
anymore. (So, commands such as
s6-rc, which
require a live supervision tree, will not work.)
- stage3 runs as process 1. Doing so makes it easier to recover
after killing all processes by kill -9 -1 or
s6-nuke.
- Its working directory is / and its stdin is /dev/null
- Its stdout and stderr are both /dev/console
- Depending on the exact configuration and what the administrator has
written in stage2_finish, there may or may not be
long-running services that remain alive. The catch-all logger and its
supervisor will always be alive; this is not a problem because they
do not hold any file descriptor to a filesystem that would need to be
unmounted.
- The last command that stage3 executes should be
s6-$1 -f, $1 being the first argument that has been
given to it. This command will instantly execute the hard system halt,
poweroff or reboot that has initially been asked by the admin.
The examples/ subdirectory of the s6-linux-init package
contains an example of /etc/rc.init, /etc/rc.tini
and /etc/rc.shutdown scripts, suitable for
stage2, stage2_finish and stage3
respectively. Those scripts can practically be used as is if the machine
is managed by the s6-rc
service manager.
s6-linux-init-maker options
- -c basedir : at boot time, stage 1,
which should be accessible as basedir/init,
will read its read-only data from basedir. After running
s6-linux-init-maker, the administrator should make sure to copy the
created directory dir to basedir. basedir
must be absolute. Default is
/etc/s6-linux-init.
- -l tmpfsdir : at boot time, a tmpfs will
be mounted on tmpfsdir. The directory should already exist in
the root filesystem, and be empty. tmpfsdir must be absolute. Default is
/run.
- -b execline_bindir : init is run by the kernel
without a PATH, and since it is a script, it is necessary to tell it where
to find the
execlineb
launcher and the first few early commands before PATH can be set.
execline_bindir is the location where the execline binaries can be
found. It must be absolute. Default is
/bin.
- -u log_uid : the catch-all
logger will run with the uid log_uid. Default is 0.
- -g log_gid : the catch-all
logger will run with the gid log_gid. Default is 0.
- -U : the correct log_uid and
log_gid values for the catch-all logger will be read from the
UID and GID environment variables that have been passed to
s6-linux-init-maker. This allows for invocations such as
s6-envuidgid nobody s6-linux-init-maker -U ... so that
the catch-all logger runs as the nobody user. Be aware that
this option is only safe when the user database on the
boot-time machine is the same as on the run-time
machine, else the catch-all logger may run with an unexpected uid
and gid.
- -G early_getty : if this option
is set, s6-linux-init-maker will define a service that will run
very early, before stage2 is executed. This early service
should be a getty, to allow logins even if stage2 fails.
early_getty should be a simple command line: for instance,
"/sbin/getty 38400 tty1". By default, no early service
is defined.
- -2 stage2 : stage2 is
the location of the stage 2 script that will be run when the
system has an operational supervision tree. It must be absolute. Default is
/etc/rc.init.
- -r : redirect. By default, stage2 is
run with stdout and stderr pointing to /dev/console, so that
users can see what init scripts print. However, it may conflict
with an early getty, or be undesirable for other reasons. The
-r option redirects stage2's stdout and stderr
to the catch-all logger, so the output will be made available
in the tmpfsdir/uncaught-logs directory.
- -Z stage2_finish :
stage2_finish is the location of the script that will be
run when s6-svscan receives a signal that tells it to stop the
machine, before it executes into stage3. It must be
absolute. Default is /etc/rc.tini.
Note that this script is run with its stdout and stderr
redirected to the tmpfsdir/uncaught-logs logging
directory, so its output will not appear on the system's console.
- -3 stage3 : stage3 is
the location of the stage 3 script that will be run at the end of
the machine lifetime, when s6-svscan is told to terminate.
It must be absolute. Default is
/etc/rc.shutdown.
- -p initial_path : the value to
set the PATH environment variable to, for all the starting processes.
This will be done as early as possible in stage 1. It is
absolutely necessary for
execline,
s6,
s6-portable-utils and
s6-linux-utils
binaries to be accessible via initial_path, else the machine
will not boot. Default is
/usr/bin:/bin.
- -m initial_umask : the value of
the initial file umask for all the starting processes, in octal.
Default is
022.
- -t timestamp_style : how
logs are timestamped by the catch-all logger. 0 means no
timestamp, 1 means
external TAI64N format,
2 means
ISO 8601 format,
and 3 means both. Default is
1.
- -d dev_style : how /dev is
handled on this system. 0 means a static /dev, 1 means
devtmpfs but not automounted by the kernel at boot time, and 2 means
devtmpfs automounted by the kernel at boot time. Default is
2.
- -s env_store : stage 1 init sometimes
inherits a few environment variables from the kernel. It empties its
environment before spawning stage2 and executing into s6-svscan, in
order to prevent those "kernel" environment variables from leaking
into the whole process tree. However, sometimes those variables are
needed at a later time; in that case, giving the -s option
to s6-linux-init-maker makes stage 1 init dump the "kernel" environment
variables into the env_store directory, via the
s6-dumpenv
program, before erasing them. env_store should obviously be
a writable directory, so it should be located under tmpfsdir!
If this option is not given (which is the default), the environment
inherited from the kernel isn't saved anywhere.
- -e initial_envvar : this option
can be repeated. For every initial_envvar, s6-linux-init-maker
will adjust the global environment directory in dir/env.
initial_envvar must either be of the form VAR,
to make sure that VAR does not appear in the global
environment, or of the form VAR=VALUE, to add an
environment variable VAR with the value VALUE.
The TZ variable, for instance, is a good candidate to be set in
the global environment.
- -n : tells s6-linux-init-maker that the init script
is going to run in a container, as pid 1 in a non-root namespace.
This modifies the .s6-svscan/finish, .s6-svscan/SIGHUP
and .s6-svscan/SIGINT scripts slightly, in order to provide
adequate functionality when the containerized system is asked to
shutdown. Do not add this option if the init script is going to run
in the root pid namespace.
Notes
The difficult parts of
running
s6-svscan as process 1 are:
- The fact that the supervision tree requires writable directories,
so in order to accommodate read-only root filesystems, there needs to
be a tmpfs mounted before s6-svscan is run.
- The catch-22 coming from the need to redirect the supervision
tree's output away from /dev/console (which is fine for a
first process invocation but impractical for log management of a
whole process tree) and into a logger that is itself managed by the
supervision tree it's reading data from.
The main benefit of s6-linux-init-maker is that it automates those
parts. This means that it has been designed for real hardware
where the above issues apply.
If you are building an init system for a
virtual machine, a container, or anything similar that does not
have the /dev/console issue or the read-only rootfs issue,
you will probably not reap much benefit from using s6-linux-init-maker:
you could probably invoke
s6-svscan
directly as your process 1, or build a script by hand, which
would result in a simpler init with less dependencies.
Nevertheless, if you prefer using s6-linux-init-maker, it
supports this case via the -n option.