doc/s6-svscan.html


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250

<html>
  <head>
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <meta http-equiv="Content-Language" content="en" />
    <title>s6: the s6-svscan program</title>
    <meta name="Description" content="s6: the s6-svscan program" />
    <meta name="Keywords" content="s6 command s6-svscan scandir supervision supervise svscan monitoring collection" />
    <!-- <link rel="stylesheet" type="text/css" href="//skarnet.org/default.css" /> -->
  </head>
<body>

<p>
<a href="index.html">s6</a><br />
<a href="//skarnet.org/software/">Software</a><br />
<a href="//skarnet.org/">www.skarnet.org</a>
</p>

<h1> The s6-svscan program </h1>

<p>
s6-svscan starts and monitors a collection of <a href="s6-supervise.html">s6-supervise</a>
processes, each of these processes monitoring a single service. It is designed to be either
the root or a branch of a <em>supervision tree</em>.
</p>

<h2> Interface </h2>

<pre>
     s6-svscan [ -S | -s ] [ -d <em>notif</em> ] [ -c max ] [ -t <em>rescan</em> ] [ <em>scandir</em> ]
</pre>

<ul>
 <li> If given a <em>scandir</em> argument, s6-svscan switches to it. Else it uses
its current directory as <a href="scandir.html">scan directory</a>. </li>
 <li> It exits 100 if another s6-svscan process is already monitoring this
<a href="scandir.html">scan directory</a>. </li>
 <li> If the <tt>./.s6-svscan</tt> control directory does not exist,
s6-svscan creates it. However, it is recommended to already have a <tt>.s6-svscan</tt>
subdirectory in your scan directory, because s6-svscan may try and execute into the
<tt>.s6-svscan/crash</tt> or <tt>.s6-svscan/finish</tt> files at some point - so those
files should exist and be executable. </li>
 <li> From this point on, s6-svscan never dies. It tries its best to keep
control of what's happening. In case of a major system call failure, which means
that the kernel or hardware is broken in some fashion, it executes into the
<tt>.s6-svscan/crash</tt> program. (But if that execution fails, s6-svscan exits
111.) </li>
 <li> s6-svscan performs an initial <em>scan</em> of its scan directory. </li>
 <li> s6-svscan then occasionally runs <em>scans</em> or <em>reaps</em>,
see below. </li>
 <li> s6-svscan runs until it is told to stop via <a href="s6-svscanctl.html">
s6-svscanctl</a>, or a signal.
Then it executes into the <tt>.s6-svscan/finish</tt> program. The program is
given an argument that depends on the s6-svscanctl options that were used. </li>
 <li> If that execution fails, s6-svscan falls back on a <tt>.s6-svscan/crash</tt>
execution. </li>
</ul>

<h2> Options </h2>

<ul>
 <li> <tt>-S&nbsp;</tt>&nbsp;: do not divert signals. This is the default for now;
it may change in a future version of s6. </li>
 <li> <tt>-s&nbsp;</tt>&nbsp;: divert signals - see below. </li>
 <li> <tt>-d&nbsp;<em>notif</em></tt>&nbsp;: notify readiness on file descriptor
<em>notif</em>. When s6-svscan is ready to accept commands from
<a href="s6-svscanctl.html">s6-svscanctl</a>, it will write a newline to <em>notif</em>.
<em>notif</em> cannot be lesser than 3. By default, no notification is sent. Please
note that using this option signals <em>shallow readiness</em>: s6-svscan being
"ready" only means that it is ready to accept commands. It <em>does not mean</em>
that all the services it launches at start are themselves ready, or even started, or
even that the relevant <a href="s6-supervise.html">s6-supervise</a> processes have
been started. If you need to test for <em>deep readiness</em>, meaning that all the
services in the supervision tree have been started and are ready, you cannot rely
on this option. </li>
 <li> <tt>-c&nbsp;<em>max</em></tt>&nbsp;: maintain services for up to <em>max</em>
service directories. Default is 500. Lower limit is 2. There is no upper limit, but:
 <ul>
  <li> The higher <em>max</em> is, the more stack memory s6-svscan will use,
approximately 50 bytes per service. </li>
  <li> s6-svscan uses 2 file descriptors per logged service. </li>
 </ul>
 It is the admin's responsibility to make sure that s6-svscan has enough available
descriptors to function properly and does not exceed its stack limit. The default
of 500 is safe and provides enough room for every reasonable system. </li>
 <li> <tt>-t&nbsp;<em>rescan</em></tt>&nbsp;: perform a scan every <em>rescan</em>
milliseconds. If <em>rescan</em> is 0 (the default), automatic scans are never performed after
the first one and s6-svscan will only detect new services when told to via a
<a href="s6-svscanctl.html">s6-svscanctl -a</a> command.
It is <em>strongly</em> discouraged to set
<em>rescan</em> to a positive value under 500. </li>
</ul>

<h2> Signals </h2>

<p>
 s6-svscan always reacts to the following signals:
</p>

<ul>
 <li> SIGCHLD&nbsp;: triggers the reaper. </li>
 <li> SIGALRM&nbsp;: triggers the scanner. </li>
 <li> SIGABRT&nbsp;: acts as if a <tt>s6-svscanctl -b</tt> command had been received. </li>
</ul>

<p>
 By default, it also reacts to the following signals:
</p>

<ul>
 <li> SIGTERM&nbsp;: acts as if a <tt>s6-svscanctl -t</tt> command had been received. </li>
 <li> SIGHUP&nbsp;: acts as if a <tt>s6-svscanctl -h</tt> command had been received. </li>
 <li> SIGQUIT&nbsp;: acts as if a <tt>s6-svscanctl -q</tt> command had been received. </li>
 <li> SIGINT&nbsp;: acts as if a <tt>s6-svscanctl -6</tt> command had been received. </li>
</ul>

<p>
 But if the <tt>-s</tt> option was given, then instead of those default actions,
s6-svscan uses configurable handlers: it forks and executes a program every time
it receives one of the following signals.
</p>

<ul>
 <li> SIGTERM&nbsp;: fork and execute <tt>.s6-svscan/SIGTERM</tt> </li>
 <li> SIGHUP&nbsp;: fork and execute <tt>.s6-svscan/SIGHUP</tt> </li>
 <li> SIGQUIT&nbsp;: fork and execute <tt>.s6-svscan/SIGQUIT</tt> </li>
 <li> SIGINT&nbsp;: fork and execute <tt>.s6-svscan/SIGINT</tt> </li>
 <li> SIGUSR1&nbsp;: fork and execute <tt>.s6-svscan/SIGUSR1</tt> </li>
 <li> SIGUSR2&nbsp;: fork and execute <tt>.s6-svscan/SIGUSR2</tt> </li>
</ul>

<p>
 If an action cannot be taken (the relevant file doesn't exist, or isn't
executable, or any kind of error happens), s6-svscan prints a warning
message to its standard error but does nothing else with the signal.
</p>

<p>
 The <tt>-s</tt> mechanism is useful, for instance, when s6-svscan is running as
process 1 and needs to trap signals such as SIGINT (sent on some systems by
a Ctrl-Alt-Del press) in order to perform some specific work instead of
executing into <tt>.s6-svscan/finish</tt> on the spot.
</p>

<p>
 s6-svscan will not exit its loop on its own when it receives a signal such as
SIGINT and the <tt>-s</tt> option has been given. To make it exit its loop,
invoke a <a href="s6-svscanctl.html">s6-svscanctl</a> command from the signal
handling script. For instance, a <tt>.s6-svscan/SIGINT</tt> script could look
like this:
</p>

<pre>  #!/command/execlineb -P
  foreground { shutdown-the-services }
  s6-svscanctl -i .
</pre>

<h2> The reaper </h2>

<p>
 Upon receipt of a SIGCHLD, or a <a href="s6-svscanctl.html">s6-svscanctl -z</a>
command, s6-svscan runs a <em>reaper</em> routine.
</p>

<p>
The reaper acknowledges (via some
<a href="http://pubs.opengroup.org/onlinepubs/9699919799/functions/wait.html">wait()</a>
function), without blocking, every terminated child of s6-svscan, even ones it does not
know it has. This is especially important when <a href="s6-svscan-1.html">s6-svscan is
run as process 1</a>.
</p>

<p>
 If the dead child is a <a href="s6-supervise.html">s6-supervise</a> process watched
by s6-svscan, and the last scan flagged that process as active, then it is restarted
one second later.
</p>

<h2> The scanner </h2>

<p>
 Every <em>rescan</em> milliseconds, or upon receipt of a SIGALRM or a
<a href="s6-svscanctl.html">s6-svscanctl -a</a> command, s6-svscan runs a
<em>scanner</em> routine.
</p>

<p>
 The scanner scans the current directory for subdirectories (or symbolic links
to directories), which must be <a href="servicedir.html">service directories</a>.
It skips names starting with dots. It will not create services for more than
<em>max</em> subdirectories.
</p>

<p>
 For every new subdirectory <em>dir</em> it finds, the scanner spawns a
<a href="s6-supervise.html">s6-supervise</a> process on it. If
<em>dir</em><tt>/log</tt> exists, it spawns a s6-supervise process on
both <em>dir</em> and <em>dir</em><tt>/log</tt>, and maintains a
never-closing pipe from the service's stdout to the logger's stdin.
This is <em>starting the service</em>, with or without a corresponding
logger.
Every service the scanner finds is flagged as "active".
</p>

<p>
 The scanner remembers the services it found. If a service has been
started in an earlier scan, but the current scan can't find the corresponding
directory, the service is then flagged as inactive. No command is sent
to stop inactive s6-supervise processes (unless the administrator
uses <a href="s6-svscanctl.html">s6-svscanctl -n</a>), but inactive
s6-supervise processes will not be restarted if they die.
</p>

<h2> Notes </h2>

<ul>
 <li> s6-svscan is designed to run until the machine is shut down. It is
also designed as a suitable candidate for
<a href="s6-svscan-1.html">process 1</a>. So, it goes out of its way to
stay alive, even in dire situations. When it encounters a fatal situation,
something it really cannot handle, it executes into <tt>.s6-svscan/crash</tt>
instead of dying; when it is told to exit, it executes into
<tt>.s6-svscan/finish</tt>. Administrators should make sure to design
appropriate <tt>crash</tt> and <tt>finish</tt> routines. </li>
 <li> s6-svscan is a fully asynchronous state machine. It will read and
process commands at any time, even when the computer is in trouble. </li>
 <li> s6-svscan <em>does not use malloc()</em>. That means it will <em>never leak
memory</em>. <small>However, s6-svscan uses opendir(), and most opendir()
implementations internally use heap memory - so unfortunately, it's impossible
to guarantee that s6-svscan does not use heap memory at all.</small> </li>
 <li> When run with the <tt>-t0</tt> option, s6-svscan <em>never polls</em>,
it only wakes up on notifications, just like s6-supervise. The s6 supervision
tree can be used in energy-critical environments. </li>
 <li> The supervision tree (i.e. the tree of processes made of s6-svscan and
all its scions) is not supposed to have a controlling terminal; s6-svscan
generally is either process 1 or a child of process 1, not something that is
launched from a terminal. If you run s6-svscan from an interactive shell, be
warned that typing ^C in the controlling terminal (which sends a SIGINT to
all processes in the foreground process group in the terminal) will terminate
the supervision tree, but not the supervised processes - so, the supervised
processes will keep running as orphans. This is by design: supervised
processes should be as resilient as possible, even when their supervisors
die. However, if you want to launch s6-svscan from an interactive shell and
need your services to die with the supervision tree when you ^C it, you can
obtain this behaviour by creating <tt>./nosetsid</tt> files in every
<a href="servicedir.html">service directory</a>. </li>
</ul>

</body>
</html>