summaryrefslogtreecommitdiff
path: root/doc/s6-svscan.html
blob: ae815fc77bbea13e5fd6f5f83be450ebe374a29a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
<html>
  <head>
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <meta http-equiv="Content-Language" content="en" />
    <title>s6: the s6-svscan program</title>
    <meta name="Description" content="s6: the s6-svscan program" />
    <meta name="Keywords" content="s6 command s6-svscan scandir supervision supervise svscan monitoring collection" />
    <!-- <link rel="stylesheet" type="text/css" href="//skarnet.org/default.css" /> -->
  </head>
<body>

<p>
<a href="index.html">s6</a><br />
<a href="//skarnet.org/software/">Software</a><br />
<a href="//skarnet.org/">www.skarnet.org</a>
</p>

<h1> The s6-svscan program </h1>

<p>
s6-svscan starts and monitors a collection of <a href="s6-supervise.html">s6-supervise</a>
processes, each of these processes monitoring a single service. It is designed to be either
the root or a branch of a <em>supervision tree</em>.
</p>

<h2> Interface </h2>

<pre>
     s6-svscan [ -d <em>notif</em> ] [ -X <em>consoleholder</em> ] [ -c max | -C services_max ] [ -L name_max ] [ -t <em>rescan</em> ] [ <em>scandir</em> ]
</pre>

<ul>
 <li> If given a <em>scandir</em> argument, s6-svscan switches to it. Else it uses
its current directory as the <a href="scandir.html">scan directory</a>. </li>
 <li> It exits 100 if another s6-svscan process is already monitoring this
<a href="scandir.html">scan directory</a>. </li>
 <li> If the <tt>./.s6-svscan</tt> control directory does not exist,
s6-svscan creates it. However, it is recommended to already have a <tt>.s6-svscan</tt>
subdirectory in your scan directory, because it is used to configure s6-svscan
operation, see below.
 <li> From this point on, s6-svscan never dies. It tries its best to keep
control of what's happening. In case of a major system call failure, which means
that the kernel or hardware is broken in some fashion, it executes into the
<tt>.s6-svscan/crash</tt> program. (But if that execution fails, s6-svscan exits
111.) </li>
 <li> s6-svscan performs an initial <em>scan</em> of its scan directory. </li>
 <li> s6-svscan then occasionally runs <em>scans</em> or <em>reaps</em>,
see below. </li>
 <li> s6-svscan runs until it is told to stop via <a href="s6-svscanctl.html">
s6-svscanctl</a>, or an appropriate signal (see below).
Then it executes into the <tt>.s6-svscan/finish</tt> program, if present; if
not, it exits 0.
</ul>

<h2> Options </h2>

<ul>
 <li> <tt>-d&nbsp;<em>notif</em></tt>&nbsp;: notify readiness on file descriptor
<em>notif</em>. When s6-svscan is ready to accept commands from
<a href="s6-svscanctl.html">s6-svscanctl</a>, it will write a newline to <em>notif</em>.
<em>notif</em> cannot be lesser than 3. By default, no notification is sent. Please
note that using this option signals <em>shallow readiness</em>: s6-svscan being
"ready" only means that it is ready to accept commands. It <em>does not mean</em>
that all the services it launches at start are themselves ready, or even started, or
even that the relevant <a href="s6-supervise.html">s6-supervise</a> processes have
been started. If you need to test for <em>deep readiness</em>, meaning that all the
services in the supervision tree have been started and are ready, you cannot rely
on this option. </li>

 <li> <tt>-X&nbsp;<em>consoleholder</em></tt>&nbsp;: assume the output console is available
on descriptor <em>consoleholder</em>. If this option is given, and a <tt>s6-svscan-log</tt>
service exists, the <a href="s6-supervise.html">s6-supervise</a> process for that service
will be run with <em>consoleholder</em> as its standard error. This is mainly useful
for a setup done via <a href="//skarnet.org/software/s6-linux-init/">s6-linux-init</a>,
where all error messages go to the <tt>s6-svscan-log</tt> catch-all logger service by
default, except messages from this service itself, which fall back to <em>consoleholder</em>.
If you're not sure what to use this option for, or how, you don't need it. </li>

 <li> <tt>-C&nbsp;<em>services_max</em></tt>&nbsp;: maintain services for up to <em>services_max</em>
service directories, including loggers. Default is 1000. Lower limit is 4. Upper limit is 160000. If
you're increasing this value from the default, please note that:
 <ul>
  <li> The higher <em>max</em> is, the more stack memory s6-svscan will use,
up to 200 bytes per service, also depending on the value of <em>name_max</em>. </li>
  <li> s6-svscan uses 2 file descriptors per logged service. </li>
 </ul>
 It is the admin's responsibility to make sure that s6-svscan has enough available
descriptors to function properly and does not exceed its stack limit. The default
of 1000 is safe and provides enough room for every reasonable system. </li>

 <li> <tt>-c&nbsp;<em>max</em></tt>&nbsp;: a deprecated way of setting <em>services_max</em>.
If the <tt>-c</tt> option is given, the value of <em>max</em> is doubled, and the result
is used as <em>services_max</em>. The reason for the change is that previous versions
of s6-svscan handled services+loggers as a single entity; but this version of s6-svscan
handles services and loggers in the same way, so with the default values it's now possible
to handle e.g. 600 unlogged services, whereas previously you were limited to 500 because
s6-svscan was reserving room for the loggers. </li>

 <li> <tt>-L&nbsp;<em>name_max</em></tt>&nbsp;: the maximum length of a name in the
scan directory. Names longer than <em>name_max</em> won't be taken into account.
Default is 251. It cannot be set lower than 11 or higher than 1019. </li>

 <li> <tt>-t&nbsp;<em>rescan</em></tt>&nbsp;: perform a scan every <em>rescan</em>
milliseconds. If <em>rescan</em> is 0 (the default), automatic scans are never performed after
the first one and s6-svscan will only detect new services when told to via a
<a href="s6-svscanctl.html">s6-svscanctl -a</a> command. Use of this option is
discouraged; it should only be given to emulate the behaviour of other supervision
suites. (<tt>-t5000</tt> for daemontools' svscan, <tt>-t14000</tt> for runit's
runsvdir.) </li>
</ul>

<h2> Signals </h2>

<p>
 s6-svscan has special handling for the following signals:
</p>

<ul>
 <li> SIGCHLD </li>
 <li> SIGALRM </li>
 <li> SIGABRT </li>
 <li> SIGHUP </li>
 <li> SIGINT </li>
 <li> SIGTERM </li>
 <li> SIGQUIT </li>
 <li> SIGUSR1 </li>
 <li> SIGUSR2 </li>
 <li> SIGPWR (on systems that support it)</li>
 <li> SIGWINCH (on systems that support it)</li>
</ul>

<p>
 Signals that are not in the above list are not caught by s6-svscan and will
have the system's default effect.
</p>

<p>
 The behaviour for the first three signals in the list is always fixed:
</p>

<ul>
 <li> SIGCHLD&nbsp;: trigger the reaper. </li>
 <li> SIGALRM&nbsp;: trigger the scanner. </li>
 <li> SIGABRT&nbsp;: immediately exec into <tt>.s6-svscan/finish</tt> (or exit 0 if that script does not exist), without waiting for any processes to die </li>
</ul>

<p>
 The behaviour for the rest of the list is configurable: on receipt of a
<tt>SIG<em>FOO</em></tt>,
s6-svscan will try to run an executable <tt>.s6-svscan/SIG<em>FOO</em></tt> file. For
instance, a <tt>.s6-svscan/SIGTERM</tt> executable script will be run on receipt of
a SIGTERM. If the file cannot be found, or cannot be executed for any reason, the
default behaviour for the signal will be applied. Default behaviours are:
</p>

<ul>
 <li> SIGHUP&nbsp;: rescan and prune the supervision tree, i.e. make sure that new
service directories visible from the scan directory have a
<a href="s6-supervise.html">s6-supervise</a> process running on them, and instruct
<a href="s6-supervise.html">s6-supervise</a> processes running on service directories
that have become invisible from the scan directory to stop their service and exit.
This behaviour can also be achieved via the
<tt>s6-svscanctl -an <em>scandir</em> </tt> command. This is the closest that s6-svscan
can get to the traditional "reload your configuration" behaviour. </li>
 <li> SIGINT&nbsp;: same as SIGTERM below. </li>
 <li> SIGTERM&nbsp;: Instruct all the <a href="s6-supervise.html">s6-supervise</a> processes
to stop their service and exit; wait for the whole supervision tree to die, without
losing any logs; then exec into <tt>.s6-svscan/finish</tt> or exit 0. This behaviour
can also be achieved via the <tt>s6-svscanctl -t <em>scandir</em></tt> command. </li>
 <li> SIGQUIT&nbsp;: same as SIGTERM above, except that if a service is logged
(i.e. there is a <em>foo</em> service <em>and</em> a <em>foo</em>/log service)
then the logging service will also be killed, instead of being allowed to exit
naturally after its producer has flushed its output and died. This can solve
problems with badly written logging programs, but it can also cause loss of logs
since the logger may die before the producer has finished flushing everything. The
behaviour can also be achieved via the <tt>s6-svscanctl -q <em>scandir</em></tt> command;
you should only use this if SIGTERM/<tt>-t</tt> fails to properly tear down the
supervision tree. </li>
 <li> Others: no effect if an appropriate executable file in <tt>.s6-svscan/</tt>
cannot be run. </li>
</ul>

<h2> The reaper </h2>

<p>
 Upon receipt of a SIGCHLD, or an <a href="s6-svscanctl.html">s6-svscanctl -z</a>
command, s6-svscan runs a <em>reaper</em> routine.
</p>

<p>
The reaper acknowledges (via some
<a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/wait.html">wait()</a>
function), without blocking, every terminated child of s6-svscan, even ones it does not
know it has. This is especially important when <a href="s6-svscan-1.html">s6-svscan is
run as process 1</a>.
</p>

<p>
 If the dead child is an <a href="s6-supervise.html">s6-supervise</a> process watched
by s6-svscan, and the last scan flagged that process as active, then it is restarted
one second later.
</p>

<h2> The scanner </h2>

<p>
 Upon receipt of a SIGALRM or a
<a href="s6-svscanctl.html">s6-svscanctl -a</a> command, s6-svscan runs a
<em>scanner</em> routine. (It also runs it every <em>rescan</em> milliseconds
if the <tt>-t</tt> option has been given.)
</p>

<p>
 The scanner scans the current directory for subdirectories (or symbolic links
to directories), which must be <a href="servicedir.html">service directories</a>.
It skips names starting with dots. It will not create services for more than
<em>max</em> subdirectories.
</p>

<p>
 For every new subdirectory <em>dir</em> it finds, the scanner spawns a
<a href="s6-supervise.html">s6-supervise</a> process on it. If
<em>dir</em><tt>/log</tt> exists, it spawns an s6-supervise process on
both <em>dir</em> and <em>dir</em><tt>/log</tt>, and maintains a
never-closing pipe from the service's stdout to the logger's stdin.
This is <em>starting the service</em>, with or without a corresponding
logger.
Every service the scanner finds is flagged as "active".
</p>

<p>
 The scanner remembers the services it found. If a service has been
started in an earlier scan, but the current scan can't find the corresponding
directory, the service is then flagged as inactive. No command is sent
to stop inactive s6-supervise processes (unless the administrator
uses <a href="s6-svscanctl.html">s6-svscanctl -n</a> or a SIGHUP), but
inactive s6-supervise processes will not be restarted if they die.
</p>

<h2> Notes </h2>

<ul>
 <li> s6-svscan is designed to run until the machine is shut down. It is
also designed as a suitable candidate for
<a href="s6-svscan-1.html">process 1</a>. So, it goes out of its way to
stay alive, even in dire situations. When it encounters a fatal situation,
something it really cannot handle, it executes into <tt>.s6-svscan/crash</tt>
instead of dying; when it is told to exit, it executes into
<tt>.s6-svscan/finish</tt>. Administrators should make sure to design
appropriate <tt>crash</tt> and <tt>finish</tt> routines. </li>
 <li> s6-svscan is a fully asynchronous state machine. It will read and
process commands at any time, even when the computer is in trouble. </li>
 <li> s6-svscan <em>does not use malloc()</em>. That means it will <em>never leak
memory</em>. <small>However, s6-svscan uses opendir(), and most opendir()
implementations internally use heap memory - so unfortunately, it's impossible
to guarantee that s6-svscan does not use heap memory at all.</small> </li>
 <li> Unless run with a nonzero <tt>-t</tt> option, which is only a legacy
feature used to emulate other supervision suites such as daemontools or runit,
s6-svscan <em>never polls</em>; it only wakes up on notifications.
The s6 supervision tree can be used in energy-critical environments. </li>
</ul>

</body>
</html>