diff options
-rw-r--r-- | doc/index.html | 8 | ||||
-rw-r--r-- | doc/quickstart.html | 12 | ||||
-rw-r--r-- | doc/tipideed.html | 177 | ||||
-rw-r--r-- | src/tipideed/cgi.c | 36 | ||||
-rw-r--r-- | src/tipideed/tipideed.c | 2 |
5 files changed, 206 insertions, 29 deletions
diff --git a/doc/index.html b/doc/index.html index bbd868c..4a2b9b7 100644 --- a/doc/index.html +++ b/doc/index.html @@ -107,18 +107,20 @@ Only keeping the last three syllables makes it easier. <li> A POSIX-compliant system with a standard C development environment </li> <li> GNU make, version 3.81 or later </li> <li> <a href="//skarnet.org/software/skalibs/">skalibs</a> version -2.13.2.0 or later. It's a build-time requirement. It's also a run-time +2.14.0.0 or later. It's a build-time requirement. It's also a run-time requirement if you link against the shared version of the skalibs library. </li> <li> Recommended at run-time: <a href="//skarnet.org/software/s6-networking/">s6-networking</a> version -2.5.1.3 or later. It's not a strict requirement, but tipidee relies +2.5.1.4 or later. It's not a strict requirement, but tipidee relies on a super-server to listen to the network and provide connection information via environment variables. It also defers to tools such as <a href="//skarnet.org/software/s6-networking/s6-tcpserver-access.html">s6-tcpserver-access</a> to provide access control and connection fine-tuning. And if you want to run an HTTPS server, you'll need something like <a href="//skarnet.org/software/s6-networking/s6-tlsserver.html">s6-tlsserver</a> -to manage the TLS transport layer. </li> +to manage the TLS transport layer. So, installing +<a href="//skarnet.org/software/s6-networking/">s6-networking</a> will make +your life easier in many ways. </li> </ul> <h3> Licensing </h3> diff --git a/doc/quickstart.html b/doc/quickstart.html index 8a393fb..a3e8519 100644 --- a/doc/quickstart.html +++ b/doc/quickstart.html @@ -52,9 +52,9 @@ If you want to serve HTTP on port 80 and HTTPS on port 443, then you'll need two services. Or four if you want to serve on both IPv4 and IPv6 adresses. </li> <li> Start these processes in the <tt>/home/www</tt> directory, the base for all the domains you're serving. </li> - <li> Assuming you want to run the server as user <tt>www</tt>, -the basic command line for an HTTP service is: -<tt>s6-envuidgid www s6-tcpserver -U example.com 80 s6-tcpserver-access -v0 -- tipideed</tt>. + <li> Assuming you want to run the server as user <tt>www</tt>, and your +local IP address is ${ip}, the basic command line for an HTTP service is: +<tt>s6-envuidgid www s6-tcpserver -U -- ${ip} 80 s6-tcpserver-access -- tipideed</tt>. <ul> <li> <a href="//skarnet.org/software/s6/s6-envuidgid.html">s6-envuidgid</a> puts the uid and gid of user <tt>www</tt> into the environment, for <tt>s6-tcpserver</tt> @@ -63,9 +63,9 @@ to drop root privileges to. </li> binds to the address and port given, drops privileges, and listens; it accepts connections and spawns a new process for each one. </li> <li> <a href="//skarnet.org/software/s6-networking/s6-tcpserver-access.html">s6-tcpserver-access</a> -performs DNS requests to fill environment variables that tipidee needs. Its main -purpose is to perform access control, but we're not using it for that here: -chances are your web server is public access and doesn't need to be IP-restricted. </li> +performs DNS requests to fill environment variables that tipidee needs. (The main +purpose of this program is to perform access control, but we're not using it for that here: +chances are your web server is public access and doesn't need to be IP-restricted.) </li> <li> <a href="tipideed.html">tipideed</a> is the tipidee daemon, and will handle HTTP requests until the client closes the connection or tipideed itself needs to close it. </li> diff --git a/doc/tipideed.html b/doc/tipideed.html index ce8f5e5..b6b39f6 100644 --- a/doc/tipideed.html +++ b/doc/tipideed.html @@ -42,8 +42,43 @@ occurs that makes it nonsensical to keep the connection open. </li> current working directory, one subdirectory for every domain it hosts. </li> </ul> +<h2> Common usage </h2> + +<p> + tipideed is intended to be run under a TCP super-server such as +<a href="//skarnet.org/software/s6-networking/s6-tcpserver.html">s6-tcpserver</a>, +for plain text HTTP, or +<a href="//skarnet.org/software/s6-networking/s6-tlsserver.html">s6-tlsserver</a>, +for HTTPS. It delegates to the super-server the job of binding and listening to +the socket, accepting connections, spawning a separate process to handle a +given connection, and potentially establishing a TLS tunnel with the client for +secure communication. +</p> + <p> - TODO: write this page. + As such, a command line for tipideed, running as user <tt>www</tt>, listening +on address <tt>${ip}</tt>, would typically look like this, for HTTP: +</p> + +<pre> + s6-envuidgid www s6-tcpserver -U -- ${ip} 80 s6-tcpserver-access -- tipideed +</pre> + +<p> + or, for HTTPS: +</p> + +<pre> + s6-envuidgid www env KEYFILE=/path/to/private/key CERTFILE=/path/to/certificate s6-tlsserver -U -- ${ip} 443 tipideed +</pre> + +<p> + Most users will want to run these command lines as <em>services</em>, i.e. daemons +run in the background when the machine starts. The <tt>examples/</tt> subdirectory +of the tipidee package provides service templates to help you run tipideed under +<a href="https://wiki.gentoo.org/wiki/OpenRC">OpenRC</a>, +<a href="//skarnet.org/software/s6/">s6</a> and +<a href="//skarnet.org/software/s6-rc/">s6-rc</a>. </p> <h2> Exit codes </h2> @@ -51,10 +86,16 @@ current working directory, one subdirectory for every domain it hosts. </li> <dl> <dt> 0 </dt> <dd> clean exit. The client closed the connection after a stream of HTTP exchanges. </dd> + <dt> 1 </dt> <dd> Illicit client behaviour. tipideed exited because it could +not serve the client in good faith. </dd> + <dt> 2 </dt> <dd> Illicit CGI script behaviour. tipideed exited because the invoked +CGI script made it impossible to continue. Before exiting, tipideed likely has +sent a 502 (Bad Gateway) response to the client. </dd> <dt> 100 </dt> <dd> bad usage. tipideed has been run in an incorrect way: bad command line options, or missing environment variables, etc. </dd> <dt> 101 </dt> <dd> cannot happen. This signals a bug in tipideed, and comes with an -error message asking you to report the bug. Please do so. </dd> +error message asking you to report the bug. Please do so, on the +<a href="//skarnet.org/lists/#skaware">skaware mailing-list</a>. </dd> <dt> 111 </dt> <dd> system call failed. If this happens while serving a request, tipideed likely has sent a 500 (Internal Server Error) response to the client before exiting. </dd> @@ -62,15 +103,143 @@ client before exiting. </dd> <h2> Environment variables </h2> +<h3> Reading - mandatory </h3> + +<p> + tipideed expects the following variables in its environment, and will exit +with an error message if they are undefined. When tipideed is run under +<a href="//skarnet.org/s6-networking/s6-tcpserver.html">s6-tcpserver</a> +(with <a href="//skarnet.org/s6-networking/s6-tcpserver-access.html">s6-tcpserver-access</a> or +<a href="//skarnet.org/s6-networking/s6-tlsserver.html">s6-tlsserver</a>, +these variables are automatically set by the super-server. This is the way +tipidee gets its network information without having to perform network +operations itself. +</p> + +<dl> + <dt> PROTO </dt> + <dd> The network protocol, normally <tt>TCP</tt>. </dd> + + <dt> TCPLOCALIP </dt> + <dd> The IP address the server is bound to. It will be passed as <tt>SERVER_ADDR</tt> +to CGI scripts. </dd> + + <dt> TCPLOCALPORT </dt> + <dd> The port the server is bound to. It will be passed as <tt>SERVER_PORT</tt> +to CGI scripts. </dd> + + <dt> TCPLOCALHOST </dt> + <dd> The domain name associated to the local IP address. It will be +passed as <tt>SERVER_NAME</tt> to CGI scripts. </dd> + + <dt> TCPREMOTEIP </dt> + <dd> The IP address of the client. It will be passed as <tt>REMOTE_ADDR</tt> +to CGI scripts. </dd> + + <dt> TCPREMOTEPORT </dt> + <dd> The port of the client socket. It will be passed as <tt>REMOTE_PORT</tt> +to CGI scripts. </dd> +</dl> + +<h3> Reading - optional </h3> + +<p> + tipideed can function without these variables, but if they're present, it +uses them to get more information. +</p> + +<dl> + <dt> TCPREMOTEHOST </dt> + <dd> The domain name associated to the IP address of the client. It will +be passed as <tt>REMOTE_HOST</tt> to CGI scripts; if absent, the value of +<tt>TCPREMOTEIP</tt> will be used instead. </dd> + + <dt> TCPREMOTEINFO </dt> + <dd> The name provided by an IDENT server running on the client, if any. +This is obsolete and not expected to be present; but if present, it will +be passed as <tt>REMOTE_IDENT</tt> to CGI scripts. </dd> + + <dt> SSL_PROTOCOL </dt> + <dd> The version of the TLS protocol used to cipher communications between +the client and the server. If present, tipideed will assume that the client +connection is secure, and will pass <tt>HTTPS=on</tt> to CGI scripts; +otherwise, it will assume it is running plaintext HTTP. </dd> +</dl> + +<h3> Writing </h3> + +<p> + When spawning a CGI or NPH script, tipideed clears all the previous variables, +so the passed environment is as close as possible to the environment of the +super-server; and it adds all the variables that are required by the +<a href="https://datatracker.ietf.org/doc/html/rfc3875#section-4.1">CGI 1.1 +specification</a>. It does not add PATH_TRANSLATED, which CGI scripts should +not rely on. +</p> + <h2> Options </h2> +<dl> + <dt> -v <em>verbosity</em> </dt> + <dd> The level of log verbosity. This is the same as the <tt>global verbosity</tt> +setting in the <a href="tipidee.conf.html">configuration file</a>; an explicit +command line option overrides any setting present in the configuration file.</dd> + + <dt> -f <em>file</em> </dt> + <dd> </dd> +</dl> + <h2> Detailed operation </h2> +<h2> Performance considerations </h2> + +<p> + On systems that implement +<a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_spawn.html">posix_spawn()</a>, +the <a href="//skarnet.org/software/s6-networking/s6-tcpserver.html">s6-tcpserver</a> +super-server (and the +<a href="//skarnet.org/software/s6-networking/s6-tlsserver.html">s6-tlsserver</a> one +as well, since both use the same underlying program) uses it instead of +<a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html">fork()</a>, +and that partly alleviates the performance penalty usually associated with servers +that spawn one process per connection. +</p> + +<p> + One of tipidee's stated goals is to explore what kind of performance is achievable for +a fully compliant Web server within the limits of that model. To that effect, tipideed +is meant to be <em>fast</em>. It should serve static files as fast as any server out +there, especially on Linux (or other systems supporting +<a href="https://man7.org/linux/man-pages/man2/splice.2.html">splice())</a> where it +uses zero-copy transfer. CGI performance should be limited by the performance of the +CGI script itself, never by tipideed. +</p> + +<p> + tipideed itself does not use +<a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html">fork()</a> +if the system supports +<a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_spawn.html">posix_spawn()</a> +— with one exception, that you will not hit, and if you do, fork() will not +be the bottleneck. (Can you guess which case it is, without looking at the code?) +tipideed does not parse its configuration file itself, delegating the task to the +offline <a href="tipidee-config.html">tipidee-config</a> program and directly mapping +a binary file instead. To parse a client request, it uses a deterministic finite +automaton, only reading the request once, and only backtracking in pathological cases. +This should streamline request processing as much as possible. +</p> + +<p> + If you have benchmarks, results of comparative testing of tipideed against +other Web servers, please share them on the +<a href="//skarnet.org/lists/#skaware">skaware mailing-list</a>. +</p> + <h2> Notes </h2> <ul> - <li> <tt>tipideed</tt> is pronounced <em>tipi-deed</em>. You can also say -<em>tipi-dee-dee</em>, but only if you're the type of person who says + <li> <tt>tipideed</tt> is pronounced <em>tipi-deed</em>. You can say +<em>tipi-dee-dee</em>, but only if you're the type of person who also says <em>PC computer</em>, <em>NIC card</em> or <em>ATM machine</em>. </li> </ul> diff --git a/src/tipideed/cgi.c b/src/tipideed/cgi.c index 40300c7..1ce0b1d 100644 --- a/src/tipideed/cgi.c +++ b/src/tipideed/cgi.c @@ -111,10 +111,16 @@ static inline int do_nph (tipidee_rql const *rql, char const *const *argv, char case -1 : die500sys(rql, 111, "fork") ; case 0 : { +#define NAME "tipideed (nph helper for pid " tain deadline ; char buf[4096] ; buffer b = BUFFER_INIT(&buffer_write, p[1], buf, 4096) ; - PROG = "tipidee (nph helper child)" ; + size_t m = sizeof(NAME) - 1 ; + char progstr[sizeof(NAME) + PID_FMT] ; + memcpy(progstr, NAME, m) ; + m += pid_fmt(progstr + m, getppid()) ; + progstr[m++] = ')' ; progstr[m++] = 0 ; + PROG = progstr ; tain_add_g(&deadline, &g.cgitto) ; close(p[0]) ; if (ndelay_on(p[1]) == -1) strerr_diefu1sys(111, "set fd nonblocking") ; @@ -205,8 +211,8 @@ static inline int run_cgi (tipidee_rql const *rql, char const *const *argv, char rstate = 1 ; break ; } - case 400 : die502x(rql, 1, "invalid headers", " from cgi ", argv[0]) ; - case 413 : die502x(rql, 1, hdr->n >= TIPIDEE_HEADERS_MAX ? "Too many headers" : "Too much header data", " from cgi ", argv[0]) ; + case 400 : die502x(rql, 2, "invalid headers", " from cgi ", argv[0]) ; + case 413 : die502x(rql, 2, hdr->n >= TIPIDEE_HEADERS_MAX ? "Too many headers" : "Too much header data", " from cgi ", argv[0]) ; case 500 : die500x(rql, 101, "can't happen: ", "avltreen_insert failed", " in do_cgi") ; default : die500x(rql, 101, "can't happen: ", "unknown tipidee_headers_parse return code", " in do_cgi") ; } @@ -217,7 +223,7 @@ static inline int run_cgi (tipidee_rql const *rql, char const *const *argv, char if (!slurpn(x[0].fd, sa, g.maxcgibody)) { if (error_isagain(errno)) break ; - else if (errno == ENOBUFS) die502x(rql, 1, "Too fat body", " from cgi ", argv[0]) ; + else if (errno == ENOBUFS) die502x(rql, 2, "Too fat body", " from cgi ", argv[0]) ; else die500sys(rql, 111, "read body", " from cgi ", argv[0]) ; } close(x[0].fd) ; @@ -243,7 +249,7 @@ static inline int local_redirect (tipidee_rql *rql, char const *loc, char *uribu memcpy(hosttmp, rql->uri.host, hostlen + 1) ; n = tipidee_uri_parse(uribuf, URI_BUFSIZE, loc, &rql->uri) ; if (!n || n + hostlen + 1 > URI_BUFSIZE) - die502x(rql, 1, "cgi ", cginame, " returned an invalid ", "Location", " value", " for local redirection") ; + die502x(rql, 2, "cgi ", cginame, " returned an invalid ", "Location", " value", " for local redirection") ; memcpy(uribuf + n, hosttmp, hostlen + 1) ; rql->uri.host = uribuf + n ; rql->uri.port = port ; @@ -297,23 +303,23 @@ static inline int process_cgi_output (tipidee_rql *rql, tipidee_headers const *h { size_t m = uint_scan(x, &status) ; if (!m || (x[m] && x[m] != ' ')) - die502x(rql, 1, "cgi ", cginame, " returned an invalid ", "Status", " header") ; + die502x(rql, 2, "cgi ", cginame, " returned an invalid ", "Status", " header") ; reason_phrase = x[m] ? x + m + 1 : "" ; if (status >= 300 && status < 399 && !location) - die502x(rql, 1, "cgi ", cginame, " returned a 3xx status code without a ", "Location", " header") ; + die502x(rql, 2, "cgi ", cginame, " returned a 3xx status code without a ", "Location", " header") ; if (status < 100 || status > 999) - die502x(rql, 1, "cgi ", cginame, " returned an invalid ", "Status", " value") ; + die502x(rql, 2, "cgi ", cginame, " returned an invalid ", "Status", " value") ; } if (location) { - if (!location[0]) die502x(rql, 1, "cgi ", cginame, " returned an invalid ", "Location", " header") ; + if (!location[0]) die502x(rql, 2, "cgi ", cginame, " returned an invalid ", "Location", " header") ; if (location[0] == '/' && location[1] != '/') return local_redirect(rql, location, uribuf, cginame) ; if (rbodylen) { if (!status) - die502x(rql, 1, "cgi ", cginame, " didn't output a ", "Status", " header", " for a client redirect response with document") ; + die502x(rql, 2, "cgi ", cginame, " didn't output a ", "Status", " header", " for a client redirect response with document") ; if (status < 300 || status > 399) - die502x(rql, 1, "cgi ", cginame, " returned an invalid ", "Status", " value", " for a client redirect response with document") ; + die502x(rql, 2, "cgi ", cginame, " returned an invalid ", "Status", " value", " for a client redirect response with document") ; } else { @@ -322,7 +328,7 @@ static inline int process_cgi_output (tipidee_rql *rql, tipidee_headers const *h char const *key = hdr->buf + hdr->list[i].left ; if (!strcasecmp(key, "Location") || !strcasecmp(key, "Status")) continue ; if (str_start(key, "X-CGI-")) continue ; - die502x(rql, 1, "cgi ", cginame, " returned extra headers", " for a client redirect response without document") ; + die502x(rql, 2, "cgi ", cginame, " returned extra headers", " for a client redirect response without document") ; } if (!status) { @@ -335,16 +341,16 @@ static inline int process_cgi_output (tipidee_rql *rql, tipidee_headers const *h { if (!status) status = 200 ; if (!tipidee_headers_search(hdr, "Content-Type")) - die502x(rql, 1, "cgi ", cginame, " didn't output a ", "Content-Type", " header") ; + die502x(rql, 2, "cgi ", cginame, " didn't output a ", "Content-Type", " header") ; } x = tipidee_headers_search(hdr, "Content-Length") ; if (x) { size_t cln ; if (!size0_scan(x, &cln)) - die502x(rql, 1, "cgi ", cginame, " returned an invalid ", "Content-Length", " header") ; + die502x(rql, 2, "cgi ", cginame, " returned an invalid ", "Content-Length", " header") ; if (cln != rbodylen) - die502x(rql, 1, "cgi ", cginame, " returned a mismatching ", "Content-Length", " header") ; + die502x(rql, 2, "cgi ", cginame, " returned a mismatching ", "Content-Length", " header") ; } tipidee_response_status(buffer_1, rql, status, reason_phrase) ; diff --git a/src/tipideed/tipideed.c b/src/tipideed/tipideed.c index 0cc512c..8c1e16e 100644 --- a/src/tipideed/tipideed.c +++ b/src/tipideed/tipideed.c @@ -513,7 +513,7 @@ int main (int argc, char const *const *argv, char const *const *envp) while (serve(&rql, docroot, hostlen + 1 + g.localportlen, uribuf, &hdr, bodysa.s, bodysa.len)) if (localredirs++ >= MAX_LOCALREDIRS) - die502x(&rql, 1, "too many local redirections - possible loop involving path ", rql.uri.path) ; + die502x(&rql, 2, "too many local redirections - possible loop involving path ", rql.uri.path) ; } } log_and_exit(0) ; |