execline
Software
skarnet.org
The execline language design and grammar
execline principles
Here are some basic Unix facts:
- Unix programs are started with the execve()
system call, which takes 3 arguments: the command name (which
we won't discuss here because it's redundant in most cases),
the command line argv, which specifies the program name and its
arguments, and the environment envp.
- The argv structure makes it easy to read some
arguments at the beginning of argv, perform some action,
then execve() into the rest of argv. For
instance, the nice command works that way:
nice -10 echo blah
will read nice and -10
from the argv, change the process' nice value, then exec into
the command echo blah. This is called
chain loading
by some people, and
Bernstein chaining by others.
- The purpose of the environment is to preserve some state across
execve() calls. This state is usually small: most programs
keep their information in the filesystem.
- A script is basically a text file whose meaning is a
sequence of actions, i.e. calls to Unix programs, with some control
over the execution flow. You need a program to interpret your script.
Traditionally, this program is /bin/sh: scripts are written
in the shell language.
- The shell reads and interprets the script command after command.
That means it must preserve a state, and stay in memory while the
script is running.
- Standard shells have lots of built-in features and commands, so
they are big. Spawning (i.e. fork()ing then exec()ing)
a shell script takes time, because the shell program itself must be
initialized. For simple programs like nice -10 echo blah,
a shell is overpowered - we only need a way to make an argv
from the "nice -10 echo blah" string, and execve()
into that argv.
- Unix systems have a size limit for argv+envp,
but it is high. POSIX states that this limit must not be inferior to
4 KB - and most simple scripts are smaller than that. Modern systems
have a much higher limit: for instance, it is 64 KB on FreeBSD-4.6,
and 128 KB on Linux.
Knowing that, and wanting lightweight and efficient scripts, I
wondered: "Why should the interpreter stay in memory while the script
is executing ? Why not parse the script once and for all, put
it all into one argv, and just execute into that argv,
relying on external commands (which will be called from within the
script) to control the execution flow ?"
execline was born.
Grammar of an execline script
An execline script can be parsed as follows:
<instruction> = <> | external options <arglist> <instruction> | builtin options <arglist> <blocklist> <instruction>
<arglist> = <> | arg <arglist>
<blocklist> = <> | <block> <blocklist>
<block> = { <arglist> } | { <instrlist> }
<instrlist> = <> | <instruction> <instrlist>
(This grammar is ambivalent, but much simpler to understand than the
non-ambivalent ones.)
execline features
execline commands can perform some transformations on
their argv, to emulate some aspects of a shell. Here are
descriptions of these features: