summaryrefslogtreecommitdiff
path: root/README.md
blob: dddcd699406ae61d6fd3e5d211a2e972c36452e0 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
# lh-bootstrap: building a disk image with Linux, musl, and skarnet.org tools from scratch

Laurent Bercot
last modified: 2017-05-22


## License

`lh-boostrap` is distributed under the terms of the
[GNU General Public License version 2](https://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html).


## Goal

`lh-bootstrap` builds a disk image for use with qemu or other VM emulators -
and the files can also be copied to real hardware.

 The image contains a Linux kernel and a collection of small user-space
tools such as [busybox](http://busybox.net/), [dropbear](https://matt.ucc.asn.au/dropbear/dropbear.html)
and the [skarnet.org tools](http://skarnet.org/software/), all statically
linked against the [musl libc](http://musl-libc.org/). It includes
the minimal amount of necessary software and client configuration to get
a machine up, running (with [s6](http://skarnet.org/software/s6) as
process 1 and [s6-rc](http://skarnet.org/software/s6-rc) as service
manager) and connected to the Internet.

 The image is built from scratch: every package is compiled from source.
The toolchains and the minimal initial development environment for the
BUILD machine, however, are not provided. See below.


### Explicitly Not A Goal

`lh-bootstrap` **is not**:

- A distribution. It will not include any more software than is
strictly necessary to get a minimal usable image up and running.
Future versions of lh-bootstrap may include a "development" flavour,
which would also include a basic C/Unix development environment on
the image, but that's as far as it will go.

- Turnkey, polished, for end users. Lots of work have been put
into it so most build machines can run it out of the box, but the tasks
here are complex and involve lots of different packages from different
sources, which all evolve rapidly - so bitrot is to be expected, and
users should not be afraid to go tweak Makefiles to set the correct
versions of the packages.

- Lightweight. Unlike other skarnet.org tools, `lh-bootstrap` is a
heavy development package that needs significant resources to run.


## Terminology

You have installed this package on the BUILD machine.
You are making an image that will work on a TARGET machine.
The supported TARGETs include x86_64, i486, armv7, armv8 (aarch64).

The TARGET machine can also be referred to as the HOST machine.
This is GNU terminology: when you configure a package with a GNU
configure script, the --build option tells what machine you're
building the software on, and the --host option tells what machine
the software is going to run on.

We will use HOST or TARGET indiscriminately. There is one case
where HOST and TARGET are not synonyms: when building a toolchain.
(In that case, HOST refers to the machine that the toolchain
being built will run on, and TARGET refers to the machine that
the toolchain will produce binaries for.)
Since we are not building a toolchain, HOST and TARGET are entirely
synonymous to us.

 HOST is generally a confusing term, because it is often
used to designate the native, real computer, a "host" as opposed
to a "guest" running in a virtual machine. But here, "host" is
not opposed to "guest", it's opposed to "build", and your native,
real computer is "build".


## Requirements

### Be root

You must be root on your BUILD machine. The build scripts will not
work properly if you are not root, *even if they do not write error
messages!*
Don't worry, most of the work is performed as a non-root user; but
root privileges are still needed for a few operations, so it is
necessary that you start the build script as root.

(It is still better to be root and lose privileges for the operations
that do not require them than to not be root and have to gain
privileges for some operations via sudo or other mechanisms.
Under Unix, it is best to avoid privilege gain whenever you can.)


### Build requirements

For the build to work, you need:

- A GNU or other Linux-based OS. Unfortunately, some Linux-specific
operations need to be performed on the BUILD machine (loopback
mounting, among others).

- A powerful BUILD machine. skarnet.org tools are small and efficient,
but building a complete system image from scratch requires significant
computing power.

- A native development environment for the BUILD machine. This means
a gcc toolchain running on your BUILD machine and producing code intended
to run on your BUILD machine. You should have this on any distribution,
and your compiler should just be called `gcc`. If you do not have this,
you can get a native toolchain [here](http://skarnet.org/toolchains/).

- An unrestricted Internet connection on the BUILD machine.

- The ability to loop-mount filesystems on the BUILD machine.

- A few necessary tools for the BUILD machine:
  + GNU `make`, version 3.81 or later
  + `bc`, Perl 5 (necessary for the Linux kernel compilation as well as syslinux)
  + `su`, `patch`, `sed`
  + `git`
  + a `tar` that supports .gz, .bz2 and .xz archives
  + a `wget` that supports HTTPS
  + `dd`, `chown`, `cpio`
  + `mkfs.ext4`, from e2fsprogs
  + `qemu-system-$TARGET` to boot your target machine

- A musl cross-development environment from the BUILD machine to the TARGET
machine. This means a gcc toolchain running on your BUILD machine and
producing code intended to run on your TARGET machine, linking the TARGET
binaries against the musl libc.
Even if you are building for the same TARGET as your BUILD machine
(example: you are building for x86_64 on an x86_64), **you cannot use
your stock distribution's native compiler for this!** Pick one of the
cross toolchains available [here](http://skarnet.org/toolchains/).

- A native musl development environment for the TARGET machine. This means a
gcc toolchain running on your TARGET machine and producing code intended
to run on your TARGET machine, linking the TARGET binaries against the musl
libc. Pick one of the native toolchains available
[here](http://skarnet.org/toolchains/).


## Usage

### Configuring

Copy the `lh-config.dist` file to `lh-config`. This file is your own configuration
and should NOT be checked into git.
Edit the `lh-config` file to configure the system to be built.

It is important that the NORMALUSER variable be set to an existing
non-root user on your BUILD system. If you don't have one, use `nobody`.

You can set the OUTPUT variable to the name of the directory the
system will be built in. There must be *a lot* of available disk
space for the output, because that's where all the builds will
take place. By default, OUTPUT is `./output`, which
means the system will be built right where you are.

TRIPLE is the triplet representing your target.
It should be `x86_64-linux-musl` for x86_64,
`arm-linux-musleabihf` for ARM,
`aarch64-linux-musl` for arm64,
`i486-linux-musl` for i486, etc.
Only triplets that appear in the `sysdeps` subdirectory are supported.

CROSS_BASE is the path where your cross-toolchain is installed.
This means the toolchain from your BUILD to your TARGET, even if
BUILD and TARGET are the same.

HOST_HOST_BASE is the path where your native toolchain for the TARGET
is installed. TODO: rename this variable

COUNTRY_CODE, LOCAL_IP and ROUTER_IP are configuration variables
for your TARGET. COUNTRY_CODE is one of `uk`, `fr`, `rs`, `vn` or `cn`.
LOCAL_IP is the IP your guest will have; ROUTER_IP is the router
address your guest will use. (On Linux, you can get your router
(gateway) ip via `route -n`.) They should be on the same class C
network.

USE_DHCP should be true if you want your image to get its IP address
via a DHCP client (in which case LOCAL_IP and ROUTER_IP will be
ignored). It should be false if you want your image to have the
LOCAL_IP static IPv4 address.

ROOTFS_SIZE, SWAP_SIZE, RWFS_SIZE, USERFS_SIZE and EXTRA_SIZE are
the size of the partitions that will be created, in megabytes.
They are big by default, so the virtual disk can be used to build
any distribution. The disk files are sparse, so it doesn't matter
that they're big - but you should modify the environment variables
if you want a smaller image.



### Building

You must be root to invoke `./make`. Most build commands will still
run unprivileged, as the user you specified in the NORMALUSER variable
in `lh-config`, but root privileges are needed for some steps in the
creation of the image: loopback mounting, for instance.

If you need a clean build, type `./make clean`. The output directory
will be erased, except for the downloaded sources. If you need to
also erase the downloaded sources, type `./make distclean`.

To start the build, type `./make`.
Not just `make`, but `./make`, i.e. the provided script. This script
sets a few important environment variables before calling the real
`make` with all its command line. You can give `./make` all the
options and arguments you would give `make`, for instance `-j6`.

The filesystems will be built under the `./output` directory, or
whatever directory you specified in the OUTPUT variable in `lh-config`.

Under this directory, once the build has completed:
- `initramfs`, `rootfs`, `rwfs` and `userfs` are the contents of the
respective filesystems of the target. You can use those to make tarballs,
for instance.
- `kernel` is the kernel binary, to be given to qemu.
- `initramfs.img.gz` is the compressed initramfs image, to be given to qemu.
- `disk-image.raw` is the complete raw disk image, suitable for qemu or to be
burned onto a real disk or SD card. By default it is huge, but it's a
sparse file, i.e. it's not really using all that space, only the parts
that have actually been written to (which is a small portion of the total
space).


### Running on backends

To launch qemu on an image you just created, run `./make qemu-boot`.
This will start a qemu process running the image you just created.
You can look at the ./qemu-boot script to see exactly what it does.

You can also "./make vmware-image" or "./make virtualbox-image" to create
a "disk-image.vmdk" file, which will be suitable as a main disk image
for VMWare or Virtualbox. Running those emulators, however, is out of
scope for this document.