Previous: Quick start Table of Contents Next: Profiles

larch – a do-it-yourself live Arch Linux system



Overview of the larch build system


Warning

Before we get started I should point out that larch must have root privileges in order to do its work, and that much of this work consists of deleting and overwriting files and even complete directories. If just one of these is wrong it might make quite a mess of your system. That is quite normal for an installer, but you will be using it on a system that is already configured and this is somewhat risky - if you set up your configuration wrong (or if you or I made some other mistake ...), you might find you have destroyed some important data and/or your system doesn't work any more. But that's life - Share and Enjoy!

Installation of the larch build system

The larch package may be installed in the normal Arch Linux way using pacman. This method will only work on an Arch system, of course. The larch repository can be found at berlios.

Alternatively, larch may be used without installing it. This should also work on non-Arch Linux systems (but see next paragraph). Download the latest larch package from the repository and unpack it somewhere convenient. To start larch you need to run the file 'opt/larch/run/larch'. If you have installed the package with pacman, 'opt/larch/run' should now be in your $PATH (at least after the next login), so just 'larch' should suffice to get it going.

If you are not running Arch Linux, you will need to download the 'pacman-allin' package from the larch repository and unpack this to the same location as you unpack the 'larch' package to. This provides a pacman executable together with all the libraries it needs, the 'repo-add' script, and a 'mirrorlist' file.

Using the larch build system

Projects

A larch project simply gathers together a few configuration options concerning the build environment. Examples are the location of the installation, pacman options, such as package repository locations, which profile to build with, which boot loader to use, and so on. These are all options which don't directly affect the design of the live system to be built. Most of the remaining configuration options do concern the design and are specified in the profile.

All project information is kept in the simply structured configuration file '~/.config/larch/larch-config'.

Unlike a normal system installation, that used by larch need not be a separate partition, it can be placed anywhere convenient. The default installation directory ('/home/larchbuild') should normally be acceptable (the building work is done in the '.larch' sub-directory), so long as there is enough free space on that partition. Note that lots of space is necessary, nearly 4GB for a 700MB CD, for example.

Profiles

A profile contains the information needed to build a particular flavour of Arch Linux - which packages to install, which locales, system configuration in '/etc/rc.conf', boot entries, and so on. There is also a folder 'rootoverlay', into which any files can be put that the design needs but differ from those in a basic, fresh Arch Linux system.

All profile information is kept in the corresponding folder in the directory '~/.config/larch/working_dir/MyProfiles'.

System Installation overview

The larch live system will normally be built from a fresh (unmodified, unconfigured) Arch installation. The Installation stage handles the creation of this installation by downloading all the desired packages (if they are not already in the host's package cache) and installing them to the directory set by the project (the default is /home/larchbuild). The profile files 'addedpacks' and to a lesser extent 'baseveto' determine which packages will be installed. The 'base' group of packages will always be installed automatically, unless there are entries in 'baseveto' to veto certain of these (this will have no effect if a vetoed package is required by one selected for installation). 'addedpacks' is simply a list of additional packages to install, along with any dependencies. Both these files are simply one package name per line, empty lines and lines starting with '#' being ignored.

By default the package cache on the build host will be used, so that only packages which have not already been downloaded will be fetched from the chosen mirror. But it is possible to select another cache location. Note that this can also be on a remote machine, mounted using sshfs or NFS.

This raw installation will not normally be modified by larch, so it can be reused, or even carefully experimented with. However, note that any changes you make to this installation manually will not be reflected in the profile, so it might be difficult to reproduce what you have done at a later time, or to revert the changes. ***+ Would it be useful to incorporate changes made via the Tweaks page into the profile? Offer it as an option? -*** If you want to configure your system, then the raw installation is probably not the best place for it, consider rather using the 'rootoverlay' directory in the profile. All files in this directory will 'overwrite' (well, 'overlay', actually) corresponding files in the base installation, by placing them in a higher layer in the aufs mount in the live system. This makes it very easy to collect your own customizations in a way that can be applied quickly to a future build.

The gui allows easy access to some of the main configuration files (e.g. '/etc/rc.conf' and '/etc/locale.gen').

Quick larchify overview

Once we have an Arch Linux installation and have specified our customizations in the profile, we can combine all this information into two squashfs archives and an initramfs (to allow the live system to boot). I call this step 'larchification'. The lowest layer in the aufs mount is the raw installation, which is compressed into a squashfs archive, 'system.sqf'. All the modifications specified in the profile, as well, as some common to all larch systems are in the archive 'mods.sqf', which 'overlays' the basic installation.

In order to aid experimentation with profile tweaks which do not affect the underlying installation, it is possible to repeat the 'larchify' step without rebuilding the 'system.sqf' archive, which saves quite a bit of time.

Building a bootable medium

When the 'larchification' has been completed, we can choose how to configure the device onto which our live system is to be installed. Most of the options on the 'Medium' page should be fairly self explanatory. We have a choice of media (iso, for CD/DVD, or partitions in general, be they USB-sticks, hard disks or whatever). It is also possible to select a bootloader and tweak its configuration.

When the live system is installed to a partition (e.g. USB-stick) it is possible to choose how the boot partition will be recognized. So long as larch also installs a bootloader, it can also add the appropriate entries to the bootloader configuration file automatically. The options available are via UUID, partition label, partition name (e.g. '/dev/sdb1'), or by searching for a partition containing the file 'larch/larchboot'. See also 'Boot parameters'.

It is possible to repeat the installation onto various media, changing the configuration, without needing to rerun the 'larchification' step. The constituent larch files remain unchanged.

Building a boot CD for a USB-stick

Older computers may not be able to boot from USB devices, so the possibility of generating a small boot iso is provided. This can be burned to CD and can be used to boot your larch system on a USB-stick. As this function uses the system on the USB-stick, this needs to be plugged in (not mounted!) and selected in the 'Partition' entry.

Minimal build system requirements

larch has been designed to work without extensive demands on the build system. The main requirement is pyqt for the gui (I am not sure what the oldest working version is, but 4.4 should be safe). Although it has been developed under Arch Linux, larch should run on other GNU/Linux systems. By means of a sort of bootstrapping, the required software has been kept to a minimum - many of the build functions are carried out on the newly installed Arch system using chroot. For example, you do not need support for squashfs or aufs on the build system. But bash, mkfs.vfat, mkfs.ext2, blkid and sfdisk ***+ ... -*** are assumed to be available (on Arch that is packages 'bash', 'dosfstools', 'e2fsprogs' and 'util-linux-ng').

'pacman.conf' and 'mirrorlist'

The larch repository must be available for building the live system, i.e. it must be included in the 'pacman.conf' used for the installation process. To ease the incorporation of updates to the structure of the file 'pacman.conf' I am currently experimenting with a split version. The header part is extracted from the default pacman.conf (distributed with the 'larch' package) and the repository list is kept separately. Both parts can be edited separately, from the larch gui, and altered versions are then stored within the profile.

If there are 'Include' lines in repository list in 'pacman.conf', the corresponding 'mirrorlist' file must be present and suitably configured. By default the host's /etc/pacman.d/mirrorlist will be used, but it is also possible to use an edited version - which is not stored within the profile, but in larch's working directory.

A special feature of larch is the ease of using a local package mirror during installation. The supplied path need not even be a complete mirror. During development work I have indeed just placed the appropriate ****.db.tar.gz files there and relied on all the packages being taken from the host's cache. (The script 'repos.sh' supplied with larch in the 'run' directory can build these db files from the current state of the pacman sync database on the host.)

The path entered for this build mirror can contain the variables "*platform*" and "*repo*", these being substituted automatically by larch.

squashfs and aufs

The system to be larchified is made into a squashed file-system in the file system.sqf. This is mounted as the lower layer of a union (aufs) file-system. On top of that there is a writeable tmpfs layer so that the file-system as a whole appears writeable. An (initially empty, or rather non-existent) 'overlay' archive is copied into the writeable layer. As this top layer is compressed using lzop, the 'lzop' package must be installed in the target. Of course only the tmpfs top layer can actually be written to, and its contents disappear when the system reboots, so the writing is only temporary. This can be overcome to some extent by using the session-saving features described below.

There is also a second overlay archive, a squashed file system (mods.sqf), which appears as a middle layer in the union file-system. This initially contains all the modifications to the base system needed to convert that into a larch system as well as all the changes specified in the profile. Using the merge-overlay feature it is possible to rebuild this archive to incorporate subsequent changes to the system.

A third overlay archive will also appear if the merge-overlay feature is used. This is also a squashfs archive and lies between the bottom layer (system.sqf) and the modifications layer (mods.sqf). It contains only 'whiteout' files, to mask files in the base system which have later been deleted.

In order to boot into a system constructed in that manner, you need an initramfs which deals with mounting all the various bits in the right way before entering the normal boot sequence. In larch the initramfs is built using the standard Arch 'mkinitcpio' system, for which special 'hooks' (essentially code plug-ins) have been developed to manage the requirements of a larch live system.

unionfs as an alternative to aufs

***+ In principle, unionfs can be used instead of aufs, but as aufs is included in current Arch Linux kernels it is obviously the first choice. Indeed the current larch code probably won't work with unionfs, as it hasn't been tested - but the framework to support it is in place. -***

Custom packages

It is possible to include your own 'custom' packages in the installation. Maybe ones you have compiled yourself, e.g. from the AUR, or modified versions (fixes or customizations) of standard packages. To do this you need to put your packages in a directory and run gen-repo on this directory (run it without arguments to get usage instructions). Then place an entry for this new repository in your 'pacman.conf' (use the button on the 'Installation' page to edit your pacman.conf repositories). If your packages replace some in the existing repositories, your custom repository needs to come before those repositories in 'pacman.conf'. Any packages you want installed now just need to be listed in addedpacks.

It is not necessary to build a custom kernel for larch, the standard kernel ('kernel26') can be used. It includes aufs and squashfs modules, and is now included in the set of 'base' packages.

Generating the base package list

In case the base package set changes, the list provided by 'pacman -Sg base' is used. However, it is possible to filter out certain unwanted packages from the base set. In most cases the default setting (empty) will be satisfactory, but you can change it by editing 'baseveto' (one package per line), part of the profile. Be aware, however, that vetoed packages will be installed anyway if they are required by some other installed package.

addedpacks

This group of packages is the main place for defining your system. Here you can enter all the applications you would like in your live CD/USB system (and subsequently installed to a hard disk partition, if that was your intention). Thanks to pacman you don't need to sort out dependencies, these should all be included automatically.

In order to support building a larch system, certain packages outside the Arch base group must be installed in the system to be 'larchified' (the larch installer includes the first group automatically):

squashfs-tools, lzop, larch-live,

aufs2, aufs2-util (unless using the untested unionfs these must
                be present, except that the aufs module might be
                provided with the kernel, in which case you only
                need aufs2-util),

syslinux (if using isolinux or syslinux),
cdrkit (for building an iso),
eject (to eject a CD at shutdown).

For the hard-disk installer - larchin, python, pygtk,
parted, ntfsprogs,
(optional, but recommended) gparted.

for this documentation, and for the capability of doing complete
rebuilds - larch

Making a live CD from an existing Arch installation

By setting the installation path to an existing Arch installation, a live medium can be made from it, by skipping to the larchify page. The installation must already be mounted, including any sub-mounts (e.g. /home on another partition). The main mount must be with options 'exec,dev', because most of the building is done via a chroot to the installation. This approach to live system generation is probably not a good idea if the installation contains a lot of data - consider how big the result will be ...

Also the currently running (Arch only!) system can be larchified, by setting the installation path to '/'. This is, however, not recommended. Building from a running system can easily result in data corruption because the file-system might well change during the build process.

Note that some things in '/var' will not be included in the 'live' system. Firstly, the standard pacman package cache, '/var/cache/pacman/pkg'. Also the log files (in '/var/log') and '/var/tmp' (temporary files, like '/tmp') are not saved. As some files in '/var/log' are required for certain aspects of logging to function, these are recreated in the initramfs.

Previous: Quick start Table of Contents Next: Profiles