Skip to content

Latest commit

 

History

History
509 lines (365 loc) · 18.4 KB

CONTAINERS.org

File metadata and controls

509 lines (365 loc) · 18.4 KB

GNU Guix containers

Table of Contents

Introduction

GNU Guix is an excellent implementation of Linux container managers and compares favourably to other container systems, such as Docker.

In addition to the advantages that Guix offers as a deployment system, Guix containers share the same software repository as the host, i.e., Guix containers are extremely light-weight! This is possible because Guix software is immutable and versioned. And because it is Guix, everything installation is both build and binary reproducible.

See also the official GNU Guix documentation.

Running a container

Containers can be run as regular users, provided the Kernel gives permission.

Usage

Give the package name(s), here emacs and coreutils (for ls etc.), you want to have those added to the container (a Guix container is empty by default):

guix environment --container --network --ad-hoc emacs coreutils

You can run a command once:

guix environment --ad-hoc --container coreutils -- df

prints the loaded home dir and the store profile:

Filesystem                  1K-blocks      Used Available Use% Mounted on
none                          3956820         0   3956820   0% /dev
udev                            10240         0     10240   0% /dev/tty
tmpfs                           65536         0     65536   0% /dev/shm
/dev/sda1                    38057472  19874684  16226540  56% /export2/izip
/dev/mapper/volume_group-vm 165008748 109556608  47047148  70% /gnu/store/ikkks8c56g56znb5jgl737wkq7w9847c-profile

Note that ‘guix environment –ad-hoc –container’ will mount your current working directory (here /export2/izip). If you start from an empty $HOME/tmp directory - that will be mounted. Any files you put here will be persistent between container runs.

Note you can point HOME to any path on startup from the shell

guix environment --ad-hoc coreutils --container bash -- env HOME=$HOME/tmp/newhome/ bash

which allows you to run specific startup scripts and keep configurations between runs.

Browser

Run icecat, a browser, in a container with

    guix environment --container --network --share=/tmp/.X11-unix
--ad-hoc icecat
    export DISPLAY=":0.0"
    icecat

You only need to install the package once.

Running Windows tools in Wine

Wine can also be run in a container:

    guix environment --container --network --share=/tmp/.X11-unix
--ad-hoc wine
    export DISPLAY=":0.0"
    wine explorer

which is great. I used to have to use VirtualBox and such to run the occasional Windows tool. Now it runs in a container with access to the local file system.

To run the tool in one go and set the HOME dir:

guix environment --network --expose=/mnt/cdrom --share=/tmp/.X11-unix --container --ad-hoc wine vim bash coreutils -- env HOME=`pwd` DISPLAY=":0.0" wine explorer

Docker

Guix has its own containers using native Linux support, but you can also run Guix in Docker and distribute software that way. One interesting thing you can do is run guix ‘pack’ which creates a docker image of a package with all its dependencies, see this description.

Providing a usable Docker container

Install the package in the main /gnu/store

For a paper we made a compilation of bioinformatics software and put it all in one GNU Guix package named book-evolutionary-genomics. I can install it using a local GUIX checkout commit cc14a90fd3ce34a371175de610f9befcb2dad52b

env GUIX_PACKAGE_PATH=../guix-bioinformatics \
  ./pre-inst-env guix package -p ~/opt/book-evolutionary-genomics \
  --no-grafts -i book-evolutionary-genomics \
  --substitute-urls="http://guix.genenetwork.org https://berlin.guixsd.org https://mirror.hydra.gnu.org"

resulting in a totally reproducible package.

Try things in a Guix container

Now we want to isolate them into a container. To run these tools inside a Guix container you can do like the earlier

env GUIX_PACKAGE_PATH=../guix-bioinformatics/ \
  ./pre-inst-env guix environment --no-grafts --ad-hoc \
  --substitute-urls="http://guix.genenetwork.org https://berlin.guixsd.org https://mirror.hydra.gnu.org" \
  coreutils book-evolutionary-genomics vim screen \
  --container bash -- bash

starts up a bash shell in a clean container. For the book we have created some scripts in the profile which can be found with the GUIX_ENVIRONMENT setting:

cd $GUIX_ENVIRONMENT/share/book-evolutionary-genomics

The bin directory is on the PATH already, but for some scripts you may want to create /usr/bin pointing to $GUIX_ENVIRONMENT/bin

mkdir /usr
ln -s $GUIX_ENVIRONMENT/bin /usr/bin

Note that /gnu/store is immutable and can therefore be shared with the main system. This makes GNU Guix containers really small and fast.

Docker image

With GNU Guix you can create a Docker image without actually installing Docker(!)

env GUIX_PACKAGE_PATH=../guix-bioinformatics/ \
  ./pre-inst-env guix pack -f docker --no-grafts \
  -S /usr/bin=/bin -S /etc/profile=/etc/profile \
  -S /book-evolutionary-genomics=/share/book-evolutionary-genomics \
  coreutils book-evolutionary-genomics bash vim

note we now have the -S switch which can make the /usr/bin symlink into the profile.

Run Docker

This produced a file which we can be loaded into Docker

Docker is part of Guix too:

guix package -i docker containerd docker-cli -p ~/opt/docker
source ~/opt/docker/etc/profile

Start the dockerd as root and make sure permissions are set

groupadd docker
usermod -aG docker ${USER}
docker load --input /gnu/store/0p1ianjqqzbk1rr9rycaqcjdr2s13mcj-docker-pack.tar.gz
docker images
  REPOSITORY          TAG                                IMAGE ID            CREATED             SIZE
  profile             425c1ignnjixxzwdwdr5anywnq9mg50m   121f9cca6c55        47 years ago        1.43 GB

Now you should see the image id and you can run

docker run 121f9cca6c55 /usr/bin/ruby --version

Find the profile

docker run 121f9cca6c55 /usr/bin/ls /usr/bin -l

Read the profile settings

docker run 121f9cca6c55 cat /gnu/store/425c1ignnjixxzwdwdr5anywnq9mg50m-profile/etc/profile

But there is an easier way because we created the symlink earlier

docker run 121f9cca6c55 cat /etc/profile

Run bioruby

docker run 121f9cca6c55 bash -c "env GEM_PATH=/gnu/store/425c1ignnjixxzwdwdr5anywnq9mg50m-profile//lib/ruby/gems/2.4.0 /gnu/store/425c1ignnjixxzwdwdr5anywnq9mg50m-profile/share/book-evolutionary-genomics/src/bioruby/DNAtranslate.rb

with input file

time docker run 121f9cca6c55 bash -c "env GEM_PATH=/gnu/store/425c1ignnjixxzwdwdr5anywnq9mg50m-profile//lib/ruby/gems/2.4.0 /gnu/store/425c1ignnjixxzwdwdr5anywnq9mg50m-profile/share/book-evolutionary-genomics/src/bioruby/DNAtranslate.rb /gnu/store/425c1ignnjixxzwdwdr5anywnq9mg50m-profile/share/book-evolutionary-genomics/test/data/test-dna.fa"

or the easy way since we created the links

time docker run 121f9cca6c55 \
  bash -c "source /etc/profile ; cd /book-evolutionary-genomics ; src/bioruby/DNAtranslate.rb test/data/test-dna.fa"

Building Docker Image of Conda with Guix

Build the conda Archive

To build the pack from guix, the following command was run:

./pre-inst-env guix pack -S /opt/gnu/bin=/bin conda

This builds an archive with `conda`. The package will be named something like `/gnu/store/y2gylr1nz7qrj0p1xwfcg4n8pm0p4wgl-tarball-pack.tar.gz`

The `./pre-inst-env` portion can be dropped if you have a newer version of guix that comes with conda in its list of packages. You can find out by running the following command:

guix package --search=conda

and looking through the list to see if there is a package named conda.

Bootstrapping the Images

From this step, there was need to bootstrap new images, based on a base image. The base image chosen was the ubuntu image. You can get it with:

docker pull ubuntu

The steps that follow will be somewhat similar, with each image building upon the image before it.

The files created here can be found in this repository.

The first image to be built only contains conda, and it was initialised with a new environment called `default-env`. This was done by writing a Docker file with the following content:

FROM ubuntu:latest
COPY /gnu/store/y2gylr1nz7qrj0p1xwfcg4n8pm0p4wgl-tarball-pack.tar.gz /tmp/conda-pack.tar.gz
RUN tar -xzf /tmp/conda-pack.tar.gz && rm -f /tmp/conda-pack.tar.gz
RUN /opt/gnu/bin/conda create --name default-env

This file was saved as `Dockerfile.conda` and then the image was built by running

docker build -t fredmanglis/guix-conda-plain:latest -f Dockerfile.conda .

Be careful not to miss the dot at the end of the command. This command creates a new image, from the base image fredmanglis/guix-conda-base-img:latest and tags the new image with the name fredmanglis/guix-conda-plain:latest

This new image is then used to bootstrap the next, by first creating a file `Dockerfile.bioconda` and entering the following content into it:

FROM fredmanglis/guix-conda-plain:latest

RUN conda config --add channels r
RUN conda config --add channels defaults
RUN conda config --add channels conda-forge
RUN conda config --add channels bioconda

This file instructs docker to bootstrap the new image from the image named fredmanglis/guix-conda-plain:latest and then run the commands to add the channels required to access the bioconda packages.

The new image, with bioconda initialised, is then created by running

docker build -t fredmanglis/guix-bioconda:latest -f Dockerfile.bioconda .

Be careful not to miss the dot at the end of the command.

The next image to build contains the sambamba package from the bioconda channel. We start by defining the image in a file, `Dockerfile.sambamba` which contains:

FROM fredmanglis/guix-bioconda:latest
RUN /opt/gnu/bin/conda install --yes --name default-env sambamba

As can be seen, the package is installed in the environment `default-env` defined while bootstrapping the image with conda only. This new image is built with the command:

docker build -t fredmanglis/guix-sambamba:latest -f Dockerfile.sambamba .

Do not miss the dot at the end of the command.

Publishing the Images

The images built in the processes above are all available at https://hub.docker.com/r/fredmanglis/

To publish them, docker’s push command was used, as follows:

docker push fredmanglis/guix-conda-plain:latest && \
docker push fredmanglis/guix-bioconda:latest  && \
docker push fredmanglis/guix-sambamba:latest

These are really, three separate commands, in a sequence that only runs the later commands if the ones before them ran successfully. This ensures that the derived images are only uploaded after the images they are based on have been successfully uploaded.

Get the Images

To get any of the images, use a command of the form:

docker pull fredmanglis/<img-name>:<img-tag>

replacing <img-name> and <img-tag> with the actual image name and tag. For example, to get the image with bioconda already set up, do:

docker pull fredmanglis/guix-bioconda:latest

Run Installed Applications

To run the applications installed, we need to set up the path correctly. To do this, we make use of docker’s –env-file option, in something similar to the following:

docker run --env-file=<file-with-env-vars> img-to-run:img-tag <command-to-run>

The <file-with-env-vars> can be found here.

Now you can proceed to run a command, for example:

docker run --env-file=environment_variables --volume /tmp/sample:/data \
fredmanglis/guix-sambamba bash -c "sambamba view /data/test.bam"

the `–volume` option enables one to mount a specific directory to the docker container that is created, so that the data is available to the running commands.

Common Workflow Language (CWL)

CWL can use Docker images to pull containers, for example for OGDI. CWL is agnostic to how these containers are sourced.

For COVID-19 PubSeq ODGI was required in a CWL module to build a graph and generate RDF. The CWL to build the graph is here. The quickest way to get an up-to-date working Docker container was by using GNU Guix. ODGI is currently maintained and packaged in an external guix-genomics repo by Erik Garrison. It is simply a matter of adding a channel or by using the GUIX_PACKAGE_PATH after a git clone of guix-genomics we build odgi in a profile

env GUIX_PACKAGE_PATH=~/guix-genomics ~/.config/guix/current/bin/guix package -i odgi -p ~/opt/vgtools

and a quick test shows

tux01:~$ ~/opt/vgtools/bin/odgi
odgi: dynamic succinct variation graph tool, version #<procedure version ()>

usage: /home/pjotr/opt/vgtools/bin/odgi <command> [options]

main mapping and calling pipeline:
  -- build         build dynamic succinct variation graph
  -- stats         describe the graph and its path relationships
  -- sort          sort a variation graph
  -- view          projection of graphs into other formats
  -- kmers         process and dump the kmers of the graph
  -- unitig        emit the unitigs of the graph
  -- viz           visualize the graph
  -- paths         interrogation and manipulation of paths
  -- prune         prune the graph based on coverage or topological complexity
  -- unchop        merge unitigs into single nodes
  -- normalize     compact unitigs and simplify redundant furcations
  -- subset        extract subsets of the graph as defined by query criteria
  -- bin           bin path information across the graph
  -- matrix        graph topology in sparse matrix form
  -- chop          chop long nodes into short ones while preserving topology
  -- groom         resolve spurious inverting links
  -- layout        use SGD to make 2D layouts of the graph
  -- flatten       project the graph sequence and paths into FASTA and BED
  -- break         break cycles in the graph
  -- pathindex     create a path index for a given graph
  -- panpos        get the pangenome position for a given path and nucleotide position (1-based)
  -- server        start a HTTP server with a given index file to query a pangenome position
  -- version       get the git version of odgi
  -- test          run unit tests

For more commands, type `odgi help`.

Now can try building a Guix container with

env GUIX_PACKAGE_PATH=~/guix-genomics ~/.config/guix/current/bin/guix environment -C --ad-hoc odgi
odgi

yes, that works too. Great, now we package a Docker image

env GUIX_PACKAGE_PATH=~/guix-genomics ~/.config/guix/current/bin/guix pack -f docker odgi

which created a container in /gnu/store/d68qyyvqchlgq3lzh3qgmlg9k42c9yas-docker-pack.tar.gz of size 30MB. Tiny!

After installing docker (part of GNU Guix) you can test

docker load --input d68qyyvqchlgq3lzh3qgmlg9k42c9yas-docker-pack.tar.gz
docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
odgi                latest              5351dc5d4fc8        50 years ago        102MB

docker run 5351dc5d4fc8 odgi
  odgi: dynamic succinct variation graph tool, version #<procedure version ()>
  etc.

It works! Only a request came to add bash and coreutils. So I made a slightly larger one, also putting all binaries in the /bin path so /bin/sh and /bin/odgi work

env GUIX_PACKAGE_PATH=~/guix-genomics ~/.config/guix/current/bin/guix pack -f docker odgi bash coreutils binutils --substitute-urls="http://guix.genenetwork.org https://berlin.guixsd.org https://ci.guix.gnu.org https://mirror.hydra.gnu.org"  -S /bin=bin

It runs, for example

docker run 0dcb42977ec2 odgi
docker run 0dcb42977ec2 sh
docker run 0dcb42977ec2 /bin/sh
docker run 0dcb42977ec2 /bin/bash -c ls

Next we make it available for general use. I pushed it to IPFS for sharing.

GEMMA

To distribute GEMMA I made static versions of the binary. A container can be made instead with, for example

env GUIX_PACKAGE_PATH=~/guix-bioinformatics ~/.config/guix/current/bin/guix \
  pack -f docker gemma-gn2 -S /bin=bin

which created a container in of size 51MB. Tiny!

After installing docker (part of GNU Guix) you can test

docker load --input d68qyyvqchlgq3lzh3qgmlg9k42c9yas-docker-pack.tar.gz
docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
gemma-gn2           latest              ed5bf7499691        50 years ago        189MB
docker run run ed5bf7499691 gemma