Jérôme Petazzoni

Create lightweight Docker containers with Buildroot

Jérôme Petazzoni

Highlights of this article (TL,DR): we’ll show how to use buildroot to create a basic but fully functional container using less than 4 MB of disk space (uncompressed). Then we will apply the same technique to obtain a PostgreSQL image which fits in less than 20 MB (not including your databases, of course).

You can play with those containers at once if you want. Just run “docker run jpetazzo/pglite”, and within seconds, you will have a PostgreSQL server running on your machine!

I like containers, because they are lighter than virtual machines. This means that they will use less disk space, less memory, and ultimately be cheaper and faster than their heavier counterparts. They also boot much faster. Great.

But how “lightweight” is “lightweight”? I wanted to know. We all wanted to know. We already have a small image, docker-ut, using a statically compiled buildbox (it’s built using this script). It uses about 7 MB of disk space, and is only good to run simple shell scripts; but it is fully functional—and perfect for Docker unit tests.

How can we build something even smaller? And how can we build something more useful (e.g., a PostgreSQL server), but with a ridiculously low footprint?

To build really small systems, you have to look at embedded systems. That’s where you find the experts about everything small-footprint and space efficient. In the world of embedded systems, sometimes you have to cram a complete system, including Linux kernel, drivers, start up scripts, essential libraries, web and SSH servers, WiFi access point management code, radius server, OpenVPN client, bittorrent downloader — all in 4 MB of flash. Sounds like what we need, right?

There are many tools out there to build images for embedded systems. We decided to use buildroot. Quoting buildroot’s project page: “Buildroot is a set of Makefiles and patches that makes it easy to generate a complete embedded Linux system.” Let’s put it to the test!

The first step is to download and unpack buildroot:

curl http://buildroot.uclibc.org/downloads/buildroot-2013.05.tar.bz2 | tar jx

Buildroot itself is rather small, because it doesn’t include the source of all the things that it compiles. It will download those later. Now let’s dive in:

cd buildroot-2013.05/

The first thing is to tell buildroot what we want to build. If you have ever built your own kernel, this step will look familiar:

make menuconfig

For now, we will change just one thing: tell buildroot that we want to compile for a 64 bits traget. Go to the “target architecture” menu, and select x86_64. Then exit (save along the way). Now brew a big pot of coffee, and fire up the build:


This will take a while (from 10 minutes to a couple of hours, depending on your local machine beefiness). This takes so long because it will first compile a toolchain. It means that instead of using your default compiler and libraries, it will: download and compile a preset version of gcc; download and compile uclibc (a small-footprint libc); and then it will use those to compile everything else. This sounds like a lot of extra work, but it brings two huge advantages:

  • if you want to build for a different architecture (e.g. that Raspberry Pi), it will work exactly the same way;
  • it abstracts your local compiler: your version of gcc/clang/other is irrelevant, since your image will be built by the versions fixed by buildroot anyway.

At the end of the build, our minimalist container is ready! Let’s have a look:

cd output/images
ls -l

You should see a small, lean, rootfs.tar file, containing the image to be imported in Docker. But it’s not quite ready yet. We need to fix a few things.

  • Docker sets the DNS configuration by bind-mounting over /etc/resolv.conf. This means that /etc/resolv.conf has to be a standard file. By default, buildroot makes it a symlink. We have to replace that symlink with a file (an empty file will do).
  • Likewise, Docker “injects” itself within containers by bind-mounting over /sbin/init. This means that /sbin/init should be a regular file as well. By default, buildroot makes it a symlink to busybox. We will change that, too.
  • Docker injects itself within containers, and (as of I write this) it is dynamically linked. This means that it requires a couple of libraries to run correctly. We will need to add those libraries to the container.

(Note: Docker will eventually switch to static linkage, which means that the last step won’t be necessary anymore.)

We could unpack the tar file, do our changes, and repack; but that would be boring. So instead, we will be fancy and update the file on the fly.

Let’s create an extra directory, and populate it with those “additions”:

mkdir extra extra/etc extra/sbin extra/lib extra/lib64
touch extra/etc/resolv.conf
touch extra/sbin/init
cp /lib/x86_64-linux-gnu/libpthread.so.0 /lib/x86_64-linux-gnu/libc.so.6 extra/lib
cp /lib64/ld-linux-x86-64.so.2 extra/lib64

The paths to the libraries might be different on your machine. In doubt, you can run ldd $(which docker) to see which libraries are used by your local Docker install.

Then, create a new tarball including those extra files:

cp rootfs.tar fixup.tar
tar rvf fixup.tar -C extra .

Last but not least, the “import” command will bring this image into Docker. We will name it “dietfs”:

docker import - dietfs < fixup.tar

We’re done! Let’s make sure that everything worked properly, by creating a new container with this image:

docker run -t -i dietfs /bin/sh

For what it’s worth, I put together a small fixup script on Gist, to automate those steps, so you can also execute it like this:

curl https://gist.github.com/jpetazzo/b932fb0c753e69c73d31/raw > fixup.sh
sh fixup.sh

The result is a rather small image; less than 3.5 MB:

REPOSITORY           TAG        ID                CREATED         SIZE
jpetazzo/busybox     latest     0c0468ea37af      5 days ago      3.389 MB (virtual 3.389 MB)

Now, how do we build something more complex, like a PostgreSQL server?

Why PostgreSQL? Two reasons. One: it’s awesome. Two: I didn’t find a PostgreSQL package in buildroot, so it was an excellent opportunity to learn how to include something “from scratch”, as opposed to merely ticking a checkbox and recompiling away.

First, we want to create a directory for our new package. From buildroot’s top directory:

mkdir packages/postgres

Then, we need to put a couple of files in that directory. For your convenience, I stored them on Gist:

curl https://gist.github.com/jpetazzo/5819538/raw/Config.in > packages/postgres/Config.in
curl https://gist.github.com/jpetazzo/5819538/raw/postgres.mk > packages/postgres/postgres.mk

Let’s have a look at those files now. First, Config.in: it is used by make menuconfig to display a checkbox for our new package (yay!), but also to define some build dependencies. In that case, we need IPV6 support.

    bool "postgres"
      PostgresSQL server

comment "postgres requires a toolchain with IPV6 support enabled"

How does one know which dependencies to use? I confess that I tried first with no dependency at all. The build failed, so I had a look at the error messages, saw that it complained about missing IPV6 headers; so I fixed the issue by adding the required dependencies.

The other file, postgres.mk, contains the actual build instructions:

# postgresql
POSTGRES_SITE = http://ftp.postgresql.org/pub/source/v$(POSTGRES_VERSION)/$(POSTGRES_SOURCE)
POSTGRES_CONF_OPT = --with-system-tzdata=/usr/share/zoneinfo

$(eval $(autotools-package))

As you can see, it is pretty straightforward. The main thing is to define some variables to tell buildroot where it should fetch PostgreSQL source code. We don’t have to provide actual build instructions, because PostgreSQL uses autotools. (“This project uses autotools” means that you typically compile it with "./configure && make && make install ; this probably rings a bell if you ever compiled a significant project  manually on any kind of UNIX system!)

The build instructions will actually be expanded from the last line. If you want more details about buildroot’s operation, have a look at buildroot’s autotools package tutorial.

We can see that postgres.mk also defines more dependencies: readline and zlib. So what’s the difference between the CONF_OPT, DEPENDENCIES, and the “depends” previously seen in Config.in?

  • CONF_OPT provides extra flags which will be passed to ./configure. In this case, the compilation was failing, telling me that I should specify the path to timezone data. I looked around and figured out the right flag.
  • DEPENDENCIES tells buildroot to compile extra libraries before taking care of our package. Guess what: when I tried to compile, it failed and complained about missing readline and zlib; so I added those dependencies and that’s it.
  • “depends” in Config.in is a toolchain dependency. It is not really a library; it merely tells buildroot “hey, when you will compile uclibc, make sure to include IPV6 support, will you?”. It has a strong implication: when you change the configuration of the toolchain (C library or compiler), you have to recompile everything: the toolchain and everything which was compiled with it. This will obviously be longer than just recompiling a single package. It is done with the command make clean all.

Last but not least, we need to include our Config.in file in the top-level Config.in. The quick and dirty way is to do this (from buildroot top directory):

echo 'source "package/postgres/Config.in"' >> Config.in

Note: normally, we should do this in a neat submenu section within e.g. packages/Config.in. But this way will save us some hassle navigating through the menus.

Alright, now run make menuconfig again; go to “Toolchain”, enable IPV6 support, go back to the main menu, and enable “postgres”. Now recompile everything with make clean all. This will take a while.

Just like before, we need to “fixup” the resulting image:

cd output/images
curl https://gist.github.com/jpetazzo/b932fb0c753e69c73d31/raw | sh

We now have a Docker image with PostgreSQL in it; but it is not enough. We still need to setup the image to start PostgreSQL automatically, and even before that, PostgreSQL will have to initialize its data directory (with initdb). We will use a Dockerfile and a custom script for that.

What’s a Dockerfile? A Dockerfile contains basic instructions telling Docker how to build an image. When you use Docker for the first time, you will probably use “docker run” and “docker commit” to create new images; but you should quickly move to Dockerfiles and “docker build” because it automates those operations and makes it easier to share “recipes” to build images.

Let’s start with the custom script. We want this script to run automatically within the container when it starts. Make a new empty directory, and create the following init file in it:

set -e
mkdir /usr/share/zoneinfo /data
chown default /data
head -c 16 /dev/urandom  | sha1sum | cut -c1-10 > /pwfile
echo "PG_PASSWORD=$(cat /pwfile)"
su default -s /usr/bin/initdb -- --pgdata=/data --pwfile=/pwfile --username=postgres --auth=trust >/dev/null
echo host all all md5 >> /data/pg_hba.conf
exec su default -s /usr/bin/postgres -- -D /data -c 'listen_addresses=*'

PostgreSQL will refuse to run as root, so we use the default user (conveniently provided by buildroot). We create /data to hold PostgreSQL data files, assign it to the non-privileged user. We also generate a random password, save it to /pwfile, and display it (to make it easier to retrieve later). We can then run initdb to actually create the data files. Then, we extend pg_hba.conf to authorize connections from the network (by default, only local connections are allowed). The last step is to actually start the server.

Make sure that the script is executable:

chmod +x init

Now, in the same directory, we will create the following Dockerfile, to actually inject the previous script in a new image:

from dietfs
add . /
expose 5432
cmd /init

The fixup.sh script has imported our image under the name “dietfs”, so our Dockerfile will start with from dietfs, to tell Docker that we want to use that image as a base. Then, we add all the files in the current directory to the root of our image. This will also inject the Dockerfile itself, but we don’t care. We expose TCP port 5432, and finally tell Docker that by default, when a container is created from this image, it should run our /init script. You can read more about the Dockerfile syntax in Docker’s documentation.

The next step is to build the new image using our Dockerfile:

docker build -t pglite .

That’s it. You can now start a new PostgreSQL instance:

docker run pglite

The output will include the password, and then the first log messages from the server:

LOG:  database system was shut down at 2013-06-20 03:55:50 UTC
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started

Weak Password Is Weak! Our password is random, but in only includes hexadecimal digits (i.e. [0-9a-f]). You can make it better by including base64 in the image, and using base64 instead of md5sum. Alternatively, you can use longer passwords.

Take note of the password. It’s OK to hit “Ctrl-C” now: the container will still run in the background. Let’s check which port was allocated for our container. docker ps will show us all the containers currently running; but to make things even simpler, we will use docker ps -l, which only shows the latest container.

$ docker ps -l
ID              IMAGE           COMMAND             CREATED              STATUS              PORTS           SIZE
e21ba744ff09    pglite:latest   /bin/sh -c /init    About a minute ago   Up About a minute   49168->5432     23.53 MB (virtual 39.87 MB)

Alright, that’s port 49168. Does it really work? Let’s check for ourselves! You can try locally if you have a PostgreSQL client installed on your Docker machine; or from anywhere else (just replace “localhost” with the hostname or IP address of your Docker machine).

$ psql postgres --host localhost --port 49168 --username postgres
Password for user postgres: 4e68b1958c
psql (9.1.3, server 9.2.4)
WARNING: psql version 9.1, server version 9.2.
         Some psql features might not work.
Type "help" for help.

postgres=# q

A small note about sizes: the image takes about 16 MB, but the data files take almost 24 MB. So the total footprint is really about 40 MB.

What if we want to automate the creation of our PostgreSQL container, to run our own PostgreSQL-as-a-Service platform? Easy, with just a tiny bit of shell trickery!

CONTAINERID=$(docker run -d pglite)
while ! docker logs $CONTAINERID 2>/dev/null | grep -q ^PG_PASSWORD= ; do sleep 1 ; done
eval $(docker logs $CONTAINERID 2>/dev/null)
PG_PORT=$(docker port $CONTAINERID 5432)
echo "A new PostgreSQL instance is listening on port $PG_PORT. The admin user is postgres, the admin password is $PG_PASSWORD."

That’s it! If you name your image “yourname/pglite” instead of just “pglite”, you will be able to “docker push” it to the Docker Public Registry, and to “docker pull” it from any other Docker host anywhere in the world. You are one PHP script away from setting up your own PostgreSQL-as-a-Service provider 🙂

Extra notes: if you run into weird issues with casing (e.g. the xt_CONNMARK.h issue mentioned in the comments), check if you are building from a Vagrant/VirtualBox shared folder. If it is the case, try again from a “local” volume (i.e. not a shared folder) and see if it works better! Thanks to Bryan Murphy for reporting this.

About Jérôme Petazzoni


Jérôme is a senior engineer at dotCloud, where he rotates between Ops, Support and Evangelist duties and has earned the nickname of “master Yoda”. In a previous life he built and operated large scale Xen hosting back when EC2 was just the name of a plane, supervized the deployment of fiber interconnects through the French subway, built a specialized GIS to visualize fiber infrastructure, specialized in commando deployments of large-scale computer systems in bandwidth-constrained environments such as conference centers, and various other feats of technical wizardry. He cares for the servers powering dotCloud, helps our users feel at home on the platform, and documents the many ways to use dotCloud in articles, tutorials and sample applications. He’s also an avid dotCloud power user who has deployed just about anything on dotCloud – look for one of his many custom services on our Github repository.

Connect with Jérôme on Twitter! @jpetazzo


9 thoughts on “Create lightweight Docker containers with Buildroot

  1. “A small note about sizes: the image takes about 16 MB, but the data files take almost 24 MB. So the total footprint is really about 40 MB.”

    What are the “data files” on a fresh install of Postgres?

    • Jerome Petazzoni

      The “data files” are essentially the empty databases created by Postgres, and the transaction log. Even though the databases are empty, they need some space (because they are not technically empty: they contain a skeleton for tables, indexes, etc. Regarding the transaction log, it uses fixed-size segments, which are pre-allocated. It’s possible to change the size of the segments at compile time, but I thought it wasn’t worth the trouble. (Hey, it’s not as bad a MongoDB, which allocates 2 GB per database, “just in case” :D)

  2. Awesome article. I’ve googled for an answer to this problem, but haven’t found one, so thought I would ask you. When I follow the instructions you gave, I get this error during the “make”:

    make[1]: Entering directory `/vagrant/tmp/buildroot-2013.05/output/toolchain/linux-3.9.4'
      CHK     include/generated/uapi/linux/version.h
      UPD     include/generated/uapi/linux/version.h
      HOSTCC  scripts/basic/fixdep
      WRAP    arch/x86/include/generated/asm/clkdev.h
      SYSHDR  arch/x86/syscalls/../include/generated/uapi/asm/unistd_32.h
      HOSTCC  arch/x86/tools/relocs
      SYSHDR  arch/x86/syscalls/../include/generated/uapi/asm/unistd_64.h
      SYSHDR  arch/x86/syscalls/../include/generated/uapi/asm/unistd_x32.h
      SYSTBL  arch/x86/syscalls/../include/generated/asm/syscalls_32.h
      HOSTCC  scripts/unifdef
      INSTALL include/drm (15 files)
      INSTALL include/asm-generic (35 files)
      INSTALL include/linux/byteorder (2 files)
      INSTALL include/linux/caif (2 files)
      INSTALL include/linux/can (5 files)
      INSTALL include/linux/dvb (8 files)
      INSTALL include/mtd (5 files)
      INSTALL include/rdma (6 files)
      INSTALL include/linux/hdlc (1 file)
      INSTALL include/linux/hsi (1 file)
      INSTALL include/linux/isdn (1 file)
      INSTALL include/scsi/fc (4 files)
      INSTALL include/linux/mmc (1 file)
    /vagrant/tmp/buildroot-2013.05/output/toolchain/linux-3.9.4/scripts/Makefile.headersinst:50: *** Missing UAPI file /vagrant/tmp/buildroot-2013.05/output/toolchain/linux-3.9.4/include/uapi/linux/netfilter/xt_CONNMARK.h.  Stop.
    make[3]: *** [netfilter] Error 2
    make[2]: *** [linux] Error 2
    make[2]: *** Waiting for unfinished jobs....
      INSTALL include/scsi (3 files)
    make[1]: *** [headers_install] Error 2
    make[1]: Leaving directory `/vagrant/tmp/buildroot-2013.05/output/toolchain/linux-3.9.4'
    make: *** [/vagrant/tmp/buildroot-2013.05/output/toolchain/linux/.configured] Error 2

    This is using a Vagrant box (VirtualBox) with Ubuntu 13.04 installed.

  3. Problem solved. There was a case mismatch for many of the header files. Just had to edit the Kbuild file to change the expected case, and that solved it.

    • Strange… after solving the case-mismatch, another error occurs. Not sure how to diagnose this one:

        AR cr lib/uclibc_nonshared.a
        STRIP -x -R .note -R .comment lib/uclibc_nonshared.a
        AR cr libc/libc_so.a
        STRIP -x -R .note -R .comment libc/libc_so.a
        LD libuClibc-
      /vagrant/tmp/buildroot/output/host/usr/lib/gcc/x86_64-buildroot-linux-uclibc/4.7.3/../../../../x86_64-buildroot-linux-uclibc/bin/ld: libc/libc_so.a(jmp-unwind.oS): relocation R_X86_64_PC32 against undefined symbol `__GI___pthread_cleanup_upto' can not be used when makin
      g a shared object; recompile with -fPIC
      /vagrant/tmp/buildroot/output/host/usr/lib/gcc/x86_64-buildroot-linux-uclibc/4.7.3/../../../../x86_64-buildroot-linux-uclibc/bin/ld: final link failed: Bad value
      collect2: error: ld returned 1 exit status
      make[1]: *** [lib/libc.so] Error 1
      make[1]: Leaving directory `/vagrant/tmp/buildroot/output/build/uclibc-'
      make: *** [/vagrant/tmp/buildroot/output/build/uclibc-] Error 2
      • Jerome Petazzoni

        Very weird. Are you using the same version of buildroot?
        Did you enable anything special?

        • Strange… rebuilding the VM made everything work, but then it still occurred intermittently after that.

          Dietfs, without the Postgres packages, seems useful, so I pushed that to https://index.docker.io/u/greglearns/dietfs/ in case others want to use it.

          Jerome, thank you so much for writing this up. It was great getting your guidance on this.

  4. Larry Brigman

    In the fixup section of dietfs you are copying files from your host file system instead of the buildroot build output. The libraries might not be the same between the two.

  5. Hello,

    very good article. I’m interested in running docker on a number of embedded systems. These systems all run Linux, but not something I can or should modify at the Kernel level – so I want to understand how to *add* docker on top of that.

    can you give instructions for this, or at least a general explanation?


Leave a Reply