Luc tielen clipped rev 1 copy

Luc Tielen – 15 November 2018
597 words in about 3 minutes

Here at Kabisa, we are big fans of Docker. In the past we’ve given workshops and already talked about it in some blogposts here, here and here.

In this post I’m going to show you how you can keep your Docker images maintainable (and reduce your image size in the process) by making use of multi-staged builds.

Why multi-stage builds?

Docker is great for creating lightweight, reproducible containers in which a program can run (be it in CI or in production). Often these images can grow quite large over time though, which is not that ideal:

  1. Uploading/downloading of large images takes a long time.
  2. Big images often contain extra unneeded programs/files that could be exploited by an attacker. Small images reduce the amount of ways an attacker can compromise your system.
  3. Larger images are harder to maintain: it can be hard to know what is already installed on the container, and keeping all those installed dependencies up to date.

Multi-stage builds can fix all these issues by splitting up the build process of the image into several smaller images and only copying over what you really need into the final stage in the Dockerfile.

How it works

Dockerfiles traditionally have 1 FROM statement as the first statement, indicating what base image to start building from. It is however possible to use multiple FROM statements in your build process, 1 for each stage. Later stages can then copy files over from earlier stages. This is done by adding --from=NAME_OF_STAGE after the COPY instruction from Docker. A (simple) example will make things more clear:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
FROM library/alpine:3.7 AS STAGE1
# Do step 1, ..., step n for everything related to stage 1.
# Here we only create a few files and do nothing else to keep it simple.
RUN mkdir /tmp/dir && \
    touch /tmp/file_a && \
    touch /tmp/file_b && \
    touch /tmp/dir/file_c
FROM library/alpine:3.7 AS FINAL_STAGE
WORKDIR /tmp
# Example of copying a single file:
COPY --from=STAGE1 /tmp/file_a .
# Example of copying an entire directory:
COPY --from=STAGE1 /tmp/dir/ .
CMD "/bin/sh"

Now you can run the following shell commands to build the image:

1
2
3
4
$ docker build . -t multi-stage-example
$ docker run -it multi-stage-example
# Inside the container:
/tmp # ls -al

This gives us the following output:

total 8
drwxrwxrwt    1 root     root          4096 Nov  2 16:08 .
drwxr-xr-x    1 root     root          4096 Nov  2 16:09 ..
-rw-r--r--    1 root     root             0 Nov  2 16:07 file_a
-rw-r--r--    1 root     root             0 Nov  2 16:07 file_c

As you can see, it only copied over file_a (from the first COPY statement) and file_c (from the directory that was copied during second COPY statement). Now that we have multiple stages, it is also possible to build only 1 stage:

1
docker build . -t multi-stage-example --target=STAGE1

This can be very handy when working on a Dockerfile and iterating/debugging quickly over 1 specific part of the build process.

A more complex example can be found here. It uses a multi-stage build (1 stage for building backend, 1 for building frontend and a final stage for grouping everything together) to great effect: the final resulting image is only 11.7MB!

Conclusion

Multi stage Docker builds give us an easy way of building compact images by decomposing the build process into small, focused build steps. It also boosts maintainability and security of the resulting image by only including what is necessary in the final output.

If you have any questions after reading this blogpost, feel free to reply in the comments.