Some time ago I decided to start a fun project, the Wheel of WebAssembly, to demonstrate how different programming languages can be compiled to WebAssembly and loaded independently on the same web page. It supports languages like C, AssemblyScript, C#, Java, Kotlin, Go, Rust (I call them wheel parts). Initially, I decided to manually configure the toolchain for each programming language on my machine and then compile them altogether. This worked fine for a while, but with time it became complicated to handle updates for each toolchain. Furthermore, when someone else wanted to play around with the project, they had to go through the same pain of manually configuring their environment. This lead me to start using Docker for building the WebAssembly output for each language.
One Dockerfile builds ’em all
Docker helped me with setting up all prerequisites to build each language wheel part in an automated way. Using a configuration file, the Dockerfile, which is part of project’s git repository, everyone running Docker could build and run the project from the scratch using these commands:
1 2 |
docker build . -t wasm-wheel docker run -p 8080:8080 -t wasm-wheel:latest |
My first Dockerfile strategy was the same: configure each toolchain on the same Docker image and build all wheel parts at the end.
1 2 3 4 5 6 7 8 9 10 |
FROM ubuntu:xenial # Configure Java toolchain # Configure Go toolchain # ... RUN npm run build EXPOSE 8080 CMD ["npm", "run", "serve"] |
Although it worked fine, the generated Docker image was more than 2GB in size. I suppose this is too much, if you want to share your image on Docker Hub.
Docker multi-stage builds
Talking to friends and colleagues, I got to hear about the Docker multi-stage build pattern. Alex Ellis has written an insightful post about it, but the idea, in general, is to create temporary Docker images to produce some kind of an output, which you later on can use in your final image. The temporary images are discarded, while your final image remains. This applies exactly to my scenario: I need to produce a .wasm file for each wheel part (written in a different programming language) and then gather all .wasm files in the same folder to serve the web app using Node.js. As I am only interested in the final output, the .wasm file, I do not need to keep the toolchain configuration for building each wheel part. My second Dockerfile looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
FROM openjdk:8-jdk AS wheel-part-java # Build Java wheel part FROM mono:5.14 AS wheel-part-csharp # Build C# wheel part # ... # Build the web app with the tiniest image running Node.js FROM node:8-alpine # Build only the metadata and wasm loaders, # the wasm is already built for us RUN npm run build -- metadata loaders # Copy the wasm from each temporary image to the final one COPY --from=wheel-part-java output build/wasm COPY --from=wheel-part-csharp output build/wasm # ... EXPOSE 8080 CMD ["npm", "run", "serve"] |
On Docker Hub one can find the official images for each toolchain, so my job is only to compile the wheel part source to WebAssembly and expose it in an output folder. At the end I can copy all generated .wasm files to my web app and serve the app. By applying this pattern, the final Docker image size decreased to 113MB.
Separation of concerns
I like keeping related things together in the same folder. It gives me a good overview and it makes it easy to update multiple files when necessary. By having a single Dockerfile with all build configurations inside, one has to remember to update multiple files in different folders (each wheel part has its own folder) when a change to a build configuration has to be applied. That is why I have split the global Dockerfile into separate Dockerfiles – one for each wheel part. I have then introduced a command to build the global Dockerfile by concatenating all individual Dockerfiles. The rest is the same – one just has to build the Docker image. You can see the final result on the official GitHub repository.
Conclusion
Docker multi-stage builds provide a way to build artifacts to use in the final Docker image and hence decrease the size of this final image.