Docker for Dummies

In my quest for the perfect tool for reproducible science, I thought that the silver bullet was to wrap your code in a neat library/package and make it available to the world. Yet, I was wrong. Docker is a much cooler and a much more effective way for sharing your work with a broad audience. This post is a 101 introduction to Docker. I describe what is Docker and show one simple application with a script in Julia.

Why Docker?

Your script works locally, but not on your friend’s laptop because of dependency issues. Docker solves the dependency hell by giving you the opportunity to “ship” your application in a “container”. One can think of a “container” as some sort of lightweight virtual image. Some technical details can be found here and here. In a nutshell, if your application works on your local machine, Docker helps you to put your application inside an container. Once in a container, your application will run smoothly for the rest of the world.

Application in Julia

The goal is to create a container with a simple script to calculate an approximation of π. Here I am making a copy-paste from this post, in which I calculated an approximation of π using Monte-Carlo. I create a folder julia-app, which contains 3 files (see the Github repository with the 3 files here)

julia-app
  ├ app.jl
  ├ deps.jl
  └ Dockerfile

The file app.jl contains the application we want to containerize. The file deps.jl contains the list of libraries/packages that are used within app.jl. The file Dockerfile is a text document that contains instructions to build the container. Generally, a Dockerfile contains 4 types of instructions:

  • FROM: specifies the “base image” we want to use within the container. In our case, we want to run an application with Julia. Luckily, we can pull a base image with Julia pre-installed on it using FROM julia:<julia-version-you-want>
  • COPY: adds files to your container
  • RUN: executes command(s) in a new layer and creates a new image. RUN is perfect for installing packages
  • CMD: specifies what command to run within the container
#choose a base image
FROM julia:1.0.3

# install julia dependencies
COPY deps.jl /usr/src/app/
RUN julia /usr/src/app/deps.jl

# copy files required for the app to run
COPY app.jl /usr/src/app/

# run the application
CMD ["julia", "/usr/src/app/app.jl"]

To create the container, use the command docker build. You can give a name to your container using the --tag , -t option:

docker build julia-app -t <your-tag>

To run the container you have just created:

docker run --name julia-app <your-tag>

After a few seconds, you should see an approximation of π showing up in your terminal. Voilà, you have have just created your first container. The next step is to put your container on DockerHub. Here is a tutorial on how to do it.

Conclusion

From an academic perspective, Docker solves the dependency hell and helps in producing reproducible research. I will try to use it more often for sharing my own research.