If you write code for a living, you’ve probably heard of Docker by now. You might have played around with it, and if you’re lucky, you may even have had the chance to use it to deploy systems in production. But have you made the leap to contributing to the project?
There are many benefits to contributing to a popular open-source project like Docker:
- You earn recognition for improving a project used by many people.
- You get to collaborate with other amazingly smart people in the open-source community.
- You become a better programmer yourself through the process of understanding and improving an important system.
But getting started on a new codebase can be daunting. Docker has many, many lines of code. Fixing even the smallest issue can require reading through a lot of that code and understanding how the pieces all fit together.
But it’s also not as difficult as you might think. You can follow Docker’s Contributor guide to get a development environment set up. Then follow these 5 simple steps to dive into a new codebase (with interactive code snippets to guide you along the way). The skills you hone doing so will come in handy on every new project you encounter over the course of your programming life. So what are you waiting for? Here they are:
Step 1: Start at
Start with what you know, as the old saying goes. If you’re like most Docker users, you probably mainly use the Docker CLI. So let’s start with the entry point into that program: the
For the remainder of this post, we’ll use a site called Sourcegraph, which the Docker team uses to search and browse code on the web as you would in an intelligent IDE. To follow along, it may be easiest to open a second browser window to Sourcegraph and hop back and forth between that and this post.
We’re looking for the
main function corresponding to the
docker command, which is the one in the
docker/docker/docker.go file. Clicking on that search result, we jump to its definition (shown below). Take some time to read through this function:
At the top of the
main function, we see a lot of code related to setting up logging, reading command flags, and initializing defaults. At the bottom, we find a call to
client.NewDockerCli, which seems to be responsible for creating the struct whose methods do all the actual work. Let’s issue a search query for
Step 2: Get to the core
In many applications and libraries, there’s one or two key interfaces that describe the core functionality or essence. Let’s try to get there from where we are now.
Clicking on the
NewDockerCli search result, we arrive at the definition of the function. Since what we’re interested in is the struct that the function returns,
DockerCli, let’s click on the return type to jump to its definition.
DockerCli brings us to its definition. Scrolling down through this file, we see its methods,
Cmd looks noteworthy. It’s the only method with a docstring and the docstring suggests that it’s the core method for executing each Docker command.
Step 3: Dive deep
Now that we’ve found
DockerCli, the core “controller” of the Docker client, let’s dive into how one of the specific Docker commands work. Let’s zoom in on
Reading the implementation of
DockerCli.Cmd shows that it calls
DockerCli.getMethod to invoke the function corresponding to each Docker command.
DockerCli.getMethod, we see that this is accomplished by the dynamic invocation of a method whose name is the string
"Cmd" prepended to the name of the Docker command. So in the case of
docker build, we’re looking for
DockerCli.CmdBuild. No such method is defined in this file, so let’s search for
Indeed, the search results show there is a method
DockerCli, so let’s select the result to jump to its definition. The
DockerCli.CmdBuild method body is rather long to inline in this blog post, but here it is for reference.
There’s a lot going on here. At the top of the method, we see code dealing with a variety of input methods for the Dockerfile and configuration. Oftentimes, a good strategy for reading through a long method is to work backwards. Start at the bottom and look at what the method does at the very end. In many cases, that’s the meat of the method and everything before is just setup for completing that core action.
At the bottom of
CmdBuild, we see a
POST request made via
cli.stream. Jumping through a few more definitions, we arrive at
DockerCli.clientRequest, which constructs a HTTP request that contains the information you pass to Docker via
docker build. So at the end of the day, all
docker build does is issue a fancy
POST request to the Docker daemon. You could try replicating its behavior with
curl if you really wanted.
Now that we’ve understood a single Docker client command through and through, you might be interested in diving deeper still and finding where the daemon receives the request and following it all the way down to its interaction with Libcontainer and the kernel. That’s certainly a valid route, but we leave that for now as an exercise to the reader. Instead, let’s get a broader understanding of the key components of the client.
Step 4: Look at usage examples
One way of better understanding a piece of code is to look at usage examples of how that code is used. Let’s go back to the
DockerCli.clientRequest method. In the right-hand side panel on Sourcegraph, we can page through usage examples of this method. It turns out this method is used in multiple places, since most of the Docker client commands result in HTTP requests issued to the daemon.
In order to fully understand a piece of code, you need to understand both how it works and how it’s used. Jumping to definition lets us understand the former by walking “forward” along the graph of code, while looking at usage examples covers the latter by walking “backward”.
Try this out for a few more functions and methods to understand how they’re interconnected. If it’s helpful, draw a picture of how various components of the application interact with one another.
Step 5: Select an issue and start coding!
Now that you have a decent picture of the Docker codebase as a whole, take a look at the issue tracker to see what needs working on, and reach out to members of the Docker community with questions you aren’t able to answer yourself. Because you’ve taken the time to explore and understand the code, you’ll be better equipped to ask smart questions and know where specific issues fit into the broader picture.
And if you feel up for it, take notes along the way, document your experience, and write it up as a blog post like this one. The Docker team would love to hear about your experience diving into their code.
One of the misconceptions that often prevents people from getting involved in projects is being daunted by the task of jumping into a large, foreign codebase. We often assume, as programmers, that the hard work lies in writing code, but often, it’s reading and understanding other people’s code that is the critical first step. Recognizing that and approaching the task in a principled way, armed with good tools for doing so, will help you conquer the psychological barrier of diving into the code.
So make the leap and check out Docker’s source today. A vibrant open-source community and codebase awaits you!
Author Photo and Bio
Beyang Liu is a programmer who likes building products that help people be better creators. Before co-founding Sourcegraph, he worked on data analysis and visualization at Palantir and researched computer vision algorithms in Professor Daphne Koller’s lab at Stanford. You can find Beyang on Twitter, Sourcegraph, and Github.
Learn More about Docker
- New to Docker? Try our 10 min online tutorial
- Share images, automate builds, and more with a free Docker Hub account
- Read the Docker 1.6 Release Notes
- Subscribe to Docker Weekly
- Attend upcoming Docker Meetups
- Attend upcoming Docker Online Meetups
- Register for DockerCon 2015
- Start contributing to Docker