Thea Lamkin

Start contributing to Docker in 5 easy steps

Thea Lamkin

If you write code for a living, you’ve probably heard of Docker by now. You might have played around with it, and if you’re lucky, you may even have had the chance to use it to deploy systems in production. But have you made the leap to contributing to the project?

There are many benefits to contributing to a popular open-source project like Docker:

  • You earn recognition for improving a project used by many people.
  • You get to collaborate with other amazingly smart people in the open-source community.
  • You become a better programmer yourself through the process of understanding and improving an important system.

But getting started on a new codebase can be daunting. Docker has many, many lines of code. Fixing even the smallest issue can require reading through a lot of that code and understanding how the pieces all fit together.

But it’s also not as difficult as you might think. You can follow Docker’s Contributor guide to get a development environment set up. Then follow these 5 simple steps to dive into a new codebase (with interactive code snippets to guide you along the way). The skills you hone doing so will come in handy on every new project you encounter over the course of your programming life. So what are you waiting for? Here they are:

Step 1: Start at func main()

Start with what you know, as the old saying goes. If you’re like most Docker users, you probably mainly use the Docker CLI. So let’s start with the entry point into that program: the main function.

For the remainder of this post, we’ll use a site called Sourcegraph, which the Docker team uses to search and browse code on the web as you would in an intelligent IDE. To follow along, it may be easiest to open a second browser window to Sourcegraph and hop back and forth between that and this post.

On Sourcegraph, let’s search for func main() inside the Docker repository.

We’re looking for the main function corresponding to the docker command, which is the one in the docker/docker/docker.go file. Clicking on that search result, we jump to its definition (shown below). Take some time to read through this function:

At the top of the main function, we see a lot of code related to setting up logging, reading command flags, and initializing defaults. At the bottom, we find a call to client.NewDockerCli, which seems to be responsible for creating the struct whose methods do all the actual work. Let’s issue a search query for NewDockerCli.

Step 2: Get to the core

In many applications and libraries, there’s one or two key interfaces that describe the core functionality or essence. Let’s try to get there from where we are now.

Clicking on the NewDockerCli search result, we arrive at the definition of the function. Since what we’re interested in is the struct that the function returns, DockerCli, let’s click on the return type to jump to its definition.

Clicking on DockerCli brings us to its definition. Scrolling down through this file, we see its methods, getMethod, Cmd, Subcmd, and LoadConfigFile. Cmd looks noteworthy. It’s the only method with a docstring and the docstring suggests that it’s the core method for executing each Docker command.

Step 3: Dive deep

Now that we’ve found DockerCli, the core “controller” of the Docker client, let’s dive into how one of the specific Docker commands work. Let’s zoom in on docker build.

Reading the implementation of DockerCli.Cmd shows that it calls DockerCli.getMethod to invoke the function corresponding to each Docker command.

In DockerCli.getMethod, we see that this is accomplished by the dynamic invocation of a method whose name is the string "Cmd" prepended to the name of the Docker command. So in the case of docker build, we’re looking for DockerCli.CmdBuild. No such method is defined in this file, so let’s search for CmdBuild.

Indeed, the search results show there is a method CmdBuild on DockerCli, so let’s select the result to jump to its definition. The DockerCli.CmdBuild method body is rather long to inline in this blog post, but here it is for reference.

There’s a lot going on here. At the top of the method, we see code dealing with a variety of input methods for the Dockerfile and configuration. Oftentimes, a good strategy for reading through a long method is to work backwards. Start at the bottom and look at what the method does at the very end. In many cases, that’s the meat of the method and everything before is just setup for completing that core action.

At the bottom of CmdBuild, we see a POST request made via Jumping through a few more definitions, we arrive at DockerCli.clientRequest, which constructs a HTTP request that contains the information you pass to Docker via docker build. So at the end of the day, all docker build does is issue a fancy POST request to the Docker daemon. You could try replicating its behavior with curl if you really wanted.

Now that we’ve understood a single Docker client command through and through, you might be interested in diving deeper still and finding where the daemon receives the request and following it all the way down to its interaction with Libcontainer and the kernel. That’s certainly a valid route, but we leave that for now as an exercise to the reader. Instead, let’s get a broader understanding of the key components of the client.

Step 4: Look at usage examples

One way of better understanding a piece of code is to look at usage examples of how that code is used. Let’s go back to the DockerCli.clientRequest method. In the right-hand side panel on Sourcegraph, we can page through usage examples of this method. It turns out this method is used in multiple places, since most of the Docker client commands result in HTTP requests issued to the daemon.

In order to fully understand a piece of code, you need to understand both how it works and how it’s used. Jumping to definition lets us understand the former by walking “forward” along the graph of code, while looking at usage examples covers the latter by walking “backward”.

Try this out for a few more functions and methods to understand how they’re interconnected. If it’s helpful, draw a picture of how various components of the application interact with one another.

Step 5: Select an issue and start coding!

Now that you have a decent picture of the Docker codebase as a whole, take a look at the issue tracker to see what needs working on, and reach out to members of the Docker community with questions you aren’t able to answer yourself. Because you’ve taken the time to explore and understand the code, you’ll be better equipped to ask smart questions and know where specific issues fit into the broader picture.

And if you feel up for it, take notes along the way, document your experience, and write it up as a blog post like this one. The Docker team would love to hear about your experience diving into their code.

Contributing effectively

One of the misconceptions that often prevents people from getting involved in projects is being daunted by the task of jumping into a large, foreign codebase. We often assume, as programmers, that the hard work lies in writing code, but often, it’s reading and understanding other people’s code that is the critical first step. Recognizing that and approaching the task in a principled way, armed with good tools for doing so, will help you conquer the psychological barrier of diving into the code.

So make the leap and check out Docker’s source today. A vibrant open-source community and codebase awaits you!

Author Photo and Bio

Beyang LiuBeyang Liu is a programmer who likes building products that help people be better creators. Before co-founding Sourcegraph, he worked on data analysis and visualization at Palantir and researched computer vision algorithms in Professor Daphne Koller’s lab at Stanford. You can find Beyang on Twitter, Sourcegraph, and Github.




Learn More about Docker


Continue reading...

Be the first to write a comment.

Leave a Reply