Building serverless apps with Docker

Every now and then, there are waves of technology that threaten to make the previous generation of technology obsolete.  There has been a lot of talk about a technique called “serverless” for writing apps. The idea is to deploy your application as a series of functions, which are called on-demand when they need to be run. You don’t need to worry about managing servers, and these functions scale as much as you need, because they are called on-demand and run on a cluster.

But serverless doesn’t mean there is no Docker – in fact, Docker is serverless. You can use Docker to containerize these functions, then run them on-demand on a Swarm. Serverless is a technique for building distributed apps and Docker is the perfect platform for building them on.

From servers to serverless

So how might we write applications like this? Let’s take our example a voting application consisting of 5 services:

Picture1

This consists of:

  • Two web frontends
  • A worker for processing votes in the background
  • A message queue for processing votes
  • A database

The background processing of votes is a very easy target for conversion to a serverless architecture. In the voting app, we can run a bit of code like this to run the background task:

import dockerrun
client = dockerrun.from_env()
client.run("bfirsh/serverless-record-vote-task", [voter_id, vote], detach=True)

The worker and message queue can be replaced with a Docker container that is run on-demand on a Swarm, automatically scaling to demand.

We can even eliminate the web frontends. We can replace them with Docker containers that serve up a single HTTP request, triggered by a lightweight HTTP server that spins up Docker containers for each HTTP request. The heavy lifting has now moved the long-running HTTP server to Docker containers that run on-demand, so they can automatically scale to handle load.

Our new architecture looks something like this:

Picture2

The red blocks are the continually running services and the green blocks are Docker containers that are run on-demand. This application has fewer long-running services that need managing, and by its very nature scales up automatically in response to demand (up to the size of your Swarm!).

So what can we do with this?

There are three useful techniques here which you can use in your apps:

  1. Run functions in your code as on-demand Docker containers
  2. Use a Swarm to run these on a cluster
  3. Run containers from containers, by passing a Docker API socket

 

With the combination of these techniques, this opens up loads of possibilities about how you can architect your applications. Running background work is a great example of something that works well, but a whole load of other things are possible too, for example:

  • Launching a container to serve user-facing HTTP requests is probably not practical due to the latency. However – you could write a load balancer which knew how to auto-scale its own web frontends by running containers on a Swarm.
  • A MongoDB container which could introspect the structure of a Swarm and launch the correct shards and replicas.

 

What’s next

We’ve got all these radically new tools and abstractions for building apps, and we’ve barely scratched the surface of what is possible with them. We’re still building applications like we have servers that stick around for a long time, not for the future where we have Swarms that can run code on-demand anywhere in your infrastructure.

This hopefully gives you some ideas about what you can build, but we also need your help. We have all the fundamentals to be able to start building these applications, but its still in its infrancy – we need better tooling, libraries, example apps, documentation, and so on.

This GitHub repository has links off to tools, libraries, examples, and blog posts. Head over there if you want to learn more, and please contribute any links you have there so we can start working together on this.

Get involved, and happy hacking!

Live streaming video by Ustream


 

Learn More about Docker

,

Building serverless apps with Docker


11 Responses to “Building serverless apps with Docker”

  1. Wes

    It seems as though Docker Swarm doesn't support the idea of tasks/jobs (deliberately short-lived), but rather just "services". This functionality doesn't seem to exist out of the box, unless I'm missing something.

    Reply
    • Ben Firshman

      The current standalone Swarm does support this – you simply do "docker run" and it will schedule it on your cluster: https://docs.docker.com/swarm/overview/

      In Docker 1.12, which has Swarm mode built-in, now uses a concept of a "service" which is not quite the same, as you suggest. This works fine for running one-off jobs where you don't need to get the output (you create a service which doesn't automatically restart), but is a bit more complex if you need to get the results. Making this work better is a work in progress.

      Reply
  2. Dave

    The vote processing code you have above, where is it intended to run? On AWS Lambda or other "serverless" architecture? Wouldn't this need to "spin up" an image for every vote that's counted? This sounds neat, but I'm sceptical at this point.

    Reply
    • Ben Firshman

      On a Swarm, which pools your resources into a single, virtual Docker Engine for running containers: https://docs.docker.com/swarm/overview/

      The idea is to indeed spin up an image for each vote that is counted. It obviously depends on your particular workload, but typically this is fine because containers are so fast to spin up (a few hundred milliseconds).

      AWS Lambda does the same thing, with a few extra tricks to keep containers alive, which could be replicated with Swarm.

      Reply
  3. Thibault

    Note that providing the raw Docker API without access control to just any container is potentially a very bad idea. If that container gets compromised, you need to consider the host compromised. SELinux can't help you there.
    https://docs.docker.com/engine/security/security/

    In general in fact, the Docker API as it is is too low-level for application-level usage, it should strictly be used for operations (i.e. what it was built for, even the dev workflows are a bit crazy at times). You'll always need finer grained access control through a higher-level API (which hopefully will get "standardized" at some point) or bring up on-demand Engines in sandboxed VMs (which largely defeats the point of containerization).

    What's presented here also has little to do with serverless in my eyes (which is more about the development approach/company practice of not managing servers in-house). Here we explicitly manage server-side resources. This can be used to provide services that in turn can be provided to shops that use a serverless approach of course, but that's namedropping buzzwords at this point.

    I feel this is more about creating containerized processes on-demand, period. Basically an inetd-model for Docker. Leveraging Swarm for distributed scheduling at this level is great though, but it doesn't seem practical before the issues above are addressed somehow. I see some nice mentions in the https://github.com/bfirsh/serverless-docker to-do list of repository though:

    > "A proxy that scopes a Docker API so that containers can securely manage and run "child" containers."
    > "Helpers for injecting the Docker API socket into containers that are run."

    Good luck with those, I'd *really* love to play with these things.

    Reply
    • Ben Firshman

      Yes – you're right I should have pointed out the security issue with the Docker socket. That's currently the main blocker to this being practical in production and we're definitely looking for help to make it work better, as you noticed from the to-do list. 😉

      Another way of thinking about this that might make it sound a bit more like serverless: what if this was an auto-scaling Swarm hosted in the cloud?

      You could define your functions as Docker images, then they run in the cloud without you having to worry about infrastructure or scaling, like on Lambda.

      Another way of thinking about this could be as a way of doing on-premise serverless. Or serverless without being tied to a particular vendor.

      There's a lot of tooling missing to make this work well, but hopefully this helps you see the line of thinking I'm taking.

      Reply
  4. Vijay Bose

    Is dockerrun python library using "Docker Remote API"?

    Reply
  5. Joe G

    Hi,

    I just cloned the repo and did 'make'. Everything partially works. I am using Docker 1.12.x. I also have it configured in Swarm mode. I noticed a comment back on July 19th saying this was in progress. Is there a status update? If I need to get this working on a Swarm cluster, can you outline any known steps/problems/items to watch out for?

    thanks!

    Joe

    Reply
  6. Joe G

    Actually, anyone should BEWARE. The github example does not even run normally under Docker 1.12. One of the dependencies uses HostConfig which is no longer supported in Docker 1.12.

    Please correct me if I am mistaken.

    Reply
  7. Han

    Hi,

    How do you manage stopped containers (finished tasks), as create many containers may use up the disk space.

    Also, maybe related to the question above, starting a fresh container does take sometime, how will this be addressed?

    Thanks a lot,

    Reply
  8. Alex Ellis

    I was there for Ben's talk at Dockercon and have tried to continue the work (as a Docker Captain)

    This is where I am with things now – specifically using Docker Swarm Mode and 1.13 + features:

    http://blog.alexellis.io/functions-as-a-service/

    Reply

Leave a Reply to Dave

Click here to cancel reply.

Get the Latest Docker News by Email

Docker Weekly is a newsletter with the latest content on Docker and the agenda for the upcoming weeks.