Betty Junod

How To Dockerize Vendor Apps like Confluence

Docker Datacenter customer, Shawn Bower of Cornell University recently shared their experiences in containerizing Confluence as being the start of their Docker journey.

Through that project they were able to demonstrate a 10X savings in application maintenance, reduce the time to build a disaster recovery plan from days to 30 minutes and improve the security profile of their Confluence deployment. This change allowed the Cloudification team that Shawn leads to start spending the majority of their time helping Cornelians to use technology to be innovative.

Since the original blog was posted, there’s been a lot of requests to get the pragmatic info on how Cornell actually did this project.  In the post below, Shawn provides detailed instructions on how Confluence is containerized and how the Docker workflow is integrated with Puppet.


Written by Shawn Bower

As we started our Journey to move Confluence to the cloud using Docker we were emboldened by the following post from Atlassian. We use many of the Atlassian products and love how well integrated they are.  In this post I will walk you through the process we used to get Confluence in a container and running.

First we needed to craft a Dockerfile.  At Cornell we used image inheritance which enables our automated patching and security scanning process.  We start with the cannonical ubuntu image: https://hub.docker.com/_/ubuntu/ and then build on defaults used here at Cornell.  Our base image is available publicly on github here: https://github.com/CU-CommunityApps/docker-base.

Let’s take a look at the Dockerfile.

FROM ubuntu:14.04

# File Author / Maintainer
MAINTAINER Shawn Bower <my email address>

# Install.
RUN \
 apt-get update && apt-get install --no-install-recommends -y \
   build-essential \
   curl \
   git \
   unzip \
   vim \
   wget \
   ruby \
   ruby-dev \
   clamav-daemon \
   openssh-client && \
 rm -rf /var/lib/apt/lists/*

RUN rm /etc/localtime
RUN ln -s /usr/share/zoneinfo/America/New_York /etc/localtime

#Clamav stuff
RUN freshclam -v && \
 mkdir /var/run/clamav && \
 chown clamav:clamav /var/run/clamav && \
 chmod 750 /var/run/clamav

COPY conf/clamd.conf /etc/clamav/clamd.conf

RUN echo "gem: --no-ri --no-rdoc" > ~/.gemrc && \
 gem install json_pure -v 1.8.1 && \
 gem install puppet -v 3.7.5 && \
 gem install librarian-puppet -v 2.1.0 && \
 gem install hiera-eyaml -v 2.1.0

# Set environment variables.
ENV HOME /root

# Define working directory.
WORKDIR /root

# Define default command.
CMD ["bash"]

At Cornell we use Puppet for configuration management so we bake that directly into our base image.  We do a few other things like setting the timezone and installing the clamav agent as we have some applications that use that for virus scanning.  We have an automated project in Jenkins that pulls that latest ubuntu:14.04 image from Docker Hub and then builds this base image every weekend.  Once the base image is built we tag it with ‘latest’, a time stamp tag and automatically push it to our local Docker Trusted Registry.  This allows the brave to pull in patches continuously while allowing others to pin to a specific version until they are ready to migrate.  From that image we create a base Java image which installs Oracle’s JVM.

The Dockerfile is available here and explained below.

# Pull base image.
FROM DTR Repo path /cs/base

# Install Java.
RUN \
  apt-get update && \
  apt-get -y install software-properties-common && \
  add-apt-repository ppa:webupd8team/java -y && \
  apt-get update && \
  echo "oracle-java8-installer shared/accepted-oracle-license-v1-1 select true" | sudo debconf-set-selections && \
  apt-get install -y oracle-java8-installer && \
  apt-get install oracle-java8-set-default && \
  rm -rf /var/lib/apt/lists/*

# Define commonly used JAVA_HOME variable
ENV JAVA_HOME /usr/lib/jvm/java-8-oracle

# Define working directory.
WORKDIR /data

# Define default command.
CMD ["bash"]

The same automated patching process is followed for the Java image as with the base image.  The Java image is automatically built after the base imaged and tagged accordingly so there is a matching set of base and java8.  Now that we have our Java image we can layer on Confluence.  Our Confluence repository is private but the important bits of the Dockerfile are below.

FROM DTR Repo path for cs/java8

# Configuration variables.
ENV CONF_HOME     /var/local/atlassian/confluence
ENV CONF_INSTALL  /usr/local/atlassian/confluence
ENV CONF_VERSION  5.8.18

ARG environment=local

# Install Atlassian Confluence and helper tools and setup initial home
# directory structure.
RUN set -x \
    && apt-get update --quiet \
    && apt-get install --quiet --yes --no-install-recommends libtcnative-1 xmlstarlet \
    && apt-get clean \
    && mkdir -p                "${CONF_HOME}" \
    && chmod -R 700            "${CONF_HOME}" \
    && chown daemon:daemon     "${CONF_HOME}" \
    && mkdir -p                "${CONF_INSTALL}/conf" \
    && curl -Ls                "http://www.atlassian.com/software/confluence/downloads/binary/atlassian-confluence-${CONF_VERSION}.tar.gz" | tar -xz --directory "${CONF_INSTALL}" --strip-components=1 --no-same-owner \
    && chmod -R 700            "${CONF_INSTALL}/conf" \
    && chmod -R 700            "${CONF_INSTALL}/temp" \
    && chmod -R 700            "${CONF_INSTALL}/logs" \
    && chmod -R 700            "${CONF_INSTALL}/work" \
    && chown -R daemon:daemon  "${CONF_INSTALL}/conf" \
    && chown -R daemon:daemon  "${CONF_INSTALL}/temp" \
    && chown -R daemon:daemon  "${CONF_INSTALL}/logs" \
    && chown -R daemon:daemon  "${CONF_INSTALL}/work" \
    && echo -e                 "\nconfluence.home=$CONF_HOME" >> "${CONF_INSTALL}/confluence/WEB-INF/classes/confluence-init.properties" \
&& xmlstarlet              ed --inplace \
        --delete               "Server/@debug" \
        --delete               "Server/Service/Connector/@debug" \
        --delete               "Server/Service/Connector/@useURIValidationHack" \
        --delete               "Server/Service/Connector/@minProcessors" \
        --delete               "Server/Service/Connector/@maxProcessors" \
        --delete               "Server/Service/Engine/@debug" \
        --delete               "Server/Service/Engine/Host/@debug" \
        --delete               "Server/Service/Engine/Host/Context/@debug" \
"${CONF_INSTALL}/conf/server.xml"

# bust cache
ADD version /version

# RUN Puppet
WORKDIR /
COPY Puppetfile /
COPY keys/ /keys

RUN mkdir -p /root/.ssh/ && \
  cp /keys/id_rsa /root/.ssh/id_rsa && \
  chmod 400 /root/.ssh/id_rsa && \
  touch /root/.ssh/known_hosts && \
  ssh-keyscan github.com >> /root/.ssh/known_hosts && \
  librarian-puppet install && \
  puppet apply --modulepath=/modules - hiera_config=/modules/confluence/hiera.yaml \

  --environment=${environment} -e "class { confluence::app': }" && \
  rm -rf /modules && \
  rm -rf /Puppetfile* && \
  rm -rf /root/.ssh && \
  rm -rf /keys

USER daemon:daemon

# Expose default HTTP connector port.
EXPOSE 8080

VOLUME ["/opt/atlassian/confluence/logs"]

# Set the default working directory as the installation directory.
WORKDIR /var/atlassian/confluence

# Run Atlassian Confluence as a foreground process by default.
CMD ["/opt/atlassian/confluence/bin/catalina.sh", "run"]

We bring down the install media from Atlassian, explode that into the install path and do a bit of cleanup on some of the XML configs.  We use Docker build cache for that part of the process becauses it does not change often.  After the Confluence installation we bust the cache by adding a version file which changes each time the build runs in Jenkins.  This ensuers that Puppet will run in the container and configure the environment.  Puppet is used to lay down environment (dev, test, prod, etc.) configuration and use a docker build argument called ‘environment.’  This allows us to bake everything needed to run Confluence into the image so we can launch it on any machine with no extra configuration.  Whether to store the configuration in the image or outside is a contested subject for sure, but our decision was  to store all configurations directly in the image. We believe this ensures the highest level of portability.

Here are some general rules we follow with Docker

  • Use base images that are a part of the automated patching
  • Follow Dockerfile best practices
  • Keep the base infrastructure in a Dockerfile, and environment specific information in Puppet
  • Build one process per container
  • Keep all components of the stack in one repository
  • If the stack has multiple components (ie, apache, tomcat) they should live in the same repository
  • Use subdirectories for each component

Hope you enjoyed this post and gets you containerizing some vendored apps. This is just the beginning as we recently moved a legacy coldfusion app into Docker – almost anything can probably be containerized!

More Resources

, ,

Betty Junod

How To Dockerize Vendor Apps like Confluence


One Response to “How To Dockerize Vendor Apps like Confluence”

  1. Martin Buchleitner

    Sorry, but you missed to VOLUME /opt/atlassian/confluence/data/ where all attachments are stored … would be nice to keep those data on a volume 😉

    Reply

Leave a Reply

Get the Latest Docker News by Email

Docker Weekly is a newsletter with the latest content on Docker and the agenda for the upcoming weeks.