a perfect fit for Docker principles - Packt Publishing

Once you’re confident that all of your builds are being quality-checked with a con-sistent CI process, the logical next step is to start looking at deploying every good build to your users. This goal is known as continuous delivery (CD).

This chapter covers

■ The Docker contract between dev and ops

■ Taking manual control of build availability across environments

■ Moving builds between environments over low-bandwidth connections

■ Centrally configuring all containers in an environment

■ Achieving zero-downtime deployment with Docker

Image progresses only as far as tests pass

CI Image Stage

Pre-prod

Prod Test pass

Test fail

Figure 7.1 A typical CD pipeline

In this chapter we’ll refer to your CD pipeline—the process your build goes through after it comes out of your CI pipeline. The separation can sometimes be blurred, but think of the CD pipeline as starting when you have a final image that has passed your initial tests during the build process. Figure 7.1 demon-strates how the image might progress through a CD pipeline until it (hope-fully) reaches production.

It’s worth repeating that last point—

the image that comes out of CI should be final and unmodified throughout your

CD process! Docker makes this easy to enforce with immutable images and encapsula-tion of state, so using Docker takes you one step down the CD road already.

7.1 Interacting with other teams during the CD pipeline

First we’re going to take a little step back and look at how Docker changes the rela-tionship between development and operations.

Some of the biggest challenges of software development aren’t technical—splitting people up into teams based on their roles and expertise is a common practice, yet this can result in communication barriers and insularity. Having a successful CD pipeline requires involvement from the teams at all stages of the process, from development to testing to production, and having a single reference point for all teams can help ease this interaction by providing structure.

TECHNIQUE 62 The Docker contract—reducing friction

One of Docker’s aims is to allow easy expression of inputs and outputs as they relate to a container that contains a single application. This can provide clarity when working with other people—communication is a vital part of collaboration, and understanding how Docker can ease things by providing a single reference point can help you win over Docker unbelievers.

PROBLEM

You want cooperating teams’ deliverables to be clean and unambiguous, reducing fric-tion in your delivery pipeline.

SOLUTION

Use the Docker contract to facilitate clean deliverables between teams.

DISCUSSION

As companies scale, they frequently find that the flat, lean organization they once had, in which key individuals “knew the whole system,” gives way to a more structured organization within which different teams have different responsibilities and compe-tencies. We’ve seen this first-hand in the organizations we’ve worked at.

171 TECHNIQUE 62 The Docker contract—reducing friction

If technical investment isn’t made, friction can arise as growing teams deliver to each other. Complaints of increasing complexity, “throwing the release over the wall,” and buggy upgrades all become familiar. Cries of “Well, it works on our machine!” will increasingly be heard, to the frustration of all concerned. Figure 7.2 gives a simplified but representative view of this scenario.

The workflow in figure 7.2 has a number of problems that may well look familiar to you. They all boil down to the difficulties of managing state. The test team might test something on a machine that differs from what the operations team has set up. In the-ory, changes to all environments should be carefully documented, rolled back when problems are seen, and kept consistent. Unfortunately, the reality of commercial pres-sure and human behavior routinely conspire against this goal, and environmental drift is seen.

Existing solutions to this problem include VMs and RPMs. VMs can be used to reduce the surface area of environmental risk by delivering complete machine repre-sentations to other teams. The downside is that VMs are relatively monolithic entities that are difficult for teams to manipulate efficiently. At the other end, RPMs offer a standard way of packaging applications that helps define dependencies when rolling out software. This doesn’t eliminate configuration management issues, and rolling

The test server VM was built some time ago and is in a non-reproducible state.

A development team delivers a release to a test server.

The ops team receives a release RPM from the dev team once it has passed testing. They deploy it to live.

The live server VM has RPMs released to it by the ops team. It was built some time ago and is now in a non-reproducible state.

The test team validates releases made to the test server VM.

Dev team Test server VM

Deliver release Consumes

Deliver RPM

Test team

Live server VM Ops team

Deliver release

Figure 7.2 Before: a typical software workflow

out RPMs created by fellow teams is far more error-prone than using RPMs that have been battle-tested across the internet.

THE DOCKER CONTRACT

What Docker can do is give you a clean line of separation between teams, where the Docker image is both the borderline and the unit of exchange. We call this the Docker contract, and it’s illustrated in figure 7.3.

With Docker, the reference point for all teams becomes much cleaner. Rather than dealing with sprawling monolithic virtual (or real) machines in unreproducible states, all teams are talking about the same code, whether it’s on test, live, or development.

In addition, there’s a clean separation of data from code, which makes it easier to rea-son about whether problems are caused by variations in data or code.

Because Docker uses the remarkably stable Linux API as its environment, teams that deliver software have far more freedom to build software and services in whatever fashion they like, safe in the knowledge that it will run predictably in various environ-ments. This doesn’t mean that you can ignore the context in which it runs, but it does reduce the risk of environmental differences causing issues.

Various operational efficiencies result from having this single reference touch-point. Bug reproduction becomes much easier, as all teams are able to describe and reproduce issues from a known starting point. Upgrades become the responsibility of the team delivering the change. In short, state is managed by those making the change. All these benefits greatly reduce the communications overhead and allow teams to get on with their jobs. This reduced communications overhead can also help encourage moves towards a microservices architecture.

This is no merely theoretical benefit: we’ve seen this improvement first-hand in a company of over 500 developers, and it’s a frequent topic of discussion at Docker technical meetups.

All three teams now refer to a single reference point: the versioned Docker image.

Dev team Docker image Test team

Ops team

Figure 7.3 After:

the Docker contract

173 TECHNIQUE 63 Manually mirroring registry images

7.2 Facilitating deployment of Docker images

The first problem when trying to implement CD is moving the outputs of your build process to the appropriate location. If you’re able to use a single registry for all stages of your CD pipeline, it may seem like this problem has been solved. But it doesn’t cover a key aspect of CD.

One of the key ideas behind CD is build promotion. Build promotion means each stage of a pipeline (user acceptance tests, integration tests, and performance tests) can only trigger the next stage if the previous one has been successful. With multiple registries you can ensure that only promoted builds are used by only making them avail-able in the next registry when a build stage passes.

We’ll look at a few ways of moving your images between registries, and even at a way of sharing Docker objects without a registry.

TECHNIQUE 63 Manually mirroring registry images

The simplest image-mirroring scenario is when you have a machine with a high-bandwidth connection to both registries. This permits the use of normal Docker functionality to perform the image copy.

PROBLEM

You want to copy an image between two registries.

SOLUTION

Pull the image, retag it, and push.

DISCUSSION

If you have an image at test-registry.company.com and you want to move it to stage-registry .company.com, the process is simple:

$ IMAGE=mygroup/myimage:mytag

$ OLDREG=test-registry.company.com

$ NEWREG=stage-registry.company.com

$ docker pull $OLDREG/$MYIMAGE [...]

$ docker tag -f $OLDREG/$MYIMAGE $NEWREG/$MYIMAGE

$ docker push $NEWREG/$MYIMAGE

$ docker rmi $OLDREG/$MYIMAGE

$ docker rmi $(docker images -q --filter dangling=true)

There are three important points to note about this process:

1 The new image has been force-tagged. This means that any older image with the same name on the machine (left there for layer-caching purposes) will lose the image name, so the new image can be tagged with the desired name.

2 All dangling images have been removed. Although layer caching is extremely useful for speeding up deployment, leaving unused image layers around can quickly use up disk space. In general, old layers are less likely to be used as time passes and they become more out-of-date.

3 You may need to log into your new registry with docker login.

The image is now available in the new registry for use in subsequent stages of your CD pipeline.

TECHNIQUE 64 Delivering images over constrained connections

Even with layering, pushing and pulling Docker images can be a bandwidth-hungry process. In a world of free large-bandwidth connections, this wouldn’t be a problem, but sometimes reality forces us to deal with low-bandwidth connections or costly band-width metering between data centers. In this situation you need to find a more effi-cient way of transferring differences, or the CD ideal of being able to run your pipeline multiple times a day will remain out of reach.

The ideal solution is a tool that will reduce the average size of an image so it’s even smaller than classic compression methods can manage.

PROBLEM

You want to copy an image between two machines with a low-bandwidth connection between them.

SOLUTION

Export the image, use bup to split it, transfer the bup chunks, and import the recom-bined image on the other end.

DISCUSSION

We must first introduce a new tool, bup. It was created as a backup tool with extremely efficient deduplication—deduplication being the ability to recognize where data is used repeatedly and only store it once. Deduplication also happens to be extremely useful in other scenarios, like transferring multiple images with very similar contents.

For this technique we’ve created an image called dbup (short for “docker bup”), which makes it easier to use bup to deduplicate images. You can find the code behind it at https://github.com/docker-in-practice/dbup.

As a demonstration, let’s see how much bandwidth we could save when upgrading from the ubuntu:14.04.1 image to ubuntu:14.04.2. Bear in mind that in practice you’d have a number of layers on top of each of these, which Docker would want to com-pletely retransfer after a lower layer change. By contrast, this technique will recognize the significant similarities and give you much greater savings.

The first step is to pull both of those images so we can see how much is transferred over the network:

$ docker pull ubuntu:14.04.1 && docker pull ubuntu:14.04.2 [...]

$ docker history ubuntu:14.04.1

IMAGE CREATED CREATED BY SIZE

5ba9dab47459 3 months ago /bin/sh -c #(nop) CMD [/bin/bash] 0 B 51a9c7c1f8bb 3 months ago /bin/sh -c sed -i 's/^#\s*$deb.*universe$$/ 1.895 kB 5f92234dcf1e 3 months ago /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic 194.5 kB 27d47432a69b 3 months ago /bin/sh -c #(nop) ADD file:62400a49cced0d7521 188.1 MB

511136ea3c5a 23 months ago 0 B

$ docker history ubuntu:14.04.2

IMAGE CREATED CREATED BY SIZE

175 TECHNIQUE 64 Delivering images over constrained connections

07f8e8c5e660 2 weeks ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0 B 37bea4ee0c81 2 weeks ago /bin/sh -c sed -i 's/^#\s*$deb.*universe$$/ 1.895 kB a82efea989f9 2 weeks ago /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic 194.5 kB e9e06b06e14c 2 weeks ago /bin/sh -c #(nop) ADD file:f4d7b4b3402b5c53f2 188.1 MB

$ docker save ubuntu:14.04.1 | gzip | wc -c 65970990

$ docker save ubuntu:14.04.2 | gzip | wc -c 65978132

This demonstrates that the Ubuntu images share no layers, so we can use the whole image size as the amount that would be transferred when pushing the new image. Also note that the Docker registry uses gzip compression to transfer layers, so we’ve included that in our measurement (instead of taking the size from docker history). About 65 MB is being transferred in both the initial deployment and the subsequent deployment.

In order to get started, you’ll need two things—a directory to store the “pool” of data bup uses as internal storage, and the dockerinpractice/dbup image. You can then go ahead and add your image to the bup data pool:

$ mkdir bup_pool

$ alias dbup="docker run --rm \

-v $(pwd)/bup_pool:/pool -v /var/run/docker.sock:/var/run/docker.sock \ dockerinpractice/dbup"

$ dbup save ubuntu:14.04.1 Saving image!

Done!

$ du -sh bup_pool 74M bup_pool

$ dbup save ubuntu:14.04.2 Saving image!

Done!

$ du -sh bup_pool 90M bup_pool

Adding the second image to the bup data pool has only increased the size by about 15 MB. Assuming you synced the folder to another machine (possibly with rsync) after adding ubuntu:14.04.1, syncing the folder again will only transfer 15 MB (as opposed to the 65 MB before).

You then need to load the image at the other end:

$ dbup load ubuntu:14.04.1 Loading image!

Done!

The process for transferring between registries would look something like this:

1 docker pull on host1

2 dbup save on host1

3 rsync from host1 to host2

4 dbup load on host2

5 docker push on host2

In document Packt Publishing - Docker Networking Cookbook (Page 194-200)