Capability For Continuous Deployment

Continuous deployment is very buzzword-y right now. I have some really strong opinions about deployment (just ask James Socol or Erik Kastner, who have heard me ranting about this). Here’s what I think, in a nutshell:

You should build the capability for continuous deployment even if you never intend to do continuous deployment. The machinery is more important than your deployment velocity.

Let me take a step back and talk about deployment maturity.

Immature deployment

At the immature end, a developer or team works on a system that has no staging environment. Code goes from the development environment straight to production. (I’m not going to even talk about the situation where the development environment is production. I am fully aware that these still exist, from asking questions to that effect of the audience in conference talks.) I’m also assuming, in this era of github, that everybody is using version control.

(I want to point out that it’s the easy availability of services like github that has enabled even tiny, disorganized teams to use version control. VC is ubiquitous now, and that is a huge blessing.)

This sample scenario is very common: work on dev, make a commit, push it out to prod. Usually, no ops team is involved, or even exists. This can work really well in an early stage company or project, especially if you’re pre-launch.

This team likely has no automation, and a variable number of tests (0+). Even if they have tests, they may have no coverage numbers and no continuous integration.

When you hear book authors, conference speakers or tech bloggers talk about the wonders of continuous deployment, this scenario is not what they are describing.

The machinery of continuous deployment

Recently in the Mozilla webdev team, we’ve had a bunch of conversations about CD. When we talked about what was needed to do this, I had a revelation.

Although we were choosing not to do CD on my team, we had in place every requirement that was needed:

  • Continuous integration with build-on-commit
  • Tests with good coverage, and a good feel for the holes in our coverage
  • A staging environment that reflects production – our stage environment is a scale model of production, with the same ratios between boxes
  • Managed configuration
  • Scripted deployment to a large number of machines

I realized then that the machinery for continuous deployment is different from the deployment velocity that you choose for your team. If we need to, we can make a commit and push it out inside of a couple of minutes, without breaking a sweat.

Why we don’t do continuous deployment on Socorro

We choose not to, except in urgent situations, for a few reasons:

  • We like to performance test our stuff, and we haven’t yet automated that
  • We like to have a human QA team test in addition to automated tests
  • We like to version our code and do “proper” releases because it’s open source and other people use our packages
  • A commit to one component of our system is often related to other commits in other components, which make more sense to ship as a bundle

Our process looks like this:

  • The “dev” environment runs trunk, and the “stage” environment runs a release branch.
  • On commit, Jenkins builds packages and deploys them to the appropriate environment.
  • To deploy to prod, we run a script that pulls the specified package from Jenkins and pushes it out to our production environment. We also tag the revision that was packaged for others to use, and for future reference. Note we are pushing the same package to prod that we pushed to stage, and stage reflects production.
  • If we need to change configuration for a release, we change it first in Puppet on staging, and then update production the same way.

We do intend to increase our deployment velocity in the coming months, but for us that’s not about the machinery of deployment, it’s about increasing our delivery velocity.

Delivery velocity is a different problem, which I’m wrestling with right now. We have a small team, and the work we’re trying to do tends not to come in small chunks but big ones, like a new report, or a system-wide change to the way we aggregate data (the thing we’re working on at the moment). It’s not that changes sit in trunk, waiting for a release. It’s more that it takes us a while to get something to a deployable stage. That is, deployment for us is not the blocker to getting features to our users faster.

finally:

It’s the same old theme you’ve seen ten times before on this blog: everybody’s environment is different, and continuous deployment may not be for everybody. On the other hand, the machinery for continuous deployment can be critical to making your life easier . Automating all this stuff certainly helps me sleep at night much better than I used to when we did manual pushes.

(Incidentally, I’d like to thank Rob Helmer and Justin Dow for automating our world: you couldn’t find better engineers to work with.)

4 Comments

  1. Axel Hecht:

    I become less of a believer in coverage. No coverage is a great indicator of lack of tests, but good coverage sadly says very little about how much your tests reflect what’s supposed to happen.

    Not that I have a good suggestion to replace coverage with.

  2. Erik Kastner:

    Great post!
    This is very much the point I was trying to make (in a very long-winded way) with my post on code as craft:
    http://codeascraft.etsy.com/2010/05/20/quantum-of-deployment/

    from the post:

    This isn’t a post about continuous deployment. Having a very simple deployment procedure is something you should do even if the thought of deploying your code 20 times a day scares you. Deployment can be a contentious subject with many stakeholders. Getting it simple and repeatable allows everyone to share a common vocabulary.

  3. laura:

    Axel, I think my main beef here is that when people have no idea what coverage they have - say it’s 5% - but they have some tests. This makes them wildly overconfident about deployment. Also, 100% code coverage with unit tests doesn’t give you load, smoke, acceptance, system, or browser testing. The main thing, I think, is to know the limitations of your tests.

    Erik - thanks! The quantum of deployment is the inverse of deployment velocity, as I read it. I liked your post when I first read it, and probably even more so now.

  4. David Tobey:

    Thank you for the thought-provoking post. There can be some nasty psychological barriers to continuous deployment, or deployment period for that matter. Deployment can be where the rubber meets the road. Where the developer’s “Platonic ideal” you mentioned in another post may well get burst. So in addition to the mechanics, the human environment needs to be there as well, where developers are maybe encouraged not to treat their work quite so much as precious jewels but rather tools to help real people in practical situations. And where the users or other stakeholders have some flexibility and forgiveness about new code as they begin to use it.

Leave a comment