This page talks about the popular alternatives for deployment into virtuals.


The Task

The task we trying to perform is to set up virtual machine that has all of the services and applications that you require installed and properly configured.

The Goals

  • We want the procedure to be repeatable.
  • We want the procedure to work for a range of different base images.
  • We want it to be simple. Users of virtuals should not need to be experts in system administration.

Approach 1: Using a pre-built image as a template

 NeCTAR (or OpenStack more generally) allows you to do the following:

  1. Create a new virtual instance from a standard base image, and manually install and configure all of the software you need.
  2. Take a snapshot of the virtual, convert the snapshot into a template image, and upload it to the OpenStack infrastructure.
  3. Create new instances from your template image.

There are some problems with this approach:

  • There is no easy way to push updates to the instances cloned from your template image.  Updates need to be applied by hand to each one, unless you set up separate infrastructure for doing this.
  • If decide that you need to rebuild your template (e.g. on a newer operating system base):
    • you need to repeat the manual installation and configuration, and
    • there is no simple way to "rebase" your existing instances to the new template.

Turnkey Linux

Turnkey Linux is a site / group that specializes in producing pre-built Linux for doing a variety of tasks.  If you want to go down this path, I suggest you check them out.  Caveat: I don't if you can run an "off-the-shelf" Turnkey Linux image on NeCTAR.  There may sell be OpenStack or NeCTAR specific configuration issues that need to be addressed.

Approach 2: Using a configuration tool

The second approach is to use a scripted configuration tool to automate the installation and configuration process.  The basic idea is that you should be able to go from a "vanilla" virtual image (selected from the NeCTAR dashboard) to one that is fully configured, simply by running some scripts or recipes.

There are a variety of ways of implementing the scripting, but ideally you want a scheme / framework that works across a wide range of Linux distros with a minimum of tweaking.

The main disadvantage of this approach is that scripted configuration tends to have a significant learning curve.  But the flip-side is that if you can use scripts / recipes written by someone else, you often need to know less about the "stuff" that you are configuring than if you had to do it all by hand.

Opscode Chef

Opscode Chef is a configuration system in which the scripting is expressed using declarative recipes.  The basic model is that you select generic recipes to configure the software and services you need, and specify the system-specific details using attributes (or data bags).  Chef framework and the recipes (ideally) look after variability across different distros; e.g. things such as how to install packages, and how to configure services.

Chef can be used in "solo" mode for stand-alone deployment, or in "server" mode where the recipes, attributes, databags and so on are stored in a server.

The main advantages of Chef are:

  • There are existing recipes for doing a wide range of tasks.
  • Platform / distro independence really does work, certainly across a wide range of Linux distros.
  • There are a lot of people using Chef, and it is not difficult to find help ... provided you know how to use Google.
  • You can put all of your Chef recipes and so on into source code control (Git is recommended).
  • Commercial support is available.

Chef does have some drawbacks though:

  • Some of the "community" recipes are not particularly good.  Common problems are recipes not working for some distros, and limited functionality / configurability.  Of course, you can always help to fix this by submitting patches ...
  • There can be conflict between the way that the Chef recipes configure software, and the way that different distros do.  (For example, Apache Tomcat and Apache Httpd configurations / installations are organized differently across different distros, in ways that are hard for Chef recipes to deal with.  I've had problems with distro package upgrades breaking configurations created by Chef ...)


  • Puppet
  • Salt

Approach #3: File distribution

The basic idea is to create a master file tree (or trees) containing configuration files, executables and so on, and then arrange that the files are distributed to the machines you are managing.  Tools for doing this kind of thing include 'rdist', 'rdist' and (I think) 'capistrano.

This is a very "old-fashioned" approach, and it has problems including:

  • dealing with configuration files and binaries that need to be different on different machines,
  • dealing with heterogeneous operating system platforms, and
  • changing or upgrading operating systems.

Other approaches

There are two other approaches that I can think of:

  • You can ignore the issue of repeatability entirely and just do the installation by hand. In fact a lot of people do this. The problem is that if you do need to repeat the process down the track, you may have difficulty remembering exactly what it was you did the first time.
    If you take this approach, you would be well advised to keep comprehensive notes on how you set up your virtuals.
  • I have heard of people addressing this problem (in part) by building a collection of custom RPMs doing the bulk of the configuration work in their embedded scripts.  This doesn't really scale ... because you have to maintain a raft of different RPMs (or equivalent) for each platform variant.


Any recommendations are going to be somewhat subjective, but my current feeling is that Chef is the best match for this particular problem.