magpiebrain

Sam Newman's site, a Consultant at ThoughtWorks

Posts from the ‘DevOps’ category

I’ll be running my new talk “Designing For Rapid Release” at a couple of conferences in the first half of this year. First up is the delightfully named Crash & Burn in Stockholm, on the 2nd of March. Then later in May I’ll be at Poznan in Poland for GeeCon 2012.

This talk focuses on the kinds of constraints we should consider when evolving their architecture of our systems in order to enable rapid, frequent release. So much of the conversation about Continuous Delivery focuses on the design of build pipelines, or the nuts and bolts of CI and infrastructure automation. But often the biggest constraint in being able to incrementally roll out new features are the problems in the design of the system itself. I’ll be pulling together a series of patterns that will help you identify what to look for in your own systems when moving towards Continuous Delivery.

Leave a comment

On my current client project, in terms of managing configuration of the various environments, I have separated things into two problem spaces – provisioning hosts, and configuring hosts. Part of the reason for this separation is that although targeting AWS, we do need to allow us to support alternative services in the future, but I also consider the type of tasks to be rather different and to require different types of tools.

For provisioning hosts I am using the Python AWS API Boto. For configuring the hosts once provisioned, I am using Puppet. I remain unconvinced as to the relative merits of PuppetMaster or Chef Server (see my previous post on the subject) and so have decided to stick with using PuppetSolo so I can manage versioning how I would like. This leaves me with a challenge – how do I apply the puppet configuration for the hosts once provisioned with Boto? I also wanted to provide a relatively uniform command-line interface to the development team for other tasks like running builds etc. Some people use cron-based polling for this, but I wanted a more direct form of control. I also wanted to avoid the need to run any additional infrastructure, so mcollective was never something I was particularly interested in.

After a brief review of my “Things I should look at later” list it looked like time to give Fabric a play.

Fabric is a Python-based tool/library which excels at creating command-line tools for machine management. It’s bread and butter is script-based automation of machines via SSH – many people in fact use hand-rolled scripts on top of Fabric as an alternative to systems like Chef and Puppet. The documentation is very good, and I can heartily recommend the Fabric tutorial.

The workflow I wanted was simple. I wanted to be able to checkout a specific version of code locally, run one command to bring up a host and also apply a given configuration set. My potentially naive solution to this problem is to simply tar up my puppet scripts, upload them, and then run puppet. Here is the basic script:

[python]
@task
def provision-box():
public_dns = provision_using_boto()

local("tar cfz /tmp/end-bundle.tgz path/to/puppet_scripts/*")
with settings(host_string=public_dns, user="ec2-user", key_filename="path/to/private_key.pem"):
run("sudo yum install -y puppet")
put("/tmp/end-bundle.tgz", ".")
run("tar xf end-bundle.tgz && sudo puppet –modulepath=/home/ec2-user/path/to/puppet_scripts/modules path/to/puppet_scripts/manifests/myscript.pp")
[/python]

The provision_using_boto() command is an exercise left to the reader, but the documentation should point you in the right direction. If you stuck the above command in your fabfile.py, all you need to do is run fab provision-box to do the work. The first yum install command is there to handle bootstraping of puppet (as it is not on the AMIs we are using) – this will be a noop if the target host already has it installed.

This example is much more simplified than the actual scripts as we have also implemented some logic to re-use ec2 instances to save time & money, and also a simplistic role system to manage different classes of machines. I may write up those ideas in a future post.

1 Comment

It seems I spoke too soon. Just one day after thinking I had tracked down the source of the trouble, and yesterday evening brought another outage. The graph in CloudWatch was all too familiar, showing the huge uptick in CPU use. The box was again unresponsive and had to be restarted. Checking cpu_log for a likely culprit, the entries looked odd:

[plain light=”true”]
2011-07-13 00:12:22 www-data 26096 21.4 0.9 160732 5972 ? D Jul12 5:30 /usr/sbin/apache2 -k start
2011-07-13 00:12:25 www-data 26096 21.4 0.9 160736 6040 ? R Jul12 5:30 /usr/sbin/apache2 -k start
2011-07-13 00:12:22 www-data 26096 21.3 0.9 160732 5972 ? D Jul12 5:30 /usr/sbin/apache2 -k start
2011-07-13 00:12:22 root 26179 24.0 0.0 4220 584 ? S 00:12 0:00 /bin/sh /home/ubuntu/tools/cpu_log
[/plain]

No entries from Postfix – good – but now other processes are having trouble. This was starting to point away from one rogue process gobbling CPU, to high CPU use being a symptom of something else. What can cause very high CPU use? Among other things, swapping memory. A process chewing up all available memory could easily cause these kinds of symptoms. A quick scan through syslog showed me something I should have spotted earlier. If it wasn’t the smoking gun, then at least something pretty close:

[plain light=”true”]
Jul 13 00:11:28 domU-12-31-39-01-F0-E5 kernel: [38837.985499] apache2 invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
[/plain]

This doesn’t mean that Apache is to blame, just that it was a process which oom-killer tried to take out in order to free up memory. And just prior to the outage itself in the apache logs:

[plain light=”true”]
[Tue Jul 12 23:48:24 2011] [error] (12)Cannot allocate memory: fork: Unable to fork new process
[/plain]

At this point my mind was already turning to the fact that I hadn’t done *any* tuning of Apache processes or PHP. After googling around for a bit, a few things looked wrong in my config. Here was the untuned default that Ubuntu gave me:

[plain light=”true”]
<IfModule mpm_prefork_module>
StartServers 5
MinSpareServers 5
MaxSpareServers 10
MaxClients 150
MaxRequestsPerChild 0
</IfModule>
[/plain]

In general MaxClients refers to the maximum number of simultaneous requests that will be served. On prefork Apache, like mine, MaxClients also refers to the max number of child processes that get spawned. A simple ps showed that even after a restart, each apache process was consuming up to 35MB of memory. The host in question has 1GB in RAM – it was clear that even with nothing else running on the box, with that sort of memory footprint I would exhaust memory way before the MaxClients threshold was reached. Even more worrying, the MaxRequestsPerChild was set to zero, meaning that the child processes would never be restarted. If a memory leak was occurring inside the child Apache process, it could carry on eating memory until the box comes crashing to it’s knees. After some quick maths I decided to reduce my MaxClients down to a more manageable 25, but also set MaxRequestsPerChild to 1000. My hope is that this may buy me some more time to try and track down where the memory use is occurring.

This has spurned me on to finally invest some time looking at nginx. This weekend may see me putting in nginx side by side with a view to moving away from Apache – from all reports this may allow me to run my sites with a much lower footprint. But if the last couple of days has taught me anything, it’s that I should be so sure to rush to the conclusion that I’ve finally tracked this problem down. If I’m still having trouble at the weekend, I may well just clone the box and try and reproduce the problem with some performance tests.

2 Comments

I’m in the process of migrating the many sites I manage from Slicehost over to EC2 (which is where this blog is currently running). I hit a snag in the last day or two – my Montastic alerts told me that the sites I had already migrated were not responding. I tried – and failed – to SSH into the box. The CloudWatch graphs for the instance showed a 100% CPU use, explaining SSH being unresponsive. The problem is that I couldn’t tell what was causing the problem. My only option was to restart the instance, which at least brought it back to life.

What I needed was something that would tell me what was causing the problem. After reaching out to The Hive Mind, Cosmin pointed me in the direction of some awk and ps foo. This little script gets a process listing, and writes out all those rows where the CPU is above 20%, prepended with the current timestamp:

[plain light=”true”]
ps aux | gawk ‘{ if ( $3 > 20 ) { print strftime("%Y-%m-%d %H:%M:%S")" "$0 } }
[/plain]

My box rarely goes about 5% CPU use, and I was worried about the CPU ramping up so quickly that I didn’t get a sample, so this threshold seemed sensible. The magic is the if ( $3 > 20) – this only emits the line if the third column of input from ps aux (which is the CPU) goes above 20.

I put the one-liner in a script, then stuck the following entry into cron to ensure that every minute, the script gets run. If everything is ok, no output. Otherwise, I’ll get the full process listing. This wouldn’t top the box getting wedged again, but would at least tell me what caused it.

[plain light=”true”]
* * * * * root /home/ubuntu/tools/cpu_log >> /var/log/cpu_log
[/plain]

Lo and behold, several hours later and the box got wedged once again. After a restart, the cpu_log showed this:

[plain light=”true” wraplines=”false”]
2011-07-11 17:55:42 postfix 6398 29.6 0.3 39428 2184 ? S 17:55 0:01 pickup -l -t fifo -u -c
2011-07-11 17:55:42 postfix 6398 29.6 0.3 39428 2180 ? S 17:55 0:01 pickup -l -t fifo -u -c
2011-07-11 17:55:42 postfix 6398 29.6 0.2 39428 1556 ? S 17:55 0:01 pickup -l -t fifo -u -c
2011-07-11 17:55:42 postfix 6398 24.6 0.2 39428 1368 ? S 17:55 0:01 pickup -l -t fifo -u -c
2011-07-11 18:16:43 root 6440 50.0 0.0 30860 344 ? R 18:16 0:01 pickup -l -t fifo -u -c
[/plain]

Matching what the CloudWatch graphs showed me, the CPU ramped up quote quickly, before I loose all output (the 4th column here is CPU). But this time, we have a culprit – Postfix’s pickup process. I had configured Postfix just a day or two back, so clearly something was amiss. Nonetheless, I can at least now disable Postfix to spend some time diagnosing the problem.

Limiting CPU

Something else that was turned up in my cries for help was cpulimit. This utility would allow me to cap how much CPU a given process used. If and when I re-enable postfix, I’ll almost certainly use this to avoid future outages while I iron out any kinks.

2 Comments

I’ve been playing around with Vagrant (which just uses VirtualBox under the hood), with a view to making it easy for developers to bring up production-like systems on their local workstations. I fashioned a simple Vagrantfile cribbed from the examples on the web. I was specifically interested as to how fast it was to bring up VMs from scratch to understand how viable it was to incorporate this as part of a normal development workflow.

This system consists of 32-bit lucid boxes, each with default settings. All tests were done on my aging MacBook Pro, with 4GB of memory. All times captured using time. This test represents bringing up between one and four VMs from scratch, and applying a simple Chef recipe to install Apache (you can see the Vagrant file I used here). Note that simply suspending and restarting a VM is much faster – these times represent the worst case of firing these VMs up from scratch.

  • 1 VM – 2m15.543s
  • 2 VMs – 5m24.306s
  • 3 VMs – 10m53.824s
  • 4 VMs – 12m19.211s

I didn’t do enough sample points to predict any obvious trends, other than more VMs = longer startup time. This also discounts any startup time your application may have too. Nonetheless, bringing up a single VM from scratch appears to be fast enough to be usable, and even when dealing with more VMs to represent more complex topologies (useful for local DR testing for example) also seems viable.

I’m less sure as to how useful this is when provisioning environments to run functional tests – adding a couple of minutes to a build that you want to keep somewhere south of ten minutes in total is huge, but keeping a pool of vagrant clean VMs already provisioned could work around this issue (as could using better hardware). The idea of being able to bring up clean environments using the same production config and tear them down again afterwards as part of a CI build is compelling, but I need to give it more thought.

From http://www.flickr.com/photos/bigduke6/258262809/I’ve been playing around with both Chef and Vagrant recently, getting my head around the current state of the art regarding configuration management. A rather good demo of Chef at the recent DevOpsDays Hamburg by John Willis pushed me towards Chef over Puppet, but I’m still way to early in my experimentation to know if that is the right choice.

I may speak more later about my experiences with Vagrant, but this post is primarily concerning Chef, and specifically thoughts regarding repeatability.

Repeatability

Most of us I hope, check our code in. Some of us even have continuous integration, and perhaps even a fully fledged deployment pipeline which creates packages representing our code which have been validated to be production ready. By checking in our code, we hope to bring about a situation whereby we can recreate a build of our software at a previous point in time.

Typically however, deploying these systems requires a bit more than simply running apt-get install or something similar. Machines need to be provisioned and dependencies configured, and this is where Chef and Puppet come in. Both systems allow you to write code that specifies the state you expect your nodes to be in to allow your systems to work. To my mind, it is important therefore that the version of the configuration code needs to be in sync with your application version. Otherwise, when you deploy your software, you may find that the systems are not configured how you would like.

Isn’t It All About Checking In?

So, if we rely on checking our application code in to be able to reproduce a build, why not check our configuration code into the same place? On the face of it, this makes sense. The challenge here – at least as I understand the capabilities of Chef, is that much of the power of Chef comes from using Chef Server, which doesn’t play nicely with this model.

Chef Server is a server which tells nodes what they are expected to be. It is the system that gathers information about your configured systems allowing discovering via mechanisms like Knife, and also how you push configuration out to multiple machines. Whilst Chef Server itself is backed by version control, there doesn’t seem to be an obvious way for an application node to say “I need version 123 of the Web Server recipe”. That means, that if I want to bring up an old version of a Web node, it could reach out and end up getting a much newer version of a recipe, thereby not correctly recreating the previous state.

Now, using Chef Solo, I could check out my code and system configuration together as a piece, then load that on to the nodes I want, but I loose a lot by not being able to do discovery using Knife and similar tools, and I loose the tracking etc.

Perhaps there is another way…

Chef does have a concept of environments. With an environment, you are able to specify that a node associated with a specific environment should use a specific version of a recipe, for example:

name "dev"
description "The development environment"
cookbook_versions  "couchdb" => "11.0.0"
attributes "apache2" => { "listen_ports" => [ "80", "443" ] }

The problem here is that I think the concept of being able to access versions of my cookbooks is completely orthogonal to environments. Let’s remember the key goal – I want to be able to reproduce a running system based on a specific version of code, and identify the right version of the configuration (recipes) to apply for that version of the code. Am I missing something?

In a previous post, I showed how we could use Clojure and specifically Incanter to process access logs to graph hits on our site. Now, we’re going to adapt our solution to allow us to to show the number of unique users over time.

We’re going to change the previous solution to pull out the core dataset representing the raw data we’re interested in from the access log – records-from-access-log remains unchanged from before:

[clojure]
(defn access-log-to-dataset
[filename]
(col-names (to-dataset (records-from-access-log filename)) ["Date" "User"]))
[/clojure]

The raw dataset retrieved from this call looks like this:

Date User
11/Aug/2010:00:00:30 +0100 Bob
11/Aug/2010:00:00:31 +0100 Frank
11/Aug/2010:00:00:34 +0100 Frank

Now, we need to work out the number of unique users in a given time period. Like before, we’re going to use $rollup to group multiple records by minute, but we need to work out how to summarise the user column. To do this, we create a custom summarise function which calculates the number of unique users:

(defn num-unique-items
  [seq]
  (count (set seq)))

Then use that to modify the raw dataset and graph the resulting dataset:

(defn access-log-to-unique-user-dataset
  [access-log-dataset]
    ($rollup num-unique-items "User" "Date" 
      (col-names (conj-cols ($map #(round-ms-down-to-nearest-min (as-millis %)) "Date" access-log-dataset) ($ "User" access-log-dataset)) ["Date" "Unique Users"])))

(defn concurrent-users-graph
  [dataset]
  (time-series-plot :Date :User
                             :x-label "Date"
                             :y-label "User"
                             :title "Users Per Min"
                             :data (access-log-to-unique-user-dataset dataset)))


(def access-log-dataset
  (access-log-to-dataset "/path/to/access.log"))

(save (concurrent-users-graph access-log-dataset) "unique-users.png")

You can see the full source code listing here.

Continuing in a re-occurring series of posts showing my limited understanding of Clojure, today we’re using Clojure for log processing. This example is culled from some work I’m doing right now in the day job – we needed to extract usage information to better understand how the system is performing.

The Problem

We have an Apache-style access log showing hits our site. We want to process this information to extract information like peak hits per minute, and perhaps eventually more detailed information like the nature of the request, response time etc.

The log looks like this:

43.124.137.100 - username 05/Aug/2010:17:27:24 +0100 "GET /some/url HTTP/1.1" 200 24 "http://some.refering.domain/" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.9) Gecko/2009040821 Firefox/3.0.9 (.NET CLR 3.5.30729)"

Extracting The Information

We want to use Incanter to help us process the data & graph it. Incanter likes its data as a sequence of sequences – so that’s what we’ll create.

First up – processing a single line. I TDD’d this solution, but have excluded the tests from the source listing for brevity.

user=> (use 'clojure.contrib.str-utils)
nil
user=> (use '[clojure.contrib.duck-streams :only (read-lines)])
nil

user=> (defn extract-records-from-line
  [line-from-access-log]
  (let [[_ ip username date] (re-find #"^(d{1,3}.d{1,3}.d{1,3}.d{1,3}) - (w+) (.+? .+?) " line-from-access-log)]
    [date username]))
#'user/extract-records-from-line

user=> (defn as-dataseries
  [access-log-lines]
  (map extract-records-from-line access-log-lines))
#'user/as-dataseries

user=> (defn records-from-access-log
  [filename]
  (as-dataseries (read-lines filename)))
#'user/records-from-access-log

A few things to note. extract-records-from-line is matching more than strictly needed – I just wanted to indicate the use of destructing for matching parts of the log line. I’m pulling in the username & date – the username is not strictly needed for what follows. Secondly, note the use of read-lines from clojure.contrib.duck-streams – rather than slurp, read-lines is lazy. We’ll have to process the whole file at some point, but it’s a good idea to look to use lazy functions where possible.

At this point, running records-from-access-log gives us our sequence of sequences – next up, pulling it into Incanter.

Getting The Data Into Incanter

We can check that our code is working properly by firing up Incanter. Given a sample log:

56.24.137.230 - fred 05/Aug/2010:17:27:24 +0100 "GET /some/url HTTP/1.1" 200 24 "http://some.refering.domain/" "SomeUserAgent"
12.14.137.140 - norman 05/Aug/2010:17:27:24 +0100 "GET /some/url HTTP/1.1" 200 24 "http://some.refering.domain/" "SomeUserAgent"
42.1.137.110 - bob 05/Aug/2010:17:28:24 +0100 "GET /some/url HTTP/1.1" 200 24 "http://some.refering.domain/" "SomeUserAgent"
143.124.1.50 - clare 05/Aug/2010:17:29:24 +0100 "GET /some/url HTTP/1.1" 200 24 "http://some.refering.domain/" "SomeUserAgent"

Let’s create a dataset from it, and view the resulting records:

user=> (use 'incanter.core)
nil
user=> (def access-log-to-dataset 
(to-dataset (records-from-access-log "/path/to/example-access.log")))
#'user/access-log-dataset
user=> (view access-log-dataset)

The result of the view command:

Unfortunately, no column names – but that is easy to fix using col-names:

user=> (def access-log-dataset 
(col-names (to-dataset (records-from-access-log "/path/to/example-access.log")) ["Date" "User"]))
#'user/access-log-dataset
user=> (view access-log-dataset)

At this point you can see that it would be easy for us to pull in the URL, response code or other data rather than the username from the log – all we’d need to do is change extract-records-from-line and update the column names.

Graphing The Data

To graph the data, we need to get Incanter to register the date column as what it is – time. Currently it is in string format, so we need to fix that. Culling the basics from Jake McCray’s post, here is what I ended up with (note use of Joda-Time for date handling – you could use the standard SimpleDateFormat if you preferred):

user=> (import 'org.joda.time.format.DateTimeFormat)
nil

user=> (defn as-millis
  [date-as-str]
  (.getMillis (.parseDateTime (DateTimeFormat/forPattern "dd/MMM/yyyy:HH:mm:ss Z") date-as-str)))
#'user/as-millis

user=> (defn access-log-to-dataset
  [filename]
  (let [unmod-data (col-names (to-dataset (records-from-access-log filename)) ["Date" "User"])]
    (col-names (conj-cols ($map as-millis "Date" unmod-dataset) ($ "User" unmod-dataset)) ["Date Time In Ms" "User"])))
#'user/access-log-to-dataset

While the date parsing should be pretty straightforward to understand, there are a few interesting things going on with the Incanter code that we should dwell on briefly.

The $ function extracts a named column, whereas the $map function runs another function over the named column from the dataset, returning the modified column (pretty familiar if you’ve used map). conj-cols then takes these resulting sequences to create our final dataset.

We’re not quite done yet though. We have our time-series records – representing one hit on our webserver – but don’t actually have values to graph. We also need to work out how we group hits to the nearest minute. What we’re going to do is replace our as-millis function to one that rounds to the nearest minute. Then, we’re going to use Incater to group those rows together – summing the hits it finds per minute. But before that, we need to tell Incanter that each row represents a hit, by adding a ‘Hits’ column. We’re also going to ditch the user column, as it isn’t going to help us here:

user=> (defn access-log-to-dataset
  [filename]
  (let [unmod-dataset (col-names (to-dataset (records-from-access-log filename)) ["Date" "User"])]
    (col-names (conj-cols ($map as-millis "Date" unmod-dataset) (repeat 1)) ["Date" "Hits"])))
#'user/access-log-to-dataset

Next, we need to create a new function to round our date & time to the nearest minute.

Update: The earlier version of this post screwed up, and the presented round-ms-down-to-nearest-min actually rounded to the nearest second. This is a corrected version:

(defn round-ms-down-to-nearest-min
  [millis]
  (* 60000 (quot millis 60000)))

If you wanted hits per second, here is the function:

(defn round-ms-down-to-nearest-sec
  [millis]
  (* 1000 (quot millis 1000)))

And one more tweak to access-log-to-dataset to use the new function:

(defn access-log-to-dataset
  [filename]
  (let [unmod-dataset (col-names (to-dataset (records-from-access-log filename)) ["Date" "User"])]
    (col-names (conj-cols ($map #(round-ms-down-to-nearest-min (as-millis %)) "Date" unmod-dataset) (repeat 1)) ["Date" "Hits"])))

Finally, we need to roll our data up, summing the hits per minute – all this done using $rollup:

(defn access-log-to-dataset
  [filename]
  (let [unmod-dataset (col-names (to-dataset (records-from-access-log filename)) ["Date" "User"])]
    ($rollup :sum "Hits" "Date" 
      (col-names (conj-cols ($map #(round-ms-down-to-nearest-min (as-millis %)) "Date" unmod-dataset) (repeat 1)) ["Date" "Hits"]))))

$rollup applies a summary function to a given column (in our case “Hits”), using another function to determine the parameters for that function (“Date” in our case). :sum here is a built-in Incanter function, but we could provide our own.

And the resulting dataset:

Now we have our dataset, let’s graph it:

user=> (defn hit-graph
  [dataset]
  (time-series-plot :Date :Hits
                             :x-label "Date"
                             :y-label "Hits"
                             :title "Hits"
                             :data dataset))

user=> (view (hit-graph (access-log-to-dataset "/path/to/example-access.log")))

This is deeply unexciting – what about if we try a bigger dataset? Then we get things like this:

Conclusion

You can grab the final code here.

Incanter is much more than simply a way of graphing data. This (somewhat) brief example shows you how to get log data into an Incanter-frendly format – what you want to do with it then is up to you. I may well explore other aspects of Incanter in further posts.

3 Comments

As part of the day job at ThoughtWorks, I’m involved with training courses we run in partnership with Amazon, giving people an overview of the AWS service offerings. One of the things we cover that can often cause some confusion is the different types of images – specifically the difference between S3 and EBS-backed images. I tend to give specific advice over what type of instance I prefer, but before then some detail is in order.

S3-hosted Images

Instances launched from S3-hosted images use the local disk for the root partition. Once spun up, if you shut down machine down the state of the operating system is lost. The only way to persist data is to store it off-instance – for example on an attached EBS volume, on S3 or elsewhere. Aside from the lack of persistence of the state of the OS, these images are limited to being 10GB in size – not a problem for most Linux distros I know of, but a non-starter for some Microsoft images.

There is also a time delaying spinning these images up, due to the time taken to copy the Image from S3 to the physical machine – the whole image has to be downloaded before the instance can start the boot up process. In practice, popular images seem to be pretty well cached and so that the delay may not be an issue for you. As with everything in AWS, the barrier to entry is so low it is worth testing this yourself to understand if this is going to be a deal breaker.

EBS-backed Images

An EBS-backed instance is an EC2 instance which uses such a device as its root partition, rather than local disk. EBS-backed images can boot in seconds (as opposed to minutes for less-well cached S3 images) as they just need to stream enough of the image to start the boot process. They also can be up to 1TB in size – in other words the max size of any EBS volume. EBS volumes also have different performance characteristics than local drives, which may need to be taken into account depending on your particular use.

The key benefit is that EBS-backed instances can be ‘suspended’ – by stopping an EBS-backed instance, you can resume it later, with all EBS-backed state (e.g. the OS state) being maintained. In this way it is more akin to the slices at slicehost, or a real, physical machine.

There is one out and out downside when compared to S3-hosted images – your charges will be higher. You will be charged for IO, which could mount up depending on your particular usage patterns. You’re also charged for the allocated size of the EBS volume too.

So which is best?

Aside from the increased cost, and the differing benefits of a block-storage rather than local disk in terms of performance characteristics, there is one key downside to an EBS. It encourages you to think of EC2 instances as persistent, stable entities which are around forever. In general, I believe you’ll gain far more from embracing the ‘everything fails, deal with it’ ethos – that is, have your machines configure themselves afresh atop vanilla gold S3-backed AMIs on boot (e.g. using EC2 instance data to download & install your software). This means you cannot fall back into the old thought patterns of constantly patching, tweaking an OS, to the point where attempting to recreate it would be akin to an act of sysadmin archeology.

There will remain scenarios where EBS-backed instances make sense (and those of you using Windows Server 2008 have no choice), but I always recommend that people moving to AWS limit their use and instead adopt their practices to use the S3 – thereby also embracing more of the promise of the AWS stack as a whole.