magpiebrain

Sam Newman's site, a Consultant at ThoughtWorks

Posts from the ‘Development’ category

Some yak shaving while playing around with Riemann resulted in me creating my first leiningen plugin, lein-gentags. It uses etags based on instructions from Nurullah Akkaya’s original blogpost – perfect for improving navigation of Clojure code in Emacs. Feedback appreciated!

Leave a comment

I’ll be running my new talk “Designing For Rapid Release” at a couple of conferences in the first half of this year. First up is the delightfully named Crash & Burn in Stockholm, on the 2nd of March. Then later in May I’ll be at Poznan in Poland for GeeCon 2012.

This talk focuses on the kinds of constraints we should consider when evolving their architecture of our systems in order to enable rapid, frequent release. So much of the conversation about Continuous Delivery focuses on the design of build pipelines, or the nuts and bolts of CI and infrastructure automation. But often the biggest constraint in being able to incrementally roll out new features are the problems in the design of the system itself. I’ll be pulling together a series of patterns that will help you identify what to look for in your own systems when moving towards Continuous Delivery.

Leave a comment

On my current client project, in terms of managing configuration of the various environments, I have separated things into two problem spaces – provisioning hosts, and configuring hosts. Part of the reason for this separation is that although targeting AWS, we do need to allow us to support alternative services in the future, but I also consider the type of tasks to be rather different and to require different types of tools.

For provisioning hosts I am using the Python AWS API Boto. For configuring the hosts once provisioned, I am using Puppet. I remain unconvinced as to the relative merits of PuppetMaster or Chef Server (see my previous post on the subject) and so have decided to stick with using PuppetSolo so I can manage versioning how I would like. This leaves me with a challenge – how do I apply the puppet configuration for the hosts once provisioned with Boto? I also wanted to provide a relatively uniform command-line interface to the development team for other tasks like running builds etc. Some people use cron-based polling for this, but I wanted a more direct form of control. I also wanted to avoid the need to run any additional infrastructure, so mcollective was never something I was particularly interested in.

After a brief review of my “Things I should look at later” list it looked like time to give Fabric a play.

Fabric is a Python-based tool/library which excels at creating command-line tools for machine management. It’s bread and butter is script-based automation of machines via SSH – many people in fact use hand-rolled scripts on top of Fabric as an alternative to systems like Chef and Puppet. The documentation is very good, and I can heartily recommend the Fabric tutorial.

The workflow I wanted was simple. I wanted to be able to checkout a specific version of code locally, run one command to bring up a host and also apply a given configuration set. My potentially naive solution to this problem is to simply tar up my puppet scripts, upload them, and then run puppet. Here is the basic script:

[python]
@task
def provision-box():
public_dns = provision_using_boto()

local("tar cfz /tmp/end-bundle.tgz path/to/puppet_scripts/*")
with settings(host_string=public_dns, user="ec2-user", key_filename="path/to/private_key.pem"):
run("sudo yum install -y puppet")
put("/tmp/end-bundle.tgz", ".")
run("tar xf end-bundle.tgz && sudo puppet –modulepath=/home/ec2-user/path/to/puppet_scripts/modules path/to/puppet_scripts/manifests/myscript.pp")
[/python]

The provision_using_boto() command is an exercise left to the reader, but the documentation should point you in the right direction. If you stuck the above command in your fabfile.py, all you need to do is run fab provision-box to do the work. The first yum install command is there to handle bootstraping of puppet (as it is not on the AMIs we are using) – this will be a noop if the target host already has it installed.

This example is much more simplified than the actual scripts as we have also implemented some logic to re-use ec2 instances to save time & money, and also a simplistic role system to manage different classes of machines. I may write up those ideas in a future post.

1 Comment

From http://www.flickr.com/photos/bigduke6/258262809/I’ve been playing around with both Chef and Vagrant recently, getting my head around the current state of the art regarding configuration management. A rather good demo of Chef at the recent DevOpsDays Hamburg by John Willis pushed me towards Chef over Puppet, but I’m still way to early in my experimentation to know if that is the right choice.

I may speak more later about my experiences with Vagrant, but this post is primarily concerning Chef, and specifically thoughts regarding repeatability.

Repeatability

Most of us I hope, check our code in. Some of us even have continuous integration, and perhaps even a fully fledged deployment pipeline which creates packages representing our code which have been validated to be production ready. By checking in our code, we hope to bring about a situation whereby we can recreate a build of our software at a previous point in time.

Typically however, deploying these systems requires a bit more than simply running apt-get install or something similar. Machines need to be provisioned and dependencies configured, and this is where Chef and Puppet come in. Both systems allow you to write code that specifies the state you expect your nodes to be in to allow your systems to work. To my mind, it is important therefore that the version of the configuration code needs to be in sync with your application version. Otherwise, when you deploy your software, you may find that the systems are not configured how you would like.

Isn’t It All About Checking In?

So, if we rely on checking our application code in to be able to reproduce a build, why not check our configuration code into the same place? On the face of it, this makes sense. The challenge here – at least as I understand the capabilities of Chef, is that much of the power of Chef comes from using Chef Server, which doesn’t play nicely with this model.

Chef Server is a server which tells nodes what they are expected to be. It is the system that gathers information about your configured systems allowing discovering via mechanisms like Knife, and also how you push configuration out to multiple machines. Whilst Chef Server itself is backed by version control, there doesn’t seem to be an obvious way for an application node to say “I need version 123 of the Web Server recipe”. That means, that if I want to bring up an old version of a Web node, it could reach out and end up getting a much newer version of a recipe, thereby not correctly recreating the previous state.

Now, using Chef Solo, I could check out my code and system configuration together as a piece, then load that on to the nodes I want, but I loose a lot by not being able to do discovery using Knife and similar tools, and I loose the tracking etc.

Perhaps there is another way…

Chef does have a concept of environments. With an environment, you are able to specify that a node associated with a specific environment should use a specific version of a recipe, for example:

name "dev"
description "The development environment"
cookbook_versions  "couchdb" => "11.0.0"
attributes "apache2" => { "listen_ports" => [ "80", "443" ] }

The problem here is that I think the concept of being able to access versions of my cookbooks is completely orthogonal to environments. Let’s remember the key goal – I want to be able to reproduce a running system based on a specific version of code, and identify the right version of the configuration (recipes) to apply for that version of the code. Am I missing something?

In a previous post, I showed how we could use Clojure and specifically Incanter to process access logs to graph hits on our site. Now, we’re going to adapt our solution to allow us to to show the number of unique users over time.

We’re going to change the previous solution to pull out the core dataset representing the raw data we’re interested in from the access log – records-from-access-log remains unchanged from before:

[clojure]
(defn access-log-to-dataset
[filename]
(col-names (to-dataset (records-from-access-log filename)) ["Date" "User"]))
[/clojure]

The raw dataset retrieved from this call looks like this:

Date User
11/Aug/2010:00:00:30 +0100 Bob
11/Aug/2010:00:00:31 +0100 Frank
11/Aug/2010:00:00:34 +0100 Frank

Now, we need to work out the number of unique users in a given time period. Like before, we’re going to use $rollup to group multiple records by minute, but we need to work out how to summarise the user column. To do this, we create a custom summarise function which calculates the number of unique users:

(defn num-unique-items
  [seq]
  (count (set seq)))

Then use that to modify the raw dataset and graph the resulting dataset:

(defn access-log-to-unique-user-dataset
  [access-log-dataset]
    ($rollup num-unique-items "User" "Date" 
      (col-names (conj-cols ($map #(round-ms-down-to-nearest-min (as-millis %)) "Date" access-log-dataset) ($ "User" access-log-dataset)) ["Date" "Unique Users"])))

(defn concurrent-users-graph
  [dataset]
  (time-series-plot :Date :User
                             :x-label "Date"
                             :y-label "User"
                             :title "Users Per Min"
                             :data (access-log-to-unique-user-dataset dataset)))


(def access-log-dataset
  (access-log-to-dataset "/path/to/access.log"))

(save (concurrent-users-graph access-log-dataset) "unique-users.png")

You can see the full source code listing here.

Continuing in a re-occurring series of posts showing my limited understanding of Clojure, today we’re using Clojure for log processing. This example is culled from some work I’m doing right now in the day job – we needed to extract usage information to better understand how the system is performing.

The Problem

We have an Apache-style access log showing hits our site. We want to process this information to extract information like peak hits per minute, and perhaps eventually more detailed information like the nature of the request, response time etc.

The log looks like this:

43.124.137.100 - username 05/Aug/2010:17:27:24 +0100 "GET /some/url HTTP/1.1" 200 24 "http://some.refering.domain/" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.9) Gecko/2009040821 Firefox/3.0.9 (.NET CLR 3.5.30729)"

Extracting The Information

We want to use Incanter to help us process the data & graph it. Incanter likes its data as a sequence of sequences – so that’s what we’ll create.

First up – processing a single line. I TDD’d this solution, but have excluded the tests from the source listing for brevity.

user=> (use 'clojure.contrib.str-utils)
nil
user=> (use '[clojure.contrib.duck-streams :only (read-lines)])
nil

user=> (defn extract-records-from-line
  [line-from-access-log]
  (let [[_ ip username date] (re-find #"^(d{1,3}.d{1,3}.d{1,3}.d{1,3}) - (w+) (.+? .+?) " line-from-access-log)]
    [date username]))
#'user/extract-records-from-line

user=> (defn as-dataseries
  [access-log-lines]
  (map extract-records-from-line access-log-lines))
#'user/as-dataseries

user=> (defn records-from-access-log
  [filename]
  (as-dataseries (read-lines filename)))
#'user/records-from-access-log

A few things to note. extract-records-from-line is matching more than strictly needed – I just wanted to indicate the use of destructing for matching parts of the log line. I’m pulling in the username & date – the username is not strictly needed for what follows. Secondly, note the use of read-lines from clojure.contrib.duck-streams – rather than slurp, read-lines is lazy. We’ll have to process the whole file at some point, but it’s a good idea to look to use lazy functions where possible.

At this point, running records-from-access-log gives us our sequence of sequences – next up, pulling it into Incanter.

Getting The Data Into Incanter

We can check that our code is working properly by firing up Incanter. Given a sample log:

56.24.137.230 - fred 05/Aug/2010:17:27:24 +0100 "GET /some/url HTTP/1.1" 200 24 "http://some.refering.domain/" "SomeUserAgent"
12.14.137.140 - norman 05/Aug/2010:17:27:24 +0100 "GET /some/url HTTP/1.1" 200 24 "http://some.refering.domain/" "SomeUserAgent"
42.1.137.110 - bob 05/Aug/2010:17:28:24 +0100 "GET /some/url HTTP/1.1" 200 24 "http://some.refering.domain/" "SomeUserAgent"
143.124.1.50 - clare 05/Aug/2010:17:29:24 +0100 "GET /some/url HTTP/1.1" 200 24 "http://some.refering.domain/" "SomeUserAgent"

Let’s create a dataset from it, and view the resulting records:

user=> (use 'incanter.core)
nil
user=> (def access-log-to-dataset 
(to-dataset (records-from-access-log "/path/to/example-access.log")))
#'user/access-log-dataset
user=> (view access-log-dataset)

The result of the view command:

Unfortunately, no column names – but that is easy to fix using col-names:

user=> (def access-log-dataset 
(col-names (to-dataset (records-from-access-log "/path/to/example-access.log")) ["Date" "User"]))
#'user/access-log-dataset
user=> (view access-log-dataset)

At this point you can see that it would be easy for us to pull in the URL, response code or other data rather than the username from the log – all we’d need to do is change extract-records-from-line and update the column names.

Graphing The Data

To graph the data, we need to get Incanter to register the date column as what it is – time. Currently it is in string format, so we need to fix that. Culling the basics from Jake McCray’s post, here is what I ended up with (note use of Joda-Time for date handling – you could use the standard SimpleDateFormat if you preferred):

user=> (import 'org.joda.time.format.DateTimeFormat)
nil

user=> (defn as-millis
  [date-as-str]
  (.getMillis (.parseDateTime (DateTimeFormat/forPattern "dd/MMM/yyyy:HH:mm:ss Z") date-as-str)))
#'user/as-millis

user=> (defn access-log-to-dataset
  [filename]
  (let [unmod-data (col-names (to-dataset (records-from-access-log filename)) ["Date" "User"])]
    (col-names (conj-cols ($map as-millis "Date" unmod-dataset) ($ "User" unmod-dataset)) ["Date Time In Ms" "User"])))
#'user/access-log-to-dataset

While the date parsing should be pretty straightforward to understand, there are a few interesting things going on with the Incanter code that we should dwell on briefly.

The $ function extracts a named column, whereas the $map function runs another function over the named column from the dataset, returning the modified column (pretty familiar if you’ve used map). conj-cols then takes these resulting sequences to create our final dataset.

We’re not quite done yet though. We have our time-series records – representing one hit on our webserver – but don’t actually have values to graph. We also need to work out how we group hits to the nearest minute. What we’re going to do is replace our as-millis function to one that rounds to the nearest minute. Then, we’re going to use Incater to group those rows together – summing the hits it finds per minute. But before that, we need to tell Incanter that each row represents a hit, by adding a ‘Hits’ column. We’re also going to ditch the user column, as it isn’t going to help us here:

user=> (defn access-log-to-dataset
  [filename]
  (let [unmod-dataset (col-names (to-dataset (records-from-access-log filename)) ["Date" "User"])]
    (col-names (conj-cols ($map as-millis "Date" unmod-dataset) (repeat 1)) ["Date" "Hits"])))
#'user/access-log-to-dataset

Next, we need to create a new function to round our date & time to the nearest minute.

Update: The earlier version of this post screwed up, and the presented round-ms-down-to-nearest-min actually rounded to the nearest second. This is a corrected version:

(defn round-ms-down-to-nearest-min
  [millis]
  (* 60000 (quot millis 60000)))

If you wanted hits per second, here is the function:

(defn round-ms-down-to-nearest-sec
  [millis]
  (* 1000 (quot millis 1000)))

And one more tweak to access-log-to-dataset to use the new function:

(defn access-log-to-dataset
  [filename]
  (let [unmod-dataset (col-names (to-dataset (records-from-access-log filename)) ["Date" "User"])]
    (col-names (conj-cols ($map #(round-ms-down-to-nearest-min (as-millis %)) "Date" unmod-dataset) (repeat 1)) ["Date" "Hits"])))

Finally, we need to roll our data up, summing the hits per minute – all this done using $rollup:

(defn access-log-to-dataset
  [filename]
  (let [unmod-dataset (col-names (to-dataset (records-from-access-log filename)) ["Date" "User"])]
    ($rollup :sum "Hits" "Date" 
      (col-names (conj-cols ($map #(round-ms-down-to-nearest-min (as-millis %)) "Date" unmod-dataset) (repeat 1)) ["Date" "Hits"]))))

$rollup applies a summary function to a given column (in our case “Hits”), using another function to determine the parameters for that function (“Date” in our case). :sum here is a built-in Incanter function, but we could provide our own.

And the resulting dataset:

Now we have our dataset, let’s graph it:

user=> (defn hit-graph
  [dataset]
  (time-series-plot :Date :Hits
                             :x-label "Date"
                             :y-label "Hits"
                             :title "Hits"
                             :data dataset))

user=> (view (hit-graph (access-log-to-dataset "/path/to/example-access.log")))

This is deeply unexciting – what about if we try a bigger dataset? Then we get things like this:

Conclusion

You can grab the final code here.

Incanter is much more than simply a way of graphing data. This (somewhat) brief example shows you how to get log data into an Incanter-frendly format – what you want to do with it then is up to you. I may well explore other aspects of Incanter in further posts.

3 Comments

In the spirit of making my mistakes in public – something which I have a long history of on this blog – I thought I’d post up a solution I came up with for a relatively simple problem. I’m not unhappy with the solution – it works – but I can’t help thinking I’m missing something and there is a more elegant solution out there.

The Problem

For our input we have a set of time series records. Each record contains a list of name value pairs. For each record, the keys are not fixed – they may vary. Let’s imagine that we’re recording viewing figures for the major UK soap operas – some soaps are on every day, some are only on a few days a week.

In Clojure, we’re got our data in the following form:

(("Monday" {:eastenders 6.5, :thearchers 2.3, :corrinationstreet 5.6})
 ("Tuesday" {:eastenders 6.8, :thearchers 1.4})
 ...)

We want to convert this into a single table of data, with the keys from the source data representing the columns, with each row representing a different timestamp, so that we can visualise the data with something like gnuplot, Incanter or just plain old excel. So we want to get to something like this (yes, I know The Archers is on at the weekend too, but this is just an example):

Day Eastenders The Archers Coronation Street
Monday 6.5 2.3 5.6
Tuesday 6.8 1.4
Wednesday   2.3 7
Thursday 6.7 2.8
Friday 9.8 2.1 7

The challenge here (such as it is) is that we want a sparse table, and that our code cannot know beforehand the total universe of soap names (what if a new soap launched?).

The Header Row

The first part of this problem as I saw it was to determine which soaps our records represented to create a header row. The solution I came up with was to stick the keys for all records into a set:

user=> (def soaps 
'(("Monday" {"eastenders" 6.5, "thearchers" 2.3, "corrinationstreet" 5.6})
("Tuesday" {"eastenders" 6.8, "thearchers" 1.4})))

#'user/records
user=> (defn as-columns [records]
(apply (partial conj #{}) (mapcat keys (map second records)))))
#'user/columns

user=> (as-columns soaps)
#{"corrinationstreet" "eastenders" "thearchers"}

Assuming we want a CSV file to store our content, a function to create the header row becomes:

user=> (str "Date," (apply str (interpose "," (as-columns soaps))))
"Date,corrinationstreet,eastenders,thearchers"

The Data

Now we have a list of all possible keys (in this example, the names of the soaps), we can use this to extract data from the records for each row. Getting data from a map is straightforward – even handling the not-there case is simple enough:

user=> (def some-map 
{"eastenders" 6.5, "thearchers" 2.3, "corrinationstreet" 5.6})
#'user/some-map

user=> (get some-map "eastenders" "-")
6.5

user=> (get some-map "hollyoaks" "-")
-

Given a list of known columns, we can use a list comprehension to extract the data in a consistent order:

user=> (for [col #{"eastenders" "thearchers" "corrinationstreet"}] 
(get {"eastenders" 6.8, "thearchers" 1.4} col "-"))
(["-"] [6.8] [1.4])

Pulling It All Together

Taking those various strands, we end up with the following solution:

(defn as-columns 
  [records]
  (apply (partial conj #{}) (mapcat keys (map second records))))

(defn header-row 
  [records]
  (str "Date," (apply str (interpose "," (as-columns records)))))

(defn values-for-record
  [columns values]
  (for [col columns] (get values col "-")))

(defn as-row
  [record columns]
  (let [day (first record)
        values (second record)]
   (str day ","
    (apply str (interpose "," (values-for-record columns values))))))

(defn as-data
  [records]
  (apply str 
    (interpose "n" 
      (for [record records] 
          (as-row record (as-columns records))))))

(defn as-table
  [records]
    (apply str (header-row records) "n" (as-data records)))

Things I like with the solution: it works.

Things I don’t like with the solution:

  • Duplicated call to as-columns, in both as-data and header-row
  • Still don’t think I’ve got the indentation right
  • Still worried I’m creating functions which are too large, or that are not readable
  • This solution was arrived at by hacking on the code in a REPL – not TDD. My Clojure skills are still lacking, so I have to embark on the occasional hack-a-thon to learn some things – this being one such exercise. I plan to re-implement this using TDD with what I’ve learnt to see what I end up with. It will be interesting to see if TDDing this allays my fears about function size.
  • Lots of (apply str (interpose… duplication going around – I should factor that out
  • Not sure if the list comprehension here is needed – have I missed something obvious?

Updated to reflect some feedback and one example of using commons-exec as an alternative to the plain old Runtime.exec

Second Update to reflect use of shell-out – thanks Scott!

Basic

Making use of clojure.contrib.duck-streams:

(ns utils
 (:use clojure.contrib.duck-streams))

(defn execute [command]
  (let [process (.exec (Runtime/getRuntime) command)]
    (if (= 0 (.waitFor  process))
        (read-lines (.getInputStream process))
        (read-lines (.getErrorStream process)))))

...
user=> (execute "ls")
("MyProject.iml" "lib" "out" "src" "test")

It could be improved obviously – for example catching some of the potential IOExceptions that can result to rethrow additional information, such as the command being executed, or the ability to take a seq of program arguments.

Error & Argument Handling

This version adds some basic (and ugly) exception handling, and also handles spacing out arguments passed in (so passing "ls" "-la" gets processed into "ls -la"):

(defn execute
  "Executes a command-line program, returning stdout if a zero return code, else the
  error out. Takes a list of strings which represent the command & arguments"
  [& args]
  (try
    (let [process (.exec (Runtime/getRuntime) (reduce str (interleave args (iterate str " "))))]
      (if (= 0 (.waitFor  process))
          (read-lines (.getInputStream process))
          (read-lines (.getErrorStream process))))
    (catch IOException ioe
      (throw (new RuntimeException (str "Cannot run" args) ioe)))))

Using commons-exec

I had some problems with hanging processes, so knocked up a version using Apache’s commons-exec. This version has the added advantage of killing long-running processes, and I folded in Steve’s suggestion for a better way of splicing in the spaces in the command line args (see his comment). commons-exec is part of the special sauce inside Ant, so is a rock solid way of launching command-line processes (well, as rock solid as Java gets).

The use of the ByteArrayOutputStream is probably inefficient, and again, decent error handling is left as an exercise to the reader.

(defn alternative-execute
  "Executes a command-line program, returning stdout if a zero return code, else the
  error out. Takes a list of strings which represent the command & arguments"
  [& args]
  (let [output-stream (new ByteArrayOutputStream)
        error-stream (new ByteArrayOutputStream)
        stream-handler (new PumpStreamHandler output-stream error-stream)
        executor (doto
                  (new DefaultExecutor)
                  (.setExitValue 0)
                  (.setStreamHandler stream-handler)
                  (.setWatchdog (new ExecuteWatchdog 20000)))]
     (if (= 0 (.execute executor (CommandLine/parse (apply str (interpose " " args)))))
       (.toString output-stream)
       (.toString error-stream))))

Using clojure.contrib.shell-out

Many thanks to Scott for this. clojure.contrib supplies the very neat shell-out:

user=> (use 'clojure.contrib.shell-out)
nil
user=> (sh "ls" "-la")

I haven’t probed further to see if this deals with my hanging process problem, but it certainly doesn’t seem to have any support for killing timeout processes. If you’re worried about runaway tasks, the commons-exec version above might be the right choice for you.

I’ve been playing around with partially applied functions in Clojure, and have hit an interesting snag when dealing with Java interop. First, lets examine what partial does in Clojure, by cribbing off an example from Stuart Halloway’s Programming Clojure:

user=> (defn add [one two] (+ one two))
#'user/add
user=> (add 1 2)
3
user=> (def increment-by-two (partial add 2))
#'user/increment-by-two
user=> (increment-by-two 5)
7

What partial is doing is partially applying the function – in our case we have applied one of the two arguments our add implementation requires, and got back another function we can call later to pass in the second argument. This example is obviously rather trivial, but partially applied functions can be very handy in a number of situations.

Anyway, this wasn’t supposed to be a discussion of partial in general, but one problem I’ve hit with it when trying to partially apply a call to a Java static method. So, let’s implement our trivial add method in plain old Java:

public class Functions {
    public static int add(int first, int second) {
        return first + second;
    }
}

Then try using partial as before:

user=> (import com.xxx.yyy.Functions)
com.xxx.yyy.Functions
user=> (Functions/add 1 2)
3
user=> (def increment-by-two (partial Functions/add 1))
java.lang.Exception: Unable to find static field: add in class com.xxx.yyy.Functions (NO_SOURCE_FILE:3)
user=> 

So it seems like the partial call can’t handle static calls in this situation. But what if I wrap the call in another function?

user=> (defn java-add [arg1 arg2] (Functions/add arg1 arg2))
#'user/java-add
user=> (def increment-by-two (partial java-add 2))
#'user/increment-by-two
user=> (increment-by-two 10)
12

Which works. There is probably a reason why, but I can’t quite work it out right now.

Posted in the “I hope no-one else has to go through this” category in the hope that Google surfaces this for some other poor soul.

Picture a rather trivial split function:

(defn split [str delimiter]
  ((seq (.split str delimiter))))

Which helpfully spits out:

java.lang.ClassCastException: clojure.lang.ArraySeq cannot be cast to clojure.lang.IFn

The issue here is the additional set of parentheses – a hang-over from a previous edit. Removing this fixed the trouble. These parentheses were causing Clojure to expect a function call…