Sam Newman's site, a Consultant at ThoughtWorks

Archive for ‘January, 2007’

When using a Continuous Integration build, before long you’ll break it. Breaking a build is not a bad thing, however it is typically the team’s top priority to have such a build fixed. Beyond the shame associated with having been named as the breaker, you then have the hassle of lots of people informing you you’ve broken it.

As a way of letting the development team know that:

  • You know the build is broken
  • You are fixing it

A simple broadcast mechanism can be highly useful.


Whilst I have seen high-tech solutions being recommended, the most effective example of a Build Fix Flag I’ve seen is simply using a giant paper flag. It was about 2 1/2 feet in height, and could be clearly seen above monitors. When a build failure was seen, a quick glance across the floor would indicate if someone was working on it.

What was nice was that before long, the same mechanism was used for notification of a number of development environments

Rules for the build fix flag

When using such a flag, we quickly decided on a set of rules as to how to use it:

  1. If you saw a CI build breakage, you looked for the flag
  2. If someone had the flag, you left them alone
  3. If you couldn’t see the flag, you tried to identify the person who made the last check in
  4. If you couldn’t find a likely culprit, you raised the flag and fixed it yourself

List below, not neccesarily in any order, are the main reasons given for the lack of activity around both this blog and London 2.0:


With xtech, OSCon, Agile 2007 and XP 2007 I’ve been busy preparing submissions, and I’ll start hearing back from them soon. Rather than continue with my Lego XP Game dog and pony show, instead I’ve submitted presentations on dbdeploy, Buildix, and will hopefully be helping out on a workshop with some colleagues. More information when I get the rejection letters.


Well, it was nice – as was the many mammoth Medieval Total War 2 and World Of Warcraft sessions.


I spent a lot of time writing proposals (which I enjoyed) meeting new clients, and playing with Python & Django. I’m getting really impressed with both Django (and the very good Django book) and the mature tool set for python development as a whole. More soon perhaps


Yeah, well..I had stuff on, you know? And series two of deadwood to watch

Expect a bit more traffic here, and a London 2.0 meeting for the end of feb

When running a suite of tests – either as part of a Continuous Integration build, or part of a check-in gate – speed is the enemy. You are always trying to find the balance between test coverage and time to complete the entire suite.

Both a check-in gate and continuous integration build have slightly different time pressures. The optimal duration for both will probably vary from team to team. A fish-eye suite is one way of formulating which tests should be included.

In a fish-eye suite, you focus on those tests which provide the best coverage for those areas of functionality currently being developed, whilst those areas not directly being affected have only minimal coverage. The logic goes that tests are there to pick up bugs, and therefore tests which cover the functionality currently being worked on are the most important.

Regularly changing the focus

In an iterative development process, working out where to change the focus of the suite can be easy. At the beginning of each iteration, the team as a whole decide which tests (or group of tests) should be run in each suite. During a development process where there is less segmentation, changing the suite on an ongoing basis may be more sensible

Reactive test suite

Rather than changing the focus of the suite at the start of each iteration, you can also decide to adapt the contents of the suite in certain situations – for example when the total time for the suite to run exceeds an agreed limit, or the number of bugs reported against a functional area increases.

For example, the team may decide that the check-in gate build should take 30 seconds, and the CI build should take no more than five minutes. The moment they take longer, the team as a whole should redefine what tests should be included. When removing tests, the team should look to remove those tests which cover areas of functionality furthest away from those currently being worked on.

Likewise if the number of bugs being raised against a certain functional area increase, the development team may decide that increasing the test coverage of a certain area makes sense. Again, the team may have to remove some tests in order to still be within the optimal build time. In which case, tests associated with areas of functionality with low defect rates make a good candidate for removal.

Source-Code Management

Being able to create and manage a fish-eye suite presupposes that you group your tests into functional areas. This can be an issue with typically packaging structures which are defined in horizontal terms, whereas the long running functional tests should cover a vertical slice of functionality.

Even if initially you aren’t going to use fish-eye suites, grouping your tests into functional (vertical) groupings rather than system (horizontal) groupings makes sense as it allows to to easily use this approach later. It also tends to make more sense to testers who see things in terms of usable functions rather than horizontal tiers.

It’s common practise within a team to define a set of tasks which should be run by each developer prior to checking in. The purpose of the Check-in Gate is to attempt to ensure that all code satisfies some basic level of quality.

Like all development standards, a check-in gate helps a team stay on the same page. It eases integration, and can help give confidence as to the quality of the code checked in.

Check-in Gate with Continuous Integration

A check-in gate is typically used prior to checking in to a system which uses Continuous Integration – where the tasks in run during the check-in gate are the same tasks used as part of the continuous integration build. The main source of embarrassment associated with breaking a CI build tends to come down to the fact that the shamed individual completely forgot to run their check-in gate build.

The importance of speed

Check-in Gates need to be fast. The longer they take to run, the less developers will want to run them – this either results in less frequent check-ins or in developers not running them at all. Fewer check-ins result in more complex (and more error prone) integrations. Not running the check-in gate at all can result in a breakdown of code quality and can be a slippery slope to the gate being abandoned altogether.


The simplest example of a check-in gate would probably be ensuring that the code compiles prior to check-in. More often, the team will decide to run either some or all of a test suite. Again the constraining factor as to what you’ll want to run as part of a check-in gate is typically time – deciding how and what to run should always be defined by the team.

Selenium is a very good in-browser testing tool. It has bindings for many different languages (including Java, Python, Perl, C! and PHP). With it you can create a suite of tests which can run tests in multiple browsers (including IE, Opera, Firefox and Safari).

The ability to run tests inside a browser is a huge boon to those of us who have to worry about cross-browser compatible websites. By having an automated test suite (and having it run regularly, perhaps using Continuous Integration) you can automatically run a set of tests repeatedly on a number of platforms, on a number of browsers, whenever code changes. Whilst this doesn’t do away with the need for normal exploratory testing, nor is it always possible (or sensible) to automate all tests, this can dramatically reduce the QA time required prior to a go live.

Selenium is a very good tool, designed to be much simpler to use than similar tools (such as Mercury’s products). It is a very good tool that you probably don’t need all that much.

Selenium Slowdown

Selenium’s strength – that it tests web applications in the browser – is also it’s weakness. Testing in browsers is slow. Not only do you have the overhead of starting and marshaling an external process (the browser) but the fact that the tests have be rendered on screen means that a sizable Selenium test suite can take an awfully long time to run.

There are techniques which can be used to handle long running tests suites (more later perhaps) but I suspect for most of you, you don’t need to worry about them

Testing the DOM

Think about what it is you want to test in your web application. You need to simulate some user activity (clicking a link, entering text) and test that some result is displayed to the user. Selenium is as good as most things out there at doing that – but as we’ve already said, it’s slow. What is the alternative?

Selenium Overview

Figure 1: An overview of a browser driver

Well what is it we are really testing here? Let’s start with the user input. For the most part (I’m excluding AJAX interaction here – more later), when a user interacts with a web page, they end up creating a HTTP request to the server. Your server acts on that request, and returns some HTML, which the browser converts into a Document Object Model, and which in turn gets rendered to the user.

So when we want to check what is displayed to the user, what is it we are actually doing? Our testing tools don’t look at the screen rendering – all they need to do is carry out assertions on the DOM itself.

So to test most web applications, we need to create a HTTP request, and perform assertions on the DOM. And Selenium certainly isn’t the fastest way of doing that.

Faking the browser

The reason that Selenium is slow, is the browser. We are using Selenium to drive the browser, which in turn submits a request for us. The browser then handles the response, creates (or manipulates) the DOM, and renders the response. Why not simply remove the browser altogether?

Tools like HTTPUnit (for Java) or Twill (for Python) let you do just that. With them, you can create a request, submit it directly to the server, handle the response and interrogate the DOM. HTTPUnit and Twill are effectively emulating the browser’s ability to create a DOM from a server response.

An overview of browser emulation testing

Figure 2: An overview of a browser-emulation testing

Test suites using browser emulation tools like this will be an order of magnitude faster than similar Selenium test suites.

No place for Selenium?

There is certainly a place for in-browser testing. In our overview of browser testing above, we implied that the DOM for any given page is created entirely as a result of a response from the server, but the world isn’t that simple.

Using Javascript, web developers have for a long time been able to manipulate the DOM by executing on the client side with no interaction with the server. Selenium (and similar tools) are still very useful for testing these kinds of situations – however for most of us there will be much less need for these (slower) tests.


The tools available for browser testing have come on leaps and bounds in recent years. There is a place for browser drivers (like Selenium or Sahi) and for suites based on browser emulation techniques (such as HTTPUnit or Twill). Knowing which to use and when can result in significant time savings when running your test suites.

And no apologies:

  • Rice Pudding & Red Wine
  • The books of Terry Pratchett
  • Cheesecake
  • The Films of Adam Sandler
  • The film Hackers