magpiebrain

Sam Newman's site, a Consultant at ThoughtWorks

Posts from the ‘Development’ category

I’m currently working on a personal project by way of learning Clojure – it’s actually a program to match up my itemised phone bill against my list of contacts to help me expense my calls. I find it best to have a real-world problem I need to solve to learn a new programming language. The problem itself is rather dull, but it did give me a chance to consider an issue I’ve hit with many other languages.

One of the core parts of my telephone expense program is the process of normalising phone numbers so I can match them up. What I am trying to do is something long the lines of:

Strip spaces, then add the missing area code, then internationalize it

So in Clojure there are a number of functions I’ve written, each of which take, and return, a string (the program is nowhere near finished, so consider this to be virtually pseudo code) :

(defn #^String normalize [str]
  (internationalize (add-missing-areacode (strip-spaces str))))

In Java, this would look like:

public String normalize(String str) {
  return internationalize(addMissingAreaCode(stripSpaces(str)));
}

The problem is that I, and most of the western world, read from left to right – with both Java and Clojure I’m having to read from right to left to determine what is being done. One system I use frequently has a construct which matches what I’m after – UNIX:

strip-spaces "44 1230 9183" | add-area-code | internationalize

So what other languages support this kind of construct? I suspect I could coax Scala into doing something like this, and it seems that it is right up Python’s alley (Django’s excellent templating system has filters which do exactly that). But if I want to use Clojure, am I stuck with this inside-out programming model? What other JVM-based languages would help me here – Ioke perhaps? It seems right up AINC’s alley, but that syntax makes me want to cry…

Update 11 Jan 2010: Thanks to Matt for pointing me towards Clojure’s ‘->‘ macro. This looks pretty close to what I’m after. So I *think* I should be able to do something along the lines of:

(-> phoneNumber stripSpaces addAreaCode internationalize)

Which is very cool.

Recently, both Paul Julius and Chris Read pointed out that I was perhaps the first person to document the concept of build pipelines, at least in terms of how it relates to continuous integration and the like. As it turns out, the original posts on the subject are from further back than I remember:

I plan to pull together my previous posts on the subject and update them a little, but in the meantime thought I’d give a bit of background as to where much of this came from.

A Harsh Introduction

My first exposure to continuous integration was by being dropped in at the deep end during my first ThoughtWorks project. The project in question was for an electronic point of sale system, and at its peak had over 50 developers in three countries working on the project. During this time I started reading up on the topic, specifically Martin Fowler & Matt Foemmel’s paper on the topic (Martin has since created an updated version).

Much of the experiences at this first, large project were dominated by long, slow build times, caused in part by an inability to separate out activities being performed by individual teams. A full discussion as to things we learnt from that project can certainly wait for another time, but I came out of that experience liking the concept of Continuous Integration, but feeling incredibly constrained by the actual implementation.

Monkeying Around

Subsequently, I worked as what we used to call a ‘Build Monkey’ at a London-based ISP. My role (which we now tend to call an Environment Specialist) was typically to identify the causes of build failure, keep the build running smoothly, as well as manage deployments to a number of different environments. Throughout this time, discussions around the theory behind managing Continuous builds for larger software teams was continuing – primarily with colleagues like Julian Simpson, Jack Bolles & others.

The challenge we seemed to face, time and again, was how you balance the various activities associated with getting software from developers machines into production, all whilst providing the fastest feedback possible.

Typically, we came at the problem from two different directions – in the first instance from the point of view of how to hammer our tools into supporting the kind of processes involved, but the more important angle was understanding what the pipeline – from developer workstation to production – actually was. This thinking can now be best thought of in terms of Continuous Deployment – although that topic is far more nuanced that the often simplistic thinking regarding systems where 50 deployments a day is possible, or even desirable.

The Present Day

Since I wrote my original articles, many other people have done work in this area, to the extent that tools like ThoughtWork’s own Cruise builds support for build pipelining & visualisation directly into the tool.

Update 1: Corrected spelling of Paul’s surname – sorry Paul!

When using a Continuous Integration build, before long you’ll break it. Breaking a build is not a bad thing, however it is typically the team’s top priority to have such a build fixed. Beyond the shame associated with having been named as the breaker, you then have the hassle of lots of people informing you you’ve broken it.

As a way of letting the development team know that:

  • You know the build is broken
  • You are fixing it

A simple broadcast mechanism can be highly useful.

Example

Whilst I have seen high-tech solutions being recommended, the most effective example of a Build Fix Flag I’ve seen is simply using a giant paper flag. It was about 2 1/2 feet in height, and could be clearly seen above monitors. When a build failure was seen, a quick glance across the floor would indicate if someone was working on it.

What was nice was that before long, the same mechanism was used for notification of a number of development environments

Rules for the build fix flag

When using such a flag, we quickly decided on a set of rules as to how to use it:

  1. If you saw a CI build breakage, you looked for the flag
  2. If someone had the flag, you left them alone
  3. If you couldn’t see the flag, you tried to identify the person who made the last check in
  4. If you couldn’t find a likely culprit, you raised the flag and fixed it yourself

When running a suite of tests – either as part of a Continuous Integration build, or part of a check-in gate – speed is the enemy. You are always trying to find the balance between test coverage and time to complete the entire suite.

Both a check-in gate and continuous integration build have slightly different time pressures. The optimal duration for both will probably vary from team to team. A fish-eye suite is one way of formulating which tests should be included.

In a fish-eye suite, you focus on those tests which provide the best coverage for those areas of functionality currently being developed, whilst those areas not directly being affected have only minimal coverage. The logic goes that tests are there to pick up bugs, and therefore tests which cover the functionality currently being worked on are the most important.

Regularly changing the focus

In an iterative development process, working out where to change the focus of the suite can be easy. At the beginning of each iteration, the team as a whole decide which tests (or group of tests) should be run in each suite. During a development process where there is less segmentation, changing the suite on an ongoing basis may be more sensible

Reactive test suite

Rather than changing the focus of the suite at the start of each iteration, you can also decide to adapt the contents of the suite in certain situations – for example when the total time for the suite to run exceeds an agreed limit, or the number of bugs reported against a functional area increases.

For example, the team may decide that the check-in gate build should take 30 seconds, and the CI build should take no more than five minutes. The moment they take longer, the team as a whole should redefine what tests should be included. When removing tests, the team should look to remove those tests which cover areas of functionality furthest away from those currently being worked on.

Likewise if the number of bugs being raised against a certain functional area increase, the development team may decide that increasing the test coverage of a certain area makes sense. Again, the team may have to remove some tests in order to still be within the optimal build time. In which case, tests associated with areas of functionality with low defect rates make a good candidate for removal.

Source-Code Management

Being able to create and manage a fish-eye suite presupposes that you group your tests into functional areas. This can be an issue with typically packaging structures which are defined in horizontal terms, whereas the long running functional tests should cover a vertical slice of functionality.

Even if initially you aren’t going to use fish-eye suites, grouping your tests into functional (vertical) groupings rather than system (horizontal) groupings makes sense as it allows to to easily use this approach later. It also tends to make more sense to testers who see things in terms of usable functions rather than horizontal tiers.

It’s common practise within a team to define a set of tasks which should be run by each developer prior to checking in. The purpose of the Check-in Gate is to attempt to ensure that all code satisfies some basic level of quality.

Like all development standards, a check-in gate helps a team stay on the same page. It eases integration, and can help give confidence as to the quality of the code checked in.

Check-in Gate with Continuous Integration

A check-in gate is typically used prior to checking in to a system which uses Continuous Integration – where the tasks in run during the check-in gate are the same tasks used as part of the continuous integration build. The main source of embarrassment associated with breaking a CI build tends to come down to the fact that the shamed individual completely forgot to run their check-in gate build.

The importance of speed

Check-in Gates need to be fast. The longer they take to run, the less developers will want to run them – this either results in less frequent check-ins or in developers not running them at all. Fewer check-ins result in more complex (and more error prone) integrations. Not running the check-in gate at all can result in a breakdown of code quality and can be a slippery slope to the gate being abandoned altogether.

Examples

The simplest example of a check-in gate would probably be ensuring that the code compiles prior to check-in. More often, the team will decide to run either some or all of a test suite. Again the constraining factor as to what you’ll want to run as part of a check-in gate is typically time – deciding how and what to run should always be defined by the team.

This guide assumes you’ve already installed Eclipse, PyDev, Python and Django. It also assumes you’re using Eclipse 3.2, PyDev 1.2.4, Django 0.95 and Python 2.4.

* Go to Window->Preferences->Preferences->PyDev->Python Interpretter and add the django source file to the PYTHONPATH settings.
* Create a new PyDev pyhon project. Make sure you uncheck the ‘create src folder’ option.
* Create project on the command line using django e.g. django-admin.py startproject mysite
* In your newly created project directory create a src directory in it, and move the django generated source files here
* In eclipse, right-click your project and select refresh
* Right-click on the project and select Properties->PyDev - PYTHONPATH, and add your src folder to the project source settings

That should be it. I still get red underlines on the Django source imports even thought PyDev seems to know about them – to test this is working properly, open up your urls.py file and ctrl click on the patterns call – it should take you to defaults.py.

Now you can go ahead and create your database & super user.

Launching to built-in server

Open up manage.py and hit F9. This should print out the usage information for the server. To actually start the server, select Run->Run..., and in the Arguments tab for manage.py enter runserver --noreload. The noreload argument gives you output.

Thanks go to PyDev creator Fabio Zadrozny for his guide which got me going.

I’ll be presenting a session entitled “Refactoring Databases for fun and profit” with a colleague at this year’s “XP Day 2006”:http://www.xpday.org/. XP Day (not just XP, not just one day…) is run by the long-established (and Olde Bank of England regulars) the “Extreme Tuesday Club”:http://www.xpdeveloper.com/. I attended last year and can highly recommend it to both beginners and seasoned practitioners alike. The conference is very vendor-lite – it is run by practitioners, for practitioners.

Our session will be feeding back experiences of using some specific database refactoring techniques, and will also be presenting a new open source tool for managing database change.

Overview

I’ve had a chance over the last few years to observe various different types of tech leads. Collated here are my views on what I think makes a successful one.

A Tech Lead Should…

  • Ensure the creation of a clear and consistent technical vision for the project which can best result in a successful project
  • Ensure all members of the team have a proper understanding of the technical vision
  • Ensure that the technical vision updates to reflect new requirements
  • Track and resolve issues where the code deviates from the technical vision
  • Create an environment in which all members of the team can contribute towards the technical vision
  • Understand and address skills gaps in the team which would result in difficulties implementing the technical vision

A Tech Lead Should Not…

  • Tell everyone what to do
  • Necessarily be the best at everything
  • Write no code
  • Write all the hard code
10 Comments

I just got this from an old colleague:

Despite all of our many successes, our architect has started making noises to the product management team that pair programming is a waste of money.

Why should he let mountains of academic research and our own project metrics get in the way of his own personal opinions!?!?!

Sometimes I just want to pack up and go home

Now first off, when people start talking in favour of Pair Programming, I tell them two things:

# From what I’ve seen, it’s the most efficient way to transfer skills, reduce “truck factor”:http://www.agileadvice.com/archives/2005/05/truck_factor.html, ensure quality and (say it quietly) stop slacking. It’s also damn hard to get some companies to try it for a number of reasons.
# Read “this Hacknot article(Hacknot – A Critical Look At Some Pair Programming Research)”:http://www.hacknot.info/hacknot/action/showEntry?eid=50 before you start quoting studies.

I don’t happen to agree with the editorial bais of the Hacknot site – my own (and my company’s own) experience has shown that we produce better software more efficiently when pairing. That said the study quoted is clearly flawed for the reasons mentioned. There have been more studies since – one day I should really get round to pulling them all together. The major problem in producing such a study is that I don’t know anyone who would spend the money creating a real world experiment over a long enough period of time to produce conclusive results.

I am however fairly sure that “pairing with a cat”:http://www.hacknot.info/hacknot/action/showEntry?eid=21 yields no benefit over pairing with an alternate programmer.