Sam Newman's site, a Consultant at ThoughtWorks

Archive for ‘March, 2004’

I had the pleasure of chatting to Jon Tirsén last night. Amongst one of the many topics we discussed, was his Ruby-based Continuous integration build tool “DamageControl”: Designed as the replacement for “CruiseControl”:, DamageControl offers much more flexible framework, and has been running the “codehaus”: builds for some time now.

“Continuous Integration(Continuous Integration by Martin Fowler and Matthew Foemmel)”: is the development practise whereby each CVS checkin results in the code being built and tested automatically – in this way encouraging frequent checkins and resulting in people having the most up to date code, and reduces integration issues.

After asking how development DamageControl was going, it turns out that Jon is looking to re-implement it in Groovy. Given that Jon seems to be such a fan of Ruby (I’ll stop short of saying ‘zealot’ as I don’t know him that well) this was a surprising admission. The main reason cited was that Java-based programs have nicer deployment mechanisms than Ruby’s (or most scripting languages) text files. Personally, the reason seems to have a lot more to do with the current mind share of groovy, and the fact it is based on a far more popular platform, in the form of Java.

A Continuous Integration tool is (potentially) such an important part of an Agile development methodology, that a well-maintained, flexible tool is very important. By basing it on the Java platform the number of developers available to help develop and maintain a system must surely outweigh the number of Ruby developers out there. Also the fact that the majority of agile development itself is done using Java, means that those companies using a Java-DamageControl are more likely to have the expertise in-house to administer, support and extend the program, and would therefore be more willing to use it. I would also hazard a guess that movement from Ruby->groovy, with groovy’s Ruby-like features would be much less work that a move from Ruby->Java language.

We also briefly discussed the recent “Groovy JSR(JSR 241: The Groovy Programming Language)”:, and agreed that the important point wasn’t that it was groovy itself that was being submitted, but that another language based on the Java platform was being submitted. .net has stolen a march on Java, basing multiple languages on its common-runtime (C#, VB, J# etc). Java is just as suited to this kind of approach as .net, and it is time we started seeing Java as a platform rather than a language. This is certainly something MS has achieved quite well (in marketing terms if nothing else) and is definitely something Sun et al can learn from. Hopefully we’ll start to see other languages being submitted to the JSR in this way – I just hope that they won’t have to have the backing of big name companies to achieve this (groovy is backed by the Apache Software Foundation, BEA and ThoughtWorks).

Following up my “previous post(A brief history of ed, sed and Regular Expressions)”: on @sed@, I thought I’d post a little tutorial on using @sed@ and @grep@ for the purposes of cleaning up logfiles. To follow the examples you’ll need @sed@ or @grep@ for your platform.
Continue reading…

As “promised(A brief history of ed, sed and Regular Expressions)”:, I’ve found a nice script which will allow you to edit the original files using @sed@, making running @sed@ on multiple files much more useful. The script come from O’Reilly’s excellent “UNIX Power Tools”: (which is probably the most useful book on actually using Unix/Linux/cygwin that I’ve ever read), and can be viewed on the “UNIX Power Tools Examples(, 3rd Edition – runsed)”: site.

The shell script @runsed@ was developed to make changes to a file permanently. It applies your @sedscr@ to an input file, creates a temporary file, then copies that file over the original. @runsed@ has several safety checks:

  • It won’t edit the @sed@ script file (if you accidentally include @sedscr@ on the command line).
  • It complains if you try to edit an empty file or something that isn’t a file (like a directory).
  • If the @sed@ script doesn’t produce any output, @runsed@ aborts instead of emptying your original file.

@runsed@ only modifies a file if your @sedscr@ made edits. So, the file’s timestamp (Section 8.2) won’t change if the file’s contents weren’t changed.

Also of possible interest is “checksed(, 3rd Edition – checksed)”:, which unlike @runsed@ simply provides a @diff@ allowing you to check the changes your @sed@ commands will make. Very handy before you start thinking about running @sed@ on several thousand files!

After hearing the “good news(ongoing – Sunny Boy)”: that Tim Bray has joined Sun, I spent a little time reading over some of his old posts and came across a small piece on the “80/20 point(ongoinig TPSM-12:80/20 Point)”: Briefly put, the 80/20 rule defines a point in a soft are development where you’ve achieved 80% of your requirements with 20% of the effort. Tim then looked at certain pieces of software which seem to solve most of what needs to be solved without going overboard (who have reached this 80/20 point and stopped) and compared it to software who went for every last piece of functionality. Here, the argument is that those pieces of software that solve most problems as simply as possible work, whilst those that attempt to be all things to all people invariably fail.

You may quibble over the exact percentages, or some of the comparisons made by Tim, but any of you that have developed sizable applications will of seen this yourself. Now I could proceed to rant about how TDD (actually any development process with short iterations) can help you quickly identify this point and make a judgement call as to whether or not to continue, but I won’t (apart from this sentence or course). I can however draw a simple analogy between software and books. Short books are by their definition shorter than long books (I get the feeling I may be dumbing down here, but anyhoo). They have less time to get you interested and get their point across before they finish. By the same token, software which has decided against this other 20% effort will have less to explain – it is quicker to learn because there is less of it to learn. You also don;t have to worry about the fact that it has to cater for the extra 20%. Likewise you don;t have to turn as many pages in a short book as in a long book. I kind of loose my point right around here, but I’m sure I had lots more to say when I was thinking about this during my sushi. A simile too far perhaps…

The issue of course is whether or not this extra 20% functionality results in a more complex piece of software for the other 80% of the functionality. If the answer is yes, you could make a strong argument in favour of not doing it, or even look at refactoring the application. You can, I think, always argue in favour of sushi.

Well, to show you how far behind the times I actually am, I finally got my “orkut”: invite. I’ve never been that interested in this whole ‘social networking’ thing, but after a few minutes of playing around with it I realised that all Orkut is doing is providing a framework for communities that already exist. Many of us are members of mailing lists, forums, have people we know socially or just via the web. Theoretically a system like Orkut (and its by no means the only social networking system out there) can replace many of these.
I did have a few other thoughts about Orkut (and the other social networks out there):

* The market research information gathered is fantastically valuable. The value of market research information increases with its verbosity – the more specific the information gathered, the more people want it. Orkut encourages you to to answer as many questions as possible as it helps you describe your persona via the system – whether or not to show you more accurately to people you already know, or in order to meet new people. If Orkut remains a free service, don’t expect it to be too long before you start seeing highly targeted adverts.
* Something like this could form the basis for an anti-spam solution. My filter could quite easily say “I don’t accept mail from someone I don’t know via Orkut”.
* In line with the above, stick a decent web services API on it and integrate it with your PIM software – suddenly your friends become your email contacts, community events get loaded onto your calendar…
* In line with social bookmark managers like “del.ici.ous”: or social RSS aggregators like “Bloglines”:, I could store my feeds and bookmarks in Orkut – these then become resources for my friends.
* Use it to agregate my existing blog, thereby displaying what I’m currently up to on my Orkut page.

“Rocketinfo(Rocketinfo – A Free Personal Web-based RSS News Reader)”: is a very nice looking online aggregator in a similar vein to “Bloglines”: It certainly looks very polished, but does lack a few features which preclude me from considering it over bloglines:

* No ability to import/export OPML files (this site’s blogroll comes from my bloglines account)
* No desktop notifier
* Cannot save items for later
* No email notification

A “post(Test Driven Design vs. Design by Contract)”: by Doug over at Creative Karma takes issue with the fact that I consider Test-Driven Development to be “quite similar(Test-Driven Development vs Design By Contract)”: to Design by Contract. After reading over the piece, I have to agree that at Doug is correct. What I meant to say is that you can almost consider TDD to be a subset (or variation of a subset) of Design by Contract.

What do I mean by that? Well, in TDD you start of with a goal. This goal tends to be fairly specific, but of course comes from a larger idea of what the system should do (most TDD texts seem to skip the part where these specific tasks actually come from, as I’ll do here). This is typically something like (to borrow Kent Beck’s example) “I need to be able to add 5 US dollars to 10 Euros and get a result of 20 Euros if the dollar to euro exchange rate is 1:2”. The developer then breaks this down into solvable tests, which lead to the code being implemented. Each little unit we are defining the tests that let us know the code is working – with Contract By Design you can define pre and post-conditions (this needs to be true before, this needs to be true afterwards). At this level the similarity between the two is clear. However Contract By Design goes far beyond this small scope, to define class invariants, loop invariants etc. I’m not saying one is better than the other, but I do feel they share some of the same benefits.

Doug finishes up thus:

Not only are unit tests black box, they also only test certain cases. A contract specifies the required behavior for every case, and any assertion associated with that contract verifies that the behavior is correct for every case that is encountered. That is what Bertrand Meyer was referring to when he said, “A test checks one case. A contract describes the abstract specification for all cases.”

I agree completely – and this is the point that Bertrand Meyer was making when he talked about systematic testing:

It was shown many years ago through a very simple argument that there’s no such thing as an exhaustive test, a test that exercises all possible cases. So we know we can’t have an exhaustive test, but we can have systematic tests that have a likelihood of exercising the cases that will fail. For example, if you have a parameter that must be between certain bounds, then you want to test the values close to bounds of the range. You want to test maybe the value in the middle, and maybe a few in-between. So we want to have tests that are systematic in that sense, and contracts help a lot generating such “systematic” tests.

Unit tests are not the same as a contract, they merely attempt to prove that the code satisfies the required contract. The better the tests, the more systematic they are, the more you have proved the contract fulfilled and that the code is of decent quality. One single test is not always enough for you to strike that taks of your todo list. No matter what your position is on this particular topic, I think we can all agree on these two points that Doug makes in summation, which I’ll reiterate here so I don’t forget them:

  • Testing cannot show the absence of defects, it can only show that defects are present.
  • We cannot test quality into a product, we have to design it in.

In Artima’s “third part(Contract-Driven Development – A Conversation with Bertrand Meyer, Part III)”: of their interview with Bertrand Meyer, some interesting comparisons regarding Design by contract and Test Driven Development are made. Bertrand Meyer makes the point that the small units of work that get undertaken in a TDD cycle (write the test, run the test, refactor) and very specific tasks that are nothing more than testing/implementing parts of a wider contract. When I use TDD, I do in fact tend to write at the top of my todo list the overall task I’m trying to achieve, which I’ve realised is the contract itself. I do not think there is much difference between the TDD and Contract-Driven approaches, certainly they try and achieve the same goal – which in Bertrad’s own words is to “build software in which reliability is built in rather than achieved after the fact through debugging”, or as Kent Beck would say (to paraphrase) “Ask the computer if it works, don’t try and work it out for yourself”.
The differences between the two are simply down to scope. The scope of a contract can be as small as a single test for a trivial example – or it can be as large as several. Both approaches still require that the larger problem be broken down and solved, and be proven to work. It seems perhaps that yet again we have another virtually identical solution for the same problem.

I’ve been away for a while (due to a debauched weekend and an illness whose effect I doubtless exaggerated), so I thought a brief roundup was in order

* Matt Riable has posted an “interesting piece([DisplayTag] Changing a row’s CSS class based on values in the row. )”: on getting the “Display Tag Library”: to style rows based on table values.
* Simon’s posted an “overview of JRules(JRules – A brief overview)”: on the JRules business engine. It doesn’t seem as flexible as Drools, but is interesting due to its support for the ILog Rule Language (support for which could doubtless be added to drools as a semantic module). I did find a helpful pointer to Ronald G. Ross’ “Principles of the Business Rule Approach”: which has gone on my wish list.
* I discovered “Nagios”:, which appears a lot of what “Big Brother”: does, only for free. Too bad in my new job its unlikely I’ll be doing any system admin stuff like I am now.
* Keith Donald has “checked in”: the latest version of his and Seth Ladd’s validation framework into the Spring CVS sandbox. I’m in the process of integrating this with my command-framework.
* Daido Metal has announced a “toy car”: that uses a water-powered fuel cell to generate hydrogen. I really need to read up on this kind of stuff…
* David Miller’s “Zebra Tables”: for “A List Apart”: shows you how to style your tables nicely using some CSS and Javascript.
* I became even more of an IoC Zealot, and as such started “defending the constructor(Daniel Bonniot’s Weblog – Are constructors useless?)”:
* I read a book about “loosing a penguin(Penguin Lost )”: and a “bigger book(Paul Auster – collected prose)”: arrived.
* I also realised my books to read stack was growing rather than shrinking, and that I had less money than I thought. I think these two facts may be related.
* I discovered a lovely little shop on the “Portobello Road”: that sells very nice cupcakes.