Java | magpiebrain

Its been a day for revisiting previous topics, so I thought I’d readdress some of the “troubles(Strange Java regexp behaviour – grouping)”:http://www.magpiebrain.com/archives/000219.html I was having with regular expressions last week. To recap, I was writing an expression to grab the serial number from the following string:

|          Serial          |
|        1234567890        |

The regexp I was using didn’t seem to work – @Serial(?s).*([0-9]+)@ should of captured the serial number into a group, but was only capturing the last number. Many commentors posted that the reason for this is that @.*@ is a greedy operator (“Doug’s”:http://www.magpiebrain.com/archives/000219.html#comment419 comment should especially be noted – rarely have I seen such effort put into a blog comment!).

Simply put a greedy operator matches as many characters as possible – when this stuff was new to me I used to think of a greedy operator as a little Pacman, chomping his way through my string, waiting till the last possible moment before letting the next operator get a look in. In this instance, the @*@ ate everything until it left just enough text for the @[0-9]+@ to match – which was just the last digit of the serial number. As you would expect, to balance greedy operators, you have lazy (or reluctant) operators. To further abuse a metaphor, I think of a lazy operator as a very full pacman, who is looking for any excuse to go off for a nap. The lazy form of the @*@ operator in Java is @*?@ – in my case this operator gives up when it sees the first number, letting the @[0-9]+@ take over. So lets look at my fixed code:


String input = "|          Serial          |n|         1234567890        |";
Pattern p = Pattern.compile("Serial(?s).*?([0-9]+)");
Matcher m = p.matcher(input);

while(m.find()) {
  System.out.println("Found match: " + m.group());
  System.out.println("Found serial number: " + m.group(1));
}

This particular mistake was quite embarrassing. I’ve always prided myself on my regexp knowledge and to make such a bonehead mistake (not to mention exhibit at least one fundamental misunderstanding about the whole thing) has gone some way to puncture my ego, which I guess is no bad thing… The moral of the story? Reach for the manual before reaching for the blog – you might still make mistakes, but at least you’re making them in private that way!

April 12, 2004

Matt Riable has encountered a performance issue with his JUnit tests, and is “advocating the use of static data(Make your JUnit Tests run faster when using Spring)”:http://raibledesigns.com/page/rd?anchor=make_your_junit_tests_run to construct his Spring @ApplicationContext@. I think in this specific instance he might be justified in his choice (I’m nothing if not pragmatic) but I am always extremely cautious in allowing the use of a shared environment between tests.

Lets look at the crux of Matt’s problem – by the book you should define the environment for a test within @setUp()@, and should clear it up in @tearDown@. @setUp@ is called before each @test@ call, and @tearDown@ after, in order to isolate each test from the other. In Matt’s case, the @setUp@ call is quite slow – the creation of his @ApplicationContext@ involves file IO and XML parsing, not the fastest of processes. So why is @setUp@ called prior to each test? Let’s look at a very simple example:



public ExampleTest extends TestCase {
  private SharedObject sharedObject;

  public void setUp() {
    sharedObject = new SharedObject();
  }

  public void testOne() {
    //run some test using sharedObject
  }

  public void testTwo() {
    //run some test using sharedObject
  }
}

Imagine if @setUp()@ is only called one for the whole @testCase@ – what if @testOne@ or @testTwo@ change the state of @sharedObject@? If @testOne@ is run first, then @testTwo@ will be reliant on the state of @sharedObject@ manipulated by @testOne@ – can you be sure @testTwo@ will still work if you run it first? Many of us run our JUnit tests directly in our IDE, and we have no control over the order in which tests are run. By sharing potentially mutable data between tests we are exposing ourselves to the possibility of variable results based on the order in which tests are run.

Now I’m not advocating the notion that shared data should never be used – for example I cannot see a problem sharing immutable data between tests. However think very carefully about sharing any data which has the capacity to be altered by one of the tests – its important to balance the benefits of instantiating Objects once for all of the tests against the potential risk of poluting the state of one test with the operations of another.

April 8, 2004

I was playing around with the Regular Expression support in Java 1.4, with a view to repeating my earlier tutorials on the use of sed for text manipulation (this time with Java), when I can across a rather strange problem. Imagine my input is a multi-line string, part of which looks like this:

|          Serial          |
|        1234567890        |

I want to match the serial number, which in this case is 1234567890. First off, I want to match the title itself (I cannot just assume any numbers are serial numbers) but I also have to match the numbers themselves. The code to match the string I want to extract the number from looks like this:



String input = "|          Serial          |n|        1234567890        |";
Pattern p = Pattern.compile("Serial(?s).*[0-9]+");
Matcher m = p.matcher(input);

while(m.find()) {
  System.out.println("Matched String " + m.group());
}

Note: The use of the embedded (?s) tag forces the . to match line terminators – by default it doesn’t unless this flag or DOTALL is used.

Put simply the pattern reads “Match the work serial, followed by any characters until you get to a list of numbers and stop there. Sure enough, running this gives the following result:


Found match Serial          |
|        1234567890

Next I group the numbers being matched using ‘(‘ and ‘)’. This gives me grouping exactly as with sed – I can now index these matching groups, using m.group(index):



  String input = "|          Serial          |n|         1234567890        |";
  Pattern p = Pattern.compile("Serial(?s).*([0-9]+)");
  Matcher m = p.matcher(input);

  while(m.find()) {
    System.out.println("Found match: " + m.group());
    System.out.println("Found serial number: " + m.group(1));
  }

But this gives the following output:


Found match: Serial          |
|        1234567890
Found serial number: 0

For some reason the grouping is only matching the last number, not the whole list. I can’t for the life of me work out why…. Oh well, expect some tutorials on the use of regular expressions soon.

_Updated_: ditched all the dashes as they screwed up the formatting.

_Updated_: Fixed the second code fragment

April 7, 2004

“Paper prototypes”:http://www-106.ibm.com/developerworks/library/us-paper/?dwzone=usability are a great tool for quickly designing and demonstrating GUI. Sometimes however the interactions can be a little hard to see – in which case a GUI-prototype can be a boon. Problems can come however from knocking up these dummy interfaces – management and users can get the idea that the product itself is nearly done, or they may start obsessing on little UI idiosyncrasies that aren’t really the point of the exercise. Ken Arnold’s “Napkin Look and Feel”:http://napkinlaf.sourceforge.net/ is an attempt to give coded interfaces a paper-prototype feel – so users get a clear idea that this is a rough draft and nothing more. The webstart demo sows the SwingSet demo using the new look and feel, and it seems to work very well.

April 6, 2004

**Project proposal**: An IM client that synchronises seamlessly with a contact list stored in an enterprise social network system (think: MS Exchange (webdav?), Confluence).

**Initial thoughts**: Use Jabber

**Project Abstraction**: Generic notification mechanism which could be used to send email, SMS, IM. Useable by CruiseControl/DamageControl/Confluence/IM client etc.

**Rational**: Certainly a need for this at work. Is there desire enough to do it?

_Update_: Also think friendster/orkut integration. I’m thinking a server with modular contact discovery mechanism, with client abstraction. Suport for multiple IM protocols or just use Jabber? YAGNI – use Jabber. I should really stop engaging in design by blog.

_Update_: Look at Jain – “JSR 187(JSR 187: JAINTM Instant Messaging)”:http://www.jcp.org/en/jsr/detail?id=187.

March 31, 2004

I had the pleasure of chatting to Jon Tirsén last night. Amongst one of the many topics we discussed, was his Ruby-based Continuous integration build tool “DamageControl”:http://damagecontrol.codehaus.org/builds/damagecontrol/docs/index.html. Designed as the replacement for “CruiseControl”:http://cruisecontrol.sourceforge.net/, DamageControl offers much more flexible framework, and has been running the “codehaus”:http://www.coudehaus.org builds for some time now.

“Continuous Integration(Continuous Integration by Martin Fowler and Matthew Foemmel)”:http://www.martinfowler.com/articles/continuousIntegration.html is the development practise whereby each CVS checkin results in the code being built and tested automatically – in this way encouraging frequent checkins and resulting in people having the most up to date code, and reduces integration issues.

After asking how development DamageControl was going, it turns out that Jon is looking to re-implement it in Groovy. Given that Jon seems to be such a fan of Ruby (I’ll stop short of saying ‘zealot’ as I don’t know him that well) this was a surprising admission. The main reason cited was that Java-based programs have nicer deployment mechanisms than Ruby’s (or most scripting languages) text files. Personally, the reason seems to have a lot more to do with the current mind share of groovy, and the fact it is based on a far more popular platform, in the form of Java.

A Continuous Integration tool is (potentially) such an important part of an Agile development methodology, that a well-maintained, flexible tool is very important. By basing it on the Java platform the number of developers available to help develop and maintain a system must surely outweigh the number of Ruby developers out there. Also the fact that the majority of agile development itself is done using Java, means that those companies using a Java-DamageControl are more likely to have the expertise in-house to administer, support and extend the program, and would therefore be more willing to use it. I would also hazard a guess that movement from Ruby->groovy, with groovy’s Ruby-like features would be much less work that a move from Ruby->Java language.

We also briefly discussed the recent “Groovy JSR(JSR 241: The Groovy Programming Language)”:http://www.jcp.org/en/jsr/detail?id=241, and agreed that the important point wasn’t that it was groovy itself that was being submitted, but that another language based on the Java platform was being submitted. .net has stolen a march on Java, basing multiple languages on its common-runtime (C#, VB, J# etc). Java is just as suited to this kind of approach as .net, and it is time we started seeing Java as a platform rather than a language. This is certainly something MS has achieved quite well (in marketing terms if nothing else) and is definitely something Sun et al can learn from. Hopefully we’ll start to see other languages being submitted to the JSR in this way – I just hope that they won’t have to have the backing of big name companies to achieve this (groovy is backed by the Apache Software Foundation, BEA and ThoughtWorks).

March 18, 2004

I lieu of some actual code, I thought I’d post a high-ish level overview diagram of how commands are executed and handled. Those of you who’ve used “XWork”:http://wiki.opensymphony.com/space/XWork may notice that it looks very familiar – which is unsuprising given that I had no problems with XWorks design itself. The real benifit of this framework over XWork (in addition to the fact that it has a more fully featured IoC framework) will be that it will have a whole host of cloesly integrated UI helper classes thanks to the “spring-rcp”:http://jroller.com/page/kdonald/20040225 project. Due to the nature of XWork (being the underpining of “WebWork”:http://www.opensymphony.com/webwork/) restructuring it to fit in with my goals was a non-starter.

The code is coming together nicely – adding the @ValidationInterceptor@ is my next job, followed by the @ResultInterceptor@ – both of which should be fairly easy.

March 4, 2004

In preparation for the next “London Java meetup”:http://web1.2020media.com/j/jez/javanicuscom/londonjava/, I’ve been playing around with “Robocode”:http://www.alphaworks.ibm.com/tech/robocode with a view to embarrassing myself in front of fellow coders. For those who don’t know, it was originally developed as a teaching tool by IBM – its a complete development environment for coding little Robots that try and blow other robots up, and its very fun in a “why is my robot doing that? Where is he going? Why won’t he fire, for the love of god, FIRE!” kind of a way. Hopefully Robocode will get a wider audience with the great news that IBM have “decided to opensource it”:http://www.alphaworks.ibm.com/forum/robocode.nsf/current/76296EA3B19C41F092246E236222BC66?OpenDocument.
Or they won’t yet – “apparently”:http://www.alphaworks.ibm.com/forum/robocode.nsf/current/990448567D8271246D13D1A191A50F19?OpenDocument that was a fake post, but IBM developer Mat Nelson is “working on it(nterested in opensource Robocode?)”:http://www.alphaworks.ibm.com/forum/robocode.nsf/current/91022251A742C56E71C7678B16754504?OpenDocument.

March 3, 2004

I had some problems with this and found the documentation a little light on the ground, so I felt a little primer was in order.

So what is a proxy class?

Well, in Java terms a proxy class is a runtime-created class which acts like a class of a specified set of interfaces. “CGLIB”:http://cglib.sourceforge.net/ is capable of creating proxy classes without having interfaces – it can mimic a specific class.

Why would you want one?

In “Spring”:http://www.springframework.org (and other languages) proxy classes are used to allow the use of AOP interceptors. The proxy implementation calls the Interceptors before/after invoking the underlying object.

What is an Interceptor?

An Interceptor is a class which gets executed before, after, or around a method. By around we mean both before and after. Oh, and it can also be called when a method throws an exception.
Continue reading…

February 29, 2004

After looking for sometime at using XWork as a command driven framework for desktop clients, I decided to put that to one side for the moment to instead cooperate with “Keith Donald’s”:http://jroller.com/page/kdonald work on “Spring’s”:http://www.springframework.org/ Rich Client Platform (spring-rcp). As I mentioned before I was a little frustrated by XWork’s IoC framework – the fact that you had no control over how @Interceptor@, @Action@ or @Result@ objects were created. I started looking at the problem from the point of view of integrating Spring inside XWork, and ended up with the makings of a Command Framework completely implemented using Spring. The work’s very much at a prototype phase right now, and I’ll post more information as it comes together. I’m hopeful of getting it properly integrated with Keith’s first commits for spring-rcp, and as soon as I do I’ll post some example code.

February 29, 2004

magpiebrain

Sam Newman's site, a Consultant at ThoughtWorks

Posts from the ‘Java’ category

So what is a proxy class?

Why would you want one?

What is an Interceptor?