avatarharuki zaemon

Domain Specific Languages: Objective-C, Ruby and Java (and Groovy)

By

I’m forever trying to “improve” my coding and design skills; I say “improve” because there is always the risk that I’m making changes for change sake. One thing I really try hard to do is remove any notion of getting/setting properties of objects and instead focusing on behaviour.

Now I’ve always tried hard to do this but it’s a skill that definitely takes some time and constant practice. TDD certainly has made my life easier by giving me some tools for this but more recently I’ve really been making an effort to try and “design” my classes in a way that creates Domain Specific Languages (DSL). DSLs enable you to write code that is hopefully more readable and understandable and therefore easier to write, debug and maintain. (At least that’s the ide in principle anyway.)

I use a variety of languages in my day-to-day work and I’ve found that the effort required in creating a DSL to be vastly different depending on the language. Of the few languages I’ve actually used in anger (Smalltalk unfortunately not being one of them) Objective-C really does provide a nice syntax for creating DSLs (once you get your head around the square brackets).

For example, take the age-old problem of transferring money from one account to another. In Objective-C, assuming we have an Account class with a transfer() method, we can write something like this:

id source = ...;id destination = ...;[source transfer:20.00 to:destination]

_ Using floats for currency is generally a bad idea but it’ll do for illustrative purposes here :-)_*

Notice the use of named parameters to really help convey the intent. This is actually an optional feature – you can still use positional arguments if you like – but one I use exclusively.

The transfer() method might look like this:

-(void)transfer:(float)amount to:(id)destination {[self debit:amount];[destination credit:amount];}

Again, ignoring the unfamiliar method declaration (you’ll just have to trust me when I tell you that you do get used to the language and even love it), it’s all pretty nice and readable.

Essentially, when calling the method, the identifier that preceeds a colon serves as the name of each argument – ie as used by the caller– with the method name itself serving as the name of the first argument and to for the second.

Once inside the method, the identifier after the colon serves as the name of the argument to be used: amount and destination.

All up, Objective-C is very concise and allows for the creation of fairly nice DSLs without much effort at all. In fact from what I can tell, most Objective-C code turns out to be more-or-less DSL-like; it’s typical to see methods calls that look like:

[report printTo:printer withPagination:YES]

The next example I whipped up uses Ruby. Now, I have to admit that I have all of about 3 days of Ruby experience so if you can come up with a better way please, please, please let me know. So, caveats aside, here is the same example in my bestest Ruby, again starting with the usage:

source = ...destination = ...source.transfer :amount=>20.00, :to=>destination

This uses a ruby hash – an associative array like a HashMap or Hashtable in Java. You usually need to wrap the hash definition in curly-braces but this can be omitted when the hash is used as the last – or only – parameter.

Now for the method itself:

def transfer paramsamount = params[:amount]self.debit amountparams[:to].credit amountend

Pretty good really but I still have a preference for the Objective-C use of named parameters. I did read in Programming in Ruby that named parameters was to be a feature of Ruby at some point but hasn’t yet made it. I also read somewhere recently that Ruby 1.9 will have a slightly simpler calling syntax so the invocation will look like this:

source.transfer amount:20.00, to:destination

The only Ruby-based DSL I know of is Rake but again my experience with Ruby is somewhat limited.

And finally Java, IMHO the ugliest of the bunch. To achieve something similar in Java seems to me at least (and again, somebody prove me wrong puhlease!) that named parameters with an even remotely useful syntax requires a combination of inner-classes and method chaining. So, to achieve a calling syntax like this:

Account source = ...;Account destination = ...;source.transfer(20.00).to(destination);

Requires something at least as complicated as this:

public class Account {...public Transfer transfer(float amount) {return new Transfer(amount);}public class Transfer {private final float amount;Transfer(float amount) {this.amount = amount;}public void to(Account destination) {Account.this.debit(amount);destination.credit(amount);}}}

The best example of a Java-based DSL that I can think of would probably be JMock.

And finally, by popular demand, a Groovy example. Groovy currently supports named parameters for calling methods that accept a Map so the calling syntax is a lot like Objective-C:

def source = ...def destination = ...source.transfer(amount:20.00, to:destination)```

The transfer() method itself then becomes something like:

void transfer(params) {def amount = params.amountthis.debit(amount)params.to.credit(amount)}

Which looks unsurprsingly like Ruby. Alas, I have no real-world example of a Groovy-based DSL to show you.

The Computing Disease

By

My iPod has been getting a real workout lately. I’ve been getting into podcasts and audible books. I’ve never seem to make the time to read paper books anymore and when I do I always manage to fall asleep after a few paragraphs so listening to them instead works well out particularly for me.

The current book is The Pleasure of Finding Things Out by my all-time favourite physicist (yes I have a favourite), Richard P. Feynman. I’ve always liked Feynman, mainly because his approach to life and learning seems to fit with my ideal – not necessarily reflected in reality but I do try.

Over the years I’ve read a number of his books, papers and lectures etc. and he always seems to have such a great outlook on life, learning, love, you name it. IMHO, a man with his head screwed on just right. I especially like his lectures on physics and his thoughts on computing. He has a way of making difficult material accessible to the likes of yours-truly.

Among many other things, Feynman states that knowlegde without understanding is pretty much a waste of time; the idea that just being able to perform some function or know the name of something without undertsanding what you are doing or the nature of that something, is not only pointless but inefficient and makes you largely ineffective. A subject that is very close to my heart also.

Yesterday I was listening to one of his talks on his time at Los Alamos on the Manhattan Project. He was talking about how they developed sophisticated – even by today’s standards – algorithms for utilising many computers – adding and multiplying machines – in parallell. The problem was that although they had ordered a number of IBM machines, it would be sometime before the machines would arrive and even then they woudl need to be assembled. In the meantime they decided to start writing and debugging their programs. To do this, they enlisted the help of a group of women to act as the adders and multipliers – like a typing pool only performing calculations instead. Amusingly, because of the way the algorithms worked and because of the state of technology at that time, the women managed to process the data as fast as the machines could. The only problem was that the women got tired and needed sleep, food, etc.

Anyway, in all this, Feynman makes a great quote which tickled my fancy:

There is a computer disease that anybody who works with computers knows about. It’s a very serious disease and it interferes completely with the work. The trouble with computers is that you ‘play’ with them! - Richard P. Feynman

Of course we all prefer to call it “innovation” ;-) but at least it’s nice to see that the phenomenon isn’t a recent one – apparently even the great Von Neumann was afflicted.

My Biggest Peeves With JavaScript

By

There’s really not a lot to dislike about JavaScript; It’s object oriented – in a way Java can only dream of; weak-typing coupled with dynamic property objects make mocking and testing a breeze; the list goes on. But there are two things about the language that continue to irritate me: Exceptions; and undefined/null. Ok, so I guess that’s three things but the last two are really in the same category.

First up, undefined/null. Let me start by saying that I love the fact that these are actually objects – go null object pattern– but the fact that they silently gobble up any calls made to them is less than helpful. Even less helpful is the fact that I can’t override this behaviour. Unlike every other object in the language, these two don’t seem to have a prototype I can mess with. At least in Objective-C/Smalltalk you have the option of finding out when messages are sent to null objects or to objects that don’t understand them. Thankfully, testing catches most of these problems but where it can’t, I resort to my tried and tested method: runtime assertions. Which leads me to my second peeve.

Exceptions. Well at least JavaScript has them I guess, even if they are called Errors. Sure I can throw them, I can catch them, I can put code in a finally block but I can’t get any information about them. I use runtime assertions all the time in Java, Objective-C and even JavaScript to catch things I haven’t managed to test for or more importantly, things I can’t necessarily predict. Whenever an assertion fails, an exception is thrown which usually dumps a stack trace giving me everything – well almost everything – I need to track down the source of the problem. Unfortunately in all web browsers I use regularly – Safari and Firefox – all I seem to get is the message "Error" printed to the JavaScript console. Great! So I know there is a problem but I don’t know where, nor importantly why? This usually leads me to the JavaScript debugger – probably the best thing that ever happened to the language. Not that I mind using the debugger but a stack-trace is extremely useful!

ActionScript: JavaScript in Flashy Pants?

By

One of the jobs I’m doing at the moment involves Flex – and consequently Flash – for the front-end and Java at the back-end, most of which involves writing ActionScript.

If you haven’t used ActionScript before, you’d be forgiven for thinking you were writing JavaScript. Well, in fact you are writing JavaScript with a few changes, not necessarily for the better IMHO – If you read the documentation you’ll discover that ActionScript is indeed based on ECMA Script, the same specification that was retro-fitted to JavaScript.

From what I can tell, ActionScript 1.x was pretty much exactly JavaScript. Version 2 on the other-hand adds some syntactic sugar to the language; no doubt an attempt to make a prototype-based language appear more like a class-based language in the hope of attracting all the Java developers. I’ll get into these additions in a bit but in any event, I find it highly amusing when Macromedia proclaim that version 1 was somehow a functional language and that version 2 is now a fully-blown O-O language.

(As an aside, I interviewed some potential developers the other day who said they loved ActionScript but thought JavaScript was a language for doing hack-work. I can only assume that opinion comes from having only used JavaScript for handling onXXX events in a browser rather than any objective comparison of the langauges themselves.)

So, on to some of the more notable changes to the language starting with type-safety. JavaScript is weakly typed. This is a topic of biblical proportions so I’ll leave alone the merits or otherwise of weak-typing suffice to say that I like it. In their infinite wisdom, Macromedia decided that what was needed was a bit of strong (or strict) typing. This is enforced by the compiler with a combination of type declaration after variable and parameter names:

var foo **: Number**;

Unfortunately this type-safety is limited to compile-time checking – nothing happens at runtime, much like Java’s generics. Not so much of a problem I guess – I didn’t want the strong typing in the first place – but amusing nonetheless.

So the next question is, how do you know what functions are available for a given type? In JavaScript these are essentially defined at runtime by adding behavior to a prototype. Well why not add some more syntax to the language?

**class** MyClass {**private** var num : Number;**public** MyClass(num : Number) {this.num = num;}public getNum() : Number {return this.num;}}

Ok, so that’s probably a little easier for most people to understand than:

MyClass = function(num) {this.num = num;};MyClass.prototype.getNum = function() {return this.num;};

But I actually find the “new” way actually involves a lot more typing than the good-old-fashioned way – which once you’re used to is easy to read anyway – and hides the fact that it is still a prototype-based language and NOT a class-based one. (I also quite like the way prototype declares “classes” but I haven’t found the time to start doing it that way just yet.)

You can also declare property accessors just like in VB/C#/etc:

class MyClass {private var num : Number;...public **get** num : Number {return this.num;}public **set** num(num : Number) : Void {this.num = num;}}

Again, great if you like that kind of thing; I’m still not convinced I do.

The other thing that gets people is the scoping rules. JavaScript has some pretty funky scoping rules which once you’re used to and understand work just fine. Again, because ActionScript looks like Java, developers don’t realise the importance of understanding the scoping rules.

ActionScript also introduces another new keyword: dynamic. This can be used when defining a class and indicates to the compiler that it should NOT do strong type checking when invoking methods and accessing properties of the class – either from within or from outside the class. Again, I find this a highly amusing construct as I can always defeat the type checking the old-fashioned way anyhow: myObj["doSomething"](15);. Yes, I agree, that’s a bit of malicious code but what I’d prefer is an option to turn on/off strict typing at a project level and not on a per-class basis.

Interestingly, someone recently pointed me at an open source compiler for ActionScript. I wonder if I could just download the code, remove the strict type checking and be done with it ;-). Oh and there’s also a unit testing toolkit as well.

All in all, I quite like using ActionScript. I’ve only played around a bit with Flash/Flex bindings but they seem quite nice too. I actually think it would be pretty easy to build a Cocoa – JavaBeans the way they should have worked – like framework (eek there’s the word!) which I thoroughly enjoy using at the moment.

But when it comes down to it, It has been my experience – and no doubt the experience of others in the JavaScript/Ruby/Smalltalk/Objective-C world – that I can achieve a lot more in fewer lines of readable code without all the syntactic sugar and strong typing. (The caveat being that I’m also a big fan of automated testing.) Moreover, understanding that JavaScript uses prototypes opens up a whole new world of possibilities that further increase my productivity – a fact that became apparent to the attendees of a mini presentation I gave on this subject just recently.

Managing Sensitive Data on Mac OS X

By

One of the clients I’m working for at the moment mandates – quite rightly so – that any data taken offsite must be encrypted. Mac OS X provides two ways to achieve this: FileVault; and Encrypted Disk Images. (If you want to know how to encrypt emails, see this article.)

FileVault (System Preferences>Security>Turn On FileVault…) is kinda cool: It allows you to encrypt your entire home directory. This might be handy if you’re super paranoid but I’m not; neither do I want the overhead of having everything I write to disk encrypted.

Encrypted Disk Images on the other hand are super cool for this kind of thing. If you’re familiar with Mac OSX then you’ve no doubt used plain disk images before. They’re the .dmg files that are used for distributing most of the software you install. The neat thing about them is that you can mount a disk image and use it just like any other disk on your machine. On top of this, you can encyrpt them so that unless you have the password, no one can peek at the data.

To create a blank disk image:

  • Open Applications>Utilities>Disk Utility
  • Select File>New>Blank Disk Image…

To create an encypted disk image:

  • Open Applications>Utilities>Disk Utility
  • Select File>New>Disk Image From Folder…
  • Choose the folder you wish to use as the source of your disk image; and
  • Press Image

In either case:

  • Enter a filename and location and choose an initial size – disk images dynamically resize so don’t worry too much about this
  • Select AES-128 (recommended) for Encryption; and * Press Create

You’ll then be prompted for a password with which to encrypt – and decrypt – the file. You’re also given the opportunity to save the password in your keychain. This just means you won’t have to remember the password when mounting it on your own machine – If you move the file to another machine or send it to someone else, a password must be entered manually.

Now anytime someone tries to mount the disk image, they’ll be prompted for a password; if they don’t have the password the data is protected, even if they try to read it using another tool.

Once mounted, you can use it as you would any other drive: as a mounted image in Finder; or under /Volumes/ using the filesystem.

MT 3.2 Atom Feed Template Problem

By

I noticed that since I upgraded to MT 3.2 my atom feed – the main one since the switch – wasn’t being displayed correctly in Safari’s RSS Reader. The culprit: a slight problem with the content-type in the template for atom.xml.

To correct the problem, find where it says <content type="html" ... and change "html" to "text/html" and all should be fine again.

The index.xml template works just fine though I’ve yet to bring either the comments.xml or index.rdf up-to-date. The former is just laziness – I need to remember how I converted the original index.html template; The latter is because MT seem to have dropped support for RSS formats <2.0. Neither of these is really an issue as they worked before and continue to work now.

FWIW, I’m considering removing index.rdf sometime soon anyway as I’m pretty sure almost every RSS reader now supports at least RSS 2.0 and most-likely Atom as well.`

Updated 17 October 2005

Seems there are all sorts of problems with the template. I ran it through a feed validator service and I’m frankly staggered anyone can actually read my blog. It seems the content type can be left as “html” after all but I needed to change a whole-lotta other stuff to get it to pass. Interestingly, Safari now sometimes manages to read the whole feed just fine. Go figure? Luckily I recently switched to using PulpFictionLite as my news reader. Safari just doesn’t cut it I’m afraid. Anyway, it’s just another example of what can happen when I try playing with things about which I understand very little. Sigh.

Multiple Mac (Tiger) Logins for Testing

By

I do all my development on my mac powerbook which, up until today, has had only one account: mine. But this morning, as we were trying to solve the last of the MT issues, I needed another login with which to quickly try something out.

It seem as though there was something peculiar about my login that was causing some of the problems so, I open Accounts and created a test user. So far nothing special about his – I’m sure you create test accounts all the time– however it’s always annoyed me that you have to logout to switch users. I mean really, this is supposed to be unix right?.

Tiger (OS X 10.4) has a nifty “Fast User Switching” option which allows you to switch between users while leaving all the applications running. I had heard about this feature some time ago and thought to myself “phe! what do I care?” but I’d never thought about it in the context of testing.

To enable fast user switching, open “Accounts” in “System Preferences” and and click “Login Options”. Tick the box titled “Enable fast user switching” (it’s down the bottom). You should then see a list of accounts in the top-right of the menu-bar.

The great thing about this for me is that I can quickly blow away the account and re-create it; or have multiple accounts all logged in to my apps at once, all from the one machine. Of course the fact that I need to get down to manual testing says more than anything but alas, in this case, I didn’t have much choice. Besides, as James has berated me for in the past: Just because you have automated tests, doesn’t mean you can avoid manual testing as well.

On a side note, I also love SSH as it allowed me to install a public-key for the MT support guy to login to my shell account without needing me to create new users – which I can’t do anyway because it’s not my server :)

Update (2 September 2005)

I noticed today that Redstone Software have a new version of OSXvnc that supports this, allowing you to view (and control) the other logins without even switching! It’s a free download; as is Chicken of the VNC – A VNC client – which you’ll also need.

MovableType Weirdness Again

By

Some of you may have noticed some decidedly odd behaviour with the blog the past couple of days: you couldn’t post comments; and I couldn’t post new entries. Somehow – I’t’s still a mystery – things just stopped working. The suspicion I have is that there was some problem with the Berkeley DB files after a server upgrade but I can’t be sure.

For the most part, MT behaves itself pretty well – to-date I’ve hardly had any problems – but when I do have a problem it usually requires quite a bit of effort to work out what went wrong and fix it (if possible).

So anyways, after unloading, reloading, repairing, upgrading MovableType and performing all manner of other unix witchcraft – most of it performed at 3am – things are finally back up and running again. The next step will be to migrate from Berkeley DB to Postgres in the next day or two so there may be a few more minor interruptions a long the way.

The upshot of all this is of course that I’m now running on the latest (and hopefully greatest) version of Movable Type. Having said that, blogging ain’t my profession – god save you all – so I’m not sure that an upgrade of MT actually buys me much except some more long nights attempting to work out why things stopped working.

Everything seems to be in or-der but if you notice any more weirdness please let me know, otherwise I’ll get back to eating my orange sher-bert

Update (1 September 2005)

Ahh yes, I knew it was too good to be true. Robert kindly informs me that comments are still busted: you can read them just not post them. After a little investigation, it looks like there is some problem with the new version of SCode. It worked for me because I’m authenticated on my own blog but it doesn’t seem to work for anyone who isn’t authenticated. Sheesh! Maybe I need to write a suite of tests for my own blog!?

Thankfully, the solution is pretty simple. Thanks to the author of the plugin who responded to my forum posting: The file SCode.pm contains the “$app->{requires_login} = 1;”. This should probably read: “$app->{requires_login} = 0;”.

Update (2 September 2005)

All converted to PGSQL now thanks to the truly awesome support people at MovableType. They ironed out the last little issues with upgrading and now I’m off Berkely DB. Hopefully this now means no more file corruptions.

All Your Keystrokes Are Belong To Us

By

I stumbled upon this paper while doing my weekly browse through articles on CleverCS. To quote from the abstract:

… a novel attack taking as input a 10-minute sound recordingof a user typing English text using a keyboard, and then recovering up to 96% of typed characters…

They even run the recovered text through a spell-checker which successfully corrected a mistake in the text as it was originally typed lol.

Pretty cool idea although I figure if you can sneak a microphone into a building – say by sending someone a bunch of flowers or a promotional desk-lamp, etc. – you can probably just as easily “upgrade” someone’s “faulty” keyboard and record keystrokes that way instead.

Attack of the Killer GPUs

By

After reading about the latest PlayStation, XBox, GameCube, etc. I was struck by how much raw processing power these machines have and how they manage to deliver such massively parallel computing at relatively low prices. For example, according to an article I read recently in Popular Science, the latest PlayStation sports nine dual-core processors, Rambus XDR RAM (apparently supporting data rates of 25.6 GB/sec) and a Rambus IO chip that supposedly moves data around at 76.8 GB/sec! All for what, a couple of hundred dollars?

Of course these machines sell in the millions and you pay through the nose for the games themselves but even still.

Then this morning, my brother forwarded me a link to an interesting article on Nvidia:

…in a recent contest to build the world’s fastest database server, the winner was a university professor who ported SQL software to run on an Nvidia GPU…

I can’t vouch for the facts of said article but if that quote is anything to go by, maybe all that wasted (oops I mean un-tapped) processing power might get some interesting use.

Pull Me Push Me Anyway You Want Me

By

I recently needed some code to parse text streams. Conceptually, the logic was pretty simple: break a stream into words and combine consecutive words into “phrases”. So for example, the stream "have a nice day" might be broken into the phrases: "have", "have a", "a", "have a nice", "a nice", "nice", "a nice day", "nice day" and "day".

The problem is that I’m not going to own this code in the end, someone else is (permission was given to publish the code) and after numerous consulting gigs over the years, I’ve become very careful to avoid leaving behind any Alien Artifacts. So I produced two versions, each producing identical output but based on very different designs, in the hope that at least one of them would fly.

I’ll start with my first approach which, IMHO, is pretty easy to understand – well I wrote it so I guess it is for me anyway – and an interface that accept tokens as they are parsed (a Visitor of sorts):

public interface TokenVisitor {public void onToken(String token);}

Classes that implement TokenVisitor will be notified anytime a token becomes available for processing.

Next, we want to create phrases from tokens and print them to the console so, we also need a class that takes the tokens it receives, creates phrases of between 1 and maximumPhraseSize tokens and sends them to another visitor (a Decorator):

import java.util.LinkedList;import java.util.Queue;public class PhraseBuilder implements TokenVisitor {private final Queue&lt;StringBuilder&gt; builders = new LinkedList&lt;StringBuilder&gt;();private final TokenVisitor output;private final int maximumPhraseSize;public PhraseBuilder(TokenVisitor output, int maximumPhraseSize) {assert output != null : "output can't be null";assert maximumPhraseSize &gt; 0 : "maximumPhraseSize can't be &lt; 1";this.output = output;this.maximumPhraseSize = maximumPhraseSize;}public void onToken(String token) {assert token != null : "token can't be null";for (StringBuilder builder : this.builders) {builder.append(' ').append(token);this.output.onToken(builder.toString());}this.output.onToken(token);this.builders.add(new StringBuilder(token));if (this.builders.size() == this.maximumPhraseSize) {this.builders.remove();}}}

And finally, some sample usage:

    Lexer lexer = new Lexer(_stream_, new PhraseBuilder(new TokenVisitor() {public void onToken(String token) {System.out.println(token);}}, 3));lexer.run();

Here, an instance of the Lexer class (not shown for the sake of brevity) simply breaks _stream_ into words (tokens) and calls TokenVisitor.onToken() passing each one in turn. The phrase builder acts as the first level visitor and passes its results on to the next level visitor, an anonymous inner class that simply prints each token to the console.

For many people it seems, this push style of processing is not only unfamiliar, but downright peculiar. When I show this style of code to developers – especially junior developers and non-technical people – they find it hard to grasp. Not so much the logic in PhraseBuilder but what stumps many people is the “complexity” of the overall “pattern” and in particular the usage. For many it seems, this approach is all a bit “backwards”.

So for comparison, here’s an example of a more conventional pull mechanism, starting with the interface:

public interface TokenStream {public String nextToken();}

This time we have an interface which we can call to get (pull) the next token rather than be notified (push) as in the previous example. The method nextToken() returns null to signify the end of the stream – no more tokens.

Next up, the phrase builder. Again, we’ll implement the interface – to allow chaining – but this time we’re relying on a pull rather than push mechanism to get tokens:

import java.util.LinkedList;import java.util.Queue;public class PhraseBuilder implements TokenStream {private final Queue&lt;StringBuilder&gt; builders = new LinkedList&lt;StringBuilder&gt;();private final Queue&lt;String&gt; phrases = new LinkedList&lt;String&gt;();private final TokenStream input;private final int maximumPhraseSize;public PhraseBuilder(TokenStream input, int maximumPhraseSize) {assert input != null : "input can't be null";assert maximumPhraseSize &gt; 0 : "maximumPhraseSize can't be &lt; 1";this.input = input;this.maximumPhraseSize = maximumPhraseSize;}public String nextToken() {return this.hasNextToken() ? this.phrases.remove() : null;}private boolean hasNextToken() {if (this.phrases.isEmpty()) {makePhrasesWithToken(this.input.nextToken());}return !this.phrases.isEmpty();}private void makePhrasesWithToken(String token) {if (token != null) {this.builders.add(new StringBuilder());for (StringBuilder builder : this.builders) {if (builder.length() &gt; 0) {builder.append(' ');}builder.append(token);this.phrases.add(builder.toString());}if (this.builders.size() == this.maximumPhraseSize) {this.builders.remove();}}}}

Holy schmokes! That’s a whole lotta code with multiple queues and extra private methods. Surely it can’t be that complicated?

Ok, so how about some sample usage:

    PhraseBuilder builder = new PhraseBuilder(new Lexer(_stream_));String token;while ((token = builder.nextToken()) != null) {System.out.println(token);}

Sheesh! That’s pretty simple. Far simpler than the code in the first example and pretty obvious what it’s doing really and when I show this kind of code to people, they tend to respond with “Oh, I see. That makes sense.”

Considering the relative complexity of the second phrase builder to the first, I find this all somewhat odd: the original example took me about ten minutes to code up and test; the second probably around twenty – at first I tried to do it from sratch but I gave up in the end and resorted to a brute-force conversion approach not dissimilar to that required to convert a recursive algorithm to an iterative one.

In fact, push versus pull is very similar to recursive versus iterative: people tend to have the same comprehension difficulties with recursion as they do with a push-style calling mechanism. It’s a strange thing really because although conceptually simpler, an iterative approach can often be much harder to implement than a recursive one; similarly, a pull-mechanism can be harder to implement than a push-mechanism.

It should be obvious by now that I have a preference for recursion and push-style processing. For one thing it removes lots of getXxx() methods which is no doubt why I like closures in languages such as Smalltalk, Ruby, Groovy, JavaScript, etc. I also find it forces me to create lots of little classes that do one thing and do it well.

That said, I can also see why (and under what circumstances) pull is more attractive: it’s usually easier to manage flow-control than with push. Often the difference between the two comes down to where state is being maintained: In the case of pull, state is maintained inside the stream; for push, state is maintained inside the parser. No doubt why many people have switched from using push-parsing to pull-parsing for XML.

Updated (1 September 2005)

Thanks to Kris for pointing out the typos in my code. I had copied the examples from and made some on-the-fly modifications to reduce their size but it seems I missed a few things – oops.

Godwin's Law of Java

By

While reading A case against Annotations, I couldn’t help but laugh-out-loud at this particular response:

RoR is quickly becoming the Godwin’s Law of Java language related discussions:“As an online Java discussion grows longer, the probability of a comparison involving Ruby or RoR approaches 1 (i.e. certainty).” – Marc Stock

Some others off the top of my head:

  • Any editor and Emacs;
  • Any programming language and Smalltalk;
  • Windows and Linux;
  • …?

Having never heard of the Law before, I did a bit of reading and found, among other things, an FAQ and a paper by Mike Godwin himself.

PGP for Mac Mail

By

If you’ve ever needed (perhaps need is too strong a word, how about wanted) to digitally sign – or encrypt for that matter – your emails from within the Mac Mail client, it’s pretty simple. Even though there are plenty of mail applications that support PGP, I’ve grown fond of Mail.app so this morning – for no real good reason – I installed a few plugins, etc. and was up and running in literally 5 mins (ok 10 mins – I didn’t have a clue what kind of signature I should use).

First go and get GPGMail. The download is a .DMG file and the instructions on the web site are easy to follow. In addition to GPGMail, you’ll also need GNU PG for the Mac (GPGMac and GPG Keychain Access as a minimum).

I also found I needed to go into the Preferences>PGP>Composing and switch on “By default, use OpenPGP/MIME”. This allows signed messages to be sent using MIME rather than the old-school format which surrounds the text in the signature and makes it look as though your email was sent through some kind of full-on nerdifier – a bit scary for mum.

Now that I have the ability to sign emails I’ll just, well, probably never use it really – quite frankly, I can’t imagine anyone being bothered to impersonate me in an email – but at least I feel “safer” LOL.

So anyway, here is my PGP key (valid until August 22, 2006) for anyone who cares:

Key ID: 0xD56C95B07EA87B26Key Type: RSAExpires: 2006-08-23Key Size: 2048Fingerprint: AF94 4D3E F229 1A40 4F79 17DD D56C 95B0 7EA8 7B26UserID: Simon Harris <haruki_zaemon@mac.com>

- -- --BEGIN PGP PUBLIC KEY BLOCK -- ---Version: GnuPG v1.4.1 (Darwin)mQELBEMLpCoBCAD1qKqEmXhnbg8unu7n3wL8d8qyqu9M7iIhLMn6ZIxHPe91vXCi3NxCu9J5p+nenKcwyPV4i/TA4G6lr6hGAv6yOkCKXg/kMX/oPoqVALBnzz+NNXXOv9xZ1/DnQuyMXj/JP3u/rMrGErO5BEq+RtxQ3BwdbF8TsEktaaIrPPG/ZRWlSSCnFhTbS4F+J8bNNgmB2M8AYbk5F2FJc68ikSDDKz28F/pF1Xal+O/s3nVtXgQkf6o5sV/YnIGogDP419XQ8C9tb4dqoe8oR8v25g6umNEDcUMtOS3Utyds2q7mfdlczjAuYpixCLk8QMbEfsaQqGJU7b7YjK9iaVJqS/Z3AAYptC1TaW1vbiBIYXJyaXMgPHNpbW9uQHJlZGhpbGxjb25zdWx0aW5nLmNvbS5hdT6JATcEEwECACEFAkMLpCoFCQHf4gAGCwkIBwMCAxUCAwMWAgECHgECF4AACgkQ1WyVsH6oeyZa4gf9Fus1SDOwBYG6RLiQomXWhfHibGZnrssw9ECemI6I81kgKC6rd+srxbiKit09TIMIUzZ/oecNVtxg80rgbYsOT4EGniq/As5c6xfYNcxwgGW00Xf6txvMGCRzkierHWlE0KajOW94AnuAtzHC9vsPVxTjt4dM08IHWS9VeuqGq8ULokcHh9uF4e24s/maJFUGikYm2dACKS8vNPImFHUcV17pTm4gptag4bm1+KmFWS1wnUS/I1jfmUc+xJOrXFRadkEbiYJEi1aaPyvgyvXzupia6RDyMvPfUILZLO1L4be/KGf6R6jdhE4T+9U4dNHbGnukIqkIQLRxgNqhtklgkA===2DOr -- ---END PGP PUBLIC KEY BLOCK -- ---

Of course it occurs to me that someone could also spoof my blog and change the public key here but again, I’m thinking people have better things to waste their valuable time on like say, RAD for example ;-).

Scraps of JavaScript

By

Not much today but some little bits-and-pieces of stuff I’ve picked up over the last two weeks. It’s been a steep learning curve going from no JavaScript to writing a character-based terminal emulator and it’s sure been fun.

Now that I have a modicum of JavaScript under my belt, I think I’ll finally take Big Daz’ advice and have another look at prototype. I had a quick look initially – on his recommendation – but I was so new to the language that none of it made much sense. FWIW, thanks again to Big Daz, I also spent a lot of time reading quirksmode.org.

Overall, DHTML works really well. The browsers seem to handle running JavaScript pretty well – the performance is quite impressive – and it’s not that difficult to get things to work cross-browser.

So, here we go…

Rather than report an error, most browsers seem to silently fail or at best give a rather less than helpful message – either by way of a pop-up or a message to the JavaScript console.

The error messages in Mozilla – sent to the JavaScript Console – are far more useful than those generated by Safari – also sent to the JavaScript Console; MSIE is woeful when reporting (by way of a pop-up) errors in JavaScript files that have been included via &lt;script language=&quot;javascript&quot; src=&quot;...&quot; type=&quot;text/javascript&quot; /&gt;.

The debugger for Mozilla works a treat.

Methods can’t be named the same as fields – they’re really just the same thing anyway. Not really a problem but I was translating some code to JavaScript and it didn’t work out as I had planned ;-). Either use an underscore (_) for field names; make sure your method names are always prefixed with a verb such as get/is/etc.; or “allow” direct access to fields. I say “allow” because strictly speaking, it seems that field values are pretty much always accessible anyway.

Closures usually require that you define a variable with a value of this to ensure you can always refer back to the object that owns the function being called:

    var **self** = this;orders.each(function(order) {**self**.process(order);});

To ensure your onkeypress event handler is called with an event object, use something like the following to capture the event and then delegate:

    var self = this;document.onkeypress = function(event) {return self.onkeypress(**event ? event : window.event**);}

To have a keystroke ignored seems to require the following code in your onkeypress event:

    event.cancelBubble = true;event.returnValue = false;return false;

This works for most everything with the noteable exception of F1 in MSIE which displays help on the browser. To prevent this, try:

    document.onhelp = function() {return false;};

The Mac generates very odd key `s for things such as Up (63232), Down (63233), Left (63234), Right (63235), etc. I say odd only because I’m used to the ones generated on PCs (38, 40, 37, 39, …). Ok, so maybe they’re not odd just different ;-)

MSIE seems only to allow you to modify the content (DHTML) of a div.

Even though the HTTP protocol allows you to send and receive binary data – using Content-Type: application/octet-stream and Content-Transfer-Encoding: binary for example — none of the browsers I tested would reliably allow the JavaScript code to receive that data as a string of characters, even though the browser would quite happily download the content to a file on my hard-disk and allow me to manually construct a string with identical content – using String.fromCharCode(0x1b) for example.

You can simulate Swings invokeLater by using window.setTimeout() with a time-out value of zero:

    var self = this;window.setTimeout(function() {self.doSomething(...);}, **0**);

Most of the browsers I tested didn’t seem to support for .. **in** ..; they all accepted the syntax but produced kooky results when used.

All browsers I tested support using innerHTML to replace the content:

    document.getElementById(id).innerHTML = html;

Using a span with CSS classes is the simplest way to inline style changes:

    &lt;span class=&quot;important&quot;&gt;...&lt;/span&gt;

Handling errors (and for that matter state changes) when using XMLHttpRequest (or in the case of MSIE, ActiveXObject(&quot;Microsoft.XMLHTTP&quot;)) differs between browsers:

  • Safari and MSIE seem to always set request.status and request.statusText;
  • Netscape/Mozilla seem to sometimes set these variables, yet other times throw exceptions due to the varible having not been defined;
  • Most will allow any old value for request method and URL and notify you via onreadystatechange if there was an error – such as 404 Not Found for example – though sometimes (under what circumstances I don’t recall) they will throw an exception on open() and sometimes on send().

Both Netscape/Mozilla and MSIE append a CRLF (0x0d0a) to the end of any content you send, leaving the Content-Length field two-bytes short; Safari seems to leave the content as-is. Not really a problem but interesting as the data already had the CRLF as usually recommended for sending content via HTTP.

To change the colour of a horizontal-rule (&lt;hr class=&quot;a_style&quot; /&gt) in a browser-neutral manner, you need to set your CSS style as:

hr.a_style {background-color: #NNNNNN;color: #MMMMMM;border: 0;height: 1px;}

You can call a method using a string for the name, allow a switch-like calling mechanism:

    var methodName = (this.insertMode) ? "insert" : "overwrite";**this[methodName]**(aCharacter);

More to come I’m sure. Add any more you can think of or let me know of better ways to do these things as I’m truly ignorant in this space.

Invasion of the Battery-Life Snatchers

By

Over the past 6-9 months, I’ve been doing a lot of development using my PowerBook running on battery. For the most part it works really well: The performance is just fine (with some tweaking of the power-saving settings), giving me around 2 hours of editing, browsing, emailing, etc. That is unless I’m developing code using IntelliJ.

When running in the foreground, IntelliJ seems to use anywhere between 3.0% and 6.0% of the CPU. Not too bad you might think but that is when I’m not doing anything with it. Even when IntelliJ is in the background – either hidden or minimized – it still uses around 3.0% of the CPU as this screen-shot from top clearly shows:

     PID COMMAND      %CPU   TIME   #TH #PRTS #MREGS RPRVT  RSHRD  RSIZE  VSIZE1259 top         14.5%  0:29.70   1    18    22   776K   372K  2.62M  26.9M -- &gt;  670 idea         2.7% 11:11.22  26   &gt;&gt;&gt;   485   270M  28.3M   231M   831M &lt; -- 1248 Safari       0.2%  0:06.22   6   127   238  8.89M  27.9M  18.8M   243M1247 SyncServer   0.0%  0:03.65   2    53    48  12.2M  3.36M  15.1M  47.3M1225 mdimport     0.0%  0:01.11   4    66    67  1.39M  3.27M  5.16M  39.7M668 bash         0.0%  0:00.02   1    14    17   220K   820K   904K  27.1M

A quick check of the Java version indicates that I am running JDK 1.5 by default:

simon$ java -versionjava version "1.5.0_02"Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_02-56)Java HotSpot(TM) Client VM (build 1.5.0_02-36, mixed mode, sharing)

And a quick check of the About Box in IntelliJ confirms this.

Now admittedly I haven’t tried any other Java applications so I’m not too sure if it’s a Java issue, a Java on OS X issue, an IntelliJ issue or what but it is driving me nuts because it pretty much turns my 2+ hours of battery life into 30-45 minutes.

So if anyone has any idea how I might get IntelliJ to stop using CPU when it’s idle, please, please, please let me know.

Oh and Jeaves, “use Eclipse” and “get a real computer” will not be considered helpful answers ;-)

Update (30 August 2005)

Vote for the bug.

Getting Soft or Getting Smarter?

By

Sixteen or so years have passed since I started my first job – a programming gig writing language interpreters and compilers – even though I didn’t really know how to program. On day one I was given an intel 80386 machine code instruction manual, a System/370 machine code instruction manual, a desktop PC, a copy of Microsoft assmbler, access to a mainframe via a 3270 terminal emulator and told to pretty-much work it out myself. And that’s pretty-much what I did.

Sixteen years later, I’m doing some work for the same company. And while a lot of the industry has move on (or was that gone ‘round in circles?) they’re still developing the same software all written in assembler and their proprietary development languages. They do great business selling software for 7-digit figures whilst competing with the likes of IBM, CA, etc. The proprietary language they use – all written in assembler remember – runs on Windows, Linux, OS/2 and zOS and has many of the features that “modern” languages have: dynamic dispatch, loose typing, etc. Not bad for a company with three people, doing it all The Wrong Way™.

My job has been two-fold: write some DHTML and JavaScript communicating back and forth with a web server written in the proprietary language; and add some new features to said language. And, all-in-all it’s been pretty good fun. The DHTML and JavaScript stuff is easy peasy (like we need an acronym for this stuff, sheesh!) and getting back into assembler programming after all these years of Java has been downright good geeky fun. That is of course until something goes wrong.

JavaScript debugging is still a bit lame, even with the Mozilla plugin but that’s easy enough to fix with a few carefully placed calls to alert(). The biggest issue with JavaScript unfortunately is the difference in browser behaviour, especially with respect to keyboard events and XMLHttpRequest. No matter though, we overcame those issues pretty easily and moved on to other things: adding new features to the proprietary language.

It’s probably been 10+ years since I did any assembler programming and I feel it; I’ve become soft. I make my changes and run the application. It touches a bit of memory it shouldn’t and BOOM, memory violation. Right. Register dump. Ick! I remember those. Urgh. Start up the debugger – gdb – and let’s try that again. BOOM. This time though we’re inside gdb so I can start poking around. Right. Where are we? Hmm…let’s look at where eip points – linux on pentium hardware. Ok, where is that relative to the load-point for the module. Ok, <snip>

My forensic skills have become soft. I’ve become too accustomed to exception handling, stack traces, automatic buffer overrun detection, garbage collection, no pointer arithmetic; unlimited numbers of variables, no little- vs. big-endian issues, etc. No peeking into memory to see what the processor stack might look like anymore. Instead, just look at the line numbers and there’s most of what you need to know already in front you.

I’m not complaining mind you– like not needing to think so hard about debugging – but it is interesting to see how my skills have changed over the years and how my current development ideas (and ideology?) have been shaped (for better or for worse) by having a knowledge of the underlying execution architecture. Forget 80x86 or System/370 processors, these days Java, .Net (and no doubt countless others) are built on _virtual machines_ with their own instruction sets, stacks, etc. How many developers actually understand the workings of the underlying VM? How much does a developer gain (or lose) by having this understanding?

Update (23rd August 2005)

Just to show you how soft I’m getting, I changed the design to one that didn’t involve me needing to use gdb LOL.

Prevent Mac Droppings on Network Drives

By

In my current gig, I’m using my PowerBook in an all M$ Windoze (with the exception of Linux servers) environment and it’s working a treat. Except for one thing: all those .DS_Store files.

Finder in Mac OS X (and probably previous versions too I imagine) creates .DS_Store files whenever you browse a directory. The file is seems pretty harmless - apparently it contains little more than window preferences, etc. - and Finder hides them from view.

Unfortunately, it does get the back up of some of the other developers. Not really because the files exist, but more because, for some reason, the files get created with a timestamp in the future which causes all manner of problems for the guys writing their MFC applications - with asserts turned on the applications barf all over the place.

So, I did a quick hunt on google and found this article that explains how to prevent the behaviour. It’s a pretty simple fix and involves entering the following at the command-line (possibly followed by a re-boot?):

> defaults write com.apple.desktopservices DSDontWriteNetworkStores true

Now if someone could only tell me why I seem to end up with all these ._XXX files lying around that don’t appear in Finder, nor when running ls -la but do end up in my zip and tar balls.

You See I Am Trying To Be Objective About It.

By

It was an unsettling feeling, one I was sure I had experienced before.

Seven days ago today, I boarded the good-ship X` bound for the land of Cocoa. I was promised a bumpy but nevertheless enjoyable ride over well charted sea, enjoyable smalltalk and no bugs. Life would be wonderful and applications would spew forth from the finger tips of those who chose to take the journey.

Seven days later, I’m in pain. I’m hurting. I’m surrounded by square brackets. I have a hard time keeping track of my memory (maybe it’s alzheimer’s?). I wish I new what was wrong with me but even the good Dr. GCC seems to have a heck of a time diagnosing my condition. I have a sneaking suspicion it’s a form of jet-lag - I feel as though I’ve travelled back in time (about 15 years to be precise) but I’m not sure.

Humour (such that it is) aside, I’ve been getting into Objective-C the past week as there’s a project I’m about to start working on that is targetted at Mac OS X and will be written in said programming language. So, I thought I’d go head first and use X` as well. Ouch, ouch, hurty, hurty. Why do I feel like someone just cut off my arms and stuck red hot pokers in my eyes? To be honest, I think I should have started off with TextMate and GCC. But nevertheless I’m percervering.

I do like the Smalltalk like features such as dynamic binding and the calling syntax is quite simialr too. I don’t care much for the use of square brackets - though I’m not sure I could come up with anything much better for a language that is (or at least was) essentially a pre-processor for a C compiler. The fact that I have to use pointers all over the place is particularly irritating too.

Ahh yes, I was once a member of the K&R club: First C; then C++. All those calls to malloc & free, new and delete. Endless hours of trying to remember what happens when I use array indexing on something declared as int **. In the end though, I guess it comes down to what you’re used to and for the last 6 years (I think?) that’s been Java; a language that for all it’s short-comings (of which there are plenty), is a pretty simple language to learn and understand - at least it was for me anyway. I’ve become acustomed to garbage collection, packages and object references (the last being syntactic sugar mostly) and I like them. And of course then there’s IntelliJ and Eclipse to help me out, suggest class names, think for me.

So it came as rather a shock to the system to go back to an IDE that seems to require soooo much configuration for apparently soooo little benefit, a language that forced me to think about when to call release to cleanup unused objects, and an overall environment that let me compile and run something that, on subsequent inspection, didn’t even seem to be valid code - more likely a “feature” of X` than any underlying problem with GCC or the language itself.

Of course many of you will argue that I’ve forgotten what it’s like to be a Real Programmer™; that it’s character building. To this I say phooey. I was once an assembler programmer and I loved it. Then I became a C/C++ developer and I loved it. Then I moved to Java and never looked back (well maybe once but nobody saw me do it so you can’t prove a thing). Moving to Objective-C is thus a painful yet disturbingly rewarding experience; like pulling out the old Commodore-64 and playing Galaga: The graphics suck but they’re cute and the game is nevertheless fun to play.

So, the things I like about Objective-C so far:

  • Dynamic dispatch - ie. run-time method binding;
  • Categories - adding methods to objects at run-time;
  • Smalltalk-like calling syntax; and
  • self - I don’t know why but it’s somehow more appealing (or just different) than this;

Things I dislike about Objective-C so far:

  • Manual (yes as far as I’m concerned even reference counting is manual) memory management;
  • Square brackets;
  • + and - for marking static and instance methods respectively;
  • @whatever - smacks of pre-processor hackedy hack hack hack; and
  • Needing to manually call the super “constructor” - it’s not really a constructor so, yes, I know why it’s necessary but it still sucks.

But in the end, who can resist the sexy look-and-feel of Mac OS X apps? Not I. And I want to try my hand at creating some so what better way to do so than Objective-C/Cocoa. I’m assured that once I get into using the Cocoa stuff, life get’s a lot more fun and interesting, fingers crossed. Or perhaps I’m just too old and grumpy ;-)

If anyone has tips, links, experience they’d like to share on how to make this a more pleasant journey (no, shut-up and take it like a man doesn’t count), please please please let me know.

Not Writing Myself Off Just Yet

By

Finally, after some travelling, I’m back in Oz having spent some time in Ladakh (Northern India), France, U.K. and Canada, all the while trying to write my first technical book.

The book (yet to actually be published mind you) was handed to me by Jon Eaves, no doubt meant as a cruel joke or perhaps punishment for some as-yet unspecified crime I must have committed ;-)

It has been a bit disheartening at times. For a start, I knew that I knew nothing about writing before I started but I now know how little that actually was. I take some comfort in the knowledge that as a complete newbie to the whole writing game, I wasn’t supposed to have known anything anyway. Having never completed a degree (I left school at age 17 to start working) and thus never having written much if anything except for my occasional blog entries (and whatever is necessary for work), my capacity to fill pages with words was (and probably still is) severely limited.

Then of course there is the tragic development process that is book publishing. For a start, you have no idea what they actually want so you do your best and start handing in chapters. As the weeks go by and you hear nothing in the way of feedback, you start to worry so, inevitably you spend more time on each chapter trying to get it “right”. Eventually the chapters are returned to you. Some are great, some need a bit of work and others are unrecognisable, let alone unintelligible to even me, the original author! I’m not sure how to resolve this problem though as everyone I speak to complains about the same thing.

I should also never have travelled while writing my first book: I discovered that I’m a single-threading processor, just as the stereotype for my gender suggests. Though to be fair, I’m not sure this really contributed as much as simply Not Having a Clue™.

I learned that standing up in front of people and giving a talk, a presentation or even teaching a class, pales into insignificance when compared to writing a book (at least for me anyway). Get my gums a flappin’ and there’s no stopping me; put me in front of a computer to type out a chapter and ………………………………. Exactly! Nothing. Nada. Bupkis. Zero. Zilch. Nanimo. I found myself having to talk out loud to an imaginary person just to get my brain into gear.

Then of course there is the subject matter: Algorithms. A topic covered by so many books that surely only a mad-man would attempt another. This one is different though. It attempts to cover each algorithm at a conceptual level in a way that most people should understand. The code itself is unit-tested and the implementations are not merely a re-hashing of C-code just to get the word Java on the cover. So in that sense I’m quite proud of the effort - I’m yet to see the final product.

On another positive note, I think my coding has improved. I’d like to say that I experienced some epiphany and suddenly became a better `r but alas, the motivation was far more down-to-earth: I’m lazy. Having to explain code in words can be difficult and at times a little, shall we say, uninteresting so, the smaller and simpler I can make the code, the easier it is to explain. The down-side is that my coding style changed about 2/3 of the way through writing the chapters. Oh well.

Fortunately I also had James (another book-writing newbie) helping me out. Unfortunately we ended up on different sides of the planet. Fortunately, against all advice to the contrary, our friendship surived with flying colours; If anything it’s made me realise how much better it is to regularly pair on things.

And lastly, I think I’d like to write another book though my approach would be very different. We’ll see if anyone is willing to risk it a second time. In the meantime, it’s back to my greatly-missed training schedule, get a haircut, and get a real job.

Dis-appointment

By

I do quite a bit of travel and I’ve always wished that travel agents would send my itinerary as an electronic invitation but they never do. In fact none of my customers do either. Actually, come to think of it, pretty much no one does. So why don’t more people use ICalendar and VCalendar when sending out meeting or appointment invitations?

At one point in my career I was forced - against my will - to use Lotus Notes and yet one thing we all did was send each other meeting invitations. However, rarely if ever have I seen them sent outside of an organisation. This seems rather strange to me as almost every email application I can think of supports one or other of the formats: Mac Mail; Outlook; Thunderbird; Eudora; Evolution; even Lotus Notes!

For example, Mail+iCal on my powerbook allow me to send and recieve invitations. My calendar is automatically updated when an invitation arrives and updates are automatically sent out when I change one that I’ve initiated.

Have I missed something or is there some reason I have to manually create and update all my appointments?