haruki zaemon

The Sound Of One Man Snapping

November 5, 2004

Nothing like waking up after a night of disturbing dreams of zombies drinking bottles of warmed-up coca-cola. It’s official. Last nights blog entry was me losing the plot. It’s happened twice now in the last couple of weeks and is a sign of me becoming someone I despise. It’s an indication that my ability to cope with being asked to be responsible for things over which I have little or no control is non-existant. In all my years of software development, I’ve honestly never felt this way. It’s certainly not in my nature. I woke up this morning wishing I could have the last week all over again.

So to all those who were offended by it, I unreservedly apologise. I have deleted the entry and will make sure I go and have a beer instead next time.

FWIW, I make mistakes. Everyone makes mistakes. Almost everything I’ve ever blogged about I’ve done at some stage as well. That’s why I write about it. I grew up with the belief that it’s not a person that is wrong/stupid/whatever but the things they do. This is why I try so hard to document here all the stupid things I have done in the hope that others won’t repeat them.

Mac OS X House Keeping

By Simon Harris

November 4, 2004

Having been a linux weenie for a few years now I had become accustomed to running various house keeping jobs on a regular basis and I wanted to do the same thing on my new PowerBook.

In particular, I use locate for quickly finding files, which to be of any use, requires the the indexer (updatedb) to be run periodically. A quick grep through the man pages and I discovered the OS X version was /usr/libexec/locate.updatedb so the next step was to get it to run as a batch job.

Whilst searching for the appropriate place to put my daily system cron jobs (/etc/daily.local), I ran across this little gem in /etc/daily: # Clean up NFS turds. May be useful on NFS servers.

Don't Panic!

By Simon Harris

November 4, 2004

Apparently, “after hours” batch jobs don’t require load testing. Yes, you heard it. Supposedly jobs that run when no users are logged in are pretty much free to do whatever they like, all 57 of them! Is it some weird side effect of Heisenberg’s Uncertainty Principle that I’ve never heard of whereby it’s possible to be either using the system interactively; or have limitless computing power; but not both at the same time? Who knows but excuse me for suggesting otherwise.

You’ll also be relieved to learn that there is no need to load test your applications together on the same box even if they will be co-located in production because we can extrapolate from the results obtained by running a single application stand-alone. That’ll be a good cost saving I’m sure.

Oh and as for including the generation and downloading of PDF documents, bah! That does nothing more than test all that pesky “network bandwidth stuff”. There’s nothing we can do about that anyway so why bother testing it right?

Phew! That’s a load off my mind (no pun intended). I had thought that we might end up doubling the load on the production box but it seems I was somewhat misguided. Glad they’ve got it all sorted out. At less than 6 weeks ’till go live and with the application only just now limping into System Test, I was beginning to worry. Silly me. What was I thinking?

Now where did I put my double pair of Joo Janta 200 Super-Chromatic Peril Sensitive Sunglasses? I’m sure they’re around here somewhere…

Speculative Optimisation

By Simon Harris

November 2, 2004

or pre-factoring as Dave likes to call it, is a common practice. It’s an easy trap to fall into. Take a look at any piece of code and I’m sure you will see a way to make it run faster. The problem is that performance bottlenecks are almost never where you would expect them to be. Sure, we might be able to double the speed of a piece of code but if it only accounts of <1% of the overall running time, then it doesn’t really matter. Just recently I had someone reccommend that we add in some caching of database results because “It will be a performance problem.” The question I had was, when compared with what?

Performance optimisation often (but not always) involves obfuscating the code in some way to achieve the desired performance. Maybe we need to inline some code or unroll a loop here or there. Whatever it is can lead to code that is hard to read and hard to understand and, as we have discussed before, therefore hard to maintain. Ironically, our so-called optimisations can potentially lead to worse performance. If the algorithm is diffuclt to understand or the code simply hard to follow, we might actually introduce unecessary overhead without even realising it. If we have no base-line, no benchmark with which to compare our results, we will never know if we are improving or degrading the performance.

For this we need a profiler. There are plenty around, some free and some you’d have to sell the kids to afford. Quest have a free version of JProbe for use with Linux and Windows that James and I have been using to profile Drools. It’s missing some features but certainly nothing we can’t live without (how many negatives can a man use in one sentence?). There really is no magic involved. Run it, see where the biggest slice of the pie is and start there. Keep doing that until you’ve knocked off all the big ticket items. Chances are that’ll get you most of the way. Anything beyond that probably requires a fundamental shift in the design. But hopefully, because you have a clean design, that shouldn’t be too much of a problem ;-)

Interestingly, one of the simplest things you can do with your design is to make things as close to immutable as possible. So, for example, rather than have lots of JavaBeans with setters, use constructors. Mark your fields final. Not so because that in itself is a performance enhancement (although it maybe?) but to ensure that the state of your objects is as stable as possible. It also makes it much easier to find out who’s messing with the state. To achieve this, you may find you need to de-compose those monolithic classes into smaller ones. I’ve found it helpful to introduce Builders to accumulate state before constructing your objects. You can think of mutable objects as having many moving parts and the more moving parts to a system, the harder it is to work out what’s happening and the harder it will be to re-factor when you finally perform your profiling.

Experience has taught me over and over again that correct code is much easier to optimise than clever code. This is why I’m a firm believer in Make It Work, Make It Right, Then Make It Fast.

Occam Need Not Apply

By Simon Harris

November 2, 2004

Wanted: Software developers for long-term, large-scale enterprise application project. Complex solutions to complex problems. Ability to justify largely redundant framework development to senior management a must.

Why is it that when left to their own devices, and given more than one way to implement something, developers we will almost certainly undertake the most complicated?

Death To Blog Spam Arrgghhh

By Simon Harris

October 31, 2004

I’ve been using MT-Blacklist for some time now and while it does a good job of moderating the spam, I’d rather it didn’t even get that far. So in a last ditch effort to eradicate comment spam all together, I’ve just installed a different kind of solution. This plugin puts up a security code graphic that you must enter in order to submit the comment. Although there have been some complaints about this technique on the grounds that it is discrimatory towards people with impaired vision, I’m going to give a whirl anyway and see how it goes. Apparently the guy who wrote the plugin has also recently written a bayesian filter as well but personally, like with MT-Blacklist, I don’t have the time to sift through all the comments, deleting the spam.

UPDATE 1: Seems to be working a treat. I’ve had not one blog spam comment in the last 24 hours but people have successfully commented manually. I usually get around 6-10 spam comments in the same period.

UPDATE 2: It’s amusing to look at my web logs and see all the access attempts from dodgy sites, no doubt attempting to post comment spam and failing dismally!

When Corporates Embrace Open Source

By Simon Harris

October 27, 2004

It is common for organisations to justify the use of popular Open Source Frameworks on the basis that developers with these skills are easy to come by. In addition, because the source code is readily accessible, it’s easy to make bug fixes and patches whenever needed. This is clearly justification enough that no analysis need be performed in order to ascertain if said framework actually fits the technical requirements of the application.

The next step always seems to be to download the source code and check it into a local repository. Then, have a core group of developers maintain it internally. This team will be responsible for checking out the source code, building it and distributing it to all the other teams ensuring that changes are controlled and all teams keep up to date with the correct version.

After using the framework for a few months, it becomes obvious that the way the code was originally written is either: broken; wrong; or doesn’t quite fit with The Way We Do Projects Here ™. This then requires massive changes to “simplify” the design and add enhancements wherever “necessary” - Like masking all those pesky exceptions that get thrown and instead returning null.

Of course now that so many changes have been made, and coupled with the requirement that all projects be uniform in quality, it becomes necessary to ensure that project teams cannot and will not use the version(s) available from the original project site but instead are forced to use the highly tailored internal version. In fact it’s probably a good idea to make the framework a “black box”. I mean, why would the non-core developers need or want access to the source code. The core team are providing a service after all and that is all that’s important, so access to the internal repository must be on an as-needs basis.

And finally, after 12 months of development and hard-work, it is customary to allow the The Architect who made ALL the proprietary changes (to the supposedly open framework) to go on 4 weeks holiday, just prior to delivery to System Test, leaving the project team to fend for themselves so that when a bug is found, the only solution is to fork the code (again) and check it in to the project repository on the proviso that the changes make their way back into the core ASAP.

Why Type When I Can Skype

By Simon Harris

October 23, 2004

Throw out Yahoo! Messenger, if you’re not using Skype, I’m no longer your “buddy” :P. I’ve tried voice chat before but nothing even as close to as good as this. I can’t believe I’ve never heard (pardon the pun) of it before.

I plugged my headphones in, “called up” a friend and starting speaking at my laptop (there’s a mic there somewhere though I’ve no idea where). The sound quality is astonishingly good. My friend might as well have been sitting next to me.

So I started calling up everyone I could. My brother travels a lot for work and has two kids and I figured it would be really useful for him. “brb (be right back)” he says so I start playing some of my newly ripped CDs as “hold music”. When he got back I asked him what the sound quality was like. “I thought a CD had started playing on my computer” he replied.

Apparently you can have up to a 4-way chat. Though I’ve not tried, if you’re prepared to pay, it allows you to make international, and possibly even local calls. I’d be interested to hear from anyone that has. And even better it has a text based IM client built in so if you like to type instead you can; though why would you bother, Jon? :P

The next bit is to work out how to get VNC working over a VPN across The ‘Net so that James and I can do a bit of remote pair programming…mmmm

Now if only I could find a way to have multiple voice chat conversations going at once without having my brain explode. Oh well I guess it’s a little too early to throw away the IM client after all. DOH!

Project Risks

By Simon Harris

October 21, 2004

A few weeks ago I gave a lecture to some second year university students here in Melbourne. The talk was titled “e-Business In The Real World” but really it was me yabbering on about my experiences delivering software. Anyway, a couple of people have asked me to publish the slides so [here they are](/blog/archives/RMIT eBusiness Lecture.pdf), all done using NeoOffice/J on my brand spanking new PowerBook. They’re not much, nothing fancy, but they really summarise the risks associated with delivering software.

If I could sum it all up I would say that if your problems are largely imposed by entities external to The Team then that’s about normal; you just have to identify the risks and mitigate them somehow. If, on the otherhand, your major problems are technical, i.e. within The Team, you’re in deep doggie doodoo; fire them all and start again ;-P

UPDATE: Having been asked to present again, I revised the slides slightly using Keynote. The content may be much the same (a few changes here and there) but it sure does look sexier now ;-). Unfortunately keynote produces an enormous PDF so, I actually exporpted to PPT then imported into NeoOffice/J and re-exported to PDF producing a file that is less than 10% the size!

Beware The Cross-Product Join

By Simon Harris

October 21, 2004

An intersting discussion started on the Drools user mailing list regarding some problems writing a rule. The particular problem is not unique to business rules though. RETE-based inferences engines share much in common with relational databases and in fact this particular problem can affect SQL queries in the same way as it affects business rules.

Let’s say we wanted to find all pairs of people that were maternal siblings (ie that had the samemother). In SQL we could write a query like this*:

SELECT * FROM Child c1, Child c2WHERE c1.motherId = c2.motherId

If we imagine we have only two children in our database, Bob (childId = 1) and Mary (childId = 2),both having the same mother, this query would generate four rows:

Bob, Mary
Mary, Bob
Bob, Bob
Mary, Mary

This is called a cross-product; every row is joined to every other row. This results in rows we’re not interested in: Bob, Bob and Mary, Mary. So the first thing we would do is try and ignore rows where the child was the same:

SELECT * FROM Child c1, Child c2WHERE c1.motherId = c2.motherId **AND c1.childId != c2.childId**

Which results in:

Bob, Mary
Mary, Bob

The next thing you’ll notice is that we still have redundant rows - rows that mean the same thing. There are a few “tricks” to avoiding this and really come down to a knowledge of the underlying attributes of the tables involved. The simplest in our case is to change the condition:

SELECT * FROM Child c1, Child c2WHERE c1.motherId = c2.motherId **AND c1.childId < c2.childId**

By imposing an arbitrary ordering, we prevent rows being joined to themselves and ensure that for any two siblings, we only get one row. Best of all, this technique translates directly into the implementation of business rules.

Not only do cross-products produce redundant and possibly incorrect results, the extra tuples (rows) generated as a consequence can cause your rule engine to grind to a halt.

I realise that no one is going to model Children and Mothers in different tables but please cut me some creative slack ;-)

Paste Your Code

By Simon Harris

October 6, 2004

Anyone who’s used TinyURL will understand how cool this is. One of the guys (Mark Proctor) over at the [haus](http://www.haus.org) put me on to it.

As the title suggests, it allows you to paste your code and generate a unique URL for it. You can select a language, choose a “nickname”, enter a description and even convert tabs to spaces if you so desire.

The result is formatted code, with line numbers, that you can easily share with others. Pretty neat.

Drools Schmokes! - Part II

By Simon Harris

October 6, 2004

So once we’d worked out what the major hot spot in drools was, it was time to find an alternative method of conflict resolution.

As a bit of background, in simple terms, as facts are asserted, new items (or activations) are added to the agenda. In the general sense, all agenda items are equal. But some are more equal than others.

Although you should stay away from attempting to infer or impose ordering on rules, sometimes it is necessary. Sometimes you just need a couple of “cleanup” or “setup” rules, that are guaranteed to fire before or after all others. In Drools (and JESS) this is known as salience. In JRules it’s called priority.

There are other reasons to order the agenda and Drools has a number of different strategies: Random; Complexity; Load Order; etc. These are then chained together. Each Resolver then gets a chance to add the item to the agenda. If it succeeds, no more resolvers are called. If however the item conflicts with one or more existing ones, all are returned and passed to the next resolver to, well, resolve LOL.

Confused? Here’s a better explanation.

Looking at the implementation it was apparent that the complexity was O(n^2). Each resolver seemed to be doing a similar thing. It was also optimised quite a bit meaning there was necessarily duplicated code.

My initial gut feeling was that a priority queue was what we needed but how would we do the chaining of the different concerns?

Maybe something like a Red-Black Tree would be useful. Maybe we could implement a comparator for each strategy. Conceptually at least, if we used the first comparator to insert into the tree until we found items that were equal. From then on we would continue to insert using the next comparator, etc.This seemed too complicated and I don’t do complicated very well. Makes my head hurt.

It seemed that each of the strategies was really just using a different dimension or aspect of the item to perform a sort. It was like a composite key. So whats the easiest way to sort on a composite key? Use a composite comparator. Something like:

public class CompositeComparator implements Comparator {private final Comparator[] _comparators;public CompositeComparator(List comparators) {this((Comparator[]) comparators.toArray(new Comparator[comparators.size()]));}public CompositeComparator(Comparator[] comparators) {_comparators = comparators;}public int compare(Object o1, Object o2) {int result = 0;for (int i = 0; result == 0 && i < _comparators.length; ++i) {result = _comparators[i].compare(o1, o2);}return result;}}

I tried it out using a TreeSet but it performed just as badly. Maybe I was wrong I thought to myself. So I jumped online and chatted to some of the Drools guys, Mark Proctor in particular. I described my ideas and he seemed to like them.

We did a bit of searching around for implementations we could use. I found one here but the license wasn’t right. Next we thought of [Peter Royal](">Doug Lea’s stuff but it was overkill. Finally <a href=“http://fotap.org/~osi/) suggested looking at the commons-collections stuff and voila, there it was - PriorityBuffer - and it took a Comparator!

Hackedy, hackedy, hack and we’d replaced the original stuff with the priority queue. Time to give it a whirl.

The first step was to run the queue with a simple Comparator. Although it doesn’t really do anything much, it would at least allow us to see what the basic overhead of the queue implementation was:

public class ApatheticComparator implements Comparator {public int compare(Object o1, Object o2) {return -1;}}

Hit run. Damn that’s quick! Once more to be sure. Yup. Hmmm. Still not convinced. Add a breakpoint and run in the debugger. Sure enough it’s being called. Cool! Ok now to try LoadOrder and Salience.

public class SalienceComparator implements Comparator {public int compare(Object o1, Object o2) {return ((Activation) o1).getRule().getSalience() - ((Activation) o2).getRule( ).getSalience();}}

Same deal. All works just fine and after implementing a few more I was convinced that this was going to be a winner.

Now we have O(n log n). Even with all the comparators chained in, the peformance doesn’t change one bit. What’s more, the different strategies are simple one liners making implementing new strategies almost trivial!

So once more I must applaud the Drools guys for a flexible and performant design!

Drools Schmokes!

By Simon Harris

October 5, 2004

We’re about to open source a new rule-based project and up until now, we’d been using various closed source rule engines to get us going. Of course this won’t cut-it once we open source so we hoped that Drools would come to our rescue.

And it did. With some caveats, I can safely say that Drools is incredibly fast. Not bad for a code base that by their own admission has, quite rightly, favoured stability over performance and as such has had ittle or no profiling done.

Luckily we had built joodi, short for Java-Based O-O Design Inferometer (just had to get the word Inferometer into a project somehow!), test-first and as such the guts of the app was based on interfaces so cutting over to Drools was prety easy. It took me about an hour I guess to convert the application, rules, tests and all, to run with Drools. We fired it up. All tests passed. Hooray!How happy were we!?

Next to run a “benchmark”. We ran the application over the struts classes using the closed source engine first and it finished in around 9 seconds. COOL! Performance had been one of our unknowns and this was certainly well within tolerences.

Then we switched over to Drools and run the same test. 20 minutes later it still hadn’t finished. Another ten minutes I’d say and I was fast asleep. So when morning came around I lept up and ran into the lounge to see if it had finished. It had. In 78 minutes!!!

Yikes we thought. This aint going to cut it. Elation turned to dismay. But no real profiling of Drools had been done so surely there was room for improvement?

After a bit of chatting with the peeps in da haus, I decided to check-out the source and use JMP to do some profiling. Run it, we thought, find the lowest hanging fruit, fix it, then keep doing that until we’ve done all the obvious stuff.

So I cranked it up and it didn’t take long to find a hot-spot. In fact it appeared that nearly 50% of the time was being spent in one small area - conflict resolution. A quick look at the source code was all that was needed to confirm my suspicions. Lots of unecessary iteration. But again, I’m not taking anyone to task over it. I’d rather it was stable and functional first.

Looking more closely at the code, I realised that the functionality provided by the classes under scrutiny were not actually necessary, yet, for me to get joodi running. Thankfully due to the thoughtful design it was pretty easy to stub out, without even touching the Drools source-code.

Time to run again…holy-cow! 5 seconds! That can’t be right. Run it again. Nope 5 seconds again. Quick look at the output to verify it was actually working correctly. Yup. Run all the joodi unit tests just to be sure. Yup they run just fine. It had gone from being 300 times slower to almost twice as fast!

Damn I’ll try running joodi against another, bigger, project - xerces. With Drools plugged in, joodi ran in around 9 seconds. With the closed source product I gave up after 5 minutes and stopped it.

So hats off to the Drools team. Damn fine job! I’ll be submitting my patches ASAP and hope to see some of that other code re-factored soon :-)

Care-Factor Nine Mister Spock

By Simon Harris

October 3, 2004

This started as a reply to a very pertinent comment on a blog entry of mine but it grew to the point where I thought it deserved an entry of its own.

First to the original comment, I always appreciate a good rant. How could I not LOL. And I agree whole-heartedly with the sentiment. I don’t tend to blog about my personal life because, well, it’s personal hehehe. I don’t really get much from writing about my life experiences, yet. Maybe one day but until then I do get a lot from writing about software development. It’s an area of my life where lots of discussion and debate seems to make a big difference.

So for the curious, I teach and train martial arts most week nights. I spend most weekends with my family except for the occasional geek session here and there. I work for 9 months of the year and take 3 months off mostly to travel - I’ve lived a total of 3 years in Japan off and on over the past 17 years. I speak Japanese. I ride my motorbike whenever I can. I ride my mountain bike whenever the weather permits…. But rather than bore you with my “I’m a Leo I enjoy cooking and dancing” story, let me summarise by saying that I do believe that life is about living and NOT about software development.

Don’t get me wrong, I don’t dislike software development. As far as a job goes it’s the best one I could hope for right now. It’s interesting. It’s challenging. It keeps my mind active. And I get to meet loads of interesting people in the process. But every year I go to Japan to train or I go hiking in New Zealand and I don’t miss the internet nor email nor mobile phones nor any technology to be honest. When it comes down to it, if I were independently wealthy I could turn my back on computers and never look back.

But that was not and is not the point. The point is that no matter whether it be software development, house keeping, whatever, all I ask is that you GIVE A SHIT about what it is you are doing and that you take some care and some responsibility. If you don’t, won’t or can’t, then STOP, CEASE, DESIST! You will do more harm than good so please go away, we don’t need nor want you.

My Aikido instructor is famous for ripping shreds through students correcting their technique. Hearing him scream “DAME!” (Japanese for “wrong”) across the mat can be a bit much for some students. But he once said to us that “there are only two reasons you’ll never receive a DAME from me. Either you’re so good that you don’t need it; or you’re so bad I’ve given up and I don’t care about you anymore.”

So I hope you’ll understand that I intend to continue ranting and writing about software development, and anything else I feel passionate and enthusastic about, BECAUSE I GIVE A SHIT. :-)

Stop Calling Me Shirley

By Simon Harris

October 1, 2004

The lack of documentation is disturbing. Requirements in the form of code or often, reverse engineered from the code. Phooey! Seemingly adhoc changes to the spec by the architects. Cowboy developers making changes here and there whenever they feel like it to hack in some new feature. Dependencies between developers forcing them to pair up to write code. What a ludicrous idea! Nothing seems to get done until the last minute. We’ll be lucky to limp across the line. Whatever that line may be. With no real acceptance criteria, how does anyone know when we’re finished?

But wait a minute…it’s a waterfall project. Oops! XP was used on the previous project. Let’s try that again shall we?

All those tests slowing down my build. How dare they make me ensure my code works. All those story cards on the Wiki to read. Boring! Would you believe I was even gasp forced to understand what I was doing by consulting with the business rep. Sheesh gimme a break. Imagine allowing the customer to change their mind at the last minute and still delivering on time. Bah! And what’s with asking me for revised estimates every day? I signed up for anarchy. Instead I got micro-management!

Programmeurs Sans Responsabilité

By Simon Harris

September 24, 2004

In all my years of software development, I have honestly never encountered a developer who really just wanted to be a drone. Someone who wanted to code directly off the spec without a care for what they were doing or why. Until today.

But then it’s not so surprising when they’re given advice like “Learn from me. Always remember my 3 rules: I didn’t do it; It’s not my area; and I don’t know anything about it.”

Surely it’s not too much to ask that a developer actually understands the rationale behind the code they are producing. Surely it is reasonable to assume that a developer has questioned the design and implementation to the point they are at least happy that it conforms to their understanding of the problem domain and that it is traceable to some functional spec.

How can so many people have such a low care-factor for the work they do and the software they produce?

java.util.ThrashMap

By Simon Harris

September 22, 2004

I received a very interesting post from the JESS mailing list last night and thought it was worth a mention.

There’s a weird threshold that can occur with any Java data structurethat uses large arrays of Object references (Object[]). If the size ofthe Object array exceeds the maximum size of objects that can be placedin the “new generation”, garbage collection performance can be severelyimpacted…

I’m not sure if I’m likely to see this problem really but I do ue HashMaps a fair bit so I thought it was interesting. In general I don’t find the need to use arrays much if at all these days. In fact, for some inexplicable reason, I tend to use LinkedList over ArrayList and, except for Simian which uses lovingly hand-crufted data structures, I can’t recall ever holding maps of data large enough to exhibit this behaviour. But then again I’m not implementing a Rules Engine.

Business Rules != Scripting

By Simon Harris

September 22, 2004

As Business Rules come into vogue (again?) and the tools proliferate, there will be the usual fumbling about as many come to terms with what it all really means. How do we use these things? What should I look out for, the pitfalls, the traps? Are there any “patterns”? But above all, the greatest difficulty it seems, is coming to terms with the idea that Rule Engines ARE NOT procedural scripting languages.

The Rete Algorithm (pronounced REE-tee and Latin for net) was developed by Charles L. Forgy at Carnegie-Mellon University in the 1970’s and is used in most modern high-performance rule engines. Rete is able to efficiently handle very large numbers of rules.

One of the most important features of the Rete algorithm lies in its ability to identify and subsume rules with similar predicates. Because of this, predicates need only be evaluated once. This differs from procedural (java `d) rules where every predicate in every rule must be independently evaluated, regardless of whether the same predicate might already have been evaluated in another rule. It can also locate conflicting rules. Something that’s almost impossible in traditional, procedural, languages.

When it comes to codifying business rules, well factored Java code can be rather difficult to understand. After a couple of weeks away, it can often take the original developer some time to get back up to speed with their own code, let alone someone elses. On the other hand, Rules are declarative statements of fact. That means no trudging through tens or even hundreds of lines of procedural code to understand what will happen under various conditions. Weeks, months or even years later you can go back to the rule definitions and immediately understand their meaning and intent.

Rule engines share much in common with Relational Databases. They are based on tuples and predicate calculus. You don’t navigate Relations (Tables), you join them. Similarly you don’t navigate facts, you join them. Both suffer (or at least have suffered) similar problems in terms of performance and optimisation.

Business Rules should be simple and atomic. They should make inferences. They should not be calling out to databases nor making countless remote calls. That’s what application code is for. Much like the difference between queries and stored procedures.

Analogies aside, the fact remains (no pun intended) that rules are not procedural, they are declarative statements of fact! Writing business rules requires very clear, concise and logical thought, as much if not more so than procedural code.

Rule-flow, priority, salience, etc. are mechanisms that allow some degree of procedural control and should therefore be considered a last resort, not the basis for a rule engine framework. While sometimes useful, all are frowned upon by rule advocates in much the same way as OO design frowns upon public variables.

If you can’t or won’t make the necessary shift from a procedural to a declarative mind set then I suggest you try BeanShell, Rhino, Groovy or any of the myriad scripting languages available. There is nothing to be ashamed of with this approach but it is most certainly NOT the same thing.

Build Watermarking

By Simon Harris

September 22, 2004

Desktop software products, especially of the Windows variety, invariably come with an “About” Dialog listing, among other things, the version of the software. The product version number helps support staff and developers solve problems when they occur out in the wild. Without a version number, tracking down a problem can often be rather difficult.

Especially on web-based applications, making the product build number, build date, and other configuration information (was it a production or a training build, etc.) accessible to the end user is an invaluable aid to developers, testers and support staff. In fact of you look to the side-bar on this blog, you’ll find the version of MovableType used.

On our last project we made this meta-info available as, funnily enough, META tags (though we could just as easily have used comments) in our JSPs and HTML files. To source the info we simply passed the CruiseControl build number and date through into our Ant scripts to use as replacement parameters when copying the Struts (sigh) application.properties file into the web archive. The person responsible for deployment can always be sure that the correct version of the application has actually been deployed. Then, when testers and users need to report a problem, they simply view the source for the page they are on and, hey-presto, there it is.

On a project I worked on with Dave some time back, we stored the current build number in the database as well. The build number was inserted into a table at the end of the database schema update scripts. A DBA could then visually inspect the data in the table to ensure the correct updates had been applied. Our update scripts also checked this table to ensure a script could not be applied again. At runtime, the application would also double check that it was running against the expected database schema ensuring we never had a, possibly catastrophic, mismatch.

You can add build watermarks to templates used to generate documents such as PDFs, to XML messages sent between systems, in emails as header tags, you name it. In fact anytime traceability back to a particular version of an application might be useful, consider adding some kind of meta-information about your application. Your support staff will love you for it!

How The Other Half Work

By Simon Harris

September 15, 2004

It’s amazing how after just 3 days on a motorbike, out on the open road, breathing in clean ocean air, all the worries of the world seem to dissapear; how the old “care-factor” rapdily falls away to hover tantelisingly above zero; only to be brutally re-awakened by my brain hitting the ground with a thud as I begin day one of a new, 3-month long, project. And what a project it is!

The last project I was on kept my brain working overtime. I had to push myself to stay afloat, to keep up with the brains on the team. It was fantastic. We all pushed each other, striving to get as close to the “ideal” solution, continually honing and refining the design to remove all, seemingly, unneccessary fluff. The Simplest Thing That Could Possibly Work While Still Being Good Design ™ was King.

But now, as if by way of some bizzare super-natural ying-yang thing struggling to balance the forces of nature, I find myself on a project where the Powers That Be have skillfully managed to find complexity where there rightfully should be none.

And yet I’m baffled that, at some level, it’s a very simplistic design. A design that even with it’s strange technology choices and inexplicable coding “standards”, requires very little brain power to comprehend. I suppose that by adhering to all those wonderful™ J2EE Core Patterns, it can’t help but be simplistic. It really is the epitomy of a Simple Arrangement of Complex Things. Need a service? JNDI is your friend. And for your troubles you’ll get back an EJB. Need to extract a piece of data from some XML? Here’s a static helper class we prepared earlier.

The other developers on the team (I’ve come in 3/4 of the way through) have no trouble wading through the hundred+ line methods that make up the half dozen or so EJBs mixed in amongst the various classes that together comprise the 20 or so WSAD “modules”. In fact I found myself in a meeting honestly feeling stupid that I just didn’t “get” the seemingly overly complex application architecture. I figured I must be missing something, something significant. And maybe I have.

With the exception of my good buddy Phil, it made perfect sense to all the rest to store XML as a CLOB in a relational database only to unpack it, extract a subset of the fields and use Lucene to index the records. No one else seemed to think that spawning threads inside the app server was a bad thing. I mean “we don’t want to re-invent MQ”. Of course not. Silly me. Are you sure 15 days isn’t too fine-grained an estimate for you? We still have 6 weeks left, surely your gant chart would look a lot simpler if we just made everything finish then. :-P

I’m not sure if I simply suffer from a particularly severe form of geek snobbery (in the same way that Linux users look down on Windows users) or perhaps I expect too much from my work. Whatever it is, it’s interesting to see another perspective on J2EE development. I will try to suspend for a while my preconceptions and at least give it a go. But I’m not giving up IntelliJ…yet!

My only consolation is that it will surely be an intellectual walk in the park and give me more than enough food for blog, not to mention oodles of free cerebral cycles to work on my various out-of-hours pet projects - the ones I had been neglecting for the past 6 or so months.