avatarharuki zaemon

How I tell a story

By

As far as I can tell, the most widely accepted format for agile user stories looks like this:

As a [role]
I would like to [action]
So that [reason for action]

I’m not entirely certain of the origins. Some have suggested Dan North and others have have suggested much earlier use. In any event, it’s use is so common I can’t recall a single “agile” project where that wasn’t the format but I’ve also never really been that comfortable with it and I’ve not really known why until recently.

Here’s an example:

As a domain novice
I want the domain experts names made prominent on a domain listing
So that I know who to turn to for help on a given domain

Though entirely contrived for the purpose of this discussion, it demonstrates the kind of stories people are inclined to write.

On the positive side, the format seems quite natural – people often have an implementation in mind and they can articulate it quite easily but find it much harder to articulate the intent. This seems to be reflected in the format with the primary focus (“I want”) expressed as the desired behaviour and the intent (“So that”) of conceptually lesser importance. The problem I have is the story centres around how the user will interact with the system and only then describes why. It presupposes an implementation with a cursory nod as to the user’s intent.

In my various roles from analyst to developer, I’ve always wanted to understand why we’re building a new feature. Experience has taught me that when I understand the why, I am better able to build a solution that satisfies the real need. As a consequence, when running workshops, I’ve always tried to have the intent stated as the primary desire/need. For example:

As a domain novice
I want to know who to turn to for help on a given domain
So that .... [what the hell goes here?]

When people go to the trouble of putting the intent first, the “So that” becomes a bit of a WTF, often resulting in a simple re-statement of the intent just to satisfy the template:

As a domain novice
I want to know who to turn to for help on a given domain
So that I can learn more about the domain [well d'uh!]

Moreover, we’ve now lost the implementation detail which, although we’re more interested in the intent, is still of value by providing context for further discussion/understanding.

What I’ve wanted was a way to encourage people to describe their intent and leave the implementation details to be decided closer to the iteration in which the story will be delivered. Yes, training is one way to achieve this but my experience is that the standard format effectively devalues the intent in favour of implementation detail. What I’d really like is to somehow nudge people into writing a story where the primary focus is the intent.

Enter the format I’ve been using recently and really liking:

As a [role]
I want to [intent]
[For example] By [action]

It’s a small but subtle change. To some so subtle that it verges on the trivial. However, my experience is that it’s similar enough to the original format so as not to seem too radical a change and at the same time different enough to encourage the behaviour I am looking for. By way of example:

As a domain novice
I want to know who to turn to for help on a given domain
e.g by having the domain experts names made prominent on a domain listing

The intent is now first and clearly specified as a higher order issue. It makes no sense to express the intent as “For example by wanting to know who to turn to for help on a given domain” so the only sensible place for it is as “I want …”.

That the last line begins with “By” identifies it as an implementation detail, acknowledging that most people have some idea of how they want it implemented and allowing them to express that without feeling awkward. With the addition of “For example” or “e.g.”– as suggested by Mike Williams – we further clarify that it as up for debate once the story hits an iteration. Finally, by placing it last and making it optional we indicate that it lends less weight than the intent.

The more I use this format the more it becomes clear when the intent is expressed as implementation, my stories tend to read much more like a plot or a narrative of the software, and I’m finding it nudges/leads me to write stories that express the intent rather than work backwards from the implementation. I’ve also noticed I’m developing some heuristics for “validation”:

  • When the implementation contains a conjunction (and) it probably indicates the story is an epic.
  • When the intent contains a conjunction, it probably indicates more than one story/epic.

Trading Design Pain for Runtime Pain

By

So, since my post on functional programming in object-oriented languages I’ve continued to tread the path with a mixture of gratification and despair. This morning the latter became overwhelming, I’d just had enough. My brain hurt and I just wanted to pump out some code, run it. I threw out the concepts I had been using as my guide and fell back on years of “old-school” object-oriented code.

Unfortunately, I was no more productive. In fact, I’d argue I was less productive. Things began to fail in weird and unexpected ways. The number of tests I needed to write to catch errors at least doubled. I soon returned to the comfort of my hybrid world.

In hindsight, I had traded design pain for runtime pain. All the mental gymnastics that went into working out how to build classes that are inter-related and at the same time immutable, etc. was replaced with time spent writing tests for anticipated edge cases as well as debugging the unexpected ones.

I concluded that the “pain” I had been experiencing was largely the result of being forced to deal with the complexity of the underlying problem. Once solved however, the code fell out with few or no bugs. By contrast, when I reverted to my previous approach, the code flowed far more freely but I spent a lot more time working out how to ensure the code didn’t do nasty things to itself.

What’s perhaps as interesting to me is that my designs are resulting in smaller and smaller classes. The more I think about problems in a functional way, the more I’m am able to design solutions that are essentially pipelines. The irony being that even though we think of imperative code as being step-by-step, it more often than not turns out to be a big, intertwined blob. Functional code on the other hand is almost by definition a series of steps, or transformations applied one after the other on some input.

These two observations are drawing me ever closer to just “getting over it” and using a functional language. The issue for me is the only FP language I know and actually like is Haskell and the only FP language I’d be likely to get into production is Clojure. Which is all I’ll say in public as I have no desire to start a flame war :)

Functional programming in object oriented languages

By

In my current job, I spend about 40% of my time with my underpants on the outside, digging around in production code, generally making stuff better. The other 60% of the time is R&D. The R&D part has some very concrete objectives but there is certainly leeway to explore different ways of developing software.

Like many programmers, my first formal introduction to OO was all about classes and inheritance. What mattered most was getting the “structure” right. Next, I came to understand the importance of encapsulation, and after that polymorphism.

Over the past 6-12 months or so I’ve become more and more interested in functional programming concepts if not functional programming languages. I’ve always been a big fan of declarative programming, business rules, etc. and yet I’ve also always been a big fan of OO. Even when I was an assembler programmer I tended to structure my code and data as if it were object-oriented, even if the self pointer was explicit.

More recently I re-read SICP, learned Haskell which I unashamedly love, played with Clojure on and off, and briefly looked at Scala. Co-incidentally, I also read Domain Driven Design, and Clean Code along with a number of other very interesting articles on functional programming, immutability in general and recursion in object-oriented languages: Why Functional Programming Matters, Why Why Functional Programming Matters Matters, Functional Objects, and Why Object Oriented Languages Need Tail Calls. All of which got me thinking, once again, that perhaps I’d been doing this OO thing all wrong.

Here I’d like to present a few observations from my exploration into functional programming in an object-oriented world.

  • Immutable objects good; Mutable objects bad
  • An object is a collection of partially applied functions
  • An object’s API provides a clear separation between Commands and Queries
  • An object is a snapshot of state and possible outcomes
  • An object is a persistent data structure

Clear as mud hey?

Immutable objects good; Mutable objects bad

Classes should be immutable unless there’s a very good reason to make them mutable….If a class cannot be made immutable, limit its mutability as much as possible.

– Joshua Bloch, Effective Java

I won’t go into a lot of detail as to why I believe this to be true. There are plenty of arguments to be found both for and against. Suffice it to say, the direction I’m taking and the conclusions I draw from my experiences are predicated on this belief. What I will say however, is that the effect it has on my designs quite often gives me the same sense of satisfaction as when practising Test Driven Design.

It’s not as though immutability in object-oriented programming is anything new. Many years ago I wrote code where all my value objects were immutable as was the occasional service class, etc. The difference in my current approach is that everything is immutable except for a tiny layer at the fringes where I need to save data for later retrieval, consume HTTP requests, etc. Yes, even my “entities” are immutable.

An object is a collection of partially applied functions

The ideal number of arguments for a function is zero … More than three requires very special justification – and then shouldn’t be used anyway.

– Bob Martin, Clean Code

In functional languages, partial application of a function allows us to define a new function by “pre-populating” some of the arguments. For example, we can take a very simple Haskell function that calculates the amount of applicable sales tax:

applicableSalesTax percentage amount = (percentage / 100) * amount

and then partially apply it to create another function that has a fixed sales tax:

applicableGST = applicableSalesTax 10

The function applicableGST partially applies applicableSalesTax with the value 10. Anytime we call applicableGST it will invoke applicableSalesTax with a rate of 10% and whatever amount we pass to it.

Now consider an object-oriented approach (in Ruby). Imagine a very simple SalesTax class that holds the rate:

class SalesTax

  def initialize(percentage)
    @rate = percentage / 100.0
  end

  def applicable_for(amount)
    (amount * @rate)
  end

  def included_in(amount)
    amount - excluded_from(amount)
  end

  def excluded_from(amount)
    amount / (1 + @rate)
  end

end

Here, the constructor sets up the context within which each of the methods then operates, just the way we always read good OO code should be.

I’ve starting to think of constructor arguments as the mechanism for partially applying all the methods on an object. Considering an object as a partial application of a set of methods is really quite interesting to me. It almost dictates that methods MUST operate, in some way, on the state of the object – just as we always read good OO code should – only there’s a nice explanation as to why: If they didn’t operate on the object’s state, they wouldn’t be partially applied.

When I apply this principle in my designs I find I have smaller, more cohesive classes – ie the methods are all related more closely to the shared state. I also find I have far fewer private methods and more often than not, none at all. (This coincidentally fits in nicely with my relatively long held belief that private methods are a smell though this is a perhaps a topic for another discussion.) I also find that the constructor then becomes a meaningful, nay critical, part of my API.

An object’s API provides a clear separation between Commands and Queries

Methods should return a value only if they are referentially transparent and hence possess no side effects.

– Bertrand Meyer

Interestingly, when we deal with immutable objects we really have little choice but to do just this.

If we want an object to “answer something” (a query) no modification is expected and we simply return a (potentially calculated) value:

def full_name
  "#{@first_name}, #{@last_name}"
end

If we want an object to “do something” (a command) we’ll be expecting some kind of representation of the new state as the result:

def set_first(new_last_name)
  Person.new(@first_name, new_last_name)
end

If we assume we only want to return one thing from a method, and that a change in state necessitates returning a handle to the new state, then a method can only ever be either a query or command but not both.

After discussing this with a colleague, they suggested that in a sense commands are now queries, eg. that ask the question: “what would a system look like if I asked you to do something?” Given this definition of Commands and Queries I can certainly see it from that perspective.

In the example just given we merely returned a newly created object but it could just as easily have inserted a record into a database. So, I still like to think of commands in the original sense except they now return a handle to the new state. Much like an HTTP PUT/POST/DELETE request does when following the principles of REST.

(As an aside, one consequence of this approach is that objects are closed under modification. That is, whenever we modify an object, we receive the result in the form of a new object of the same type.)

An object is a snapshot of state and possible outcomes

Domain Events represent the state of entities at a given time when an important event occurred and decouple subsystems with event streams. Domain Events give us clearer, more expressive models in those cases.

– Eric Evans

I presume it would be largely uncontroversial to describe immutable objects as a snapshot of some state. When I also consider an object as a collection of partially applied functions, I begin to think of an object – and therefore the system as a whole – as a snapshot not only of state but also possible outcomes.

I suspect thinking about the difference between a system that undergoes state changes in situ and one that in effect represents every possible outcome simultaneously without needing to materialise them all in advance leads to a very different way of modelling a problem.

While pondering this, I recalled that as a kid I loved Choose Your Own Adventure books? At the end of each page (or few pages) you are presented with a set of choices: Do you turn left? Do we Open the door? Do you run away? Depending on the choice we make, the story takes a different course. Objects seem a bit like pages in a Choose Your Own Adventure Book, frozen in time. The methods are like the choices we make – each one takes us to a different page, itself frozen in time. I used to bookmark pages with my thumb, like a save-point. If I didn’t like the way the story was headed I simply turned back to the last known “good” point in the book.

An object is a persistent data structure

A persistent data structure is one that efficiently preserves the previous version of itself when it is modified. One of the simplest persistent data structures is the singly-linked (cons) list but almost any tree structure can be adapted to be persistent.

If we’re creating snapshots in order to preserve the initial state we’ll need to make a copy that can be modified. It’s reasonable to assume that all these redundant copies will take precious memory and computing cycles and in doing so create quite a bit of garbage. Assuming a naive copying strategy that makes deep, nested copies, this will certainly be the case. Thankfully, there are some pretty neat optimisations we can make by because every object is immutable.

When thought of as a tree with the references to other objects as the children, an object can be treated just like a persistent data structure. If we need to make a copy and modify something, all we need do is create a shallow copy and change the necessary fields. In this way, each copy is like a delta from the previous, sharing as much state as possible and reducing not only the time to copy but also the amount of memory needed.

When I first started down this path, I used constructors for this purpose. For example, if I wanted to create a new Money object, I’d create a new version using regular object construction:

class Money

  attr_reader :currency, amount

  def initialize(currency, amount)
    @currency = currency
    @amount = amount
  end

  def add(other)
    Money.new(@currency, @amount + @currency.convert(other.currency, other.money))
  end

end

I found this approach worked fine for small, value-object sized, classes however for objects that reference more than two or three values, constructing new instances became cumbersome. Moreover, if an object had internal state that was hidden but necessary when copying, the constructor API would become polluted. Even in languages such as Java that provide method overloading, or Ruby with first-class associative arrays, it still felt overly complicated to construct a new instance just to change a single value.

Instead I’ve started using an approach that involves cloning. Whenever I need to make a change, I perform a shallow copy, update the appropriate fields and return the result. In a language such as Ruby, this is quite simple to implement and make safe. Instead of using object construction, anytime I wish to make a change I do something like this:

def add(other)
  transform do
    @amount = @amount + currency.convert(other.currency, other.money)
  end
end

The add method now looks similar to a method you’d write if you were able to modify state; we assign a new amount based on some calculation. The transform method is the key here by making a shallow copy of the object, running the block within the context of the copy, then freezing the result to prevent modification before returning it to the caller. I’m finding this approach to have a number of advantages.

My constructor API isn’t “polluted” with internal implementation concerns. The constructor remains a part of the public API.

The construction of the modified copy isn’t leaking into the implementation of the method itself. Instead, the method can focus on the job at hand.

Because the method only ever needs to concern itself with the data upon which it operates, the rest of the class can vary relatively independently. When we were explicitly constructing a new object, the add method had to concern itself with also copying the currency. When using transform that problem goes away. No matter how many other fields need to be copied, the add method remains unchanged.

In a sense each method describes the delta between the current state and the new state. Just like a persistent data structure. It reminds me of branching in a Version Control System.

More to explore

As I mentioned very early on, I’m working with designs that are almost completely immutable, even at the entity level. This approach results in some really positive benefits but also presents some interesting implementation challenges as well as challenging some of my long held beliefs.

Immutable designs feel easier to reason about. I have no empirical evidence for this just a gut feel. When I’m trying to work out why something isn’t working as expected, more often than not I can just read through the code and work out what happened. I’m not trying to keep a mental mode of all the side effects.

When dealing with any kind of RDBMS, immutable objects lend themselves to a model where changes to the database are written as events rather than updates to individual records. Unlike a traditional ORM, I don’t have the “luxury” of modifying an object to assign an ID. This hasn’t presented much of a problem as yet though I suspect it might become more interesting as the object model increases in complexity.

I’ve been contemplating trying out an OODB such as MagLev to see how that might fit in. I suspect it should be no more difficult than with mutable objects and perhaps even simpler.

I’m also working on a project at the moment that uses an immutable RDF store wrapped in an object model to make it actually useable. So far it’s fit really nicely.

I’m wary of my code turning into a collection of function objects. Ie. objects that effectively have a single doIt method. It doesn’t happen too often to be a concern but it certainly does happen and I’m still not certain how I feel about it.

I sometimes find myself exposing state I wouldn’t otherwise have needed when I used mutable objects. It feels a tad icky but thus far I haven’t found it to be much of an issue.

Testing has also been interesting. I find I’m doing far less mocking and much more state-based testing. I find the tests I’m writing to be far more declarative than when I do interaction based testing. Define the initial state of the system, run the code, and compare against the expected state. Even (especially?) at the unit level this has been very effective. Again, I’m not sure how I feel about this but so far, it’s worked out well.

Of course not everything is immutable. The database isn’t nor is the file system though in both cases I try my best to treat them as if they were by only ever writing new records/files. The system runtime isn’t immutable either – the act of creating a new object proves that.

Perhaps it really is a natural progression from here to the use of more functional languages as some of my colleagues tell me I should. However there does seem to be a general lack of distinction between the benefits of functional programming concepts and functional programming languages. I’m thoroughly enjoying incorporating functional programming concepts into object-oriented languages.

Tests as documentation

By

Whilst I’ve been playing around with immutable collection classes in Ruby, I’ve also been working on ways to document behaviour without writing loads of RDOC that goes stale really quickly.

Tests have always been touted as a form of documentation but I’ve rarely – if ever come to think of it – seen that work in practice. Cucumber comes very close but I wanted something a little closer to the metal, something that allowed me to write unit tests with something like RSpec.

For sometime now, I’d been structuring my specs with this in mind and I thought I was doing a reasonably good approximation. Then today I finally had cause to test my work. As part of a feature I was implementing in another project, I wanted to use a list method I thought provided just what I required but I couldn’t remember exactly how it behaved or what the interface was so, naturally, I consulted the documentation. D’oh! But wait, I thought smugly, I’ve been writing these specs and as we all know, specs are documentation. Moreover, I’ve been putting in quite a bit of effort to make them read as such so why not go read the specs?

Suffice to say, they didn’t live up to my expectations. After running spec -f nested spec/hamster/list/span_spec.rb the result wasn’t bad but wasn’t great either:

Hamster::List
  #span
    is lazy
    on []
      with a block
        preserves the original
        returns a tuple with two items
        correctly identifies the prefix
        correctly identifies the remainder
      without a block
        returns a tuple with two items
        returns self as the prefix
        leaves the remainder empty
    on [1]
      with a block
        preserves the original
        returns a tuple with two items
        correctly identifies the prefix
        correctly identifies the remainder
      without a block
        returns a tuple with two items
        returns self as the prefix
        leaves the remainder empty
    on [1, 2, 3, 4]
      with a block
        preserves the original
        returns a tuple with two items
        correctly identifies the prefix
        correctly identifies the remainder
      without a block
        returns a tuple with two items
        returns self as the prefix
        leaves the remainder empty

For a start, there was no narrative, nothing telling me what the desired outcome was; why do I want to use this method? Secondly, whilst the individual assertions seemed to make sense when reading the spec code, once they were in this purely textual form they were somewhat useless in helping me understand what to expect. And lastly, a purely aesthetic complaint, I didn’t really like the indentation so much. Right when all that hard work should have paid off, it failed me. But not completely. I was still convinced there was some merit in what I wanted and perhaps a little more tweaking could get me closer to my ideal.

After a few iterations of modifying the code, running the specs, and reading the output, I finally hit upon something I think is pretty close to what I’ve been after:

Hamster.list#span
  is lazy
  given a predicate (in the form of a block), splits the list into two lists
  (returned as a tuple) such that elements in the first list (the prefix) are
  taken from the head of the list while the predicate is satisfied, and elements
  in the second list (the remainder) are the remaining elements from the list
  once the predicate is not satisfied. For example:
    given the list []
      and a predicate that returns true for values <= 2
        preserves the original
        returns the prefix as []
        returns the remainder as []
      without a predicate
        returns a tuple
        returns self as the prefix
        returns an empty list as the remainder
    given the list [1]
      and a predicate that returns true for values <= 2
        preserves the original
        returns the prefix as [1]
        returns the remainder as []
      without a predicate
        returns a tuple
        returns self as the prefix
        returns an empty list as the remainder
    given the list [1, 2, 3, 4]
      and a predicate that returns true for values <= 2
        preserves the original
        returns the prefix as [1, 2]
        returns the remainder as [3, 4]
      without a predicate
        returns a tuple
        returns self as the prefix
        returns an empty list as the remainder

This time there’s a narrative describing what the method does, followed by a series of examples not only describing the behaviour but also providing concrete values. Now the output reads more like documentation only rather than duplicated as RDOC that rapidly becomes disconnected from reality, it’s generated from the tests and automatically stays up-to-date.

The underlying spec is not perfect by any stretch – there is certainly a modicum of duplication between the test code and the descriptive text – but I think it strikes a reasonable balance between tests that are readable as code as well as plain text documentation. I’d certainly love to know what, if anything, others have done.

require File.expand_path('../../../spec_helper', __FILE__)

require 'hamster/list'

describe "Hamster.list#span" do

  it "is lazy" do
    lambda { Hamster.stream { |item| fail }.span { true } }.should_not raise_error
  end

  describe <<-DESC do
given a predicate (in the form of a block), splits the list into two lists
  (returned as a tuple) such that elements in the first list (the prefix) are
  taken from the head of the list while the predicate is satisfied, and elements
  in the second list (the remainder) are the remaining elements from the list
  once the predicate is not satisfied. For example:
DESC

    [
      [[], [], []],
      [[1], [1], []],
      [[1, 2], [1, 2], []],
      [[1, 2, 3], [1, 2], [3]],
      [[1, 2, 3, 4], [1, 2], [3, 4]],
      [[2, 3, 4], [2], [3, 4]],
      [[3, 4], [], [3, 4]],
      [[4], [], [4]],
    ].each do |values, expected_prefix, expected_remainder|

      describe "given the list #{values.inspect}" do

        before do
          @original = Hamster.list(*values)
        end

        describe "and a predicate that returns true for values <= 2" do

          before do
            @result = @original.span { |item| item <= 2 }
            @prefix = @result.first
            @remainder = @result.last
          end

          it "preserves the original" do
            @original.should == Hamster.list(*values)
          end

          it "returns the prefix as #{expected_prefix.inspect}" do
            @prefix.should == Hamster.list(*expected_prefix)
          end

          it "returns the remainder as #{expected_remainder.inspect}" do
            @remainder.should == Hamster.list(*expected_remainder)
          end

        end

        describe "without a predicate" do

          before do
            @result = @original.span
            @prefix = @result.first
            @remainder = @result.last
          end

          it "returns a tuple" do
            @result.is_a?(Hamster::Tuple).should == true
          end

          it "returns self as the prefix" do
            @prefix.should equal(@original)
          end

          it "returns an empty list as the remainder" do
            @remainder.should be_empty
          end

        end

      end

    end

  end

end

Lazy spec task creation

By

I converted a Ruby project over to use Bundler for gem dependency management today. For the most part it worked flawlessly except, that is, when the CI build ran for the first time after the conversion:

LoadError: no such file to load -- vendor/gems/environment

Stacktrace:
tasks/spec.rb:8:in `require'
tasks/spec.rb:8:in `<top (required)>'
...
Rake aborted!

The short story: The spec task definition needed the RSpec gem to be loaded but it wasn’t until after all tasks had been defined.

Now, I could single out the spec task definition and ensure it was loaded last but that would mean adding a bunch of code to my otherwise trivial Rakefile. The other option was to somehow defer the creation of the spec task until actually needed. After a bit of searching I couldn’t find anything particularly useful so I rolled my own:

namespace :spec do

  desc "Run specifications"
  task :run => :define do
    Rake::Task[:_run].invoke
  end

  task :define do

    require 'spec/rake/spectask'

    Spec::Rake::SpecTask.new(:_run) do |t|
      t.spec_opts << "--options" << "spec/spec.opts" if File.exists?("spec/spec.opts")
    end

  end

end

The spec:run task depends on the spec:define task to create a “hidden” spec:_run task to actually do the work.

The nice thing about this all the ickiness is hidden – remove the tasks/spec.rb file and nothing else really cares – and means I can treat the spec task like any other when creating my task dependencies.

As always, YMMV.

Why Object-Oriented Languages Need Tail Calls

By

Disclaimer: I unashamedly stole the title after reading another article on the same topic.

Some of you may know of a little project I’ve been working on in my, albeit very limited, spare time. Hamster started out as an implementation of Hash Array Mapped Trees (HAMT) for Ruby and has since expanded to include implementations of other Persistent Data Structures such as Sets, Lists, Stacks, etc.

For those that aren’t up with HAMTs or persistent data structures in general, they have a really neat property: very efficient copy-on-write operations. This allows us to create immutable data-structures that only need copying when something changes, making them a very effective when writing multi-threaded code.

Hamster also contains an implementation of Cons Lists with all the usual methods you’d expect from a Ruby collection such as map, select, reject, etc. thrown in for good measure.

One of the things I really wanted to investigate was laziness. So, for example, when evaluating:

Hamster.interval(1, 1000000).filter(&:odd?).take(10)

Rather than generate a list with a million values, evaluate them all against the filter, and then select the first ten, Hamster lazily generates the list, the evaluation of filter, and even take. In fact, as it stands, the example code won’t actually do anything; you would need to call head to kick-start anything happening at all. This behaviour extends, to the extent possible, to all other collection methods.

Hamster also supports infinite lists. For example, the following code produces an infinite list of integers:

def integers
  value = 0
  Hamster.stream { value += 1 }
end

Now we can easily generate a list of odd numbers:

integers.filter(&:odd?)

Again, rather than generate every possible integer and filter those into odd numbers, the list is generated as necessary.

OK, so enough with the apparent shameless self-promotion. Let’s get to the point.

My first implementation of lists used recursion for collection methods. The code was succinct, and, IMHO elegant. It conveyed the essence of what I was trying to achieve. It was easier to understand and thus, I would surmise, easier to maintain. The problem was that for any reasonably large list, stack overflows were common place. The lack of Tail-Call-Optimisation (TCO) meant that the recursive code would eventually blow whatever arbitrary stack limits were in place. The solution: convert the recursive code to an equivalent iterative form.

Once all methods had been re-implemented using iteration, the code ran just fine on large lists; no more stack overflow errors. The downside was, the code had almost doubled in size–12 lines of code became 24 or in some cases even more. The code was now harder to read and far less intention revealing. In short, the lack of Tail-Call-Optimisation lead to less maintainable and I’d hazard a guess, more error prone code.

The story however, doesn’t end there. Take another (albeit contrived) example that partitions integers into odds and evens:

partitions = integers.partition(&:odd?)
odds = partitions.car
evens = partitions.cadr

You would expect odds to contain [1, 3, 5, 7, 9, ...], and evens to contain [2, 4, 6, 8, 10, ...]. But the way I initially implemented the code it didn’t. Here’s an example to show what happened:

odds.take(5)    # => [1, 3, 5, 7, 9]
evens.take(5)   # => [2, 12, 14, 16, 18]

Confused? So was I until it dawned on me that I had broken a fundamental principle: immutability. The underlying block that generates the list of integers has state! Enumerating the odd values first produces the expected results but once we get around to enumerating the even values, the state of the block is such that it no longer starts at 1–reversing the order of enumeration produces a corresponding reversal of the error. Pure functional Languages such as Haskell have mechanisms for dealing with this but in Ruby, the only construct I really have available to me is explicit caching of generated values.

Once I had cached the values all was well, or so I thought. I started to write some examples that used files as lists:

File.open("my_100_mb_file.txt") do |io|
  io.to_list.map(&:chomp).map(&:downcase).each do |line|
    puts line
  end
end

Running the code above took forever to run, much slower than the non-list equivalent. I expected a little slow down sure, but nothing like that which I was seeing.

At first I suspected garbage collection–perhaps the virtual machine was being crushed by the sheer number of discarded objects; I could find no evidence for this. Next, I suspected synchronisation–anything with state needs synchronisation. Again, I found no evidence for this either. A bit more fiddling and a few dozen print statements later–Ruby has no real profiling tools that I’m aware of, something that frustrates me no end at times–I realised what the problem was.

When I failed to find any evidence of garbage collection as the culprit, it had seemed a bit odd but I wasn’t sure why I felt that way and thus moved on. Had I stopped and thought about it for a while I may have realised that in fact that was exactly the problem: there was NO evidence of garbage collection at all. How could that be? Processing hundreds of thousands of lines in a 100MB text file using a linked list was sure to generate lots of garbage. Once a line had been processed, the corresponding list element should no longer have been referenced and thus made available for garbage collection, unless… unless for some mysterious reason each element was still being referenced.

My caching implementation worked like this: As each value is generated, it’s stored in an element and linked to from the previous element: [A] -> [B] -> [C]. At face value this works well–if you never hold a references to “A” or “B”, they will become available for garbage collection. So what could possibly have been going wrong? Each line was being processed and then discarded. Surely, that meant each corresponding element should have become available for garbage collection?

Now recall that I had converted the recursive code to an iterative equivalent. This had now come back to bite me, hard!–though to be fair the recursive code would have suffered in a similar and perhaps more obvious way. The call to map runs in the context of the very first line which, because of the caching, directly and indirectly references every other line that is processed! The lack of Tail-Call-Optimisation in Ruby means that whether I use recursion or iteration, if I process all elements from the head of a stream, the garbage collector can never reclaim anything because the head element is always referenced until the end of the process!

Some of my colleagues have suggested that I just get over it and use a “real” language like Clojure. Whilst I understand the sentiment, the point of Hamster is not necessarily to implement a functional language in Ruby. Rather, it is to see what can be done in object-oriented languages and, in this case, Ruby.

Hamster has allowed me to demonstrate that functional language idioms can, for the most part, translate quite well into object-oriented equivalents. However, the lack of Tail-Call-Optimisation severely limits what is possible.

Update 2009/12/30

MacRuby supports a limited form of TCO as well. I received similar results as for YARV (see below) the differences being you’re not limited to the call being the last statement in the method and there’s a bug where you receive a segmentation fault rather than a stack overflow.

Update 2009/12/27

According to this redmine ticket, YARV has some limited TCO support which is disabled by default. I performed the necessary incantations to enable it, only to discover the true meaning of “limited”: optimise calls to the same method in the same instance iff the call is the last statement in the method.

Plugins: Grab 'em while they're stale

By

I don’t like leaving unused code lying around, unused applications installed, unused clothes in the wardrobe, etc. As a consequence I’m often referred to as ‘Mr. Detritus’.

As you probably know, I’ve created a number of Ruby on Rails plugins over the years. Most of them when I first started out with Rails and for that matter, Ruby. Most of them had poor (if any) test coverage and the code looked generally like a dog’s breakfast but they satisfied a need – scratched an itch if you like – I had at the time.

Time marches on and although I will continue to use Ruby as a language, I no longer have any desire to use Rails. Some of my plugins ended up in Rails core, I’ve continued to use others on recent projects but most of them have been left to rot – some of them I wouldn’t use even if I did another Rails project having long since considered them failed experiments.

And so it is that I will very shortly (within the next month) delete most of the plugins from my GitHub account. If you wish to continue using them, feel free to fork and keep a copy for yourself. Republish them under your own name if you wish for they will no doubt be better cared for by you than me.

UPDATE: By popular demand, a Once off never to be repeated copy of the UNSUPPORTED Rails plugins.

Random thoughts on our current Agile process

By

It’s late and I don’t seem to be able to sleep so for something to do I thought I’d jot down (ok copy from an email I sent out earlier in the week) some thoughts about our development process as it has evolved over the last few months or so.

As always, I can only speak from personal experience (one data point doesn’t really count for much) so here’s my totally subjective perspective, YMMV:

  • Full-time pairing when possible/practical

  • No more than one card per pair in dev at any time

  • No more than one card per pair in ready-for-dev at any time

  • Nothing to be blocked; If it’s blocked we work to remove the blockage

  • New cards are demand pulled into read-for-dev as cards are moved from dev to ready-for-test

  • We have big picture story card sessions as required

  • We have planning meetings at the start of each iteration where we discuss what’s “planned”

  • We give everything t-shirt sizes (S, M, L)

  • No technical stories; everything must be done because it delivers business value

  • We don’t measure velocity

  • We aggressively split cards

  • I repeat, we aggressively split cards

  • Daily stand-ups

  • Parking lot for post-stand up discussions

  • 2-week iterations with retro followed by a kick-off to discuss the stories

  • We try to focus on doing things as simply as possible

  • We try to focus on building things correctly rather than as fast as possible

  • We fight to have a REAL user available to better understand their needs

  • We rally against the usual cries of “but I know we’ll need it”; We trust that by building things simply we can always add on the extra functionality later

  • Trying to deliver everything to everyone leads to delivering nothing at all to anyone

  • T-shirt sizes help the business prioritise not estimate delivery dates; Business value is a function of, among other things, time/cost to build

  • Rolling technical stories into stories that deliver business value force us to change course slowly and justify changes

  • We usually have parallel implementations of some things as a consequence

  • Minimal changes are allowed on “old” implementations; anything substantial requires a migration to the “new” implementation

  • If we can deliver some business value early by splitting the cards we do so as soon as possible; Eg. business can view existing data in the new form but editing is a new card because they can still use the old mechanism for that.

  • It’s critical that the whole team is taken on the “journey” so they understand why things are being built. Doing so brings the team into alignment and also allows the team to make informed decisions such as re-structuring work to enable splitting cards for aggressively.

  • Bringing the team along for the journey can be painful, never gets easier, and is always worth the effort.

  • Just-in-time stories really does enable the business to leave the decision as to what’s important to the last possible moment

  • Implementing the smallest amount of code possible for each story is critical to enabling just-in-time development; less code == greater flexibility; E.g. don’t use a database when the data comes from a spreadsheet and presently only ever changes in a spreadsheet.

  • We do as much forward thinking as possible/practical; We think of as many likely scenarios as we can and keep reducing the scope of the implementation so as not to preclude implementing them later on

  • Almost nothing ends up looking as we thought it would when first envisaged.

  • More important than writing the code is working out how to structure the implementation so that we get the job done without precluding possible future work; sometimes this means at least thinking through a strategy for migrating from one implementation to another later on if necessary.

  • It’s amazing how splitting stories reveals just how little the business value certain aspects of stories

  • We almost never get through everything that was “planned”

  • We almost always end up doing stories that weren’t “planned”

  • We are as close as we can get (due to the bureaucratic nature of the client’s operations group) to on-demand deployment into production. Ideally we’d like it to be automated but that’s just not going to happen anytime soon

  • We’re motivated by getting things done; call that velocity if you will but we really haven’t found a need to measure velocity. Delivering a constant stream of small but valuable stuff into production every week is VERY motivating.

  • We value delivering something over delivering nothing

  • We actively plan for change Caveats:

  • We have a highly competent team

  • We have a fixed budget

  • We have internal and external users

  • We have UI and data-only users

  • We have an existing implementation we are evolving away from It’s far from perfect and is constantly evolving but as Travis observed, it kinda represents a snapshot of what being Agile means to me right now.

We're Recruiting

By

If you haven’t heard already, Cogent are recruiting.

Cogent prides itself on the depth of its experience with agile software development, and its ability to leverage this experience to benefit Cogent clients. We are an open-book company, with comprehensive employee participation in decision-making.

What do we do? We’re a three part story. We go out on site as consultants to help our clients get better at producing good software, by both coaching them in agile techniques and working as integral part of their development teams. We produce high-quality websites for clients from our own premises. Finally, we build our own (mostly web-based) products, using the range of great talents that make up our team.

Right now we’re looking for people who can perform hands-on web application development both in our offices and on client sites, predominantly in Ruby on Rails.

We’re also looking for people who can provide hands-on support to clients undertaking agile transformations at both the small and large scale, as well as help out with internal product development

In either case You’ll need to show us that you:

  • understand the principles of agile software development and have experience working on agile projects
  • are collaborative, but willing to be a benevolent dictator when required
  • can represent us on a client site in a way that makes us proud
  • have a passion for software development and you continue your professional development outside of work hours
  • have experience with Ruby on Rails, or you are able to learn it very quickly. Extra points if you’re a Smalltalk or Haskell expert
  • understand that “done” means the software is in production
  • have a demonstrated track record of successful delivery
  • have a passion for software development and you continue your professional development outside of work hours
  • think that the time after five o’clock belongs to your family (yes, that’s contradictory - we want to know how you deal with that contradiction)
  • understand that software and process should be opinionated, but not bigoted
  • can read, you choose to read, and you understand what you read
  • are intellectually omnivorous
  • consider communication (written and verbal) to be amongst your strongest skills

In return, we’ll provide you with a collegial environment that rewards inquisitiveness rather than being an ongoing inquisition. We’ll treat you as part of the Cogent family, and give you a share of the profit and/or the products that we develop. We’ll provide an environment where you can work with your peers, be challenged, and be the best that you can be.

If this sounds like your thing, you can visit our website for more information about Cogent, or email us directly: info@cogent.co

Less delicious, yet more satisfying

By

These days, I spread my research and reading between Instapaper and Evernote. IMHO, delicious is essentially a big old shed with crap in it and no way to actually use any of it other than marvel at how much stuff I’ve collected.

On the other hand, both Instapaper and Evernote add value to the stuff I’ve collected: Instapaper allows me to read blogs and websites on my phone, and Evernote allows me to collect and organise information according to project, tags, etc.

As Travis pointed out, this makes it difficult (nay impossible) for others to see what I’m reading and is largely the reason I send out almost daily emails to colleagues on stuff I think is more generally interesting.

This morning I noticed that Instapaper helpfully provide a read-only feed of my list. So, for anyone interested, here is the link: http://www.instapaper.com/rss/175381/1rDOvQp1xwBTeMIoml2TuzPjlmM

UPDATE: If you’d rather not wade through everything I read to find the good bits, I’ve started “starring” items I found insightful and/or think are of more general interest. Here’s the feed: http://www.instapaper.com/starred/rss/175381/ttNEuQvOmmM5sX94f0HCO7ns

Problem Solving

By

It occurred to me recently that I have this notion of programming as a process that involves breaking a problem down into a sets of smaller and smaller problems until I have something I know how to solve. (I mentioned this to Steve yesterday which reminded him of a joke about an engineer and a mathematician.)

I have previously just assumed that I therefore follow this process when I’m actually problem solving however, on reflection, I’n not so sure. More specifically, I’m either not doing it at all or, at the very least, I’m doing it intuitively.

I wonder how many people do (or have done) this as an explicit part of their own problem solving and if so, what effects they’ve noticed as a consequence.

No, Sleep, Till Bedtime

By

Or at least until all those Twitter client developers have fixed their Twitpocalypse bugs. In case you didn’t know, a few days ago the ID range used for Twitter’s messages exceeded 2^31 (approximately 2 billion) causing any apps that stored them as 32-bit integers to think they were really small negative numbers.

It’s usual – and I say usual because I don’t always adhere to it – policy for storing external identifiers is to treat them as text, even when I know they are numbers. Why? Essentially because I consider it a coincidence that they’re numbers. That identifier, number though it may be, has no special significance to me over and above being an opaque handle to some entity in another system. As such, I like to treat them as text.

Discussing this with a good friend and colleague of mine, the question of column width came up. Ie. if you’re going to make it text, how long should the column be? If you’re lucky enough to be using a database such as PostgreSQL, then the answer is: it doesn’t matter – there’s no performance benefit to artificially limiting the size of the column. For other databases, the common practice is to use something like VARCHAR(255). Think about it, even if it is a number that’s 10^255!

Twitter claims that its API is RESTful. And if to you, REST means nice, predictable URLs with some semantic path possibly followed by a numeric id and returning numeric ids in search results, then yes, it’s RESTful. Want to see the most recent messages for a user? There’s a simple HTTP request you can make to a nice, semantic (if you speak English) URL that returns a list of them and their identifiers. And, as expected, our Twitter clients have been dutifully squirrelling these ids away in integer fields (probably because that’s the default) and all has well until 2 days ago.

Now, without going into too much of an ideological rant, I happen subscribe to the principle that RESTful URLs should be opaque. That is, a URL is a URL is a URL. No slicing, no dicing, no assembling, no joining. If I have a URL to a resource then that’s what I use. Period. End of story. (You can find plenty of discussion on this by Roy Fielding using Google.)

So, back to our column widths. Assuming we have a text field in our database large enough to accommodate a URL, we could go one step further. Rather than treat the identifier as text and storing that, why not go the whole hog and store the URL instead?

As far as I can tell, the only reason is that Twitter’s API, RESTful though they may claim, sends back numeric identifiers rather than URLs which in turn leads developers to incorrectly assume that they should be storing them as numbers.

On their own, identifiers are meaningless and in fact, useless. To utilise an identifier requires us to know the system in which it is stored and the collection in which it belongs. If instead each piece of information was identified by a URL we get all that context for free and the power to share information grows phenomenally.

To me, the beauty and power of the internet is the ability to link together disparate systems in ways no one had previously imagined. More specifically, in ways the publishers of the information never considered.

Opaque URLs combined with idiomatic use of HTTP verbs can help reduce the coupling between producers and consumers by giving back control to producers in how and where they store information and at the same time increasing the freedom for others to share and use that information.

(That last paragraph reads like an Amnesty International commercial!)

Shameless Self Promotion

By

So the past couple of months, I’ve finally had the luxury of starting to realise my (and Cogent’s) dream of doing product development.

We just recently launched what we hope is a very simple, easy to use and somewhat opinionated web application for Getting Things Done™ (GTD). It’s a crowded market to be sure but we really believe we understand GTD well enough to deliver a system that is more than just a to-do list with GTD inspired keywords.

Runway is still in the early stages of feature development. For those that know anything about GTD, you’ll be happy to hear that we’re working on delivering Projects, Artifacts, Agendas and of course an Inbox, to name but a few, in the very near future.

What you see now is, and will always be, free. at some point we’ll be adding pay-for features but we’ll also be doing the right thing by all our early adopters. So, if you have 5 or so minutes, we’d love for you to sign up, have a play, and of course, tell us what you think.

Web standards and all I got was this lousy website

By

Over the Easter long weekend, I had a great break from work and a great opportunity to think about and reflect on my career, my job, and my profession as a whole. It’s safe to say I become a bit disheartened and disillusioned. The one striking conclusion I kept arriving at is that we are so technology focused that we spend too much time, money and effort building things that the customer is “happy” with but not blown away by. That we artificially constrain the end user experience based on our notions of “correctness”. In particular, web application development is largely a bunch of dick-pulling technical masturbation, forever re-inventing the wheel at a ridiculously low level of abstraction shoving our technological solutions down user’s throats in the name of “software engineering”. What’s worse is that I’ve been complicit. Not only by buying the hype but often by trying to do “the right thing” even when I felt as though I was bashing my head against a brick wall.

Rewind the clock to somewhere between 1996 and 1999. During those years I, along with a good friend and colleague built a desktop application that was delivered to thousands of users across Australia using nothing more than good old-fashioned client-server SQL written in, of all things, PowerBuilder – kinda like VisualBasic. More than 10 years ago, the user experience was compelling and sophisticated, it performed exceptionally well over 2400 baud dialup modems, and we built the initial release with only 2 people over 3 months from scratch. As shameful as it is, especially coming from one so vocal about automated testing as I, we had nothing but manual testing but we also had few bugs and when users did find a problem, we fixed and redeployed within 24 hours – mostly because we didn’t want to interrupt users as they worked and so waited until after hours. Over the next 12 months, we were able to adapt to the user’s needs immediately. Rarely did we add the features as request but we always managed to produce a solution they actually needed. Fast forward a decade and I feel like I suck because I honestly don’t believe I could do the same thing again today. In fact, I challenge any of us to build the same user experience with our existing technology stack.

To those that know me well, I will no doubt sound like a broken record but I can’t help feel we’ve been trying to coerce HTML & CSS into something they just aren’t and doing so for a decade now.

Think about it, HTML: HyperText Markup Language. Does that sound like it has anything to do with layout and design? In fact do you know any designers, even those that call themselves web designers, that do any of their design work in HTML/CSS? No – well none that I’ve ever heard of. The closest I can think of is a colleague who does his wireframes in OmniGraffle and then generates HTML/CSS. Why? I put it to you it’s because we don’t think in HTML/CSS. You CAN’T effectively think in HTML/CSS and if a guy who’s expertise lies in designing user interfaces can’t think in terms of HTML/CSS why the hell do we think we should?

HTML was designed for linking documents with a modicum of layout and has served that purpose admirably. As a result, the web browser largely won the battle for desktop supremacy and almost everyone has a web browser and regularly uses a number of web sites. Similarly, pretty much everyone has a computer running an 80x86 based CPU and run dozens of applications built specifically for it. HTML/CSS are the machine language of the web.

For those of us lucky enough to have done any assembler programming, we’ve also been lucky enough not to have had to do any for a very long time. Instead, we chose to move away from assembler to other languages. C, C++, Java, Smalltalk, Python, Perl, Ruby, literally dozens of other programming languages that have systematically improved the level of abstraction. Many of these languages now run on top of the JVM, LLVM, CLR, etc. themselves abstractions on top of the underlying CPU.

Did we move because the runtime was faster? Hardly. In fact in almost all cases outrageous claims were made early on that poor performance would be the undoing of these languages and in almost all cases these claims ultimately proved unfounded. No, we moved to these languages because we hoped they would give us a better level of abstraction. That we could code more closely to the way we think. That we would one day realise the dream of literally thinking in code.

Even within languages we constantly strive to improve the level of abstraction. In many cases we’ve created Domain-Specific-Languages in order that we are better able to think IN the language most appropriate to the task at hand rather than needing to perform some contorted mapping process. This is the reason the Ruby community has slowly moved from Test::Unit to RSpec/Shoulda: Test::Unit does the job just fine but it’s verbose and “too close to the metal”. Just like assembler. When I’m the most productive I’m literally thinking in code.

We’ve largely sorted the back-end problems: Database access layers, routing, data format conversion, validation, you name it it’s all been largely worked out in whatever framework and language combination you can imagine. The same cannot be said of the front-end WHERE IT ACTUALLY MATTERS.

Granted, HTML/CSS has undergone change but to what extent and to what end? We have JSP, ASP, ERB, HAML, SASS, Liquid, blueprint, jQuery, Prototype, MooTools, Dojo, YUI, etc. but none of them appreciably raises the level of abstraction. Most advances in the world of HTML/CSS are lipstick. They’re all constrained by the fallacy that HTML/CSS is the holy grail of web design. No, the whole problem with web development is that we haven’t abstracted away the underlying technology, instead we’ve been conned by a bunch of HTML/CSS gurus and boffins who think that designing the perfect machine code is all the world needs. There is nothing more primitive than HTML+CSS when it comes to the web.

HTML & CSS try to be all things to all people and by doing so, much like J2EE, we ended up with a set of primitive tools that are repetitive, verbose, hard to test, maintain and refactor and ultimately provide a user experience that can best be described as a tarted up, 24-bit 3270 terminal. Don’t believe me? Point me at a website where the user experience feels liquid and natural. Where it literally gets out of your way so that you never even realise you’re using it? For the most part you can’t. The poster children of the Rails world provide at best a rudimentary user experience. I suspect people use them because there is no alternative, not because it’s actually a great UX. Why? IMHO because the technology choices are just plain awful. If you can find a website with a rich user experience that just melts away, you’ll likely find a bunch of developers who either had nervous breakdowns or spent many years building some superduper framework, or both!

To be fair I’m no doubt coming across as though HTML/CSS is to blame for all the world’s problems. Not at all. We suffer from similar problems across the board in software development. It just so happens that I’ve been in the world of web development for a long time now and feeling the effects.

I’m not advocating the use of any particular technology – that would kinda defeat the purpose of my argument. What I am saying is that I believe we’re stuck in a mindset that only allows us to think inside the incredibly narrow bounds of something we’re used to, IMHO, only because it’s all we’re used to.

Rather than embracing the “web paradigm” how about we embrace the user and their experience and decide what technology would best enable us to deliver that.

A Title Case Gem for Ruby

By

A project I’m working on called for some “smart” capitalisation of page titles. Essentially I wanted to take a URL slug and generate a page title.

Rails comes with a built-in String#titleize method that capitalises every word but that looked a little odd when the title was something like: “My Hovercraft Is Full Of Eels”. So I went on a hunt for something “smarter”.

After a little search I stumbled upon Marshall Elfstrand’s JavaScript, Ruby, and Objective-C ports of John Gruber’s “Title Case” algorithm and decided to turn it into a Gem that adds String#titleize and String#titleize! (aliased as #titlecase, and #titlecase! respectively). When used in a Rails environment, this effectively replaces the Rails versions.

Now my page titles look a little more human-like: “My Hovercraft is Full of Eels”.

Plugins move

By

Following hot on the heels of my blog move, I’ve finally moved all my rails plugins off the venerable RubyForge and onto GitHub.

Since I started working at CogentConsulting – no we’re not “The Company of ex-ThoughtWorkers” unless you count all 3 of us as somehow statistically significant – I’ve had less and less time and less and less inclination to spend any appreciable effort on RedHill related stuff to the point where the company really exists just to support and market Simian.

As a consequence, I’ve also dropped the RedhillOnRails moniker in favour of publishing the plugins under my personal account.

Blog move

By

If you’re reading this then the move of my blog was successful and thank-you for putting up with a screwy RSS feed during the transition. No doubt you received double or possibly even triple posts.

Why the move? Well, even though GeekISP have been a fantastic hosting provider over the years and MovableType has been pretty reliable as a blogging platform, in my never ending quest to Do Less Stuff, I figured it was time to move the pain somewhere else.

From a technical perspective, the move was fairly easy though not without some pain. There is no direct way to import from MT to Blogger however I did find a tool that helped convert the MT export file into something Blogger could import.

I also wrote a quick script to replace all internal references with new links as well as generating a new .htaccess file for any links from the outside world. This step was pretty easy although it took some trial and error to work out what how Blogger converts titles into URLs – as near as I can tell it truncates to a maximum of 40 characters with a bias towards word boundaries. The duplicate posts appearing in the RSS feed were as a direct result of me re-creating the entire blog several times fixing little things here and there.

And so it is that my blog comes to be here on Blogger. The next step is to move all my domain hosting to Google Sites but that’s for another day. Hopefully this will be the last move for some time and, with someone else maintaining my blogging software, hopefully less stuffing around on my part.

Rails, meet Drupal.

By

If you’ve been considering integrating (or replacing) your Drupal application with a Rails application, then Drupal Fu may come in handy.

It’s pretty rough-and-ready – I essentially just ripped the code out of an existing application and cobbled it together – with, as yet, no plugin infrastructure, Rakefile, or anything else that might give you a degree of confidence in the quality of the code :)

That said, the code has been working in a production application for a while and we figured it might help out some others going through the same pain.

Acts As Teapot

By

No, it’s not April Fools yet but I thought I’d get in early this year. Acts As Teapot is a Ruby on Rails plugin that ensures your Ruby on Rails applications conform to RFC2324. My assumption here is that your application is not a coffee pot and therefore does not understand the Hyper Text Coffee Pot Control Protocol (HTCPCP/1.0). Thus, if ever a BREW request or any other request with the Content-Type set to “application/coffee-pot-command” is received, the server will respond with 418 I’m a teapot.

TimeMachine FTW!

By

Not withstanding the fact that I needed to restore my operating system in the first place – due to an inexplicable and catastrophic failure of the Java installation resulting in segfaults – I was able to restore my entire 100GB system in around 4 hours. For posterity:

  • Boot off the OS X System Install DVD – hold down option while the system starts
  • Connect the external drive with the TimeMachine backup – in my case a TimeCapsule attached via ethernet
  • Select “Restore from TimeMachine backup” in the Utilities menu
  • Select the specific backup (by timestamp) from which to restore
  • And away you go!

The disk is then automatically erased and a fully bootable system is restored sans temp directories and cache files. It even managed to restore my PostgreSQL databases that were running at the time – which probably says more about PostgreSQL than anything.

The one grumble I do have is that the timestamps in the name of the backups were some non-obvious period relative to the actual date the backup was made. The difference wouldn’t have been much of an issue had I simply needed to restore the most recent backup but as it turned out I needed to go back a couple of days in order to get a clean system. Thankfully I got lucky on the second attempt :)

Once I had restored the system I took a look at the backup folders and sure enough there are two timestamps: the one in the folder name, and the created date. The created timestamp was spot on but the one in the folder name – the one presented to you when restoring – was whacky. I honestly didn’t spend long enough to calculate if the difference was consistent.

What is really interesting is that I had SuperDuper! on my list of software to start using but it would appear there is little need – at least in my case.