How do you capture that?

This email from the OpenStack mailing list is a good illustration of the design rationale capture problem:

To: Yun Mao <yunmao@xxxxxxxxx>
From: Vishvananda Ishaya <vishvananda@xxxxxxxxx>
Date: Thu, 1 Mar 2012 12:36:43 -0800

Yes it does. We actually tried to use a pool at diablo release and it was very broken. There was discussion about moving over to a pure-python mysql library, but it hasn’t been tried yet.

Vish

On Mar 1, 2012, at 11:45 AM, Yun Mao wrote:

> There are plenty eventlet discussion recently but I’ll stick my
> question to this thread, although it’s pretty much a separate
> question. 🙂
>
> How is MySQL access handled in eventlet? Presumably it’s external C
> library so it’s not going to be monkey patched. Does that make every
> db access call a blocking call? Thanks,
>
> Yun
>

The problem here is that a database query can block an OpenStack Compute service from running until the query completes, because the implementation uses a green threads library (eventlet) instead of native threads. The OpenStack developers implemented  a non-blocking solution, but the solution broke things, so it was abandoned.

This is really a challenge problem for software engineering: how do you capture this type of information so that a new developer can understand why the code was implemented that way, without depending on the existence of something like the OpenStack mailing list?

My hypothesis is that building up this knowledge incrementally is the best way to go, using a StackOverflow-style Q&A approach. It would be great if we could write a comprehensive design document, but I don’t think it’s possible to know in advance what sort of questions your future reader will want answered. But if you build it based on answering people’s questions, then that frees you up from trying to guess what it is they will need to know.

Here’s another study I’ve always wanted to run: evaluate how well an author of software documentation can predict:

  • what sort of questions the documentation reader will want answered by the docs
  • the amount of prior knowledge the reader will already have

Lacking a frame of reference

I was glancing through one of the LinkedIn software group discussions, and noticed that the poor state of software development was being discussed. Whenever I hear these laments, the question that comes to my mind is, “compared to what?”

It isn’t obvious that software development is in much poorer shape than, say, civil or mechanical engineering, and I’m not even sure how to make a meaningful comparison. Consider IEEE’s Risk Factor blog. Yes, expensive software failures are still a too-common occurrence. Yet, as I write this, the second Risk Factor post from the top discusses the fatal Washington DC subway crash in 2009 which was due to an electrical circuit failure, not a software defect. While the field of software should always strive for perfection, it isn’t a realistic standard to be judged against and found wanting.

As an aside, here’s a study I’ve always wanted to do: compare cost and schedule overruns for government IT projects versus government construction projects of similar initial budget and schedule projections. The raw data should be publicly available, assuming one knows where to look. Comparing how well the projects met their requirements across the domains would be more challenging.