How do you capture that?

This email from the OpenStack mailing list is a good illustration of the design rationale capture problem:

To: Yun Mao <yunmao@xxxxxxxxx>
From: Vishvananda Ishaya <vishvananda@xxxxxxxxx>
Date: Thu, 1 Mar 2012 12:36:43 -0800

Yes it does. We actually tried to use a pool at diablo release and it was very broken. There was discussion about moving over to a pure-python mysql library, but it hasn’t been tried yet.

Vish

On Mar 1, 2012, at 11:45 AM, Yun Mao wrote:

> There are plenty eventlet discussion recently but I’ll stick my
> question to this thread, although it’s pretty much a separate
> question. 🙂
>
> How is MySQL access handled in eventlet? Presumably it’s external C
> library so it’s not going to be monkey patched. Does that make every
> db access call a blocking call? Thanks,
>
> Yun
>

The problem here is that a database query can block an OpenStack Compute service from running until the query completes, because the implementation uses a green threads library (eventlet) instead of native threads. The OpenStack developers implemented a non-blocking solution, but the solution broke things, so it was abandoned.

This is really a challenge problem for software engineering: how do you capture this type of information so that a new developer can understand why the code was implemented that way, without depending on the existence of something like the OpenStack mailing list?

My hypothesis is that building up this knowledge incrementally is the best way to go, using a StackOverflow-style Q&A approach. It would be great if we could write a comprehensive design document, but I don’t think it’s possible to know in advance what sort of questions your future reader will want answered. But if you build it based on answering people’s questions, then that frees you up from trying to guess what it is they will need to know.

Here’s another study I’ve always wanted to run: evaluate how well an author of software documentation can predict:

what sort of questions the documentation reader will want answered by the docs
the amount of prior knowledge the reader will already have

Unit testing: because all of the cool kids are doing it

While it’s fun to dump on the faddishness in software development, the rise of unit testing represents real progress in the state-of-the-practice. As an example, in the Django framework where I spend most of my time, I appreciate how much the framework helps me begin writing and running unit tests.

Thanks go to the test-driven development movement for making testing cool and inspiring an ecosystem of automated unit testing tools, even if it turns out that TDD itself provides no measurable benefits (the evidence to date is mixed).

Predicting individual performance is hard

If they can’t even do it in basketball, the prospects for us being able to do it in software development are bleak.

Lacking a frame of reference

I was glancing through one of the LinkedIn software group discussions, and noticed that the poor state of software development was being discussed. Whenever I hear these laments, the question that comes to my mind is, “compared to what?”

It isn’t obvious that software development is in much poorer shape than, say, civil or mechanical engineering, and I’m not even sure how to make a meaningful comparison. Consider IEEE’s Risk Factor blog. Yes, expensive software failures are still a too-common occurrence. Yet, as I write this, the second Risk Factor post from the top discusses the fatal Washington DC subway crash in 2009 which was due to an electrical circuit failure, not a software defect. While the field of software should always strive for perfection, it isn’t a realistic standard to be judged against and found wanting.

As an aside, here’s a study I’ve always wanted to do: compare cost and schedule overruns for government IT projects versus government construction projects of similar initial budget and schedule projections. The raw data should be publicly available, assuming one knows where to look. Comparing how well the projects met their requirements across the domains would be more challenging.

Sitting has become synonymous with sloth

The Wall Street Journal explains stand-up meetings and agile software development to the layperson.

Redacting results in reviews

Here’s a thought experiment: what if empirical software engineering papers were first reviewed with the results redacted? Once the reviewers had submitted the reviews of the redacted paper, they were then shown the full paper and did a full review. I wonder how much the final reviews would change in practice.

So, what happens if the programmer makes an error?

If you’re building tools for use by developers, especially novices, you need to ask yourself this question again and again, or they’re going to get mighty confused by the error messages. Via Hacker News.

Well-deserved

Nice to see Greg Wilson win a Sloan Foundation Grant to advance his Software Carpentry project. It’s an education project to teach much-needed software development skills to scientists.

IEEE and the newspaper industry

Ian Sommerville went on a tear the other day about IEEE not supporting open access content. I suspect that IEEE uses the revenues it gets from publication subscriptions to subsidize other activities like conferences. I think the open access model for research publications is inevitable, especially in tech-savvy fields like electrical engineering and computer science. When that happens, the IEEE may find itself in the position that the newspapers did when one of their profit centers (classifieds) got disrupted by free online services like Craigslist, making it difficult to fund the cost centers (news reporting).

Mind you, if free journals and fewer subsidized IEEE conferences means that computer science shifts from being conference-oriented to being journal-oriented like the rest of academia, it will probably be a benefit to the CS research community. But IEEE will have to either reinvent itself or go the way of the local newspaper.

Bifurcation

The IT community:

Shared memory programming (pthreads) is awful! Message-passing (Erlang) would make our lives so much easier!

The high-performance computing community:

Message-passing (MPI) is awful! Shared-memory programming (OpenMP, PGAS) would make our lives so much easier!