Relative confidence in scientific theories

One of the challenges of dealing with climate change is that it’s difficult to communicate to the public how much confidence the scientific community has in a particular theory. Here’s a hypothesis: people have a better intuitive grasp of relative comparisons (A is bigger than B) than they do with absolutes (we are 90% confident that “A” is big).

Assuming this hypothesis is true, we could do a broad survey of scientists and use them to rank-order confidence in various scientific theories that the general public is familiar with. Possible examples of theories:

  • Plate tectonics
  • Childhood vaccinations cause autism
  • Germ theory of disease
  • Theory of relativity
  • Cigarette smoking cause lung cancer
  • Diets rich in saturated fats cause heart disease
  • AIDS is caused by HIV
  • The death penalty reduces violent crime
  • Evolution by natural selection
  • Exposure to electromagnetic radiation from high-voltage power lines cause cancer
  • Intelligence is inherited biologically
  • Government stimulus spending reduces unemployment in a recession

Assuming the survey produced a (relatively) stable rank-ordering across these theories, the end goal would be to communicate confidence in a scientific theory by saying: “Scientists are more confident in theory X than they are in theories Y,Z, but not as confident as they are in theories P,Q”.

How do you capture that?

This email from the OpenStack mailing list is a good illustration of the design rationale capture problem:

To: Yun Mao <yunmao@xxxxxxxxx>
From: Vishvananda Ishaya <vishvananda@xxxxxxxxx>
Date: Thu, 1 Mar 2012 12:36:43 -0800

Yes it does. We actually tried to use a pool at diablo release and it was very broken. There was discussion about moving over to a pure-python mysql library, but it hasn’t been tried yet.

Vish

On Mar 1, 2012, at 11:45 AM, Yun Mao wrote:

> There are plenty eventlet discussion recently but I’ll stick my
> question to this thread, although it’s pretty much a separate
> question. 🙂
>
> How is MySQL access handled in eventlet? Presumably it’s external C
> library so it’s not going to be monkey patched. Does that make every
> db access call a blocking call? Thanks,
>
> Yun
>

The problem here is that a database query can block an OpenStack Compute service from running until the query completes, because the implementation uses a green threads library (eventlet) instead of native threads. The OpenStack developers implemented  a non-blocking solution, but the solution broke things, so it was abandoned.

This is really a challenge problem for software engineering: how do you capture this type of information so that a new developer can understand why the code was implemented that way, without depending on the existence of something like the OpenStack mailing list?

My hypothesis is that building up this knowledge incrementally is the best way to go, using a StackOverflow-style Q&A approach. It would be great if we could write a comprehensive design document, but I don’t think it’s possible to know in advance what sort of questions your future reader will want answered. But if you build it based on answering people’s questions, then that frees you up from trying to guess what it is they will need to know.

Here’s another study I’ve always wanted to run: evaluate how well an author of software documentation can predict:

  • what sort of questions the documentation reader will want answered by the docs
  • the amount of prior knowledge the reader will already have

Lacking a frame of reference

I was glancing through one of the LinkedIn software group discussions, and noticed that the poor state of software development was being discussed. Whenever I hear these laments, the question that comes to my mind is, “compared to what?”

It isn’t obvious that software development is in much poorer shape than, say, civil or mechanical engineering, and I’m not even sure how to make a meaningful comparison. Consider IEEE’s Risk Factor blog. Yes, expensive software failures are still a too-common occurrence. Yet, as I write this, the second Risk Factor post from the top discusses the fatal Washington DC subway crash in 2009 which was due to an electrical circuit failure, not a software defect. While the field of software should always strive for perfection, it isn’t a realistic standard to be judged against and found wanting.

As an aside, here’s a study I’ve always wanted to do: compare cost and schedule overruns for government IT projects versus government construction projects of similar initial budget and schedule projections. The raw data should be publicly available, assuming one knows where to look. Comparing how well the projects met their requirements across the domains would be more challenging.