Friday, October 13, 2006

After Rereading...

Never mind the previous post on the Johns Hopkins survey of deaths in Iraq since the war. I had another look at the report, and I now know what's wrong with it.

As I wrote in my last post:

The idea of doing random "clusters" would be a contributing factor. What this presumably means is that specific neighborhoods were chosen, and houses were randomly selected in these neighborhoods. While that may make the interviewer's job easier, it introduces confounding variables to the survey. Deaths as a result of fighting (which are reported to be the main source of "excess deaths" in the report) are likely to "cluster," after all. If one household in a neighborhood is affected, others will be as well.

I was almost right. In fact, it's worse than I suspected. Their "clustering" method didn't choose random neighborhoods and then random houses within those neighborhoods. What it in fact did was choose 50 random houses across Iraq and then survey the 39 houses immediately adjacent to those houses. From the methodology section:

First, the location of the 50 clusters was chosen according to the geographic distribution of the population in Iraq. ... This sampling process went on randomly to select the town (or section of the town), the neighborhood, and then the actual house where the survey would start. This was all done using random numbers. Once the start house was selected, an interview was conducted there and then in the next 39 nearest houses. (pp. 3-4, emphasis added)

So I think we can all see how this works. Neighborhoods that saw no fighting will simply reflect the baseline rate. Thus, they are guaranteed to get at least the deathrate recorded before the war in a "worst-case" (for them) scenario. But they know that out of 50 neighborhoods they will occasionally get lucky and hit on one that experienced heavy fighting. If this is the case, then that neighborhood will throw off the average significantly - because a large number of households can be expected to poll some kind of death. Now remember, they're reporting an average deathrate of 13% after the war started against a baseline of 5.5% in the year before the war. All they really need to get shocking numbers is a 10% deathrate, really - because then they can plausibly claim that the deathrate has doubled. So it really only takes 2-5 heavy casualty neighborhoods to get the numbers they're shooting for. I'm sure they didn't have much trouble.

OK - I have a firm opinion now. This survey is bogus. I believe in a number closer to 60,000 for "excess deaths" as a result of the invasion of Iraq.


