Imprisoned by Statistics: How Poor Data Collection and Analysis Sent an Innocent Nurse to Jail

If you want to convince someone that at least a basic understanding of statistics is an essential life skill, bring up the case of Lucia de Berk. Hers is a story that's too awful to be true—except that it is completely true.

A flawed analysis irrevocably altered de Berk's life and kept her behind bars for five years, and the fact that this analysis targeted and harmed just one person makes it more frightening. When tragedy befalls many people, aggregating the harmed individuals into a faceless mass helps us cope with the horror. You can't play the same trick on yourself when you consider a single innocent woman, sentenced to life in prison, thanks to an erroneous analysis.

The Case Against Lucia

It started with an infant's unexpected death at a children's hospital in The Hague. Administrators subsequently reviewed earlier deaths and near-death incidents, and identified 9 other incidents in the previous year they believed were medically suspicious. Dutch prosecutors proceeded to press charges against pediatric nurse Lucia de Berk, who had been responsible for patient care and medication at the time of all of those incidents. In 2003, de Berk was sentenced to life in prison for the murder of four patients and the attempted murder of three.

The guilty verdict, rendered despite a glaring lack of physical or even circumstantial evidence, was based (at least in part) on a prosecution calculation that only a 1-in-342-million chance existed that a nurse's shifts would coincide with so many suspicious incidents. "In the Lucia de B. case statistical evidence has been of enormous importance," a Dutch criminologist said at the time. "I do not see how one could have come to a conviction without it." The guilty verdict was upheld on appeal, and de Berk spent the next five years in prison.

One in 342 Million...?

If an expert states that the probability of something happening by random chance is just 1 in 342 million, and you're not a statistician, perhaps you'd be convinced those incidents did not happen by random chance.

But if you are statistically inclined, perhaps you'd wonder how experts reached this conclusion. That's exactly what statisticians Richard Gill and Piet Groeneboom, among others, began asking. They soon realized that the prosecution's 1-in-342-million figure was very, very wrong.

Here's where the case began to fall apart—and not because the situation was complicated. In fact, the problems should have been readily apparent to anyone with a solid grounding in statistics.

What Prosecutors Failed to Ask

The first question in any analysis should be, "Can you trust your data?" In de Berk's case, it seems nobody bothered to ask.

Richard Gill graciously attributes this to a kind of culture clash between criminal and scientific investigation. Criminal investigation begins with the assumption a crime occurred, and proceeds to seek out evidence that identifies a suspect. A scientific approach begins by asking whether a crime was even committed.

In Lucia's case, investigators took a decidedly non-scientific approach. In gathering data from the hospitals where she worked, they omitted incidents that didn't involve Lucia from their totals (cherry-picking), and made arbitrary and inconsistent classifications of other incidents. Incredibly, events De Berk could not have been involved in were nonetheless attributed to her. Confirmation and selection bias were hard at work on the prosecution's behalf.

Further, much of the "data" about events was based on individuals' memories, which are notoriously unreliable. In a criminal investigation where witnesses know what's being sought and may have opinions about a suspect's guilt, relying on memories of events that happened weeks and months ago seems like it would be a particularly dubious decision. Nonetheless, the prosecution's statistical experts deemed the data gathered under such circumstances trustworthy.

As Gill, one of the few heroes in this sordid and sorry mess, points out, "The statistician has to question all his clients’ assumptions and certainly not to jump to the conclusions which the client is aiming for." Clearly, that did not happen here.

Even If the Data Had Been Reliable...

So the data used against de Berk didn't pass the smell test for several reasons. But even if the data had been collected in a defensible manner, the prosecution's statement about 1-in-342-million odds was still wrong. To arrive at that figure, the prosecution's statistical expert multiplied p-values from three separate analyses. However, in combining those p-values the expert failed to perform necessary statistical corrections, resulting in a p-value that was far, far lower than it should have been. You can read the details about these calculations in this paper.

In fact, when statisticians, including Gill, analyzed the prosecution's data using the proper formulas and corrected numbers, they found the odds that a nurse could experience the pattern of events exhibited in the data could have been as low as 1 in 25.

Justice Prevails at Last (Sort Of)

Even though de Berk had exhausted her appeals, thanks to the efforts of Gill and others, the courts finally re-evaluated her case in light of the revised analyses. The nurse, now declared innocent of all charges, was released from prison (and quietly given an undisclosed settlement by the Dutch government). But for an innocent defendant, justice remained blind to the statistical problems in this case across nearly 10 years and multiple appeals, during which de Berk experienced a stress-induced stroke. It's well worth learning more about the role of statistics in her experience if you're interested in the impact data analysis can have on one person's life.

At a minimum, what happened to Lucia de Berk should be more than enough evidence that a better understanding of statistics could set you free.

Literally.