Published on March 15, 2011 by

Why does medical research sometimes get it so wrong?

Douglas Altman, back in 1994, wrote a highly influential paper in the British Medical Journal entitled ‘the scandal of poor medical research’ (1). It focused on the prevalence of poor design and analysis in medical research. Altman believed the reason was a general failure to appreciate the principles underlying scientific research, coupled with the ‘publish or perish climate’ – where scientific merit was measured by quantity not quality. He argued that, since this system encouraged poor research, it needed to change. The ‘system’ proved resistant to such fundamental change, and the main outcome of his plea was the introduction of accepted standard reporting formats for randomised trials (CONSORT) and observational studies (STROBE).

Martin Bland, a prime advocate of improving standards, reviewed the situation in 2010 (2). He felt that ‘major’ journals had shown big improvements, concurrent with the drive for evidence-based medicine, extensive use of meta-analyses, trials with much larger numbers of participants, use of confidence intervals rather than P-values, and improved statistical refereeing. Bland found scant improvement among specialist clinical journals ‘where statisticians seldom venture’, and in biomedical laboratory research (3). Should this concern us? To show that such matters are not merely of academic interest, let us consider two recent examples.

(In)dependent events

In 1999 Sally Clark, a British solicitor, was accused of murdering her two infant children. Since there was no actual evidence of murder, the defence argued for sudden infant death syndrome. Nevertheless she was convicted of murder by majority verdict. The prosecution’s case was much strengthened by the ‘expert’ evidence of paediatrician, Roy Meadow – who first described ‘Munchausen syndrome by proxy’ where mothers, or caretakers of children, harm or even kill their children (4). Meadow testified that the chances of two cases of sudden infant death syndrome happening in her family was only 1 in 73 million. He estimated this on the basis that the risk of cot death in a family of her socio-economic status was one in 8543; hence the risk of two infants dying by chance in one family was one in 85432.

Unfortunately Meadow’s calculation assumed those events occurred independently, whereas in this case genetic and environmental factors were obviously shared (5). Yet it required two appeals before Sally Clark’s conviction was overturned, and she was freed. Worryingly, whilst many experts strongly criticised Meadow’s incompetence, the judicial incompetence went unnoticed. The crucial point was that, in the absence of hard evidence, convicting someone on the basis of probability alone is unjustifiable. Otherwise, given the probability of winning a fortune is generally very much lower than one in 73 million, lottery winners ought be convicted for fraud. In 2007 Mrs Clark died accidentally of acute alcohol intoxication. A family spokesman said that Sally was unable to come to terms with the false accusations, based on flawed medical evidence and the failures of the legal system, which debased everything she had been brought up to believe in.

Correlation is not Causation

Our second case-study focuses on the age-old error of assuming correlation proves causation. Until the mid-1980s, separate vaccines were used to immunise children in the UK against measles, mumps and rubella, but in 1988 the combined MMR vaccine was introduced. Coverage was very high in the 1990s, and the number of children catching these diseases fell to an all-time low. Then, in 1998, a paper appeared in the (normally) reputable Lancet journal reporting gastrointestinal disease and behavioural disorders (autism) in twelve previously normal children (6). In most cases, onset of symptoms occurred after they received the MMR vaccine. On this basis alone the author – Wakefield – called for further investigations into the possible relationship between autism and the vaccine and, at a subsequent press conference, Wakefield argued that the MMR vaccine should be withdrawn. Yet MMR vaccine is always given at around 12-15 months of age, and the mean age at which parents of children with autism first report concern is 18-19 months (irrespective of whether or not the children have received the vaccine). Hence there will inevitably be a close temporal association for affected individuals.

The following year an epidemiological study was carried out to investigate whether introduction of the MMR vaccine in 1987 resulted in any change in the long-term trend of number of cases of autism (7). No such change was detected, and it was concluded there was nothing to suggest a causal association between the two. This provoked a letter from Wakefield (8) who criticised the study, and he presented data purporting to show such evidence. Although his data failed to do this, the media played-up the argument. So-called ‘balanced’ news reports wrongly suggested both arguments had an equal weight of evidence. In just one month, early in 2002, there were hundreds of media reports – described by some as a media ‘feeding frenzy’. Despite many well-conducted studies in the early 2000s showing no evidence of any association, MMR vaccinations fell alarmingly, and health workers warned that measles could once again become a serious public health problem.

In this instance it is debatable whether most blame should fall upon the media, the researcher, or the journal; the Lancet editors did not retract Wakefield’s paper until 2010. The problems were certainly magnified by massively inaccurate media reporting, which gave the public a highly-biased and fraudulent picture of the risks of MMR vaccination. Ironically the eventual outcome was not too serious in the UK. But in South Africa, where an anti-vaccination campaign thrives on the internet, reduced uptake of MMR vaccinations led to an outbreak of 18290 cases of measles (9) and an unknown number of deaths in November 2010.

Confirmation bias

Confirmation-biasIn both of these cases, researchers’ failure to understand basic principles of the scientific method led to serious consequences. Addressing these shortcomings would entail better science education, not only of researchers, but also of the public and judiciary. But there is another factor at work, albeit one largely ignored in most research, known as ‘confirmation bias’. This is probably the most widespread and insidious form of bias, where scientists (and just about everyone else) search much harder for evidence to support their pet idea than for evidence to refute it – and weight that evidence accordingly. Roy Meadow originated the term Munchausen syndrome by proxy (now a generally recognised syndrome of child abuse). But his readiness to assume that deaths of children resulted from that, with little or no supporting evidence, strongly suggests confirmation bias. On the MMR case there is abundant evidence that Andrew Wakefield’s anti-vaccine stance resulted in severe confirmation bias – and, according to a recent article (10), even outright fraud.

When faced with cases like this, it is clear that journals introducing standard reporting formats is palliative, not curative. Most of the reviews of statistical practice worry endlessly about problems such as subgroup comparisons, or whether variances are homogeneous, but largely ignore the fundamental issues and rarely, if ever, address bias. Confirmation bias cannot be eliminated by ‘good design’, albeit its effects are minimised in randomised blinded controlled trials. Confirmation bias dictates what experimental and observational studies are done in the first place – and how their end-results are evaluated. Media actions aside, the exponential growth in scientific publications, plus non-peer-reviewed material such as blogs, increases the chance of another ‘Wakefield event’. Perhaps Douglas Altman was right when he noted (1) that we need less research, better research, and research done for the right reasons, if we are to improve the quality of scientific papers. Or perhaps, more realistically, we would be wise to cultivate a more critical approach to what is published and the underlying biases – even in ‘reputable’ journals.