Why Most Published Research Findings are False

**swansont** · September 20, 2007

Interesting commentary on an effect of publishing novel results. I get the impression this is far more prevalent in certain biology and medicine experiments, where you are more likely to have many variables that can't be held strictly constant.

http://www.marginalrevolution.com/marginalrevolution/2005/09/why_most_publis.html

So (if I'm reading this right) a snippet in the paper about a small-sample study showing some novel effect is likely to be found to be wrong when a larger study is done later to confirm it; the false-positive study is more likely to get published than the small-sample study that correctly showed no effect. The claim that most findings are false may not be right, but many of them can be, even with the system working right.

Which explains why we keep hearing that something is good/bad for us, only to have someone come along later and say it's wrong.

The more chilling aspect of this is that "with lots of researchers every crackpot theory will have at least one scientific study that it can cite in it's [sic] support" Which is why one has to look at all of the data/studies in a field, and not fixate on one or two that support your notion — outliers (and false positives or negatives) will always exist, and cherry-picking results is not science.

Mr Skeptic · September 20, 2007

The claim that most findings are false may not be right, but many of them can be, even with the system working right.

One must also consider that this study is claiming that most studies are false, so if it were true, one should expect this study to be false. This is not unlike the claim that 87.31% of statistics are made up on the spot. >:D

Actually, what the study is claiming is that most published studies are false. One problem is that results considered statistically significant are sometimes simply due to coincidence, whereas some results that are true fail to get significant results. Both problems increase with smaller sample size, and also because editors preferentially publish surprising results (which tend to be false) rather than expected results (which tend to be true, but boring).

But most of us already know that we should take small sample sizes and unexpected results with a grain of salt, so this is not as big a problem as is claimed.

BenTheMan · September 20, 2007

I think that there has been a recent law in America, addressing the issue.

The LEAST suspect are in the theoretical science, like math and physics. Sometimes the application to reality may be suspect, but always one can check calculations for consistency. If you don't believe me that this actually happens, go to arxiv.org and look for papers that start with ``Reply to...''. These are papers where people point out other's mistakes---usually it boild down to some disagreement over assumptions, otherwise the replies are not so public (generally handled via personal communication).

But the problem still remains---much of the computer code that physicsts use is closed source, so there is no real external peer review. One usually counts on academic integrity, and the fact that others do the same problem in a different way, getting similar results.

D H · September 20, 2007

The underlying statistical paradox (can't remember the name) is the same as this scenario:

You go to the doctor. One possible cause of your symptoms is a very rare but fatal disease. The disease affects 1/1000 adults, apparently at random. The doctor sends you to the lab for a blood test for this disease. The test has yields 1% false positives and 0.01% false negatives (a very good screening test). The results come back positive. Since the test has a paltry 1% false positives, is it time to rewrite your will? Surprisingly, the odds remain better than 9 to 1 that you are not diseased.

A similar thing is happening here. The false hypotheses outnumber the true ones in the soft sciences. Testing in the soft sciences is statistical. Just because a hypothesis passes statistical tests does not mean it is valid.

What does this mean for physics, where experiments like gravity probe B are becoming the norm? Physics used to blessed with the equivalent of six or more (and often many more) nines of accuracy, thereby avoiding the statistical morass that plagues the soft sciences. Just as an example, the mass of the photon has been experimentally confirmed to be less than 10^-64 kilograms, or 10^-32 electron masses. In comparison, the results from gravity probe B have been delayed in part because of statistical issues.

**swansont** · September 20, 2007

What does this mean for physics, where experiments like gravity probe B are becoming the norm? Physics used to blessed with the equivalent of six or more (and often many more) nines of accuracy, thereby avoiding the statistical morass that plagues the soft sciences. Just as an example, the mass of the photon has been experimentally confirmed to be less than 10^-64 kilograms, or 10^-32 electron masses. In comparison, the results from gravity probe B have been delayed in part because of statistical issues.

Well, I think all this is one reason why e.g. particle physics looks for more rigorous (5-sigma) confirmation. Especially when there aren't that many places that can do a followup experiment.

But most of us already know that we should take small sample sizes and unexpected results with a grain of salt, so this is not as big a problem as is claimed.

I think the main concern is not with the science-minded folks who know better, but with the lay people who don't. I think they will see this as a problem with science in general, and not just the reporting (at the journal level, and then again in the popular press), because they don't make a distinction between the different parts of the process. They'll just think that science can't be trusted because three years ago twinkies helped prevent cancer and now all of the sudden they don't.

Paralith · September 20, 2007

I think the main concern is not with the science-minded folks who know better, but with the lay people who don't. I think they will see this as a problem with science in general, and not just the reporting (at the journal level, and then again in the popular press), because they don't make a distinction between the different parts of the process. They'll just think that science can't be trusted because three years ago twinkies helped prevent cancer and now all of the sudden they don't.

I definitely agree. Especially since the more crazy and different (and therefore likely to be false) the results of a study, the more likely the popular press will go to town with it - and probably make even more ridiculous extrapolations along the way. I once read a news snippet describing how a protein involved with the formation of long term memory had been recently identified, and the author of the clip somehow concluded that this was a significant step towards helping smokers quit cigarettes by removing their memory associations of bars with smoking. :doh:

Dak · September 20, 2007

One must also consider that this study is claiming that most studies are false, so if it were true, one should expect this study to be false.

as i read it, it's claiming that, dependant on field, most published research findings that use statistical confidence testing can be false*. this paper doesn't use statistical confidence testing, so it isn't prone to this phenomena.

* i.e., findings of the form "from the data, we can accept hypothesys x with a 99% confidence interval"

Pangloss · September 21, 2007

Interesting commentary on an effect of publishing novel results. I get the impression this is far more prevalent in certain biology and medicine experiments, where you are more likely to have many variables that can't be held strictly constant.

Don't let the global warming proponents hear you say that. You'll be labelled a Denier before you can blink.

Oh oh, sorry, sorry everyone, I should have realized that the entire planet Earth must have fewer variables than a human body. Yeah. I apologize. Dunno what I was thinking. It just kinda slipped out. I'll do 30 "hail gaia's", I promise. :doh:

**swansont** · September 21, 2007

Don't let the global warming proponents hear you say that. You'll be labelled a Denier before you can blink.

Oh oh, sorry, sorry everyone, I should have realized that the entire planet Earth must have fewer variables than a human body. Yeah. I apologize. Dunno what I was thinking. It just kinda slipped out. I'll do 30 "hail gaia's", I promise.

If you read the whole thing, though, you'll find that this is part of the problem with the so-called deniers, and that your observation is 180º off from the point of the article. There are many variables, so you improve the chances of a false positive/negative result that they can latch on to, while ignoring the large body of publications that give a valid result. That's why there is the admonition to look at the body of literature, and not just a few individual studies.

pioneer · September 21, 2007

In statisical science, cause and affect is replaced with a blackbox of experiments. With blackbox experiments, one suspends reason and waits for the oracle to tell you what to think. If one tries to open the blackbox to take a peak inside, one would think they were a criminal, daring to commit blaspheme against the blackbox.

This dynamics sort of remains me of the scene in the Wizard of Oz, when the Dorothy and her three friends go to see the wizard the last time. Dorothy's dog, Toto, pulls the curtain away from the old man who is running the OZ machine. He says, "pay no heed to that man behind the curtain! I am the wise and powerful OZ." It turns out the wise and powerful OZ was an an illuision. If one tries to pull the curtain by looking in the blackbox, the frailty of the old man is figured out.

The way I see it, statistics is useful where our ability to explain things in a rational way breaks down. It allows us to continue, without reason. But rather then realize that statistical studies are needed, due to our rational limitations, they have become institutionalized as the all powerful OZ. The little man behind the curtain is quite rationallly frail and he needs the OZ machine to compensate for this frailty.

The little old man was only able to give up the OZ machine, when the wicked witch of the west was subdued. The west, is the place of the setting sun. The OZ machine was needed to appease a fear of the sun setting, i.e., statisical science becoming dark, if the OZ machine failed, since there is not enough reason in these areas of science to proceed, rationally. The statistical areas of science are the last strongholds of alchemy. Today we use high tech gadgets, to turn lead into gold. As the paper of this topic showed, much of this gold turned out, not to be gold. We need to let the sun set, or kill the witch of the west, using water (rational thoughts) so that statistical science, can ween itself away, and become rational. It doesn't really need the OZ machine, it only needs a rational foundation.

Paralith · September 21, 2007

First of all, it's not that no one is allowed to open the "black box." It's only a black box because at this point we don't have enough information to see inside of it. But the whole point of the research is to try and find out what's in there, not just dance around it.

Secondly, statistical analysis is not inherently irrational. It can be used irrationally, like most things can, but when used with a correct comprehension of what the statistics do and do not say, they can be used to make rational steps forward into understanding the inner workings of the black box.

And really, pioneer, you need to stop making the majority of your posts consist of in depth, convoluted analogies and metaphors. Analogies can help sometimes, but you use them to the point of confusion, and it isn't helping you make your point.

**swansont** · September 21, 2007

In statisical science, cause and affect is replaced with a blackbox of experiments. With blackbox experiments, one suspends reason and waits for the oracle to tell you what to think. If one tries to open the blackbox to take a peak inside, one would think they were a criminal, daring to commit blaspheme against the blackbox.

This dynamics sort of remains me of the scene in the Wizard of Oz, when the Dorothy and her three friends go to see the wizard the last time. Dorothy's dog, Toto, pulls the curtain away from the old man who is running the OZ machine. He says, "pay no heed to that man behind the curtain! I am the wise and powerful OZ." It turns out the wise and powerful OZ was an an illuision. If one tries to pull the curtain by looking in the blackbox, the frailty of the old man is figured out.

The way I see it, statistics is useful where our ability to explain things in a rational way breaks down. It allows us to continue, without reason. But rather then realize that statistical studies are needed, due to our rational limitations, they have become institutionalized as the all powerful OZ. The little man behind the curtain is quite rationallly frail and he needs the OZ machine to compensate for this frailty.

The little old man was only able to give up the OZ machine, when the wicked witch of the west was subdued. The west, is the place of the setting sun. The OZ machine was needed to appease a fear of the sun setting, i.e., statisical science becoming dark, if the OZ machine failed, since there is not enough reason in these areas of science to proceed, rationally. The statistical areas of science are the last strongholds of alchemy. Today we use high tech gadgets, to turn lead into gold. As the paper of this topic showed, much of this gold turned out, not to be gold. We need to let the sun set, or kill the witch of the west, using water (rational thoughts) so that statistical science, can ween itself away, and become rational. It doesn't really need the OZ machine, it only needs a rational foundation.

That's addressed here: http://scienceblogs.com/denialism/2007/09/a_second_crank_finds_ioannidis.php (in addition to what Paralith wrote)

Basically they point out that you do need to look inside the box, to the extent that you can, to help determine the reliability of the study. "Instead of testing modalities with no biological plausibility and publishing the inevitable 5% of studies showing a false-positive effect, one should have good theory going in for why an effect should occur."

IOW, if you don't have any kind of mechanism in mind, then you will still see false positives, because they are all noise (which is addressed in point number 1)

Dak · September 21, 2007

Don't let the global warming proponents hear you say that. You'll be labelled a Denier before you can blink.

Oh oh, sorry, sorry everyone, I should have realized that the entire planet Earth must have fewer variables than a human body. Yeah. I apologize. Dunno what I was thinking. It just kinda slipped out. I'll do 30 "hail gaia's", I promise.

just to add to what swansont said, that's why the IPCC report is cited so often, specifically because it is a review of the body of research and so is less suceptable to randomly being false.

dichotomy · September 25, 2007

The more chilling aspect of this is that "with lots of researchers every crackpot theory will have at least one scientific study that it can cite in it's [sic] support" Which is why one has to look at all of the data/studies in a field, and not fixate on one or two that support your notion — outliers (and false positives or negatives) will always exist, and cherry-picking results is not science.

I'm amazed that 'cherry picking' still goes on. It's common sense that a sizable range of studies, taken from separate scientific groups should be looked at as a rule, and is, more often than not, the best way to go. I suppose common sense isn't that common afterall. :confused:

It doesn't really need the OZ machine, it only needs a rational foundation.

So, statistics and probability are only as good as the quality and quantity of data that is observed in order to get results. In this respect, it is still more a science than an art. And not smoke and mirrors. The analogy here is a tradesman that blames his tools, or the perceived difficulty of the job at hand. Instead he should more often look at his technique, and the tools he has selected to finish the job. Statistics are tools like any other. The correct tools need to be selected to complete a job to a ‘world class’ standard.

pcollins · October 2, 2007

That's why there is the admonition to look at the body of literature, and not just a few individual studies.

Until, of course, we get to the climate model bottleneck. Not many of those out there.

Sign In

Why Most Published Research Findings are False

Recommended Posts

swansont

Mr Skeptic

BenTheMan

D H

swansont

Paralith

Dak

Pangloss

swansont

pioneer

Paralith

swansont

Dak

dichotomy

pcollins

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity

Important Information