Measurement error and the replication crisis

A common assumption has been that if one finds statistically significant results with noisy data, it means that the findings are conservative. (The intuition is that had there not been strong associations present, they would not have made it through the noise)

In their article in Science, lauded statisticians Loken and Gelman argue that this is not the case.

Some key arguments from the article:

To acknowledge significant amounts of uncontrolled variance, is to admit there is less information in the data than initially thought. Statistical tools can not improve upon the amount of information that is in raw data.

I quote from the article:

First, researchers typically have so many “researcher degrees of freedom”—unacknowledged choices in how they prepare, analyze, and report their data—that statistical significance is easily found even in the absence of underlying effects (8) and even without multiple hypothesis testing by researchers (9). In settings with uncontrolled researcher degrees of freedom, the attainment of statistical significance in the presence of noise is not an impressive feat. The second, related issue is that in noisy research settings, statistical significance provides very weak evidence for either the sign or the magnitude of any underlying effect. Statistically significant estimates are, roughly speaking, at least two standard errors from zero. In a study with noisy measurements and small or moderate sample size, standard errors will be high and statistically significant estimates will therefore be large, even if the underlying effects are small. This is known as the statistical significance filter and can be a severe upward bias in the magnitude of effects; as one of us has shown, reported estimates can be an order-of-magnitude larger than any plausible underlying effects (10).

Read the full article (two pages) at:

Click to access measurement.pdf

Measurement error and the replication crisis

Recent Posts

Meta