Promoting more reliable research and communication of results


The coronavirus pandemic has once again brought science to the center of public attention. Once again we have seen how important an honest communication about science is, and that research may help to reduce uncertainty, but cannot eliminate it.

It seems that scientific results are a lot more uncertain than many people would like them to be. In the last decade, we often saw headlines saying that we shouldn't believe that exciting new medical study, or that there is trouble at the lab. According to an online survey, most researchers would agree there is a reproducibility crisis.

Yes, science is not perfect. Although our goal is to produce reliable knowledge, it seems that some of the current academic (or simply human) incentives favor novelty over scrutiny, quantity over quality, and coherent storytelling over full reporting of results that are often inconclusive.

However, the perceived non-replication of scientific results may partly be a problem of interpretation and communication, rather than a crisis of the scientific method per se. On a more technical level, we and others argue that the crisis of unreplicable research is mainly a crisis of overconfidence in statistical, or scientific, inference. We agree that science isn't broken. It's just a hell of a lot harder than we give it credit for.

As the saying goes, we should never waste a good crisis. This is our opportunity to take part in the discussion, and to contribute to reforming science and perhaps even some aspects of the academic system. Here are three new initiatives that deserve a closer look:

What matters ... is not replication defined by the presence or absence of statistical significance, but the evaluation of the cumulative evidence and assessment of whether it is susceptible to major biases

Goodman, Fanelli, Ioannidis (2016)

Publications on science, statistics, and reproducibility


A collection of blogs

can be found on the blog by Andrew Gelman (the blog itself is often worth a look)


Reproducible, replicable, reliable?

Goodman SN, Fanelli D, Ioannidis JPA (2016) What does research reproducibility mean? Science Translational Medicine 8: 341. https://doi.org/10.1126/scitranslmed.aaf5027


There are many papers on statistics and reproducibility

and some are linked in some of our own blog posts and papers on the topic:

Amrhein V, Greenland S (2022) Rewriting results in the language of compatibility. Trends in Ecology & Evolution 37: 567-568. PDF https://doi.org/10.1016/j.tree.2022.02.001

Amrhein V, Greenland S (2022) Discuss practical importance of results based on interval estimates and p-value functions, not only on point estimates and null p-values. Journal of Information Technology 37: 316-320. PDF https://doi.org/10.1177/02683962221105904

Berner D, Amrhein V (2022) Why and how we should join the shift from significance testing to estimation. Journal of Evolutionary Biology 35: 777-787. PDF https://doi.org/10.1111/jeb.14009

Schwab S, Janiaud P, Dayan M, Amrhein V, Panczak R, Palagi PM, Hemkens LG, Ramon M, Rothen N, Senn S, Furrer E, Held L (2022) Ten simple rules for good research practice. PLOS Computational Biology 18: e1010139. PDF https://doi.org/10.1371/journal.pcbi.1010139

Amrhein V (2021) The role of science in the news (and elsewhere). Statisticians React to the News

Amrhein V (2020) Statistics is for statisticians. Statisticians React to the News – blog by the International Statistical Institute (ISI)

Amrhein V, Greenland S, McShane B (2019) Retire statistical significance. Nature 567: 305-307. PDF https://doi.org/10.1038/d41586-019-00857-9

Amrhein V, Trafimow D, Greenland S (2019) Inferential statistics as descriptive statistics: There is no replication crisis if we don't expect replication. The American Statistician 73, sup1: 262-270. PDF https://doi.org/10.1080/00031305.2018.1543137

Amrhein V (2018) Inferential statistics is not inferential. sci five, University of Basel.

Amrhein V, Greenland S (2018) Remove, rather than redefine, statistical significance. Nature Human Behaviour 2: 4. PDF https://doi.org/10.1038/s41562-017-0224-0

Amrhein V, Korner-Nievergelt F, Roth T (2017) The earth is flat (p > 0.05): Significance thresholds and the crisis of unreplicable research. PeerJ 5: e3544. PDF https://doi.org/10.7717/peerj.3544