AUTHOR=Spiegelman Eli TITLE=Esteemed Colleagues: A Model of the Effect of Open Data on Selective Reporting of Scientific Results JOURNAL=Frontiers in Psychology VOLUME=12 YEAR=2021 URL=https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2021.761168 DOI=10.3389/fpsyg.2021.761168 ISSN=1664-1078 ABSTRACT=
Open data, the practice of making available to the research community the underlying data and analysis codes used to generate scientific results, facilitates verification of published results, and should thereby reduce the expected benefit (and hence the incidence) of p-hacking and other forms of academic dishonesty. This paper presents a simple signaling model of how this might work in the presence of two kinds of cost. First, reducing the cost of “checking the math” increases verification and reduces falsification. Cases where the author can choose a high or low verification-cost regime (that is, open or closed data) result in unraveling; not all authors choose the low-cost route, but the best do. The second kind of cost is the cost to authors of preparing open data. Introducing these costs results in that high- and low-quality results being published in both open and closed data regimes, but even when the costs are independent of research quality open data is favored by high-quality results in equilibrium. A final contribution of the model is a measure of “science welfare” that calculates the ex-post distortion of equilibrium beliefs about the quality of published results, and shows that open data will always improve the aggregate state of knowledge.