AUTHOR=Khan Shamus , Hirsch Jennifer S. , Zeltzer-Zubida Ohad TITLE=A dataset without a code book: ethnography and open science JOURNAL=Frontiers in Sociology VOLUME=9 YEAR=2024 URL=https://www.frontiersin.org/journals/sociology/articles/10.3389/fsoc.2024.1308029 DOI=10.3389/fsoc.2024.1308029 ISSN=2297-7775 ABSTRACT=

This paper reflects upon calls for “open data” in ethnography, drawing on our experiences doing research on sexual violence. The core claim of this paper is not that open data is undesirable; it is that there is a lot we must know before we presume its benefits apply to ethnographic research. The epistemic and ontological foundation of open data is grounded in a logic that is not always consistent with that of ethnographic practice. We begin by identifying three logics of open data—epistemic, political-economic, and regulatory—which each address a perceived problem with knowledge production and point to open science as the solution. We then evaluate these logics in the context of the practice of ethnographic research. Claims that open data would improve data quality are, in our assessment, potentially reversed: in our own ethnographic work, open data practices would likely have compromised our data quality. And protecting subject identities would have meant creating accessible data that would not allow for replication. For ethnographic work, open data would be like having the data set without the codebook. Before we adopt open data to improve the quality of science, we need to answer a series of questions about what open data does to data quality. Rather than blindly make a normative commitment to a principle, we need empirical work on the impact of such practices – work which must be done with respect to the different epistemic cultures’ modes of inquiry. Ethnographers, as well as the institutions that fund and regulate ethnographic research, should only embrace open data after the subject has been researched and evaluated within our own epistemic community.