AUTHOR=Sheik Cody S. , Reese Brandi Kiel , Twing Katrina I. , Sylvan Jason B. , Grim Sharon L. , Schrenk Matthew O. , Sogin Mitchell L. , Colwell Frederick S. TITLE=Identification and Removal of Contaminant Sequences From Ribosomal Gene Databases: Lessons From the Census of Deep Life JOURNAL=Frontiers in Microbiology VOLUME=9 YEAR=2018 URL=https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2018.00840 DOI=10.3389/fmicb.2018.00840 ISSN=1664-302X ABSTRACT=
Earth’s subsurface environment is one of the largest, yet least studied, biomes on Earth, and many questions remain regarding what microorganisms are indigenous to the subsurface. Through the activity of the Census of Deep Life (CoDL) and the Deep Carbon Observatory, an open access 16S ribosomal RNA gene sequence database from diverse subsurface environments has been compiled. However, due to low quantities of biomass in the deep subsurface, the potential for incorporation of contaminants from reagents used during sample collection, processing, and/or sequencing is high. Thus, to understand the ecology of subsurface microorganisms (i.e., the distribution, richness, or survival), it is necessary to minimize, identify, and remove contaminant sequences that will skew the relative abundances of all taxa in the sample. In this meta-analysis, we identify putative contaminants associated with the CoDL dataset, recommend best practices for removing contaminants from samples, and propose a series of best practices for subsurface microbiology sampling. The most abundant putative contaminant genera observed, independent of evenness across samples, were