AUTHOR=Smith Paul E. , Waters Sinead M. , Gómez Expósito Ruth , Smidt Hauke , Carberry Ciara A. , McCabe Matthew S. TITLE=Synthetic Sequencing Standards: A Guide to Database Choice for Rumen Microbiota Amplicon Sequencing Analysis JOURNAL=Frontiers in Microbiology VOLUME=11 YEAR=2020 URL=https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2020.606825 DOI=10.3389/fmicb.2020.606825 ISSN=1664-302X ABSTRACT=
Our understanding of complex microbial communities, such as those residing in the rumen, has drastically advanced through the use of high throughput sequencing (HTS) technologies. Indeed, with the use of barcoded amplicon sequencing, it is now cost effective and computationally feasible to identify individual rumen microbial genera associated with ruminant livestock nutrition, genetics, performance and greenhouse gas production. However, across all disciplines of microbial ecology, there is currently little reporting of the use of internal controls for validating HTS results. Furthermore, there is little consensus of the most appropriate reference database for analyzing rumen microbiota amplicon sequencing data. Therefore, in this study, a synthetic rumen-specific sequencing standard was used to assess the effects of database choice on results obtained from rumen microbial amplicon sequencing. Four DADA2 reference training sets (RDP, SILVA, GTDB, and RefSeq + RDP) were compared to assess their ability to correctly classify sequences included in the rumen-specific sequencing standard. In addition, two thresholds of phylogenetic bootstrapping, 50 and 80, were applied to investigate the effect of increasing stringency. Sequence classification differences were apparent amongst the databases. For example the classification of