The final, formatted version of the article will be published soon.
ORIGINAL RESEARCH article
Front. Microbiol.
Sec. Microbial Symbioses
Volume 15 - 2024 |
doi: 10.3389/fmicb.2024.1485073
This article is part of the Research Topic The Microbial Link: Exploring Oral and Gut Microbiome Connections View all 8 articles
Optimizing Microbiome Reference Databases with PacBio Full-Length 16S rRNA Sequencing for Enhanced Taxonomic Classification and Biomarker Discovery
Provisionally accepted- 1 Department of Oral Microbiology, School of Dentistry, Pusan National University, Busan, Republic of Korea
- 2 Department of Internal Medicine, Dongnam Institute of Radiological and Medical Sciences, Busan, Republic of Korea
The study of the human microbiome is crucial for understanding disease mechanisms, identifying biomarkers, and guiding preventive measures. Advances in sequencing platforms, particularly 16S rRNA sequencing, have revolutionized microbiome research. Despite the benefits, large microbiome reference databases (DBs) pose challenges, including computational demands and potential inaccuracies. This study aimed to determine if full-length 16S rRNA sequencing data produced by PacBio could be used to optimize reference DBs and be applied to Illumina V3-V4 targeted sequencing data for microbial study. Methods: Oral and gut microbiome data (PRJNA1049979) were retrieved from NCBI. DADA2 was applied to full-length 16S rRNA PacBio data to obtain amplicon sequencing variants (ASVs). The RDP reference DB was used to assign the ASVs, which were then used as a reference DB to train the classifier. QIIME2 was used for V3-V4 targeted Illumina data analysis. BLAST was used to analyze alignment statistics. Linear discriminant analysis Effect Size (LEfSe) was employed for discriminant analysis. Results: ASVs produced by PacBio showed coverage of the oral microbiome similar to the Human Oral Microbiome Database. A phylogenetic tree was trimmed at various thresholds to obtain an optimized reference DB. This established method was then applied to gut microbiome data, and the optimized gut microbiome reference DB provided improved taxa classification and biomarker discovery efficiency. Conclusion: Full-length 16S rRNA sequencing data produced by PacBio can be used to construct a microbiome reference DB. Utilizing an optimized reference DB can increase the accuracy of microbiome classification and enhance biomarker discovery.
Keywords: oral microbiome, gut microbiome, PacBio, Illumina, next generation sequencing, reference database
Received: 23 Aug 2024; Accepted: 28 Oct 2024.
Copyright: © 2024 Han, Choi, Kim, Park, Chung and Na. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Jin Chung, Department of Oral Microbiology, School of Dentistry, Pusan National University, Busan, Republic of Korea
Hee Sam Na, Department of Oral Microbiology, School of Dentistry, Pusan National University, Busan, Republic of Korea
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.