The final, formatted version of the article will be published soon.
ORIGINAL RESEARCH article
Front. Bioinform.
Sec. Genomic Analysis
Volume 5 - 2025 |
doi: 10.3389/fbinf.2025.1504728
This article is part of the Research Topic From one genome to many genomes: the evolution of computational approaches for pangenomics and metagenomics analysis View all 6 articles
MetaComBin: combining abundances and overlaps for binning metagenomics reads
Provisionally accepted- 1 Department of Information Engineering, School of Engineering, University of Padua, Padua, Veneto, Italy
- 2 University of Padua, Padua, Italy
Metagenomics is the discipline that studies heterogeneous microbial samples extracted directly from their natural environment, for example from soil, water, or the human body. The detection and quantification of species that populate microbial communities have been the subject of many recent studies based on classification and clustering, motivated by being the first step in more complex pipelines (e.g. for functional analysis, de-novo assembly or comparison of metagenomes). Metagenomics has impact on both environmental studies and precision medicine, thus it is crucial to be able to improve the quality of species identification through computational tools.In this paper we explore the idea of improving the overall quality of metagenomics binning at reads-level by proposing a computational framework that sequentially combine two complementary read binning approaches: one based on species abundances determination, and another one relying on reads overlap in order to cluster reads together. We called this approach MetaComBin (Metagenomics Combined Binning).The results of our experiments with the MetaComBin approach showed that the combination of two tools, based on different approaches, can lead to the improvement of the clustering quality in realistic conditions where the number of species is not known beforehand.
Keywords: Metagenomics, reads binning, abundance, overlap, K-mers, clustering
Received: 01 Oct 2024; Accepted: 27 Jan 2025.
Copyright: © 2025 Tomasella and Pizzi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Cinzia Pizzi, University of Padua, Padua, Italy
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.