AUTHOR=Norouzi-Beirami Mohammad Hossein , Marashi Sayed-Amir , Banaei-Moghaddam Ali Mohammad , Kavousi Kaveh TITLE=Beyond Taxonomic Analysis of Microbiomes: A Functional Approach for Revisiting Microbiome Changes in Colorectal Cancer JOURNAL=Frontiers in Microbiology VOLUME=10 YEAR=2020 URL=https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2019.03117 DOI=10.3389/fmicb.2019.03117 ISSN=1664-302X ABSTRACT=

Colorectal cancer (CRC) is one of the most prevalent cancers in the world, especially in developed countries. In different studies, the association between CRC and dysbiosis of gut microbiome has been reported. However, most of these works focus on the taxonomic variation of the microbiome, which presents little, if any, functional insight about the reason behind and/or consequences of microbiome dysbiosis. In this study, we used a previously reported metagenome dataset which is obtained by sequencing 156 microbiome samples of healthy individuals as the control group (Co), as well as microbiome samples of patients with advanced colorectal adenoma (Ad) and colorectal carcinoma (Ca). Features of the microbiome samples have been analyzed at the level of species, as well as four functional levels, i.e., gene, KEGG orthology (KO) group, Enzyme Commission (EC) number, and reaction. It was shown that, at each of these levels, certain features exist which show significant changing trends during cancer progression. In the next step, a list of these features were extracted, which were shown to be able to predict the category of Co, Ad, and Ca samples with an accuracy of >85%. When only one group of features (species, gene, KO group, EC number, reaction) was used, KO-related features were found to be the most successful features for classifying the three categories of samples. Notably, species-related features showed the least success in sample classification. Furthermore, by applying an independent test set, we showed that these performance trends are not limited to our original dataset. We determined the most important classification features at each of the four functional levels. We propose that these features can be considered as biomarkers of CRC progression. Finally, we show that the intra-diversity of each sample at the levels of bacterial species and genes is much more than those of the KO groups, EC numbers, and reactions of that sample. Therefore, we conclude that the microbiome diversity at the species level, or gene level, is not necessarily associated with the diversity at the functional level, which again indicates the importance of KO-, EC-, and reaction-based features in metagenome analysis. The source code of proposed method is freely available from https://www.bioinformatics.org/mamed.