AUTHOR=Zhang Lujun , Wang Yanshan , Chen Jingwen , Chen Jun TITLE=RFtest: A Robust and Flexible Community-Level Test for Microbiome Data Powerfully Detects Phylogenetically Clustered Signals JOURNAL=Frontiers in Genetics VOLUME=12 YEAR=2022 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2021.749573 DOI=10.3389/fgene.2021.749573 ISSN=1664-8021 ABSTRACT=

Random forest is considered as one of the most successful machine learning algorithms, which has been widely used to construct microbiome-based predictive models. However, its use as a statistical testing method has not been explored. In this study, we propose “Random Forest Test” (RFtest), a global (community-level) test based on random forest for high-dimensional and phylogenetically structured microbiome data. RFtest is a permutation test using the generalization error of random forest as the test statistic. Our simulations demonstrate that RFtest has controlled type I error rates, that its power is superior to competing methods for phylogenetically clustered signals, and that it is robust to outliers and adaptive to interaction effects and non-linear associations. Finally, we apply RFtest to two real microbiome datasets to ascertain whether microbial communities are associated or not with the outcome variables.