Skip to main content

ORIGINAL RESEARCH article

Front. Genet.
Sec. Statistical Genetics and Methodology
Volume 15 - 2024 | doi: 10.3389/fgene.2024.1494401
This article is part of the Research Topic Statistical Approaches, Applications, and Software for Longitudinal Microbiome Data Analysis and Microbiome Multi-Omics Data Integration View all 7 articles

ZINQ-L: A Zero-Inflated Quantile Approach for Differential Abundance Analysis of Longitudinal Microbiome Data

Provisionally accepted
  • 1 Johns Hopkins University, Baltimore, Maryland, United States
  • 2 Weill Cornell Medical Center, NewYork-Presbyterian, New York City, United States

The final, formatted version of the article will be published soon.

    Background: Identifying bacterial taxa associated with disease phenotypes or clinical treatments over time is critical for understanding the underlying biological mechanism. Association testing for microbiome data is already challenging due to its complex distribution that involves sparsity, over-dispersion, heavy tails, etc. The longitudinal nature of the data adds another layer of complexity - one needs to account for the within-subject correlations to avoid biased results. Existing longitudinal differential abundance approaches usually depend on strong parametric assumptions, such as zero-inflated normal or negative binomial. However, the complex microbiome data frequently violate these distributional assumptions, leading to inflated false discovery rates. In addition, the existing methods are mostly mean-based, unable to identify heterogeneous associations such as tail events or subgroup effects, which could be important biomedical signals. Methods: We propose a zero-inflated quantile approach for longitudinal (ZINQ-L) microbiome differential abundance test. A mixed-effect quantile rank-score-based test was proposed for hypothesis testing, which consists of a test in mixed-effects logistic model for the presence-absence status of the investigated taxon, and a series of mixed-effects quantile rank-score tests adjusted for zero inflation given its presence. As a regression method with minimal distributional assumptions, it is robust to the complex microbiome data, controlling false discovery rate, and is flexible to adjust for important covariates. Its comprehensive examination of the abundance distribution enables the identification of heterogeneous associations, improving the testing power. Results: Extensive simulation studies and an application to a real kidney transplant microbiome study demonstrate the improved power of ZINQ-L in detecting true signals while controlling false discovery rates. Conclusion: ZINQ-L is a zero-inflated quantile-based approach for detecting individual taxa associated with outcomes or exposures in longitudinal microbiome studies, providing a robust and powerful option to improve and complement the existing methods in the field.

    Keywords: Longitudinal microbiome studies, zero inflation and disperson, mixed-effects models, quantile rank-score test, heterogeneous associations

    Received: 10 Sep 2024; Accepted: 10 Dec 2024.

    Copyright: © 2024 Li, Li, Lee, Zhao and Ling. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Wodan Ling, Weill Cornell Medical Center, NewYork-Presbyterian, New York City, United States

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.