AUTHOR=Brombacher Eva , Hackenberg Maren , Kreutz Clemens , Binder Harald , Treppner Martin TITLE=The performance of deep generative models for learning joint embeddings of single-cell multi-omics data JOURNAL=Frontiers in Molecular Biosciences VOLUME=9 YEAR=2022 URL=https://www.frontiersin.org/journals/molecular-biosciences/articles/10.3389/fmolb.2022.962644 DOI=10.3389/fmolb.2022.962644 ISSN=2296-889X ABSTRACT=

Recent extensions of single-cell studies to multiple data modalities raise new questions regarding experimental design. For example, the challenge of sparsity in single-omics data might be partly resolved by compensating for missing information across modalities. In particular, deep learning approaches, such as deep generative models (DGMs), can potentially uncover complex patterns via a joint embedding. Yet, this also raises the question of sample size requirements for identifying such patterns from single-cell multi-omics data. Here, we empirically examine the quality of DGM-based integrations for varying sample sizes. We first review the existing literature and give a short overview of deep learning methods for multi-omics integration. Next, we consider eight popular tools in more detail and examine their robustness to different cell numbers, covering two of the most common multi-omics types currently favored. Specifically, we use data featuring simultaneous gene expression measurements at the RNA level and protein abundance measurements for cell surface proteins (CITE-seq), as well as data where chromatin accessibility and RNA expression are measured in thousands of cells (10x Multiome). We examine the ability of the methods to learn joint embeddings based on biological and technical metrics. Finally, we provide recommendations for the design of multi-omics experiments and discuss potential future developments.