Deep learning in microbiome analysis: a comprehensive review of neural network models

Przymus, Piotr; Rykaczewski, Krzysztof; Martín-Segura, Adrián; Truu, Jaak; Carrillo De Santa Pau, Enrique; Kolev, Mikhail; Naskinova, Irina; Gruca, Aleksandra; Sampri, Alexia; Frohme, Marcus; Nechyporenko, Alina

doi:10.3389/fmicb.2024.1516667

REVIEW article

Front. Microbiol., 22 January 2025

Sec. Systems Microbiology

Volume 15 - 2024 | https://doi.org/10.3389/fmicb.2024.1516667

Deep learning in microbiome analysis: a comprehensive review of neural network models

Piotr Przymus¹^†

Krzysztof Rykaczewski¹^†

Adrián Martín-Segura²^*^†

Jaak Truu³^†

Enrique Carrillo De Santa Pau²

Mikhail Kolev^4,5

Irina Naskinova⁴

Aleksandra Gruca⁶

Alexia Sampri^7,8

Marcus Frohme⁹

Alina Nechyporenko^9,10

¹Faculty of Mathematics and Computer Science, Nicolaus Copernicus University in Toruń, Toruń, Pomeranian, Poland
²Computational Biology Group, IMDEA Food Institute, Madrid, Spain
³Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
⁴Department of Mathematics, University of Architecture, Civil Engineering and Geodesy, Sofia, Bulgaria
⁵Department of Applied Computer Science and Mathematical Modeling, Faculty of Mathematics and Computer Science, University of Warmia and Mazury in Olsztyn, Olsztyn, Poland
⁶Department of Computer Networks and Systems, Silesian University of Technology, Gliwice, Poland
⁷British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom
⁸Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, United Kingdom
⁹Molecular Biotechnology and Functional Genomics, Technical University of Applied Sciences Wildau, Wildau, Brandenburg, Germany
¹⁰Department of System Engineering, Kharkiv National University of Radioelectronics, Kharkiv, Ukraine

Microbiome research, the study of microbial communities in diverse environments, has seen significant advances due to the integration of deep learning (DL) methods. These computational techniques have become essential for addressing the inherent complexity and high-dimensionality of microbiome data, which consist of different types of omics datasets. Deep learning algorithms have shown remarkable capabilities in pattern recognition, feature extraction, and predictive modeling, enabling researchers to uncover hidden relationships within microbial ecosystems. By automating the detection of functional genes, microbial interactions, and host-microbiome dynamics, DL methods offer unprecedented precision in understanding microbiome composition and its impact on health, disease, and the environment. However, despite their potential, deep learning approaches face significant challenges in microbiome research. Additionally, the biological variability in microbiome datasets requires tailored approaches to ensure robust and generalizable outcomes. As microbiome research continues to generate vast and complex datasets, addressing these challenges will be crucial for advancing microbiological insights and translating them into practical applications with DL. This review provides an overview of different deep learning models in microbiome research, discussing their strengths, practical uses, and implications for future studies. We examine how these models are being applied to solve key problems and highlight potential pathways to overcome current limitations, emphasizing the transformative impact DL could have on the field moving forward.

Introduction

The diverse microbial communities inhabiting different environments play pivotal roles in shaping ecosystem dynamics, influencing nutrient cycling, and impacting the health and wellbeing of host organisms (Sessitsch et al., 2023; Liao et al., 2024). Understanding the intricate relationships within microbiomes is crucial for various fields such as agriculture, medicine, and environmental science. Microbiome engineering, aimed at manipulating microbial communities to achieve desired outcomes, requires comprehensive knowledge of microbial community composition, function, and interdependencies (Berruto and Demirer, 2024; Cullen et al., 2020; Lee, 2023).

Conventional analytical methods often struggle to fully capture the intricate complexity and dynamics present in microbiome data. This limitation has motivated researchers to explore advanced computational approaches such as machine learning and deep learning. Microbiome data is inherently high-dimensional, sparse, and context-dependent, posing difficulties for traditional machine learning methods. Deep learning (DL) models, with their capacity to process complex, non-linear relationships, have shown promise in overcoming these limitations. Deep learning architectures, in particular, provide robust tools for extracting meaningful patterns from complex, high-dimensional data, making them well-suited for microbiome analysis. Unfortunately, significant challenges remain. Issues such as the limited number of observations, sparse data, interpreting model outcomes, and ensuring model robustness across different types of microbiome data pose ongoing hurdles.

This paper is a complementary work and a continuation of the previous efforts carried out by the COST (European Cooperation in Science and Technology) Action CA18131 on Statistical and Machine Learning Techniques in Human Microbiome Studies (ML4Microbiome). It aims to assist microbiologists and biomedical scientists who are beginning their journey or wish to delve deeper into specialized resources that integrate machine learning techniques for the analysis of microbiome data. Previously, we described the applications of machine learning in human microbiome studies (Marcos-Zambrano et al., 2021; Moreno-Indias et al., 2021), cataloged the most common ML-based software and framework resources (Marcos-Zambrano et al., 2023) and discussed the challenges and best practices in the use of ML methods in microbiome data (Marcos-Zambrano et al., 2021; Papoutsoglou et al., 2023).

In this paper, we focus on and explore in depth the use of deep learning architectures and their applications in analyzing microbiome data, building on ML4Microbiome work where these methods were only briefly described. The rapid increase in microbiome data, driven by advances in high-throughput sequencing technologies and large-scale collaborative projects, provides a rich resource for deep learning applications. Furthermore, continuous developments in deep learning algorithms and frameworks (such as TensorFlow, PyTorch, and Keras) have made these techniques more accessible and user-friendly. New architectures and optimization techniques are being designed to address the challenges posed by high-dimensional, sparse microbiome data more effectively. These advancements collectively lower the barriers to adopting deep learning, highlighting its potential to enhance microbiome research significantly. Consequently, we anticipate a rapid increase in the use of deep learning methods in microbiome studies in the coming years. Therefore, the aim of this manuscript is to develop a more comprehensive understanding of how various deep learning architectures can improve our insights into microbiome dynamics, functions, and interactions within microbial communities and with hosts. The paper surpasses previous reviews focused on ML techniques that merely describe deep learning approaches for the analysis of microbiome datasets (Hernández Medina et al., 2022; Geman et al., 2018; Mathieu et al., 2022; LaPierre et al., 2019; Deng et al., 2021; Roy et al., 2024). It introduces non-specialized readers without background technical knowledge to a clear understanding of various deep learning architectures, along with their specific applications in microbiome analysis, illustrated by diverse examples and schemes. Additionally, the paper engages in discussions regarding their strengths, weaknesses, and challenges in the microbiome analysis.

The manuscript is structured first to highlight key applications of deep learning in microbiome research, which include data preprocessing, feature extraction and engineering techniques. This is followed by microbiome analysis tasks benefiting from deep learning approaches, such as Classification/Prediction tasks, studying microbiome interactions, clustering analysis, and using deep learning for creating metagenome-assembled genomes. Next, we describe multiple deep learning architectures following the structure of the Neural Network zoo, a comprehensive visual guide of different types of neural network architectures (Leijnen and Veen, 2020). For each architecture, we discuss its potential usefulness in the context of microbiome analysis, highlighting specific reasons. We provide a general overview of each architecture's concept and then discuss how they can be applied to microbiome-specific tasks, drawing from existing literature or proposing potential applications. In addition, for the more enthusiast readers, we provide additional bibliography that may serve as a practical guidance and to build theoretical foundations (see Literature recommendation in the Supplementary material 1). Finally, we discuss the risks and considerations associated with using deep learning on microbiome data. This section covers various risks, potential problems, and important considerations that researchers and practitioners should be aware of when employing deep learning techniques in microbiome research.

Common microbiome data types

Various technologies are employed to explore the microbiome, with targeted sequencing (such as marker gene amplicon sequencing) and metagenomic shotgun sequencing standing out as two primary methods.

1. Targeted sequencing is a technique that focuses on specific regions of the genome to identify microbial communities accurately. This technique involves sequencing the amplified 16S ribosomal RNA (rRNA) gene to identify bacteria and archaea and the Internal Transcribed Spacer (ITS) region or 18S rDNA gene to identify eukaryotes. Sequencing the 16S rRNA gene is particularly important in identifying and quantifying the various bacterial and archaeal species within a sample. The analysis of the obtained sequencing data can be performed using either the Operational Taxonomic Units (OTUs) or Amplicon Sequence Variants (ASVs) approach, each providing different levels of taxonomic resolution and computational demands based on the goals of the study (Chiarello et al., 2022).

2. Metagenomic shotgun sequencing provides a more exhaustive analysis by sequencing all DNA in a sample, covering bacteria, archaea, eukaryotes, and viruses. Although this method delivers a broader overview of the microbiome, it demands more resources and computational effort. The data analysis process of shotgun sequencing data is intricate, involving the reconstruction of longer DNA sequences, taxonomic classification, and functional annotation.

3. Metatranscriptomic sequencing is an emerging technique that is used to study microbiomes. This technique involves the study of RNA transcripts to understand the active genes and the responses of the microbiome under different conditions. This approach provides valuable insights into the functional dynamics and gene expression profiles of microbial communities.

4. Metaproteomic analysis examines the proteins present in a microbiome, offering insights into the active metabolic processes within microbial communities. By identifying the proteins being produced, researchers can infer the functional capabilities of the microbiome.

5. Metabolomic analysis identifies small molecules, revealing metabolic activities within microbial communities and between the microbiome and host.

Integrating various types of microbiome data into multi-omics analysis is becoming increasingly common, which provides a comprehensive understanding of the microbiome's structure, function, and dynamics. Each data type offers unique insights, collectively enhancing our knowledge about microbial communities. In this regard, data transformation prior to applying DL is crucial for effectively handling microbiome sequencing data. They help to rectify compositional issues, reduce noise, adhere to statistical assumptions, and enable meaningful analysis and interpretation. In human microbiome studies, the most commonly used data transformation methods for both targeted sequencing and shotgun data are relative and normalization-based methods. These are followed by compositional transformations such as the centered log-ratio (CLR) and Isometric log-ratio (ILR) methods (Ibrahimi et al., 2023). Microbiome data is most often represented as a matrix or table, with each row representing a sample or subject and each column representing microbial features. However, the data can also be organized as a time series, where each time step corresponds to a different point in time (e.g., longitudinal microbiome data). In Supplementary Table 1 you can find the most common manner to feed data to the different NN architectures.

Applications of DL techniques in microbiome research

In this section, we will explore key applications of deep learning in microbiome research, categorized into three main groups. First, we will begin by exploring DL uses for microbiome taxonomic and functional profiling (microbial taxons, derived proteins, and metabolites). Then examining data preprocessing tasks, such as data augmentation and imputation, batch correction, feature extraction, and multi-view analysis techniques relevant to microbiome data analysis. Finally, we will discuss various microbiome analysis tasks that benefit from deep learning approaches, including Classification/Prediction tasks, studying microbiome interactions and clustering analysis. In the text and Table 1, you will find a general overview of suitable architectures for each task. Architectures are selected based on known applications of the architecture for analysis of microbiome data or similar contexts. Architectures highlighted in bold indicate instances where we have found examples of their usage in microbiome data analysis in the literature. The most relevant publications were selected that showcased the versatility and effectiveness of each neural network model across different microbiome-related applications.

Table 1

Table 1. Applications of DL techniques in microbiome research.

Microbiome taxonomic and functional profiling

The identification of microbiome features (i.e., taxa, genes) is essential for posterior functional studies and profiling of ecosystems that could be done in a metagenomic project. Numerous tools had previously been developed for these tasks (reviewed in Marcos-Zambrano et al., 2023). The spread of the shotgun sequencing method has led to the study of the functional microbiome, allowing for the characterization of microbiome small molecules (toxins, antibiotics, etc.) and their functionality (Zhang Y. et al., 2022; Ma et al., 2022). The initial step involves identifying these molecules which typically are encoded in biosynthetic gene clusters (BGCs). To facilitate this, different models were developed, including pHMM, BLAST, and ClusterFinder (reviewed in Ak and Sy, 2018). DL has enhanced the accuracy of these algorithms while also delivering good computational performance for some of them. The emergence of deep learning models has led to the development of new models for this purpose, such as e-DeepBGC (Liu M. et al., 2022) or DeepRFI (Gligorijević et al., 2021). Another emerging aspect of microbiome taxonomic and functional profiling is the creation of metagenome-assembled genomes (MAG). The approach is based on the reduction of reads to smaller contiguous sequences (contigs) with significant overlap and binning them, i.e., grouping them by their genome of origin. The process of binning is a complicated process that typically relies on the analysis of the detected sequences' co-abundance (contigs from the same organism should have abundance's high covariance across samples) or the k-mer frequency found in the DNA. There are three main groups of binning approaches based on the features utilized. These groups include sequence composition (k-mer frequency) based, abundance (contig coverage) based, and hybrid methods (combining both k-mer frequency and coverage features). However, using these feature sets independently can generate problems like sequence redundancy, and co-abundance trends to cause chimeric MAGs. The emergence of deep learning-based binning methods has improved the handling of heterogeneous information in the process of MAG recovery.

Data preprocessing

Augmentation

Microbiome data poses a significant challenge due to their high dispersion and sparsity, requiring a substantial amount of data to build statistical models effectively. However, not all microbiome studies have the resources to collect large datasets. Consequently, creating augmented datasets to train more sophisticated statistical models has become a viable approach in the microbiome field. These generated datasets exhibit similar characteristics to real microbiome data, preserving the sparsity and diversity of the microbiome, while retaining important taxa-taxa correlations (Liu M. et al., 2022; Gligorijević et al., 2021).

Imputation

Data imputation is an additional method used to generate microbiome data. The microbiome is a dynamic component of organisms that evolves over time and in response to various external conditions. Therefore, longitudinal studies conducted over time or under different health conditions/treatments are precious by providing insights into the microbiome's adaptation and its impact on host health. However, these studies complicate the collection of comprehensive and complete datasets due to the need for data from different time points, adding to the intrinsic complexity of microbiome data mentioned earlier. Missing data at specific intervals is a common challenge, potentially hindering the development of robust statistical models. To address this, DL techniques have also been employed to impute these missing points (Choi et al., 2023), aiding in completing the datasets necessary for the successful development of ML models.

Batch correction

Combining various microbiome studies is a common approach to tackle the lack of large datasets and data sparsity, effectively enlarging the pool of samples. However, integrating databases coming from different sources can be a challenging task. The batch effect, alterations in data caused by external non-biological factors in the experiment, can affect the generation of ML models. Li et al. (2023) designed a DL-based algorithm based on GAN networks for this purpose. Their algorithm, coupled with a mathematical index to predict health status (GMHI), was able to remove the batch effect while keeping the particularities of the different disease status in several studies, improving the disease discrimination in those datasets. Additionally, autoencoder-based methods can also be used for batch correction (Bank et al., 2023). They can effectively remove batch effects by compressing data and applying guided training to keep the biological variations, similar to the adversarial approach employed by GAN networks. For instance, Autoencoder-based Batch Correction (ABC) is a semi-supervised deep learning architecture designed for integrating single-cell sequencing data from multiple sources. This method removes batch effects while maintaining the biological variations in the data (Danino et al., 2024). Although designed for other purposes, this tool has a great potential use in the microbiome context.

Feature extraction and engineering

Feature extraction involves identifying, selecting, or creating meaningful data attributes from raw datasets to enhance model accuracy by capturing relevant information and patterns. Deep learning may be used as it is able to manage complex datasets and interpret non-linear patterns effectively. For example, this could involve quantifying specific bacterial groups or extracting pathways related to host-microbe interactions, simplifying data complexity, and improving predictive capabilities for disease states or ecosystem dynamics (Oh and Zhang, 2020; Shen Y. et al., 2023; Tataru et al., 2022).

General applications

In previous sections, we have primarily concentrated on preprocessing data (imputation, data generation) and identifying unique features that reveal patterns. However, the main use of deep learning with microbiome data is classifying original samples into groups or populations using various types of neural networks.

Classification/prediction

Classification and prediction are two fundamental aspects of machine learning, each serving a unique purpose in data analysis and decision-making processes. Classification involves categorizing data into predefined groups or classes based on their features; it's primarily used when the outputs are categorical, such as diagnosing diseases (healthy vs. diseased) or identifying customer sentiment (positive, negative, or neutral). On the other hand, prediction refers to forecasting continuous outcomes based on input variables, such as blood glucose levels for Type 2 diabetes (T2D). This process, often called regression in statistical contexts, uses different methodologies like linear regression or deep learning models to estimate numerical values.

Microbiome interactions

The primary use of deep learning is to predict health or disease states based on microbiome data. Determining whether a particular microbiome is linked to disease development is crucial. However, some approaches focus solely on factors influencing the microbiome's health or disease state without considering microbial interactions or environmental influences that could drive the final outcome. Models like the generalized Lotka-Volterra (gLV) have been used to understand microbial community interactions and how small changes can impact the entire community (van den Berg et al., 2022). The gLV model estimates bacteria growth rates and interactions among community members. However, it struggles with large, complex interactions, often requiring longer computational time compared to newer DL-based models.

Clustering

Clustering is a type of unsupervised learning technique used in data analysis where data points are grouped into groups (clusters) based on their similarities, with the aim that items in the same cluster are more similar to each other than to those in other clusters. This method is widely used across various fields, e.g., to identify inherent structures or patterns in data without prior labeling of the points. For example, clustering can be applied to patient data to identify subgroups that share similar microbiome profiles (de Kok et al., 2024), which can help tailor specific treatments or better understand the progression of diseases. Another typical example is when researchers use clustering to analyze grouping organisms or genes based on genetic similarity, which can reveal evolutionary relationships or functional similarities (Nissen et al., 2021).

Multi-view analysis

Recently, studying microbiomes using a combination of different omic approaches has become increasingly common. These multi-omics datasets, alone or together with host-specific data or environmental data, can be processed with multi-view analysis methods (also referred as data integration), allowing for a comprehensive understanding of the microbiome's structure, function, and dynamics. Multi-omics multi-view analysis methods have been categorized into five distinct strategies: early, mixed, intermediate, late, and hierarchical (Picard et al., 2021) and general aspects of deep learning-based multi-omics data integration methods have been reviewed by Kang et al. (2022). Early fusion involves transforming all datasets into a unified representation, which is then used as input for a chosen deep learning model. In the case of late fusion, first-level models are developed from individual data types, and then the predictions from these models are combined by training a second-level model, which serves as the final predictor. Multi-view analysis using deep learning has been explored in several microbiome studies to harness the strengths of different data types and enhance our understanding of microbial communities and their interactions.

Deep learning architectures

In this section, we explore various deep learning architectures within the realm of microbiome analysis. We begin with a general overview of each architecture's concept before delving into its specific applications in microbiome analysis. By synthesizing insights from existing literature (see exact examples of architectures in Supplementary material 2) and proposing potential applications, our goal is to offer valuable perspectives on leveraging these architectures to overcome challenges and foster advancements in microbiome research.

Artificial neural networks are computer models inspired by the workings of the human brain. They consist of multiple layers, each containing units called neurons that process information. These neurons are connected by activation functions, enabling the network to learn and make decisions (Figure 1) (McCulloch and Pitts, 1943).

Figure 1

Figure 1. The model of neuron as proposed in 1943 by McCulloch and Pitts (1943).

There are usually three types of layers: input, hidden, and output. The input layer receives data, with each neuron representing an element like a pixel in an image or a word in a sentence. Hidden layers, positioned between the input and output layers, process and transform this data to learn complex relationships. The output layer generates the final prediction or result, for example, identifying healthy individuals and those with specific diseases based on their gut microbiome profiles. Between layers are activation functions, which are mathematical functions used in neurons that decide whether a given neuron should be “activated,” meaning it passes the signal further. They introduce non-linearity, which allows the neural network to learn complex patterns. Examples of activation functions include ReLU (Rectified Linear Unit), which passes positive input values and returns zero for negative ones, and the sigmoid function, which transforms the input value into a range from 0 to 1, useful when predicting probabilities. You can find a summary of the most commonly used activation functions in Supplementary Table 2.

There are different mathematical metrics to measure the performance of a neural network model. The use of one or the other depends on the classification performed by the model, although some of them can be used for the same task. For example, precision and recall are metrics more commonly used for Classification/predictions of categorical classes while Mean squared error (MSE) or Root mean squared error (RMSE) are more commonly used in regression problems. See Supplementary Table 3 for a summary of the most typical evaluation metrics in DL.

Feed forward neural networks in multi-layer perceptron type

Feedforward neural networks (FFNNs) are a type of neural network that passes information from input to output without looping back at any point (Figure 2). A notable subclass of FFNNs is the multilayer neural network, also known as Multilayer Perceptrons (MLPs or MLPNNs) (Rumelhart et al., 1986), which are made up of layers. Each layer connects only to the next layer in line, without any connections within the same layer. The training of MLPs employs the backpropagation algorithm within a supervised learning framework, where they learn from sets of known input-output pairs and measure their accuracy using metrics like mean squared error (MSE). Although they theoretically can model any relationship between inputs and outputs with enough neurons in the hidden layers (see Cybenko Theorem), their effectiveness in practical applications can vary. To improve their performance, FFNNs are often used together with other types of neural networks. The input to the FFNN is a finite-dimensional vector of a fixed length, which is derived from raw data through appropriate processing.

Figure 2

Figure 2. Scheme of a typical feedforward neural network architecture. Input layer receives the input data, hidden layer consists of neurons that apply a weighted sum of inputs followed by an activation function to learn complex patterns, output layer provides the final output of the network. Information flows in one direction, from the input layer, through the hidden layers, to the output layer.

Functional annotation and metagenome-assembled genomes

FFNNs have been used for gene identification, using reference databases as a guide [e.g., NCBI Refseq, CARD (Jia et al., 2017), ARDB (Liu and Pop, 2009), or UNIPROT (Apweiler et al., 2004)], to improve gene identification and find new sequences (e.g., identifying new antibiotic resistance genes). That is the case of tools like Meta-MFDL (Zhang et al., 2017), Deep-ARG (Arango-Argoty et al., 2018), or ONN4MST (Zha et al., 2022). Regarding MAG generation, SemiBin (Pan et al., 2022) and SemiBin2 (Pan et al., 2023) are advanced binning algorithms that use DL. They work by dividing long contigs into two equal-length segments to create pairwise must-link constraints, and use taxonomic annotation information to create pairwise cannot-link constraints. SemiBin employs a semi-supervised autoencoder to extract this constraint information and generate embeddings for clustering. SemiBin2 is an upgraded version of SemiBin, which generates must-link constraints similarly, but introduces cannot-link constraints by randomly sampling pairs of contigs. COMEBin (COntrastive Multi-viEw representation learning for effective Binning of metagenomic contigs) is a binning method based on contrastive multi-view representation learning (Wang et al., 2024). COMEBin utilizes data augmentation to generate multiple fragments (views) of each contig and obtains high-quality embeddings of heterogeneous features (sequence coverage and k-mer distribution) through contrastive learning. The network structure used consists of two primary modules. The first module uses a FFNN to process contig coverage features. The second module also uses a FFNN to integrate the output of the first module and the k-mer features, generating an embedded representation of both. These embeddings are further used in the clustering process.

Classification/prediction

FFNNs can be applied to analyze microbiome data and make predictions or classifications based on the input features, data representation, feature engineering, network architecture, training and validation, evaluation, and prediction. Some of the designs used taxa abundances as input for the networks (Galkin et al., 2020; Wu et al., 2024). Others used different approaches like feeding directly k-mer distributions (Asgari et al., 2018), or combining different sources of data like taxa, metabolic and genomic abundances (Lee and Rho, 2022).

FFNNs have also been used to predict microbial community composition based on microbiome-environment interactions., MetaMLAnn algorithm tries to infer microbial communities in unsampled city areas based on the composition of sampled locations (Zhou et al., 2019).

Multi-view analysis

MDL4Microbiome integrates three distinct features of the microbiome: conventional taxonomic profiles, genome-level relative abundance, and metabolic functional characteristics, to improve classification accuracy (Lee and Rho, 2022). Each feature is processed through a separate supervised MLPNN. The final hidden layer of each model generates embedded representations of the respective feature. By combining these representations, a new shared representation is created that retains the essential characteristics of each of the different modalities.

Recurrent neural networks

Recurrent neural networks (RNNs) are a type of neural network that adds a time dimension to data processing (Figure 3). They can remember information from previous inputs because they connect across different time steps. This ability makes them effective for tasks that rely on past information, such as predicting the next word in a sentence. However, RNNs are particularly susceptible to the common neural network issues of vanishing and exploding gradients, wherein the gradient either diminishes or increases exponentially across time steps due to the characteristics of activation functions. This phenomenon can lead to substantial information loss during training. In the field of microbiology, RNNs and LSTMs are useful for studying the dynamics of microbial communities over time. They have been used to predict changes in the composition of microbiomes, forecast how populations of microbes change, and understand how microbes interact with their hosts over time.

Figure 3

Figure 3. (A) Scheme showing a recurrent neural network (RNN) architecture. The input is a sequence of finite-dimensional vectors, each of fixed length, which are derived from raw data through appropriate processing. This type of architecture uses recurrent units in hidden layer. (B) Depicts the structure of the hidden layer: a single recurrent neuron (cell).