- 1Center for Bioinformatics and Quantitative Biology, Richard and Loan Hill Department of Bioengineering, University of Illinois at Chicago, Chicago, IL, United States
- 2Constellation, Baltimore, MD, United States
- 3Merck & Co., Inc., Kenilworth, NJ, United States
Feed-forward loops (FFLs) are among the most ubiquitously found motifs of reaction networks in nature. However, little is known about their stochastic behavior and the variety of network phenotypes they can exhibit. In this study, we provide full characterizations of the properties of stochastic multimodality of FFLs, and how switching between different network phenotypes are controlled. We have computed the exact steady-state probability landscapes of all eight types of coherent and incoherent FFLs using the finite-butter Accurate Chemical Master Equation (ACME) algorithm, and quantified the exact topological features of their high-dimensional probability landscapes using persistent homology. Through analysis of the degree of multimodality for each of a set of 10,812 probability landscapes, where each landscape resides over 105–106 microstates, we have constructed comprehensive phase diagrams of all relevant behavior of FFL multimodality over broad ranges of input and regulation intensities, as well as different regimes of promoter binding dynamics. In addition, we have quantified the topological sensitivity of the multimodality of the landscapes to regulation intensities. Our results show that with slow binding and unbinding dynamics of transcription factor to promoter, FFLs exhibit strong stochastic behavior that is very different from what would be inferred from deterministic models. In addition, input intensity play major roles in the phenotypes of FFLs: At weak input intensity, FFL exhibit monomodality, but strong input intensity may result in up to 6 stable phenotypes. Furthermore, we found that gene duplication can enlarge stable regions of specific multimodalities and enrich the phenotypic diversity of FFL networks, providing means for cells toward better adaptation to changing environment. Our results are directly applicable to analysis of behavior of FFLs in biological processes such as stem cell differentiation and for design of synthetic networks when certain phenotypic behavior is desired.
1. Introduction
Cells with the same genetic make-ups can exhibit a variety of different behavior. They can also switch between these different phenotypes stochastically. This phenomenon has been observed in bacteria, yeast, and mammals such as neural cells (Acar et al., 2005; Choi et al., 2008; Guo and Li, 2009; Gupta et al., 2011). The ability to exhibit multiple phenotypes and switching between them is the foundation of cellular fate decision (Schultz et al., 2007; Cao et al., 2010; Ye et al., 2019), stem cell differentiation (Feng and Wang, 2012; Papatsenko et al., 2015; Zhang et al., 2019), and tumor formation (Huang et al., 2009; Shiraishi et al., 2010).
Cells exhibiting different phenotypes have different patterns of gene expression. Single-cell studies demonstrated that isogenic cells can exhibit different modes of gene expression (Shalek et al., 2013), indicating that distinct phenotypes are encoded in the wiring of the genetic regulatory networks (Liang and Qian, 2010). This phenomenon of epigenetic control of bimodality in gene expression by network architecture is well-known and has been extensively studied in earlier works of phage-lambda (Arkin et al., 1998; Ptashne, 2004; Zhu et al., 2004a,b; Cao et al., 2010).
Understanding multimodality in gene regulatory networks and its control mechanism can provide valuable insight into how different cellular phenotypes arises and how cellular programming and reprogramming proceed (Lu et al., 2007). Much of current knowledge of multimodality is derived from analysis of networks with feedback loops or cooperative interactions (Siegal-Gaskins et al., 2009). However, recent studies suggest that multimodality and phenotype switching can also arise from slow promoter binding, which may result in distinct protein expression levels of long durations (Feng and Wang, 2012; Thomas et al., 2014; Chen et al., 2015; Duncan et al., 2015; Terebus et al., 2019). Nevertheless, the nature and extent of this type of bimodality is not well-understood.
In this work, we study the network modules of feed-forward loops (FFLs) and characterize the stochastic nature of their multimodalities. FFLs are one of the most prevalent three-node network motifs in nature (Alon, 2006) and play important regulatory roles (Lee et al., 2002; Shen-Orr et al., 2002; Boyer et al., 2005; Mangan et al., 2006; Tsang et al., 2007; Ma et al., 2009; Sorrells and Johnson, 2015). They appear in stem cell pluripotency networks (Boyer et al., 2005; Papatsenko et al., 2015; Sorrells and Johnson, 2015), microRNA regulation networks (Tsang et al., 2007; Re et al., 2009; Ivey and Srivastava, 2010), and cancer networks (Re et al., 2009). The behavior of FFLs has been studied extensively using deterministic ODE models. These studies revealed important functions of FFLs in signal processing, including sign-sensitive acceleration and delay pulse generation functions, and increased cooperativity (Mangan and Alon, 2003; Ma et al., 2009). FFLs are also found to be capable of maintaining robust adaptation (François and Siggia, 2008; Ma et al., 2009) and detecting “fold-changes” (Goentoro et al., 2009).
However, analysis based on ODEs is limited in its ability to characterize probabilistic events, as they do not capture bimodality in gene expression that arises from slow promoter binding (Vellela and Qian, 2009). The stochastic behavior of FFLs is not well-characterized: Basic properties such as the number of different phenotypes FFLs are capable of exhibiting, the conditions required for their emergency, their relative prominence, and the sensitivity of different phenotypes to perturbations are not known.
Our stochastic FFL models are based on processes of Stochastic Chemical Kinetics (SCK), which provides a general framework for understanding the stochastic behavior of reaction networks. Quantitative SCK modeling can uncover different network phenotypes, the conditions for their occurrence, and the nature of the prominence of the stability peaks. However, analysis of stochastic networks is challenging. First, models based on stochastic differential equations such as Fokker–Planck and Lagenvin models may be inadequate due to their Gaussian approximations. This is further compounded by the limited number of simulation trajectories that can be generated. These difficulties are reflected in the reported failure of a Fokker–Planck model in accounting for multimodality in the simple network model of single self-regulating gene at certain reaction rates (Duncan et al., 2015). Second, the widely used Stochastic Simulation Algorithm (Gillespie simulations) can generate SCK trajectories (Gillespie, 1977), but are challenged in capturing rare events and in computing efficiency. There are also difficulties in assessing convergency and in estimating computational errors (Cao and Liang, 2013). Third, even if the probabilistic landscape can be accurately reconstructed with acceptable accuracy, detecting topological features such as peaks in high-dimensional probability landscapes and assessing their objectively prominence at large-scale remains an unsolved problem.
To characterize the stochastic behavior of FFLs using models based on SCK processes, our approach is to solve the underlying discrete Chemical Master Equation (dCME) using the ACME (Accurate Chemical Master Equation) algorithm (Cao et al., 2016a,b), and to obtain the exact probability landscapes of all 8 varieties of FFLs.
Aided by the computational efficiency of ACME, we are able to explore the behavior FFLs under broad conditions of synthesis, degradation, binding, and unbinding rates of transcription factors genes binding. Furthermore, we analyze the topological features of the exactly constructed high-dimensional probability landscapes using persistent homology, so the number of probability peaks and the prominence measured by their persistence are quantified objectively. These techniques allow us to examine details of the number of possible phenotypic states at different conditions, as well as the ranges of conditions enabling phenotypic switching. With broad exploration of the model parameter space, we are able to construct detailed phase diagrams of multimodalities under different conditions.
Our results reveal how FFL network behaves differently under varying strengths of regulations intensities and the input. In addition, we characterize quantitatively the effects of duplication of genes in the FFL network modules. We show gene duplication can significantly affect the diversity of multimodality, and can enlarge monomodal regions so FFLs can have robust phenotypes. The results we obtained can be useful for analysis of phenotypic switching in biological networks containing the FFL modules. They can also be used for construction of synthetic networks with the goal of generating certain desired phenotypic behavior.
2. Models and Methods
2.1. Architecture and Types of Feed-Forward Loop Network Modules
2.1.1. Overview
FFLs consists of three nodes representing three genes, each expresses a different protein product (Figure 1A). An FFL regulates the network output from the left input node toward the right output node via two paths; the direct path from the left node to the right node, and the indirect path from the left to the right node via an intermediate buffer node. As each of the three regulations can be either up- or downregulation, there are altogether 23 = 8 types of FFL.
Figure 1. Representation and the types of feed-forward loop (FFL) network: (A) General wiring and corresponding 3-node schematic representation of an FFL module containing three genes a, b, and c expressing three proteins A, B, and C. Protein A regulates the expressions of genes b and c through binding to their promoters. Protein B regulates the expression of gene c through promoter binding. (B) The FFL modules can be classified into eight different types. Coherent/incoherent FFLs are on the left/right, respectively.
2.1.2. Network Architecture
Specifically, we denote the three genes of an FFL module as a, b, and c, which expresses protein products A, B, and C at constant synthesis rate of sA, sB, and sC, respectively (Figure 1A). Proteins A, B, and C are degraded at rate dA, dB, and dC, respectively. Both proteins A and B function as transcription factors and can bind competitively to the promoter of gene c and regulates its expression. As the promoter of gene c can bind to either protein A or B, but not both, this type of regulation is known as the “OR” gate. In addition, protein A can bind to the promoter of gene b and regulate its expression. Specifically, protein A can bind to the promoter of gene c at rate to form complex cA, which dissociates at rate . cA expresses protein C at a rate k3-fold over the basal rate of sC. Similarly, protein B can bind to the promoter of gene c at rate to form complex cB, which dissociates at rate . cB expresses protein C at a rate k2-fold over the basal rate of sC. Furthermore, protein A binds to the promoter of gene b at rate to form gene–protein complex bA, which dissociate at rate . Upon binding protein A, bA expresses protein B at a rate k1-fold over the basal rate of sB.
The biochemical reactions of our FFL model are summarized below:
Here, we set , , , and . All reaction rate constants are of the unit s−1, while coefficients k1, k2, and k3 are ratio of reaction rates and therefore unitless. The ratios k1, k2, and k3 can take different values so the network represents different types of FFLs.
2.1.3. Types of FFL Modules
Depending on the nature of the regulations, namely, whether each of regulation intensities k1, k2, and k3 is ≥1 (activating) or <1 (inhibiting), there are 23 = 8 types of FFLs. These FFLs are classified into two classes, the coherent FFLs and the incoherent FFLs (Figure 1B) (Alon, 2006). A FFL is termed coherent (C1, C2, C3, C4 in Figure 1B), if the direct effect of protein A on the gene c has the same sign (positive or negative) as its net indirect effect through protein B. Taking the FFL model C1 (Figure 1B) as an example, protein A activates gene b, and protein B activates gene c, with an overall effect of “activation.” At the same time, the direct effect of product of gene a protein A is also activation of gene c. Therefore, C1 is a coherent FFL. When the sign of the indirect path of the regulation is opposite to that of the direct path, we have incoherent FFLs (I1, I2, I3, I4 in Figure 1B). Taking the FFL model I1 as an example, the effect of the direct path is positive, but the overall effect of the indirect path is negative. As can be seen from Figure 1B, all incoherent FFLs have an odd number of edges of inhibition.
2.1.4. Model Parameters
In order to explore broadly the behavior of all types of FFLs, we construct FFL models over the parameter space of a wide range of possible combinations of k1, k2, and k3, representing all 8 types of FFLs. The regulation intensity is set to values based on values reported in (Bu et al., 2016; Tej et al., 2019). We then altered the regulation intensities by about 10-fold to study the general behavior of different types of FFLs at the steady state. We take parameter values of k1 ∈ {0.025, 0.1, 0.4, 0.8, 1.5, 2.1, 2.4, 3.0}, k2 ∈ [0.025, 5.0] with step size of 0.25, k3 ∈ [0.025, 5.0] with step size of 0.25. In addition, for the input intensity, the values are selected based on the analysis of abundance pattern reported in (Momin and Biswas, 2020). We take , and for one and two copies of genes b and c. Details of the relationship of FFL types with k1, k2, and k3 are listed in Table 1. Over this parameter space, we study the behavior of all 8 types of FFLs. Overall, we constructed a total of 10,812 examples of FFLs and computed the steady-state probability landscape for each of them.
2.2. Computing Probability Landscape Using ACME
2.2.1. Exact Computation of Probability Landscape of FFLs
Consider a well mixed system of reaction with constant volume and temperature. This system has n species Xi, i = 1, 2, ⋯, n, in which each particle can participate in m reactions Rk, k = 1, 2, ⋯, m. A microstate of the system at time t, x(t) is a column vector representing the copy number of species: , where the values of copy numbers are non-negative integers. The state space Ω of the system includes all the possible microstate of the system from t = 0 to infinity, Ω = {x(t)|t ∈ [0, ∞)}. In this study, the size of the state space is |Ω| = 657, 900 when genes b and c are single-copy, and |Ω| = 686, 052 and 1, 289, 656 when there are two copies of gene b and c, respectively.
The reaction Rk of the system takes the form of
which brings the system from a microstate x to a new microstate x + sk, where sk is the stoichiometry vector and is defined as
In a well mixed system, the propensity function of reaction k, Ak(x) is given by the product of the intrinsic reaction rate constant rk and possible combinations of the relevant reactants in the current state x.
With the above definitions, the dCME of a network model of the SCK processes consists of a set of linear ordinary differential equations defining the changes in the probability landscape over time at each microstate x. Denote the probability of the system at a specific microstate x at time t as p(x, t) ∈ ℝ[0, 1], the probability landscape of the system over the whole state space Ω as p(t) = {p(x(t))|x(t) ∈ Ω}, the dCME of the system can be written as the general form of
where x and x − sk ∈ Ω.
The steady-state probability landscapes is obtained by solving the dCME directly. The exact solution is made possible by using the the ACME algorithm (Cao et al., 2016a,b). The ACME algorithm eliminates potential problems due to inadequate sampling, where rare events of very low probability is difficult to estimate using techniques such as the stochastic simulation algorithm (SSA) (Gillespie, 1977; Kuwahara and Mura, 2008; Daigle et al., 2011; Cao and Liang, 2013).
2.3. Identification of Multimodality by Persistent Homology
Despite its simple architecture, FFLs have a 9-dimensional probability landscape: There are three genes (a, b, and c), three proteins (A, B, and C), and three bound genes bA, cA, and cB (i.e., gene b bound to protein A, gene c bound to either protein A or protein B). Because of the high dimensionality, it is challenging to characterize the topological structures of their probability landscapes; restricting networks to only “on” and “of” state separately makes it difficult to gain insight into the overall behavior of the network.
There have been studies that analyze d-dimensional probability landscape by examining its projection onto 1-d or 2-d subspaces (e.g., 2-d heatmaps or contour plots) (Bu et al., 2016; Dey and Barik, 2021). However, projected probability surface on lower dimensional space often no longer reflects the topology of the original space, with results and interpretations likely erroneous or misleading (Manuchehrfar et al., 2021). Finding peak states by examining distinct local maxima is equivalent to locating hypercubes that are critical points of Morse index of d in the d-dimension state space. While, local maxima may be identified by comparing its probability value with those of all of its neighbors, all peaks regardless their prominence will be identified. As numerical calculation may introduce small errors, peaks of tiny magnitude will be included. It is non-trivial to decide on a proper threshold to filter them out.
Persistent homology provides a powerful method that can characterize topological features of high-dimensional probability landscapes (Edelsbrunner et al., 2002; Carlsson, 2009). Here, we use newly developed cubic complex algorithm to compute homology groups1 and quantitatively assess the exact topology of the 9-dimensional probability landscape.
2.3.1. Homology Groups
We use homology groups from algebraic topology to characterize the probability landscape. Homology group provides an unambiguous and quantitative description on how a space is connected. It returns a set of algebraic groups describing topological features of holes of various dimensions in the space. The rank of each i-th groups counts the number of linearly independent holes in the corresponding ith dimension. For example, Rank(H0) counts the number of connected components (0th dimensional holes).
2.3.2. Persistent Homology
Persistent homology measures the importance of these topological features (Edelsbrunner et al., 2002), and has been applied in studies of chemical compounds and biomolecules (Xia and Wei, 2014, 2015; Xia et al., 2015). Here, we focus on the topological features of probability peaks, including their appearance and disappearance. They are measured by persistent homology of the 0-th homology group. Specifically, we take the probability p(x) as a height function, and construct a sequence of topological spaces using thresholds {ri} for p(x):
The superlevel sets {Xi} has Xi = {x ∈ X|p(x) ≥ ri}, which corresponds to the threshold ri. The sequence {Xi} gives a sequence of subspaces, which is called filtration:
As the threshold changes, the peak of a probability landscape emerges from the sea-level at a specific threshold, which is the birth time of the corresponding 0-homology group in the filtration. It disappears as an independent component when merged with a prior peak at a particular threshold, which is called the death time. When the sea-level recedes to the ground level at p(x) = 0, only the first peak remains.
2.3.3. Persistent Diagram of Multimodality in Probability Landscape
We keep track of the probability peaks by recording the birth and death times of their corresponding 0-homology groups throughout the filtration. This relationship is depicted by the two-dimensional persistent diagram.
For the ith probability peak, when the threshold r reaches the value rb(i), the probability peak appears. We call this value the birth probability pb(i) = rb(i) of peak i. When the threshold r is lowered to a value rd(i), this peak is merged to an existing peak. We call this value the death probability pd(i) = rd(i) of peak i. The persistence of peak i is defined as:
The persistent diagram plots peak i using the birth probability pb(i) as the y-coordinate and the death probability pd(i) as the x-coordinate. The number of dots on the persistent diagram corresponds to the number of probability peaks. Those that are further off the diagonals are the more prominent probability peaks as their persistences are larger.
3. Results
3.1. Multimodality and Persistent Homology of FFLs
For each FFL network, we first compute its probability landscapes p = p(xA, xB, xC, xa, xb, xc, xbA, xcA, xcB) at the steady-state under various conditions of model parameters. Here, xA, xB, and xC are copy numbers of proteins A, B, and C, respectively; xa, xb, and xc are copy numbers of genes a, b, and c, respectively; xbA and xcA are copy numbers of genes b and c bound by protein A; xcB is the copy number of gene c bound by protein B.
Our results show that the 8 types of FFLs can exhibit up to six different phenotypes of mono- and multimodality at different conditions in the parameter spaces we investigated. An illustration of these six different types of multimodality is shown in Figure 2.
Figure 2. Examples of multimodality exhibited by feed-forward loop (FFL) network motifs. The steady-state probability landscape can exhibit up to 6 different multimodes. The illustrative examples are as follows: 1 peak (red), coherent FFL of type C1 when k1 = 1.2, k2 = 1.2, and k3 = 1.2; 2 peaks (yellow), either in protein B with coherent FFL of type C1, where k1 = 3.0, k2 = 1.2, and k3 = 1.2, or in protein C with coherent FFL of type C1, where k1 = 1.2, k2 = 6.0, and k3 = 6.0; 3 peaks (green), coherent FFL of type C1, where k1 = 1.2, k2 = 6.0, and k3 = 3.6; 4 peaks (light-blue), coherent FFL of type C1 exhibits two peaks for protein B and two peaks for protein C, where k1 = 3.0, k2 = 6.0, and k3 = 6.0; and 6 peaks (purple), coherent FFL of type C1 exhibit two peaks for B and three peaks for C, where k1 = 3.0, k2 = 6.0, and k3 = 3.6.
We further computed their 0-th homology groups at varying sea level of probability. The number of peaks, the birth, and death probability associated with each peak in Figure 2 are shown in the persistent diagrams of Figure 3.
Figure 3. Persistent diagrams (PDs) of feed-forward loop (FFL) network modules of Figure 2 exhibiting different multimodalities. Red: The probability landscape with monomodality. Yellow: These two PDs depict the two steady-state landscapes exhibiting bimodality. Green, light blue, and purple: These three PDs depict the landscape exhibiting tri-modality, 4-modality, and 6-modality, respectively.
3.1.1. Behavior of FFLs From Stochastic Models Differ From Deterministic ODE Models
The behavior of FFL network modules revealed from our stochastic models are fundamentally different from that of deterministic models of ordinary differential equations (ODEs). ODE models are based on kinetics of law of mass action and are used to calculate the mean concentrations of A, B, and C at equilibrium state. However, they do not provide accurate pictures on the degree of multimodality. For example, the steady-state ODE solutions with respect to different gene occupancy for mass action kinetics show that there are at most six phenotypic states (see Supplementary Material for more details). However, as there are no probabilistic considerations, conclusions drawn from ODE models can be problematic.
An example of the diverging results between ODE and stochastic models is shown in Figure 4A for an FFL of C1 type. The mean values of C obtained from the ODE model (vertical blue line) and the expectation computed from the probability landscape (vertical purple line) diverge from each other (Figure 4A). There are three different phenotypic states by the ODE model (green lines, Figure 4A), which are different from the bimodal probability distribution obtained from the SCK model (Figure 4A).
Figure 4. Comparing feed-forward loop (FFL) behavior by Accurate Chemical Master Equation (ACME) and by deterministic ordinary differential equation (ODE) models. (A) shows the results of FFL of C1 type for (k1, k2, k3) = (2.4, 4.5, 1.8). The exact results obtained using ACME exhibit bimodality in protein C (red curve), while trimodality is predicted by the deterministic ODE model (green vertical lines). The mean copy number from ACME (purple vertical line) is also different from the that from ODE (blue vertical line). (B) shows the results of FFL of I1 type for (k1, k2, k3) = (2.4, 0.4, 1.8). The exact results obtained using ACME exhibit monomodality in protein C (red curve), while deterministic ODE model predicts trimodality (green vertical lines), even though the mean copy number of protein C are the same between ACME and ODE models (purple and blue vertical lines, respectively).
A further example is provided by the FFL of type I1. Here, the ODE model predicts the existence of three phenotypes at k1 = 2.7, k2 = 0.4, and k3 = 1.8 (Figure 4B, green vertical lines). However, the stochastic model shows that there is only one stability peak. Although the mean value of C obtained from the ODE model and the expected C value computed from the probability landscape largely overlap, the ODE model provides no information on phenotypical variability. Overall, stochastic models provide accurate and rich information that are not possible with ODE models.
3.1.2. Behavior of FFLs From Exact Solution to dCME by ACME Can Be Differ From That by Stochastic Simulation Algorithm
Results from simulations using SSA may differ from the exact solution to dCME obtained using ACME. We illustrate this using two incoherent FFLs, one at (k1, k2, k3) = (3.0, 0.5, 5.0) of I1-FFL (Figures 5A–C) and another at (k1, k2, k3) = (0.1, 2.75, 5.0) (Figures 5D–F) of the I4-type FFL. The exact steady-state probability landscape of the I1-FFL network computed using ACME is multimodal, exhibiting two peaks in protein B and two peaks in protein C (Figure 5A). However, these peaks are not definitive when 30,000 reaction trajectories up to 2,500 s are simulated using SSA (upper plots, Figures 5B,C). Bimodality in proteins B and C becomes only definitive when simulation time is extended to 5,000 s (lower plots, Figures 5B,C).
Figure 5. Comparing landscapes from Accurate Chemical Master Equation (ACME) and reaction trajectories from the stochastic simulation algorithm (SSA). (A) Probability surface projected onto the (B, C)-plane for the feed-forward loop (FFL) with (k1, k2, k3) = (3.0, 0.5, 5.0). There is bimodality in both proteins B and C. (B,C) The reaction trajectories computed from SSA corresponding to condition in (A) for proteins C and B, respectively. The upper plots are for 2,500 s and lower plots are for 5,000 s. SSA does not capture the bimodality of proteins B and C until 2,500 s. (D) The probability surface projected onto (B − C) plain for FFL with (k1, k2, k3) = (0.1, 2.75, 5.0). There is tri-modality in protein C and bimodality in protein B. (E,F) Corresponding reaction trajectories in proteins C and B, respectively. Upper plots are for the results for 2,500 s and lower plots are for 5,000 s. SSA does not capture tri-modality of protein C until 2,500 s. In addition, SSA fails to capture bimodality in protein B.
The exact steady-state probability landscape of the I4-FFL network computed using ACME exhibits tri-modality in protein C and bimodality in protein B (Figure 5D). However, tri-modality is not clearly captured when the reaction trajectories are <2, 500 s (upper plot, Figure 5E), and becomes definitive only after 5,000 s (lower plot, Figure 5E). In addition, bimodality in protein B is not captured, even when the reaction trajectories are at 5, 000 s (upper and lower plot, Figure 5F).
3.2. Phase Diagrams of Multimodality in FFLs
Current studies of stochastic networks are limited to their behavior under a few selected conditions. Here, we explore the multimodality of all eight types of FFLs under broad conditions of synthesis, degradation, binding, and unbinding as outlined in Table 1. This is made possible by the efficiency of the multi-finite buffer ACME algorithm. The analysis using persistent homology further allows us to quantitatively characterize the exact topology of the landscape. Together, we are able to obtain the full phase diagrams on the phenotype of multimodality of FFLs at different combinations of parameter values (Figure 6).
Figure 6. Phase diagrams of multimodality of Feed-forward loop (FFL) network modules based on 10,812 steady-state probability landscapes at different condition of regulation intensities for all 8 types of FFL network modules. Monomodality occurs when 0.4 ≤ k1 ≤ 2.1 and k2, k3 intensities are moderate, i.e., 0.4 ≤ k1 ≤ 3 (blue region when k1 = 0.4, 0.8, 1.5, and 2.1). Bimodality may occur for different combinations of regulation intensities. When k1 intensity is either very high (2.4 ≤ k1) or very low (k1 ≤ 0.1), bimodality occurs when k2, k3 intensities are moderate, i.e., 0.4 ≤ k1 ≤ 3. When k1 intensity is moderate (0.4 ≤ k1 ≤ 2.1), bimodality occurs when at least one of the other regulation intensities k2 or k3 is high. Tri-modality occurs when k1 is moderate (0.4 ≤ k1 ≤ 2.1) and either k2 or k3 is moderate. Multimodality occurs when k1 is low or high (k1 ≤ 0.4 or k1 ≥ 2.1), and at least either k2 or k3 is high. Color scheme (vertical bar): Blue, green, red, orange, purple, and brow represent regions with one, two three, four, five, and six peaks, respectively.
Altogether, we compute 10,812 probability landscapes of the 8-types of FFL modules. Depending on the values of k1, k2, and k3, each phase diagram shown depicts the behavior of four types of FFLs, one for each of the four quadrants formed by the two straight lines of k2 = 1 and k3 = 1 (Figure 6), with the type of FFL labeled accordingly. The specific types also depend on k1, which is listed at the top of each plot (Figure 6). As a result, we have gained comprehensive and accurate characterization of the multimodality phenotypes of this type of important network modules.
3.2.1. Monomodality
As shown in Figure 2, the steady-state probability landscape of the FFL at k1 = k2 = k3 = 1.2 exhibits one probability peak. At this condition, it is a coherent FFL of type C1. The projected distributions of B and C exhibit monomodality and has only one peak (Figure 2, red) when the values of intensities k1, k2, and k3 are close to 1.0 (Figure 6). Overall, there is only one phenotypic state when the regulations intensities in FFL are weak.
3.2.2. Bimodality
The steady-state probability landscape of FFLs can exhibit two types of bimodality (colored yellow in Figure 2). The first type occurs when k1 < 0.4 or k1 ≥ 2.4, with bimodality in protein B while monomodality in protein C. This is illustrated as green regions in Figure 6 shown at the two top-left and the two bottom right phase diagrams where k1 ∈ {0.025, 0.1, 2.4, 3.0}. That is, if the regulation intensities of k1 and k2 are about two-fold different either way, bimodality in B arises.
The second type of bimodality occurs when 0.4 ≤ k1 < 2.4, where protein C exhibit bimodality while monomodality is maintained in B. This is illustrated as green regions in the remaining phase diagrams of Figure 6, where k1 ∈ {0.4, 0.8, 1.5, 2.1}.
3.2.3. Tri-modality
The steady-state probabilistic landscape of FFL can exhibit tri-modality (green, Figure 2). There are three possible phenotypes in protein C while monomodality in protein B is maintained. Trimodal regions are colored red in the phase diagrams of Figure 6. They arise when the difference in rates k2 and k3 is at least about two-fold and 0.4 ≤ k1 ≤ 2.1.
3.2.4. Multimodality
The steady-state probability landscape of the FFL can exhibit 4 to 6 probability peaks (orange, purple, and green, respectively, in Figure 2). Landscapes with 4 modes have bimodality in both protein B and protein C. Those with 5 modes has bimodality in B and tri-modality in C. Landscapes with 6 modes exhibit bimodality in B and tri-modality in C. Inspection on the conditions indicates that when the regulations are strong; i.e., when k1, k2, and k3 ≥ 2.1, FFLs exhibit very well-defined multimodality peaks. However, when the regulation intensity k1 is weak, the steady-state probability landscape exhibits multimodality only when the other two regulation intensities, namely, k2 and k3 are strong. As shown in Figure 6, there are two groups of FFLs based on the characteristics of the multimodality they exhibit: One group consists of FFLs of C2, C4, I1, and I3 types, where tri-modality of output protein C always exists, as long as k2 and k3 are at least about two-fold different. The other group consists of FFLs of C1, C3, I2, and I4 types where the signs of the regulations that the output node C receives from B and A are the same (both activation or both inhibition). Tri-modality occurs when the regulations k2 and k3 have very distinct values.
Overall, protein B can exhibit either mono- or bimodality, and protein C can exhibit mono-, bi-, or tri-modality on the probability landscape.
3.3. Increasing Input Intensity Amplifies Multimodality in FFL
To understand how input intensity affect the response of FFL networks, we examine their behavior under different input conditions. Specifically, we examine how different synthesis rate sA of protein A affects the number of modes in proteins B and C.
We first carry out computations and broadly survey the behavior of FFLs at strong input intensity, where sA is set to 10.0. The values of k2 and k3 are sampled broadly, and k1 is tested for three different values of k1 = 0.8, 2.1, and 2.4. The results are summarized in Figure 7 (top row). We then similarly survey the behavior of FFLs at decreased synthesis intensity of protein A, with sA = 3.0 (Figure 7, bottom row).
Figure 7. Effects of input intensity on multimodality of Feed-forward loops (FFLs). The phase diagrams of the number of stability peaks in the steady-state probability landscapes at strong input intensity sA = 10.0 (top row) and weak input intensity sA = 3.0 (bottom row) for different k2 and k3 at three different conditions of k1 = 0.8, 2.1, and 2.4. Color scheme (vertical bar): Blue, green, red, orange, purple, and brown represent regions with one, two, three, four, five, and six peaks, respectively.
There are clear changes in the mode of multimodality of FFLs. At k1 = 0.8 and k1 = 2.1 (Figure 7, left and center columns), when protein A synthesis rate sA is reduced from 10.0 (top) to 3.0 (bottom), regions with one (blue) and three (red) peaks are reduced. In addition, certain areas of the tri-stable (red) regions become bimodal (green).
At larger k1 = 2.4 (Figure 7, right column), the FFLs exhibits dramatic changes in the modes of multimodality when synthesis rate sA of protein A is reduced from 10.0 (top) to 3.0 (bottom). In many regions, one or more stability peaks disappear. There are regions with two peaks at sA = 10.0 that become monomodal. There are also regions of six peaks that become those of four peaks. This is due to the loss of one stability peak from three in protein C. In addition, large regions with four peaks (orange) disappear and become either regions with two peaks (green) or with three peaks (red). Overall, we can conclude that high-input intensity represented by high sA rate for protein A induces changed phenotypes of multimodality in FFLs.
3.4. Binding and Unbinding Dynamics Are Critical for Multiple Phenotypic Behavior
Results obtained so far are based on the assumption of slow binding () and unbinding () reactions, which we call the generic case. When the FFL network slowly switches between phenotypic states, the process of synthesis degradation of protein C has sufficient time to converge to equilibrium at each phenotypic state of gene c. An important questions is how slow the promoter dynamics need to be for FFLs to exhibit multiple phenotypes, without feedback loops or cooperatively.
To answer this question, we explore the behavior of FFLs under different binding and unbinding dynamics of gene c for an FFL of type I1. In this case, protein A activates protein B and protein C, while protein B inhibits protein C (see Figure 1B). With slow binding kinetics as described above, the output C of this FFL exhibits three stability peaks. These are at the expression level of protein C of (1) C = 0, corresponding to the condition when gene c is inhibited by B, (2) C = 9, corresponding to the basal level of C expression, and (3) C = 49, when C expression is activated by A. We then fix the regulation intensities at k1 = 3.0, k2 = 0.025, and k3 = 5.1, and examine how the number of phenotypic states is affected by gene c binding dynamics (Figure 8).
Figure 8. Effect of binding dynamics on the modality of protein C in the feed-forward loop (FFL) network of type I1, with (k1, k2, k3) = (3.0, 0.025, 5.1). (A) Effects when binding affinity between gene c and both protein A and protein B are altered by n-fold, where n ∈ {0.5, 2, 8, 16}. At slower binding (yellow line), the modes of distribution of protein C are well-distinguished. However, when the binding and unbinding rates increased to 8 (green line), the peak at C = 9 disappears. At n = 16, bimodality is observed in protein C. (B) Effects when only the binding affinity of gene c and protein B is altered by n-fold, where n ∈ {0.5, 2, 8, 16}. When the binding affinity of gen c and protein B increases, the peak at C = 9 disappears, while the peaks at C = 49 robustly remains. However, the peak at C = 49 becomes less significant. (C) Effects when only the binding affinity of gen c and protein A is altered by n-fold, where n ∈ {0.5, 2, 8, 16}. At high binding affinity, the peak at C = 9 disappears while the peak at C = 49 becomes more prominent.
We first set the binding affinities between gene c and protein A and between gene c and protein B to the same values, and change them together to n-fold of the generic case, where n ∈ {0.5, 2, 8, 16}. For slower binding and unbinding dynamics (yellow line for n = 0.5, Figure 8A), the modes of the distribution of the output of protein C are even better distinguished. However, when both binding and unbinding rates are increased to n = 8 fold (green line), the probability peak at C = 9, which corresponds to basal level of C expression, merges with the probability peak at C = 0. At n = 16, the distribution of C is bimodal.
We then keep the biding affinity between gene c and protein A unchanged and alter only the binding affinity between gene c and protein B by n-fold, where n ∈ {0.5, 2, 8, 16}. When the binding affinity increases (e.g., n = 8), the probability peak at C = 9 disappears, while the probability peak at high copy number of C = 49 robustly remains, although with less magnitude (Figure 8B).
When only the biding affinity between gene c and protein A is altered while that between gene c and protein B is held constant (Figure 8C), the probability peak at the basal level of C expression (C = 9) diminishes when the binding affinity increases (e.g., n = 8). However, the probability peak at C = 49 becomes more prominent. At n = 8, the distribution of C is tri-modal. At n = 16, it becomes bimodal. This indicates that multiple phenotypes arise in FFLs when the unbinding rate is about an order of magnitude smaller than the expression rate of the protein.
3.5. Gene Duplication Can Enrich Phenotypic Diversity and Enlarge Stable Regions of Specific Multimodality of FFLs
Gene duplication provides a basic route of evolution (Lynch and Conery, 2000) and is an important driver of phenotypical diversity in organisms (Conrad and Antonarakis, 2007). Here, we study how gene duplication affects the phenotypes of FFLs.
We examine how duplication of gene c and separately duplication of gene b affect the behavior of the FFL network modules. With two copies of gene c, there can be six possible states of gene c activation. Depending on whether the promoter sites of both copies of gene c are free or occupied by either protein A or protein B, we have for both c genes to have unoccupied, protein A bound, or protein B bound promoter site. This can be denoted as a triplet (c, cA, cB), which can take any of the possible values of (2, 0, 0), (0, 2, 0), (0, 0, 2), (1, 1, 0), (1, 0, 1), and (0, 1, 1). For the case when there are two copy number of gene B, there are three possible states of gene b activation, depending on whether the promoter site of both copies of gene b are free or occupied by protein A. This can be denoted as a duplicate (b, bA), which can take any of the possible values of (2, 0), (1, 1), or (0, 2).
The phase diagrams of the number of modes of stability peaks are shown in Figure 9, when there is only one copy of both gene b and gene c (first row), when there are two copies of gene c but one copy of gene b (second row), and when there are two copy number of gene b but one copy of gene c (third row). The conditions are k1 = 0.025, 0.8, 1.5, and 2.4, for different values of k2 ∈ [0.1, 5] and k3 ∈ [0.1, 5], where there are slow binding and unbinding (, ). Each phase diagram in Figure 9 consists of 400 steady-state probability landscapes with a total 12 × 400 = 4, 800 landscapes. This broad range of parameters allow us to study all 8 different modules of FFL network and the effects of gene c and gene b duplications.
Figure 9. Phase diagram of the effects of gene duplication on multimodality of feed-forward loops (FFLs). First row: Phase diagrams of the modality of stability peaks when there are one copy of gene c and one copy of gene b. Second row: Phase diagrams when there are one copy of gene b and two copies of gene c. Third row: Phase diagram, when there are two copy of gene b and one copy of gene c. The first, second, and third columns are for k1 = 0.025, 1.5, and 2.4, respectively. Color scheme (vertical bar): Blue, green, red, orange, purple, and brown represent regions with one, two, three, four, five, and six peaks, respectively.
We examine the behavior of FFL in three different regimes of k1: (1) When k1 ≪ 1.0 (Figure 9, first column), the bimodal regions (green) expands when there are two copies of gene c (second row), but there are no significant changes when there are two copies of gene b (third row). The overall size of multimodal regions increases in both cases. (2) When k1 ≈ 1.0 (Figure 9, second and third columns), the duplication of gene c (second row) expands the regions with three stability peaks and reduces regions with two peaks. In contrast, the duplication of gene b (third row) has no significant effects on multimodality. (3) When k1 = 2.4 (fourth column), duplication of gene c (second row) expands regions with two and six stability peaks. Duplication of gene b (third row) reduces the region with four peaks and expands the region with five peaks.
These results show that introducing additional copy of gene b or gene c not only can enrich different phenotypic behavior but can also increase the stability of specific phenotypic states, namely, enlarge regions of particular phenotypes by uniting previously different phenotypic regions together. Overall, gene duplication can increase phenotypic diversity, and enlarge stability regions of specific multimodal states.
Bacterial cells have fast binding and unbinding dynamics (Ali Al-Radhawi et al., 2019), and it is unlikely that the occurrence of multiple copies of the same gene in FFLs plays significant roles in stochastic multimodality. In contrast, mammalian cells have slower promoter dynamics (Forger and Peskin, 2003). Gene duplication in FFLs may provide a natural mechanism for enriched multimodality with enhanced stochastic phenotypic switching. This is reflected in reduced monomodal regions, and enlarged multimodal regions where there are 4 (orange), 5 (purple), and 6 (brown) phenotypic states of the output C (second and third row in Figure 9).
Assuming that initially both copies of the gene were functioning, but subsequently one gene copy lost its biochemical function due to mutations, we can expect two opposite types of scenarios to occur: If regulation intensities are strong (k2 and k3 are large), one of the phenotypic states becomes lost (e.g., green region becomes light blue, and orange region becomes red, Figure 9). If regulation intensities are weak, the duplication of gene c or gene b can lead to enlargement of the region of monomodality. It can also lead to the appearance of new regimes where there are a larger number of multimodality modes (orange, purple, and green regions in Figure 9). That is, gene duplication can create new stable states, leading to an enlarged number of high probability states. This, however, occurs only in FFL modules with strong regulations intensities. FFL modules with low regulation intensities instead lose phenotypical diversity and become more robust in monomodality with enlarged region in the parameter space.
4. Discussion
Gene regulatory networks (GRNs) play critical roles in defining cellular phenotypes but it is challenging to characterize the behavior of GRNs. Although GRNs may consist of dozens or more of genes and proteins, their functions often can be defined by smaller sub-networks called network motifs. How small network motifs are responsible for complex properties such as the maintenance of multi-phenotypic behavior or modules is poorly understood. Current widely practiced approach is studying network motifs using deterministic models. However, this approach imposes restrictions on the types of network motifs capable of exhibiting multimodal phenotype to mostly feedback networks.
In this study, we examined the FFL network motifs, one of the most ubiquitous three-node network motifs. Although their deterministic behavior is well-studied, with great understanding of their functions such as signal processing and adaptations gained, their stochastic behavior remains poorly characterized.
Here, we showed the direct regulation path from the input node to the output node and the indirect path through the intermediate buffer node provide the necessary architecture for distinct multiple modalities. Phase diagrams of FFL in Figure 6 show that FFLs of various types can exhibit different multimodality. At large copy numbers and large volume, our model of stochastic reaction kinetics are the same as those based on mass action kinetics (Kurtz, 1971, 1972; Vellela and Qian, 2007), where ordinary differential equation (ODE) models are appropriate. When ODE models are applied to enzyme–substrate reactions, they can be further approximated by Michaelis–Menten kinetics, with the additional assumption that the substrate is in instantaneous chemical equilibrium with the enzyme–substrate complex. When ODE models are applied to the reaction of one receptor and n identical simultaneously binding ligands, we arrive at the Hill equation, with the coefficient n phenomenologically characterizing cooperativity. These kinetic models based on ODE approximations, however, are not applicable to the current study, as we are examining strong stochasticity arising at low copy number of molecules, where ODE models are not valid.
FFLs play important roles in gene regulatory networks. For example, it is shown that several I1-FFL sub-networks control the process of Bacillus subtilis sporulation (Eichenberger et al., 2004; Mangan et al., 2006). In addition, C1-FFL network is found to be present in the L-arabinose (ara) utilization system of E. coli, where araBAD is the target (gene c) activated by the intermediate gene araC and the input gene CRP. Gene araC is also activated by CRP. Therefore, they form a 3-node C1 type FFL (Mangan et al., 2003). Results in this work can help to gain understanding of the behavior of these different types of FFLs found in gene regulatory networks.
In addition, we have shown that input intensity affects the multimodal behavior of various types of FFLs. Examples shown in Figure 7 demonstrate that at high k1 values, input intensity dramatically changes the multimodality as shown in the phase diagrams. Our results are consistent with previous findings that input intensity is an important factor in determining output intensity of FFLs (Mangan et al., 2003; Goentoro et al., 2009; Lin et al., 2018). Here, we further demonstrated that input intensity is also important in determining the modality of the steady-state behavior of FFLs.
In mammalian cells, slow dynamics of transcription factor binding to promoter is often observed (Dermitzakis and Clark, 2002; Hager et al., 2009; Lickwar et al., 2012; Tuǧrul et al., 2015; Hasegawa and Struhl, 2019). This is likely due to the complex process of chromatin regions opening up so they become accessible and the slow nature of events such as promoter, enhancer, and mediator binding. These physical processes result in highly stochastic behavior of networks. Stochastic models have demonstrated that complex multimodality phenotypes can naturally arise from stochastic fluctuations when genes have distinct expression levels, a phenomenon widely observed in mammalian cells (Cao et al., 2018). We showed that binding and unbinding dynamics are critical for multi-phenotypic behavior. For an I1-FFL with (k1, k2, k3) = (3.0, 0.025, 5.1), Figure 8 highlighted that binding and unbinding rates affect multiple peaks in protein C.
Results of this study indeed showed that once stochastic fluctuations between distinct expression levels due to slow promoter dynamics are considered, FFLs can exhibit complex multimodal phenotypes. When the expression levels of the output gene (gene c) at the inhibited, basal, and activated states are well-separated, three distinct phenotypes arise. Combined with two additional possible phenotypes of different levels of gene b expression, we can have up to six modalities for FFLs. Furthermore, high intensity of input amplifies multimodality in FFLs, suggesting that the FFL architecture are favored for maintaining multiple phenotypic states. In addition, we find that regulation intensities are key determinants of specific stochastic behavior of FFLs, which could be tuned in order to obtain any desired phenotypic behavior between 1 and 6 stability modes.
Our study also revealed the roles of gene duplication. When there are two copies of gene c, while one in principle could expect 2 × 6 = 12 different phenotypes for the output protein C. This is, however, not observed, as the regulation intensities or reaction rates are not well-separated. In contrast, instead of further increase in multimodality beyond six, we observe the expansion of the area of monomodality, resulting from the connectedness of regions of expression with different rates that are merged together. Our results showed that duplication of gene b and gene c not only can enrich different phenotypic behavior but can also increase the stability of certain phenotypic states, while decreasing others (Figure 9). We showed that in general, gene duplication can enrich phenotypic diversity. The presence and functional roles of gene duplication are well-known (Hurles, 2004). For example, in human-induced pluripotent stem cells (HiPSCs), chromosome 12 duplication leads to significant enrichment of cell cycle related genes (Mayshar et al., 2010), in which FFL sub-networks play important roles. This abnormality results in increase in the tumorigenicity of HiPSCs. Our findings may also shed light on how gene duplication affects cellular adaptation to changing environment (Kondrashov, 2012): As the support regions of monomodality are enlarged, smaller fluctuations in regulation intensities will not switch cells with duplicated genes to a different phenotypic state. Thus, gene duplication may help to stabilize the behavior of the network, so cells are better adapted to a changing environment.
Analysis of stochastic behavior of FFLs reported here have implications in a variety of biological problems. For example, the stem cell regulation network consisting of pluripotency transcription factors Oct4 and Nanog maintain pluripotency against differentiation (Boyer et al., 2005; Chickarmane et al., 2006; Papatsenko et al., 2015; Lin et al., 2018). A component of this network can be abstracted as an FFL: Nanog participates as the intermediate node (gene b, which is activated by Oct4 (gene a), and both regulate the expression of genes associated with the onset of differentiation or pluripotency (gene cs). In addition, regulation networks in hematopoietic stem cells are formed by two FFL networks involving β globin, GATA-1, EKLF, and FOG-1. In each network, FOG-1 and EKLF function as the intermediate genes (gene b) and are activated by GATA-1 (gene a), while all of them activate β globin (gene c) (Swiers et al., 2006). Moreover, in other stem cell differentiation networks, there are several sub-networks that exhibit behaviors of different types of FFLs. For example, Klf4 (gene a) activates Pou5f1 (gene b) and inhibits Sox2 (gene c), while Pou5f1 activated Sox2 (Onichtchouk et al., 2010; Okawa et al., 2016), as in the C3-type FFL (Figure 1).
In summary, we have constructed and analyzed the exact high-dimensional steady-state probability landscapes of FFLs under broad conditions and have constructed their phase diagrams in multimodality. These results are based on 10,812 exactly computed probability landscapes and their topological features as measured by persistent homology. With slow binding and unbinding dynamics of transcription factor binding to promoter, FFLs exhibit strong stochastic behavior that is very different from deterministic models, and can exhibit from 1 up to 6 stability peaks. In addition, input intensity play major roles in the phenotypes of FFLs: At weak input intensity, FFLs exhibit monomodality, but strong input intensity may result in up to 6 stable phenotypes. Furthermore, we found that gene duplication can enrich the diversity of FFL network phenotypes and enlarge stable regions of specific multimodalities.
Results reported here can be useful for constructing synthetic networks, and for selecting model parameters, so a particular desirable phenotypic behavior can materialize (Jones et al., 2020). Our results can also be used for analysis of behavior of FFLs in biological processes such as stem cell differentiation and for design of synthetic networks with desired phenotype behavior. We hope results reported here for different types of FFL can be tested experimentally.
Data Availability Statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Author Contributions
AT and JL conceived and designed the study. AT designed and carried out analysis of the ODE model, multimodality, phase diagrams, slow dynamics, and gene duplication. FM designed and carried out analysis of persistent homology, and assisted in multimodality and phase diagram computation. YC participates in design and data analysis. AT and JL wrote the manuscript with significant input from FM. All authors have read and approved the final manuscript.
Funding
This work was supported by NIH grant R35 GM127084.
Conflict of Interest
YC was employed by company Merck & Co., Inc., Kenilworth, NJ, United States.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Acknowledgments
We thank Wei Tian for assistance in computing persistent homology.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2021.645640/full#supplementary-material
Footnotes
1. ^Tian, W., Manuchehrfar, F., Wagner, H., Edelsbrunner, H., and Liang, J. (2021). Persistent homology and moment of probability landscapes of stochastic reaction networks and their changes.
References
Acar, M., Becskei, A., and van Oudenaarden, A. (2005). Enhancement of cellular memory by reducing stochastic transitions. Nature 435, 228–232. doi: 10.1038/nature03524
Ali Al-Radhawi, M., Del Vecchio, D., and Sontag, E. D. (2019). Multi-modality in gene regulatory networks with slow promoter kinetics. PLoS Comput. Biol. 15:e1006784. doi: 10.1371/journal.pcbi.1006784
Alon, U. (2006). An Introduction to Systems Biology: Design Principles of Biological Circuits. New York, NY: CRC Press. doi: 10.1201/9781420011432
Arkin, A., Ross, J., and McAdams, H. H. (1998). Stochastic kinetic analysis of developmental pathway bifurcation in phage λ-infected Escherichia coli cells. Genetics 149, 1633–1648. doi: 10.1093/genetics/149.4.1633
Boyer, L. A., Lee, T. I., Cole, M. F., Johnstone, S. E., Levine, S. S., Zucker, J. P., et al. (2005). Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122, 947–956. doi: 10.1016/j.cell.2005.08.020
Bu, P., Wang, L., Chen, K.-Y., Srinivasan, T., Murthy, P., Tung, K.-L., et al. (2016). A mir-34a-numb feedforward loop triggered by inflammation regulates asymmetric stem cell division in intestine and colon cancer. Cell Stem Cell 18, 189–202. doi: 10.1016/j.stem.2016.01.006
Cao, Y., Lei, X., Ribeiro, R. M., Perelson, A. S., and Liang, J. (2018). Probabilistic control of hiv latency and transactivation by the tat gene circuit. Proc. Natl. Acad. Sci. U.S.A. 115, 12453–12458. doi: 10.1073/pnas.1811195115
Cao, Y., and Liang, J. (2013). Adaptively biased sequential importance sampling for rare events in reaction networks with comparison to exact solutions from finite buffer DCME method. J. Chem. Phys. 139:025101. doi: 10.1063/1.4811286
Cao, Y., Lu, H.-M., and Liang, J. (2010). Probability landscape of heritable and robust epigenetic state of lysogeny in phage lambda. Proc. Natl. Acad. Sci. U.S.A. 107, 18445–18450. doi: 10.1073/pnas.1001455107
Cao, Y., Terebus, A., and Liang, J. (2016a). Accurate chemical master equation solution using multi-finite buffers. Multiscale Model. Simul. 14, 923–963. doi: 10.1137/15M1034180
Cao, Y., Terebus, A., and Liang, J. (2016b). State space truncation with quantified errors for accurate solutions to discrete chemical master equation. Bull. Math. Biol. 78, 617–661. doi: 10.1007/s11538-016-0149-1
Carlsson, G. (2009). Topology and data. Bull. Am. Math. Soc. 46, 255–308. doi: 10.1090/S0273-0979-09-01249-X
Chen, Y., Lv, C., Li, F., and Li, T. (2015). Distinguishing the rates of gene activation from phenotypic variations. BMC Syst. Biol. 9:1. doi: 10.1186/s12918-015-0172-0
Chickarmane, V., Troein, C., Nuber, U. A., Sauro, H. M., and Peterson, C. (2006). Transcriptional dynamics of the embryonic stem cell switch. PLoS Comput. Biol. 2:e123. doi: 10.1371/journal.pcbi.0020123
Choi, P. J., Cai, L., Frieda, K., and Xie, X. S. (2008). A stochastic single-molecule event triggers phenotype switching of a bacterial cell. Science 322, 442–446. doi: 10.1126/science.1161427
Conrad, B., and Antonarakis, S. E. (2007). Gene duplication: a drive for phenotypic diversity and cause of human disease. Annu. Rev. Genomics Hum. Genet. 8, 17–35. doi: 10.1146/annurev.genom.8.021307.110233
Daigle, B. J., Roh, M. K., Gillespie, D. T., and Petzold, L. R. (2011). Automated estimation of rare event probabilities in biochemical systems. J. Chem. Phys. 134:044110. doi: 10.1063/1.3522769
Dermitzakis, E. T., and Clark, A. G. (2002). Evolution of transcription factor binding sites in mammalian gene regulatory regions: conservation and turnover. Mol. Biol. Evol. 19, 1114–1121. doi: 10.1093/oxfordjournals.molbev.a004169
Dey, A., and Barik, D. (2021). Potential landscapes, bifurcations, and robustness of tristable networks. ACS Synth. Biol. 10, 391–401. doi: 10.1021/acssynbio.0c00570
Duncan, A., Liao, S., Vejchodský, T., Erban, R., and Grima, R. (2015). Noise-induced multistability in chemical systems: discrete versus continuum modeling. Phys. Rev. E 91:042111. doi: 10.1103/PhysRevE.91.042111
Edelsbrunner, H., Letscher, D., and Zomorodian, A. (2002). Topological persistence and simplification. Discrete Comput. Geometry 28, 511–533. doi: 10.1007/s00454-002-2885-2
Eichenberger, P., Fujita, M., Jensen, S. T., Conlon, E. M., Rudner, D. Z., Wang, S. T., et al. (2004). The program of gene transcription for a single differentiating cell type during sporulation in Bacillus subtilis. PLoS Biol. 2:e328. doi: 10.1371/journal.pbio.0020328
Feng, H., and Wang, J. (2012). A new mechanism of stem cell differentiation through slow binding/unbinding of regulators to genes. Sci. Rep. 2:550. doi: 10.1038/srep00550
Forger, D. B., and Peskin, C. S. (2003). A detailed predictive model of the mammalian circadian clock. Proc. Natl. Acad. Sci. U.S.A. 100, 14806–14811. doi: 10.1073/pnas.2036281100
François, P., and Siggia, E. D. (2008). A case study of evolutionary computation of biochemical adaptation. Phys. Biol. 5:026009. doi: 10.1088/1478-3975/5/2/026009
Gillespie, D. T. (1977). Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 81, 2340–2361. doi: 10.1021/j100540a008
Goentoro, L., Shoval, O., Kirschner, M. W., and Alon, U. (2009). The incoherent feedforward loop can provide fold-change detection in gene regulation. Mol. Cell 36, 894–899. doi: 10.1016/j.molcel.2009.11.018
Guo, D., and Li, C. (2009). Stochastic and coherence resonance in feed-forward-loop neuronal network motifs. Phys. Rev. E 79:051921. doi: 10.1103/PhysRevE.79.051921
Gupta, P. B., Fillmore, C. M., Jiang, G., Shapira, S. D., Tao, K., Kuperwasser, C., et al. (2011). Stochastic state transitions give rise to phenotypic equilibrium in populations of cancer cells. Cell 146, 633–644. doi: 10.1016/j.cell.2011.07.026
Hager, G. L., McNally, J. G., and Misteli, T. (2009). Transcription dynamics. Mol. Cell 35, 741–753. doi: 10.1016/j.molcel.2009.09.005
Hasegawa, Y., and Struhl, K. (2019). Promoter-specific dynamics of tata-binding protein association with the human genome. Genome Res. 29, 1939–1950. doi: 10.1101/gr.254466.119
Huang, S., Ernberg, I., and Kauffman, S. (2009). Cancer attractors: a systems view of tumors from a gene network dynamics and developmental perspective. Semin. Cell Dev. Biol. 20, 869–876. doi: 10.1016/j.semcdb.2009.07.003
Hurles, M. (2004). Gene duplication: the genomic trade in spare parts. PLoS Biol. 2:e206. doi: 10.1371/journal.pbio.0020206
Ivey, K. N., and Srivastava, D. (2010). MicroRNAs as regulators of differentiation and cell fate decisions. Cell Stem Cell 7, 36–41. doi: 10.1016/j.stem.2010.06.012
Jones, R. D., Qian, Y., Siciliano, V., DiAndreth, B., Huh, J., Weiss, R., et al. (2020). An endoribonuclease-based feedforward controller for decoupling resource-limited genetic modules in mammalian cells. Nat. Commun. 11:5690. doi: 10.1038/s41467-020-19126-9
Kondrashov, F. A. (2012). Gene duplication as a mechanism of genomic adaptation to a changing environment. Proc. R. Soc. B 279, 5048–5057. doi: 10.1098/rspb.2012.1108
Kurtz, T. G. (1971). Limit theorems for sequences of jump Markov processes approximating ordinary differential processes. J. Appl. Probabil. 8, 344–356. doi: 10.2307/3211904
Kurtz, T. G. (1972). The relationship between stochastic and deterministic models for chemical reactions. J. Chem. Phys. 57, 2976–2978. doi: 10.1063/1.1678692
Kuwahara, H., and Mura, I. (2008). An efficient and exact stochastic simulation method to analyze rare events in biochemical systems. J. Chem. Phys. 129:165101. doi: 10.1063/1.2987701
Lee, T. I., Rinaldi, N. J., Robert, F., Odom, D. T., Bar-Joseph, Z., Gerber, G. K., et al. (2002). Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 298, 799–804. doi: 10.1126/science.1075090
Liang, J., and Qian, H. (2010). Computational cellular dynamics based on the chemical master equation: A Challenge for understanding complexity. J. Comput. Sci. Technol. 25, 154–168. doi: 10.1007/s11390-010-9312-6
Lickwar, C. R., Mueller, F., Hanlon, S. E., McNally, J. G., and Lieb, J. D. (2012). Genome-wide protein-dna binding dynamics suggest a molecular clutch for transcription factor function. Nature 484, 251–255. doi: 10.1038/nature10985
Lin, Y. T., Hufton, P. G., Lee, E. J., and Potoyan, D. A. (2018). A stochastic and dynamical view of pluripotency in mouse embryonic stem cells. PLoS Comput. Biol. 14:e1006000. doi: 10.1371/journal.pcbi.1006000
Lu, T., Shen, T., Bennett, M. R., Wolynes, P. G., and Hasty, J. (2007). Phenotypic variability of growing cellular populations. Proc. Natl. Acad. Sci. U.S.A. 104, 18982–18987. doi: 10.1073/pnas.0706115104
Lynch, M., and Conery, J. S. (2000). The evolutionary fate and consequences of duplicate genes. Science 290, 1151–1155. doi: 10.1126/science.290.5494.1151
Ma, W., Trusina, A., El-Samad, H., Lim, W. A., and Tang, C. (2009). Defining network topologies that can achieve biochemical adaptation. Cell 138, 760–773. doi: 10.1016/j.cell.2009.06.013
Mangan, S., and Alon, U. (2003). Structure and function of the feed-forward loop network motif. Proc. Natl. Acad. Sci. U.S.A. 100, 11980–11985. doi: 10.1073/pnas.2133841100
Mangan, S., Itzkovitz, S., Zaslaver, A., and Alon, U. (2006). The incoherent feed-forward loop accelerates the response-time of the gal system of Escherichia coli. J. Mol. Biol. 356, 1073–1081. doi: 10.1016/j.jmb.2005.12.003
Mangan, S., Zaslaver, A., and Alon, U. (2003). The coherent feedforward loop serves as a sign-sensitive delay element in transcription networks. J. Mol. Biol. 334, 197–204. doi: 10.1016/j.jmb.2003.09.049
Manuchehrfar, F., Li, H., Tian, W., Ma, A., and Liang, J. (2021). Exact topology of dynamic probability surface of an activated process by persistent homology. J. Phys. Chem. B 125, 4667–4680. doi: 10.1021/acs.jpcb.1c00904
Mayshar, Y., Ben-David, U., Lavon, N., Biancotti, J.-C., Yakir, B., Clark, A. T., et al. (2010). Identification and classification of chromosomal aberrations in human induced pluripotent stem cells. Cell Stem Cell 7, 521–531. doi: 10.1016/j.stem.2010.07.017
Momin, M. S. A., and Biswas, A. (2020). Extrinsic noise of the target gene governs abundance pattern of feed-forward loop motifs. Phys. Rev. E 101:052411. doi: 10.1103/PhysRevE.101.052411
Okawa, S., Nicklas, S., Zickenrott, S., Schwamborn, J., and del Sol, A. (2016). A generalized gene-regulatory network model of stem cell differentiation for predicting lineage specifiers. Stem Cell Rep. 7, 307–315. doi: 10.1016/j.stemcr.2016.07.014
Onichtchouk, D., Geier, F., Polok, B., Messerschmidt, D. M., Massner, R., Wendik, B., et al. (2010). Zebrafish pou5f1-dependent transcriptional networks in temporal control of early development. Mol. Syst. Biol. 6:354. doi: 10.1038/msb.2010.9
Papatsenko, D., Darr, H., Kulakovskiy, I. V., Waghray, A., Makeev, V. J., MacArthur, B. D., et al. (2015). Single-cell analyses of escs reveal alternative pluripotent cell states and molecular mechanisms that control self-renewal. Stem Cell Rep. 5, 207–220. doi: 10.1016/j.stemcr.2015.07.004
Ptashne, M. (2004). A Genetic Switch: Phage Lambda Revisited, 3rd Edn. New York, NY: Cold Spring Harbor Laboratory Press.
Re, A., Corá, D., Taverna, D., and Caselle, M. (2009). Genome-wide survey of microrna-transcription factor feed-forward regulatory circuits in human. Mol. Biosyst. 5, 854–867. doi: 10.1039/b900177h
Schultz, D., Jacob, E. B., Onuchic, J. N., and Wolynes, P. G. (2007). Molecular level stochastic model for competence cycles in Bacillus subtilis. Proc. Natl. Acad. Sci. U.S.A. 104, 17582–17587. doi: 10.1073/pnas.0707965104
Shalek, A. K., Satija, R., Adiconis, X., Gertner, R. S., Gaublomme, J. T., Raychowdhury, R., et al. (2013). Single-cell transcriptomics reveals bimodality in expression and splicing in immune cells. Nature 498, 236–240. doi: 10.1038/nature12172
Shen-Orr, S. S., Milo, R., Mangan, S., and Alon, U. (2002). Network motifs in the transcriptional regulation network of Escherichia coli. Nat. Genet. 31, 64–68. doi: 10.1038/ng881
Shiraishi, T., Matsuyama, S., and Kitano, H. (2010). Large-scale analysis of network bistability for human cancers. PLoS Comput. Biol. 6:e1000851. doi: 10.1371/journal.pcbi.1000851
Siegal-Gaskins, D., Grotewold, E., and Smith, G. D. (2009). The capacity for multistability in small gene regulatory networks. BMC Syst. Biol. 3(1):1. doi: 10.1186/1752-0509-3-96
Sorrells, T. R., and Johnson, A. D. (2015). Making sense of transcription networks. Cell 161, 714–723. doi: 10.1016/j.cell.2015.04.014
Swiers, G., Patient, R., and Loose, M. (2006). Genetic regulatory networks programming hematopoietic stem cells and erythroid lineage specification. Dev. Biol. 294, 525–540. doi: 10.1016/j.ydbio.2006.02.051
Tej, S., Gaurav, K., and Mukherji, S. (2019). Small RNA driven feed-forward loop: critical role of sRNA in noise filtering. Phys. Biol. 16:046008. doi: 10.1088/1478-3975/ab1563
Terebus, A., Cao, Y., and Liang, J. (2019). “Sensitivities of regulation intensities in feed-forward loops with multistability,” in 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (Berlin: IEEE), 1969–1972. doi: 10.1109/EMBC.2019.8856532
Thomas, P., Popović, N., and Grima, R. (2014). Phenotypic switching in gene regulatory networks. Proc. Natl. Acad. Sci. U.S.A. 111, 6994–6999. doi: 10.1073/pnas.1400049111
Tsang, J., Zhu, J., and van Oudenaarden, A. (2007). Microrna-mediated feedback and feedforward loops are recurrent network motifs in mammals. Mol. Cell 26, 753–767. doi: 10.1016/j.molcel.2007.05.018
Tuǧrul, M., Paixão, T., Barton, N. H., and Tkačik, G. (2015). Dynamics of transcription factor binding site evolution. PLoS Genet. 11:e1005639. doi: 10.1371/journal.pgen.1005639
Vellela, M., and Qian, H. (2007). A quasistationary analysis of a stochastic chemical reaction: Keizer's paradox. Bull. Math. Biol. 69, 1727–1746. doi: 10.1007/s11538-006-9188-3
Vellela, M., and Qian, H. (2009). Stochastic dynamics and non-equilibrium thermodynamics of a bistable chemical system: the schlögl model revisited. J. R. Soc. Interface 6, 925–940. doi: 10.1098/rsif.2008.0476
Xia, K., Feng, X., Tong, Y., and Wei, G. W. (2015). Persistent homology for the quantitative prediction of fullerene stability. J. Comput. Chem. 36, 408–422. doi: 10.1002/jcc.23816
Xia, K., and Wei, G.-W. (2014). Persistent homology analysis of protein structure, flexibility, and folding. Int. J. Num. Methods Biomed. Eng. 30, 814–844. doi: 10.1002/cnm.2655
Xia, K., and Wei, G.-W. (2015). Multidimensional persistence in biomolecular data. J. Comput. Chem. 36, 1502–1520. doi: 10.1002/jcc.23953
Ye, Y., Kang, X., Bailey, J., Li, C., and Hong, T. (2019). An enriched network motif family regulates multistep cell fate transitions with restricted reversibility. PLoS Comput. Biol. 15:e1006855. doi: 10.1371/journal.pcbi.1006855
Zhang, Y., Liu, N., Lin, W., and Li, C. (2019). Quantifying the interplay between genetic and epigenetic regulations in stem cell development. N. J. Phys. 21:103042. doi: 10.1088/1367-2630/ab4c82
Zhu, X.-M., Yin, L., Hood, L., and Ao, P. (2004a). Calculating biological behaviors of epigenetic states in the phage λ life cycle. Funct. Integr. Genomics 4, 188–195. doi: 10.1007/s10142-003-0095-5
Keywords: systems biology, feed forward loop, gene regulatory network, network motif, stochastic reaction network, persistent homology, ACME algorithm
Citation: Terebus A, Manuchehrfar F, Cao Y and Liang J (2021) Exact Probability Landscapes of Stochastic Phenotype Switching in Feed-Forward Loops: Phase Diagrams of Multimodality. Front. Genet. 12:645640. doi: 10.3389/fgene.2021.645640
Received: 23 December 2020; Accepted: 26 April 2021;
Published: 08 July 2021.
Edited by:
Chunhe Li, Fudan University, ChinaReviewed by:
Yang Cao, Virginia Tech, United StatesDavit Potoyan, Iowa State University, United States
Copyright © 2021 Terebus, Manuchehrfar, Cao and Liang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Jie Liang, amxpYW5nJiN4MDAwNDA7dWljLmVkdQ==
†These authors have contributed equally to this work