- 1Department of Mathematics, Imperial College London, London, United Kingdom
- 2School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
- 3School of Informatics, University of Edinburgh, Edinburgh, United Kingdom
Metabolic heterogeneity is widely recognized as the next challenge in our understanding of non-genetic variation. A growing body of evidence suggests that metabolic heterogeneity may result from the inherent stochasticity of intracellular events. However, metabolism has been traditionally viewed as a purely deterministic process, on the basis that highly abundant metabolites tend to filter out stochastic phenomena. Here we bridge this gap with a general method for prediction of metabolite distributions across single cells. By exploiting the separation of time scales between enzyme expression and enzyme kinetics, our method produces estimates for metabolite distributions without the lengthy stochastic simulations that would be typically required for large metabolic models. The metabolite distributions take the form of Gaussian mixture models that are directly computable from single-cell expression data and standard deterministic models for metabolic pathways. The proposed mixture models provide a systematic method to predict the impact of biochemical parameters on metabolite distributions. Our method lays the groundwork for identifying the molecular processes that shape metabolic heterogeneity and its functional implications in disease.
1. Introduction
Non-genetic heterogeneity is a hallmark of cell physiology. Isogenic cells can display markedly different phenotypes as a result of the stochasticity of intracellular processes and fluctuations in environmental conditions. Gene expression variability, in particular, has received substantial attention thanks to robust experimental techniques for measuring transcripts and proteins at a single-cell resolution (Golding et al., 2005; Taniguchi et al., 2010). This progress has gone hand-in-hand with a large body of theoretical work on stochastic models to identify the molecular processes that affect expression heterogeneity (Swain et al., 2002; Raj and van Oudenaarden, 2008; Thomas et al., 2014; Dattani and Barahona, 2017; Tonn et al., 2019).
In contrast to gene expression, our understanding of stochastic phenomena in metabolism is still in its infancy. Traditionally, cellular metabolism has been regarded as a deterministic process on the basis that metabolites appear in large numbers that filter out stochastic phenomena (Heinemann and Zenobi, 2011). But this view is changing rapidly thanks to a growing number of single-cell measurements of metabolites and co-factors (Bennett et al., 2009; Imamura et al., 2009; Lemke and Schultz, 2011; Paige et al., 2012; Ibáñez et al., 2013; Yaginuma et al., 2014; Esaki and Masujima, 2015; Xiao et al., 2016; Mannan et al., 2017) that suggest that cell-to-cell metabolite variation is much more pervasive than previously thought. The functional implications of this heterogeneity are largely unknown but likely to be substantial given the roles of metabolism in many cellular processes, including growth (Weisse et al., 2015), gene regulation (Lempp et al., 2019), epigenetic control (Loftus and Finlay, 2016), and immunity (Reid et al., 2017). For example, metabolic heterogeneity has been linked to bacterial persistence (Radzikowski et al., 2017; Shan et al., 2017), a dormant phenotype characterized by a low metabolic activity, as well as antibiotic resistance (Deris et al., 2013) and other functional effects (Vilhena et al., 2018). In biotechnology applications, metabolic heterogeneity is widely recognized as a limiting factor on metabolite production with genetically engineered microbes (Binder et al., 2017; Schmitz et al., 2017; Liu et al., 2018).
A key challenge for quantifying metabolic variability is the difficulty in measuring cellular metabolites at a single-cell resolution (Amantonico et al., 2010; Takhaveev and Heinemann, 2018; Wehrens et al., 2018). As a result, most studies use other phenotypes as a proxy for metabolic variation, e.g., enzyme expression levels (Kotte et al., 2014; van Heerden et al., 2014), metabolic fluxes (Schreiber et al., 2016), or growth rate (Kiviet et al., 2014; Şimşek and Kim, 2018). From a computational viewpoint, the key challenge is that metabolic processes operate on two timescales: a slow timescale for expression of metabolic enzymes, and a fast timescale for enzyme catalysis. Such multiscale structure results in stiff models that are infeasible to solve with standard algorithms for stochastic simulation (Gillespie, 2007). Other strategies to accelerate stochastic simulations, such as τ-leaping (Rathinam et al., 2003), also fail to produce accurate simulation results due to the disparity in molecule numbers between enzymes and metabolites (Tonn, 2020). These challenges have motivated a number of methods to optimize stochastic simulations of metabolism (Puchałka and Kierzek, 2004; Cao et al., 2005; Labhsetwar et al., 2013; Lugagne et al., 2013; Murabito et al., 2014). Most of these methods exploit the timescale separation to accelerate simulations at the expense of some approximation error. This progress has been accompanied by a number of theoretical results on the links between molecular processes and the shape of metabolite distributions (Levine and Hwa, 2007; Oyarzún et al., 2015; Gupta et al., 2017b; Tonn et al., 2019). Yet to date there are no general methods for computing metabolite distributions that can handle inherent features of metabolic pathways such as feedback regulation, complex stoichiometries, and the high number of molecular species involved.
In this paper we present a widely applicable method for approximating single-cell metabolite distributions. Our method is founded on the timescale separation between enzyme expression and enzyme catalysis, which we employ to approximate the stationary solution of the chemical master equation. The approximate solution takes the form of mixture distributions with: (i) mixture weights that can be computed from models for gene expression or single-cell expression data, and (ii) mixture components that are directly computable from deterministic pathway models. The resulting mixture model can be employed to explore the impact of biochemical parameters on metabolite variability. We illustrate the power of the method in two exemplar systems that are core building blocks of large metabolic networks. Our theory provides a quantitative basis to draw testable hypotheses on the sources of metabolite heterogeneity, which together with the ongoing efforts in single-cell metabolite measurements, will help to re-evaluate the role of metabolism as an active source of phenotypic variation.
2. General Method for Computing Metabolite Distributions
We consider metabolic pathways composed of enzymatic reactions interconnected by sharing of metabolites as substrates or products. In general, we consider models with M metabolites Pi with i ∈ {1, 2, …, M} and N catalytic enzymes Ej with j ∈ {1, 2, …, N}. A typical enzymatic reaction has the form
where Pi and Pk are metabolites, and Ej and Cj are the free and substrate-bound forms of the enzyme. The parameters (kf, j, kb, j) and (kcat, j, krev, j) are positive rate constants specific to the enzyme. In contrast to traditional metabolic models, where the number of enzyme molecules is assumed constant, here we explicitly model enzyme expression and enzyme catalysis as stochastic processes. Our models also account for dilution of molecular species by cell growth and consumption of the metabolite products by downstream processes.
Though in principle one can readily write a Chemical Master Equation (CME) for the marginal distribution P(P1, P2, …PM) given the pathway stoichiometry, analytical solutions of the CME are tractable only in few special cases. To overcome this challenge, we propose a method for approximating metabolite distributions that can be applied in a wide range of metabolic models. We first note that using the Law of Total Probability, the marginal distribution P(P1, P2, …, PM) can be generally written as:
where P = (P1, P2, …PM) and E = (E1, E2, …, EN) are the vectors of metabolite and enzyme abundances, respectively. The equation in (2) describes the metabolite distribution in terms of fluctuations in gene expression, comprised in the distribution P(E), and fluctuations in reaction catalysis, described by conditional distribution P(P|E).
A key observation is that Equation (2) corresponds to a mixture model with weights P(E) and mixture components P(P|E). To compute the mixture weights and components, we make use of the timescale separation between gene expression and metabolism. Gene expression operates on a much slower timescale than catalysis (Cao et al., 2005; Levine and Hwa, 2007; Kuntz et al., 2013), with protein half-lives typically comparable to cell doubling times and catalysis operating in the millisecond to second range. Therefore, in the fast timescale of catalysis we can write a conservation law for the total amount of each enzyme (free and bound):
where Et, j is the total number of enzymes Ej. Note that since our models integrate enzyme kinetics with enzyme expression, the variables Et, j follow their own, independent stochastic dynamics. It is important to note that in our approach, the conservation relation in (3) holds only in the fast timescale of catalysis. This contrasts with classic deterministic models for metabolic reactions, which typically focus on the fast catalytic timescale and assume enzymes as constant model parameters (Cornish-Bowden, 2004).
As a result of the separation of timescales, the weights and components of the mixture in (2) can be computed separately. Specifically, the mixture weights P(E) can be obtained as solutions of a stochastic model for enzyme expression (Raj and van Oudenaarden, 2008), or taken from absolute single-cell measurements of enzyme expression. Such absolute measurements can be obtained from single-molecule technologies (Okumus et al., 2016), carefully calibrating fluorescence data (Rosenfeld et al., 2006; Bakker and Swain, 2019) or normalization (Taniguchi et al., 2010). The mixture components P(P|E), on the other hand, can be estimated with suitable approximation techniques. For simplicity, here we choose to employ the Linear Noise Approximation (LNA), which provides a Gaussian estimate of the stationary distribution of a stochastic chemical system (van Kampen, 1992; Elf and Ehrenberg, 2003). The use of the LNA is justified on the basis that metabolites tend to appear in large numbers per cell, a key condition for the LNA to produce accurate results. However, more accurate methods to compute P(P|E) can be used if required (Andreychenko et al., 2017; Gupta et al., 2017a). In Figure 1, we illustrate a schematic of the proposed method.
Figure 1. Computation of single-cell metabolite distributions with Gaussian mixture models. We exploit the separation of timescales to compute the weights and components of the mixture model in Equation (2). Mixture weights are computed as stationary solutions to the Chemical Master Equation (CME) for a chosen model for stochastic enzyme expression. The mixture components are computed via the Linear Noise Approximation (Elf and Ehrenberg, 2003) (LNA) applied to the pathway ODE model. The method produces a Gaussian mixture model for metabolite distributions that can be applied in a wide range of metabolic pathways.
We thus propose the following procedure for computing single-cell metabolite distributions:
1. Starting from the mixture model in Equation (2), compute the enzyme distribution P(E) from a stochastic model for gene expression, either analytically (if possible) or numerically with Gillespie's algorithm.
2. To approximate the mixture components P(P|E) with the LNA, compute the steady state solution of the deterministic rate equation for each enzyme state E:
where S is the stoichiometric matrix and v(·) is the vector of deterministic reaction rates; for ease of notation we have assumed a unit cell volume, and hence the deterministic rates are equal to the propensities of the stochastic model. Note that due to the timescale separation, Equation (4) must be solved assuming constant enzymes E, and its solution depends on the enzyme abundance, i.e., .
3. For each enzyme state E, compute the solution to the Lyapunov equation (Elf and Ehrenberg, 2003):
where A is the Jacobian of (4) evaluated at the steady state and BBT = Sdiag{v}ST. Note that, as in (4), the solution of the Lyapunov equation depends on the enzyme state, i.e., Σ = Σ(E).
4. Following the LNA, approximate the mixture components P(P|E) as a multivariate Gaussian distribution with mean and covariance matrix Σ.
5. Combine the weights P(E) and Gaussian components P(P|E) through the mixture model in (2).
In the next sections we illustrate the effectiveness of our method in two exemplar systems.
3. Reversible Michaelis-Menten Reaction
We first consider a stochastic model that integrates a reversible Michaelis-Menten reaction with a standard model for enzyme expression. As shown in Figure 2A, the Michaelis-Menten mechanism includes reversible binding of four species: a metabolic substrate S, a free enzyme E, a substrate-enzyme complex C and a metabolic product P. To model enzyme expression, we use the well-known two-stage scheme for transcription and translation (Thattai and van Oudenaarden, 2001; Shahrezaei and Swain, 2008) (Figure 2A). The complete set of reactions is:
The reactions in (6) correspond to a reversible Michaelis-Menten reaction as in (1), while reactions in (7) are the two-stage model for gene expression. We include four additional first-order reactions (8) and (9) to model consumption of the metabolite product with rate constant kc, mRNA degradation with rate constant kdeg, and dilution of all model species with rate constant δ. In what follows we assume that the substrate S remains strictly constant, for example to model cases in which the substrate represents an extracellular carbon source that evolves in much slower timescale than cell doubling times.
Figure 2. Exemplar metabolic systems. (A) Reversible Michaelis-Menten reaction; the full set of reactions are shown in Equations (6)–(9). The model accounts for reversible catalysis of a substrate S into a product P. (B) Two-step pathway with noncompetitive end-product inhibition; the reactions are shown in Equations (18)–(25). The product (P2) sequesters enzyme E1 into an inactive form , thereby reducing the rate of the first reaction. In both examples we assume a constant substrate S and linear dilution of all chemical species. Enzymes are assumed to follow the two-stage model for gene expression (Shahrezaei and Swain, 2008), which includes species for the enzymatic mRNA and protein.
Since on the fast timescale of the catalytic reaction, the total number of enzymes can be assumed in quasi-stationary state (Cornish-Bowden, 2004; Tonn et al., 2019), we have that
and therefore the general mixture model in (2) can be written as:
The mixture weights P(Etotal) can be computed from the stochastic model for gene expression in (7). Under the standard assumption that mRNAs are degraded much faster than proteins (Raj and van Oudenaarden, 2008), the stationary solution of the two-stage model can be approximated by a negative binomial distribution (Shahrezaei and Swain, 2008):
where Γ is the Gamma function and the parameters are defined as the burst frequency a = ktx/δ and burst size b = ktl/kdeg.
To compute the mixture components P(P|Etotal) with the LNA, we write the full system of deterministic rate equations [see (35) in section 6] for the three species E, C, and P. Note that in this case, we can further reduce the rate equations by (i) using the conservation law in (10), and (ii) assuming that the binding and unbinding reactions between S and E reach equilibrium faster than the product P, a condition that generally holds in metabolic reactions. After algebraic manipulations, the reduced ODE can be written as:
where
and the parameters are KmS = (kb + kcat)/kf and KmP = (kb + kcat)/krev.
The mean of each mixture component is simply given by the steady state solution of (13), which we denote as . For a given enzyme abundance Etotal, the variance Σ(Etotal) of each Gaussian component is given by the solution to the Lyapunov equation in (5):
where f′ and g′ are first-order derivatives. Combining the negative binomial in (12) with the Gaussian components, we can rewrite Equation (11) to get a Gaussian mixture model for the metabolite:
where both and Σ(x) must be computed for each value of x = Etotal in the summation. The normalization constant in (16) is
In Figure 3, we plot the mixture model (16) for realistic parameter values and compare this approximation with distributions computed from long runs of Gillespie simulations of the whole set of reactions (6)–(9). The results indicate that the mixture model provides an excellent approximation of the metabolite distribution. In the next section we test our methodology in a more complex pathway with feedback regulation.
Figure 3. Stationary product distribution of a Michaelis-Menten reaction. The proposed mixture model in (16) provides an excellent approximation for the metabolite distribution obtained with Gillespie's algorithm (Gillespie, 2007). Distributions were computed for varying values of the bursting parameter a. Note that the resulting distributions are almost identical to those predicted in our earlier work using a Poisson mixture (Tonn et al., 2019), since we have deliberately chosen parameters to produce similar distribution in both cases. All parameter values can be found in Table 1.
Table 1. Parameter values for simulations in Figure 3.
4. Pathway With End-Product Inhibition
A common regulatory motif in metabolism is end-product inhibition, in which a pathway enzyme can bind to its own substrate as well as the pathway product (see Figure 2B). The product thus sequesters enzyme molecules, which reduces the number of free enzymes available for catalysis and slows done the reaction rate. To examine the accuracy of our method in this setting, we study a fully stochastic model for a two-step pathway with noncompetitive end-product inhibition:
The two reactions in (18) and (19) are reversible Michaelis-Menten kinetics, sharing the intermediate metabolite P1 as a product and substrate, respectively. The end-product inhibition in (20) consists of reversible binding between h molecules of P2 and the first enzyme E1 into a catalytically-inactive complex E*. The remaining model reactions in (21)–(25) are analogous to the previous example in section 3: reactions in (21) and (22) describe the two-stage model for expression of both enzymes, and with reactions (23)–(25) we model first-order mRNA degradation, product consumption, and dilution by cell growth. For simplicity we also assume that both enzymes are independently expressed, but in general our method can also account for cases in which enzymes are co-expressed or co-regulated (Chubukov et al., 2014). The resulting model has two distinct pools of enzymes, which remain constant over the timescale of catalysis:
and therefore the mixture model in (2) becomes
where the summation goes through all (Et, 1, Et, 2) pairs. Since both enzymes are expressed independently, the enzyme distribution is the product of two negative binomials P(Et, 1, Et, 2) = P(Et, 1) × P(Et, 2), each one analogous to the distribution in (12).
To compute the mixture components with the LNA, we use the rate equations for the reactions in (18)–(23); the full set of ODEs is listed in Equation (36) in the Methods. As in the first example, by employing the conservation laws in (26) and assuming rapid equilibrium of the complexes C1 and C2, the deterministic model can be further simplified to a 2-dimensional ODE:
where for ease of notation we have omitted the dependency on Et, 1 and Et, 2. The nonlinear functions in (28) are
where θ = ksq/krsq is the product-enzyme binding constant and the remaining parameters are defined as κS = kcat, 1kf, 1/(kb, 1+kcat, 1), κ1 = kb, 1krev, 1/(kb, 1+kcat, 1), κ2 = kcat, 2kf, 2/(kb, 2+kcat, 2), κ3 = kb, 2krev, 2/(kb, 2+kcat, 2), Km, S = kcat, 1/κS, Km, 1 = kb, 1/κ1, Km, 2 = kcat, 2/κ2, and Km, 3 = kb, 2/κ3.
As in the previous example, the ODEs in (28) correspond to the full model (36) rewritten in terms of both metabolites assuming that the enzyme-substrate reactions reach equilibrium in a faster timescale than catalysis. This reduced model can be readily employed to obtain approximations for the mixture components with the LNA. If we denote as the steady state solution of (28), we can write the Lyapunov equation as AΣ+ΣAT+BBT = 0 with A and BBT given by
where f(·), g(·), and their derivatives are evaluated at the steady state solution . The Gaussian components of the mixture model are then
where and |·| is the matrix determinant. After combining the joint distribution of enzymes and the components into Equation (27), we get a Gaussian mixture model for the joint marginal distribution of both metabolites:
where and Σ(x, y) need to computed numerically for each pair (x, y) = (Et, 1, Et, 2) in the summation. The burst frequencies ai = ktx, i/δ and burst sizes bi = ktl, i/kdeg, i are specific to each enzyme, and the normalization constant is given by
To test the quality of the approximation, we numerically computed the mixture model in (33) for various combinations of parameter values, shown in Figure 4. We observe that the mixture model offers an excellent approximation as compared to exact Gillespie simulations of the full model (18)–(25). We note that in this case, the full stochastic model has seven species and three different timescales, and therefore the runtime of Gillespie simulations are extremely long, in the order of several hours per run.
Figure 4. Stationary distributions for the intermediate metabolite in a two-step pathway with end-product inhibition. The panels show the distribution of intermediate metabolite P1 for different combinations of parameter values. (A) Impact of enzyme bursting frequency a1 and a2. (B) Impact of binding constant between the first enzyme and the end-product. All parameter values can be found in Table 2.
Table 2. Parameter values for simulations in Figure 4.
To further illustrate the utility of our method, we employed the mixture model to study the impact of parameter perturbations on the metabolite distributions. Without an analytical solution, such a study would require the computation of long Gillespie simulations for each combination of parameter values, which quickly become infeasible due to the long simulation time. In contrast, the mixture model provides a systematic way to rapidly evaluate the influence of model parameters on metabolite distributions. In Figure 5A we show summary statistics of the marginal P(P1) for various combinations of average enzyme expression levels. The results suggest that expression levels can have a strong impact on the mean and coefficient of variation of the intermediate metabolite. Moreover, in Figure 5B we plot the distribution P(P1, P2) for combinations of bursting parameters. The results show that uncorrelated enzyme fluctuations can result in correlated metabolite distributions due to the coupling introduced by the pathway (Levine and Hwa, 2007).
Figure 5. Impact of enzyme expression on metabolite distributions. (A) We compute the mean and coefficient of variation (CV) of the intermediate metabolite P1 in model (18)–(25), for a wide range of mean enzyme expression levels. (B) Enzyme expression parameters shape the metabolite distribution; We computed the joint metabolite distribution P(P1, P2) for three combinations of enzyme bursting parameters, chosen to give the same mean expression, and assuming both enzymes are expressed independently. Shown are contour plots of the bivariate distributions of enzymes (left) and metabolites (right). The results suggest that metabolite correlations emerge even when enzymes are uncorrelated, as reported previously in the literature (Levine and Hwa, 2007). All parameter values can be found in Table 3.
Table 3. Parameter values for simulations in Figure 5.
5. Discussion
Cellular metabolism has traditionally been assumed to follow deterministic dynamics. This paradigm results largely from the observation that cellular metabolites are highly abundant. However, recent data shows that single-cell metabolite distributions can display substantial heterogeneity in their abundance across single cells (Bennett et al., 2009; Imamura et al., 2009; Lemke and Schultz, 2011; Paige et al., 2012; Ibáñez et al., 2013; Yaginuma et al., 2014; Esaki and Masujima, 2015; Xiao et al., 2016; Mannan et al., 2017). It has also been shown that expression of metabolic genes is as variable as any other component of the proteome (Taniguchi et al., 2010), and thus in principle it is plausible that such enzyme fluctuations propagate to metabolites. These observations have begun to challenge the paradigm of metabolism being a deterministic process, suggesting that metabolite fluctuations may play a role in non-genetic heterogeneity.
Here we described a new computational tool to predict the statistics of metabolite fluctuations in conjunction with gene expression. The method is based on a timescale separation argument and leads to a Gaussian mixture model for the stationary distribution of cellular metabolites. Computing distributions from this approximate model is substantially faster than through stochastic simulations, as these can be extremely slow due to the multiple timescales of metabolic pathways. Our technique can therefore be employed to efficiently explore the parameter space and predict the shape of metabolite distributions in different conditions. In earlier work we showed that the product of a single metabolic reaction can be accurately described by a Poisson mixture model (Tonn et al., 2019). Such approximation allowed the discovery of previously unknown regimes for metabolite distributions, including heavily tailed distributions and various types of bimodality and multimodality. The Poisson approximation, however, is bespoke to single reactions and not valid for more complex systems. In contrast, the Gaussian mixture model discussed here can be applied to multiple kinetic mechanisms, more complex stoichiometries, as well as post-translational regulation.
An advantage of our approach is that the mixture weights can be computed offline from stochastic models for gene expression or single-cell expression data. The model is flexible in that it can readily accommodate gene expression models of various complexity. For the sake of illustration, in our examples we used the simple two-stage model for gene expression, but other models including gene regulation can also be employed (Dattani and Barahona, 2017). Particularly relevant models are those that account for enzyme co-regulation, a widespread feature of bacterial operons (Chubukov et al., 2014), which translates into correlations between expression of different pathway enzymes and the resulting metabolite abundances. A limitation of our method is that in many cases analytic solutions of the CME are not known, particularly for large models with multiple interacting genes. In such cases, the mixture weights P(E) can be approximated through stochastic simulations (Gillespie, 2007) albeit at the expense of increased computational costs. Most recently, progress in stochastic simulation of genome-scale metabolic networks (Tourigny et al., 2020) can offer an alternative route for studying fluctuations in large metabolic models.
The effectiveness of our method relies on two conditions: the separation of timescales between enzyme expression and enzyme catalysis, and the ability of the LNA to approximate the mixture components accurately. The first condition is satisfied by the vast majority of enzymes because their kinetics operate in regimes that are orders of magnitude faster than gene expression (Chubukov et al., 2014). However, the timescale separation can fail if the metabolic substrate S, typically a carbon source, cannot be assumed to be constant, a suitable assumption in the typical case of abundant nutrient sources with low fluctuations. Our theory would need to be extended in cases when nutrient sources become another source of variability, e.g., under fluctuations dictated by the environment (Dattani and Barahona, 2017). The second condition breaks down when the LNA fails to provide good estimates of the mixture components (Thomas and Grima, 2015; Andreychenko et al., 2017). As explained in section 2, here we have deliberately chosen to employ the LNA because it provides a simple and rapid method to compute the mixture components, P(P|E), for a broad range of metabolic pathways. Yet in cases where its assumptions do not hold, e.g., low abundance of metabolites, the LNA step in our method can be replaced by more accurate approximations. Such alternative methods include, for example, the conditional system size expansion including terms beyond the LNA, maximum entropy reconstructions using the method of conditional moments, or the finite state projection algorithm (Andreychenko et al., 2017; Gupta et al., 2017a), all of which can be readily incorporated into our mixture model strategy. These methods rely on different assumptions and their approximation quality will vary depending on the specific model parameters; in some cases, estimates for their approximation errors can be obtained with suitable methods, as discussed in a recent review on this topic (Kuntz et al., 2020).
Although our method can account for a large class of metabolic models and post-translational regulation mechanisms, there are a number of promising extensions that would broaden its utility in light of recent experimental advances. First, here we have only considered stationary distributions of metabolites, and a number of experiments have revealed cases in which metabolic heterogeneity emerges during dynamic nutrient shifts (Kotte et al., 2014; van Heerden et al., 2014; Nikolic et al., 2017). Extensions of our method to time-dependent metabolite distributions require the computation of the time-dependent solution of the CME for the enzyme expression model (Shahrezaei and Swain, 2008; Cao and Grima, 2018). As long as the dynamics of gene expression is slow enough to preserve the time scale separation, the computation of the mixture components with the LNA or other methods remains unchanged.
Another promising extension is the inclusion of transcriptional feedback regulation, a topic that has received substantial attention in the literature (Zaslaver et al., 2004; Chubukov et al., 2012; Chaves and Oyarzún, 2019; Lempp et al., 2019). In these systems, some pathway metabolites can bind to transcription factors (TF) that control enzyme expression in the same pathway. Such regulation can be included by using the conditional LNA method (Thomas and Grima, 2015) at the expense of not being able to compute the mixture weights offline anymore. Specifically, this extension would model mixture weights through more elaborate enzyme expression models in which the metabolite-TF interactions are replaced by their conditional averages, leading to an effective feedback model that requires specialized solution methods (Holehouse et al., 2020). A particularly promising application of such extended analysis is in synthetic biology, where there is a growing interest in the interplay between stochastic fluctuations and experimentally tunable parameters of molecular circuits (Briat et al., 2016; Boada et al., 2017). In particular, the use of metabolite-responsive feedback can improve robustness of strains engineered for the production of high-value metabolites (Oyarzún and Stan, 2013; Stevens and Carothers, 2015). Early results in this area (Oyarzún et al., 2015) suggest complex dependencies between metabolite fluctuations and the tunable parameters of the feedback control system. Such analyses were purely based on lumped models for metabolite-TF binding, and hence a more detailed theory could reveal novel design strategies to mitigate metabolite heterogeneity in production strains.
A number of works have sought to find links between fluctuations across layers of cellular organization, such as gene expression, metabolism and cell growth (Kiviet et al., 2014; Kotte et al., 2014; van Heerden et al., 2014; Nikolic et al., 2017; Thomas et al., 2018). But since measurement of metabolites in single cells remains technically challenging, there is pressing need for computational methods to predict fluctuations in cellular metabolites. Our proposed method provides a systematic approach for such task, paving the way for the generation of hypotheses on the molecular sources of metabolic heterogeneity.
6. Methods
6.1. Model Simulation
Stochastic simulations were computed with Gillespie's algorithm over long simulation times (several hours) corresponding to thousands of cell cycles. The ODE models and Lyapunov equations were solved in Matlab. In all examples, the negative binomial distribution for gene expression in (12) was computed with its continuum approximation (Gamma distribution).
6.2. Deterministic Rate Equations
6.2.1. Reversible Michaelis Menten
The full set of rate equations for the reversible reaction in (6)–(8) is:
To further reduce the above system of ODEs to Equation (13) in the main text, we can substitute the conservation relation in Equation (10), i.e. C = Etotal−E, and use the fact that the substrate-enzyme complex (C) typically equilibrates much faster than the product P, which means that dC/dt ≈ 0 in the timescale of catalysis.
6.2.2. End-Product Inhibition
The full set of rate equations for the reactions in (18)–(23) is:
As in the previous example, we can use the rapid equilibrium assumption and the conservation relations in (26), i.e., Et, 1 = E1+E*+C1 and Et, 2 = E2+C2, to simplify the 7-dimensional ODE in (28) to the 2-dimensional system in (28) of the main text.
Data Availability Statement
The original contributions presented in the study are included in the article/supplementary materials, further inquiries can be directed to the corresponding author/s.
Author Contributions
MT carried out research, model simulation, model analysis, and wrote the paper. PT and MB contributed to model analysis and paper writing. DO designed the research, model analysis, and wrote the paper. All authors contributed to the article and approved the submitted version.
Funding
This work was funded by the Human Frontier Science Program through a Young Investigator Grant (RGY0076-2015) awarded to DO, a UKRI Future Leaders Fellowship (MR/T018429/1) awarded to PT, and the EPSRC Centre for Mathematics of Precision Healthcare (EP/N014529/1) awarded to MB.
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
References
Amantonico, A., Urban, P. L., and Zenobi, R. (2010). Analytical techniques for single-cell metabolomics: state of the art and trends. Anal. Bioanal. Chem. 398, 2493–2504. doi: 10.1007/s00216-010-3850-1
Andreychenko, A., Bortolussi, L., Grima, R., Thomas, P., and Wolf, V. (2017). Distribution Approximations for the Chemical Master Equation: Comparison of the Method of Moments and the System Size Expansion. Springer International Publishing, 39–66.
Bakker, E., and Swain, P. S. (2019). Estimating numbers of intracellular molecules through analysing fluctuations in photobleaching. Sci. Rep. 9, 1–13. doi: 10.1038/s41598-019-50921-7
Bennett, B. D., Kimball, E. H., Gao, M., Osterhout, R., Van Dien, S. J., and Rabinowitz, J. D. (2009). Absolute metabolite concentrations and implied enzyme active site occupancy in Escherichia coli. Nat. Chem. Biol. 5, 593–599. doi: 10.1038/nchembio.186
Binder, D., Drepper, T., Jaeger, K. E., Delvigne, F., Wiechert, W., Kohlheyer, D., et al. (2017). Homogenizing bacterial cell factories: analysis and engineering of phenotypic heterogeneity. Metab. Eng. 42, 145–156. doi: 10.1016/j.ymben.2017.06.009
Boada, Y., Vignoni, A., and Picó, J. (2017). Engineered control of genetic variability reveals interplay among quorum sensing, feedback regulation, and biochemical noise. ACS Synth. Biol. 6, 1903–1912. doi: 10.1021/acssynbio.7b00087
Briat, C., Gupta, A., and Khammash, M. (2016). Antithetic integral feedback ensures robust perfect adaptation in noisy bimolecular networks. Cell Syst. 2, 15–26. doi: 10.1016/j.cels.2016.01.004
Cao, Y., Gillespie, D. T., and Petzold, L. R. (2005). Accelerated stochastic simulation of the stiff enzyme-substrate reaction. J. Chem. Phys. 123:144917. doi: 10.1063/1.2052596
Cao, Z., and Grima, R. (2018). Linear mapping approximation of gene regulatory networks with stochastic dynamics. Nat. Commun. 9:3305. doi: 10.1038/s41467-018-05822-0
Chaves, M., and Oyarzún, D. A. (2019). Dynamics of complex feedback architectures in metabolic pathways. Automatica 99, 323–332. doi: 10.1016/j.automatica.2018.10.046
Chubukov, V., Gerosa, L., Kochanowski, K., and Sauer, U. (2014). Coordination of microbial metabolism. Nat. Rev. Microbiol. 12, 327–340. doi: 10.1038/nrmicro3238
Chubukov, V., Zuleta, I. A., and Li, H. (2012) Regulatory architecture determines optimal regulation of gene expression in metabolic pathways. Proc. Natl. Acad. Sci. U.S.A. 109, 5127–5132. doi: 10.1073/pnas.1114235109
Dattani, J., and Barahona, M. (2017). Stochastic models of gene transcription with upstream drives: exact solution and sample path characterization. J. R. Soc. Interf. 14:20160833. doi: 10.1098/rsif.20016.833
Deris, J. B., Kim, M., Zhang, Z., Okano, H., Hermsen, R., Groisman, A., et al. (2013). The innate growth bistability and fitness landscapes of antibiotic resistant bacteria. Science 342:1237435. doi: 10.1126/science.1237435
Elf, J., and Ehrenberg, M. (2003). Fast evaluation of fluctuations in biochemical networks with the linear noise approximation. Genome Res. 13, 2475–2484. doi: 10.1101/gr.1196503
Esaki, T., and Masujima, T. (2015). Fluorescence probing live single-cell mass spectrometry for direct analysis of organelle metabolism. Analyt. Sci. 31, 1211–1213. doi: 10.2116/analsci.31.1211
Gillespie, D. T. (2007). Approximate accelerated stochastic simulation of chemically reacting systems. J. Chem. Phys. 1716, 1716–1733. doi: 10.1063/1.1378322
Golding, I., Paulsson, J., Zawilski, S. M., and Cox, E. C. (2005). Real-time kinetics of gene activity in individual bacteria. Cell 123, 1025–1036. doi: 10.1016/j.cell.2005.09.031
Gupta, A., Mikelson, J., and Khammash, M. (2017a). A finite state projection algorithm for the stationary solution of the chemical master equation. J. Chem. Phys. 147:154101. doi: 10.1063/1.5006484
Gupta, A., Milias-Argeitis, A., and Khammash, M. (2017b). Dynamic disorder in simple enzymatic reactions induces stochastic amplification of substrate. J. R. Soc. 14, 1–29. doi: 10.1098/rsif.2017.0311
Heinemann, M., and Zenobi, R. (2011). Single cell metabolomics. Curr. Opin. Biotechnol. 22, 26–31. doi: 10.1016/j.copbio.2010.09.008
Holehouse, J., Cao, Z., and Grima, R. (2020). Stochastic modeling of auto-regulatory genetic feedback loops: a review and comparative study. Biophys. J. 118, 1517–1525. doi: 10.1016/j.bpj.2020.02.016
Ibáñez, A. J., Fagerer, S. R., Schmidt, A. M., Urban, P. L., Jefimovs, K., Geiger, P., et al. (2013). Mass spectrometry-based metabolomics of single yeast cells. Proc. Natl. Acad. Sci. U.S.A. 110, 8790–8794. doi: 10.1073/pnas.1209302110
Imamura, H., Nhat, K. P. H., Togawa, H., Saito, K., Iino, R., Kato-Yamada, Y., et al. (2009). Visualization of ATP levels inside single living cells with fluorescence resonance energy transfer-based genetically encoded indicators. Proc. Natl. Acad. Sci. U.S.A. 106, 15651–15656. doi: 10.1073/pnas.0904764106
Kiviet, D. J., Nghe, P., Walker, N., Boulineau, S., Sunderlikova, V., and Tans, S. J. (2014). Stochasticity of metabolism and growth at the single-cell level. Nature 514, 376–379. doi: 10.1038/nature13582
Kotte, O., Volkmer, B., Radzikowski, J. L., and Heinemann, M. (2014). Phenotypic bistability in Escherichia coli' s central carbon metabolism. Mol. Syst. Biol. 10:736. doi: 10.15252/msb.20135022
Kuntz, J., Oyarzún, D. A., and Stan, G. B. V. (2013). “Model reduction of genetic-metabolic networks via time scale separation,” in A Systems Theoretic Approach to Systems and Synthetic Biology, eds V. Kulkarni, G.-B. Stan, and K. Raman (Springer), 181–210.
Kuntz, J., Thomas, P., Stan, G. B. V., and Barahona, M. (2020). Stationary distributions of continuous-time Markov chains: a review of theory and truncation-based approximations. SIAM Rev.
Labhsetwar, P., Cole, J. A., Roberts, E., Price, N. D., and Luthey-Schulten, Z. A. (2013). Heterogeneity in protein expression induces metabolic variability in a modeled Escherichia coli population. Proc. Natl. Acad. Sci. U.S.A. 110, 14006–14011. doi: 10.1073/pnas.1222569110
Lemke, E. A., and Schultz, C. (2011). Principles for designing fluorescent sensors and reporters. Nat. Chem. Biol. 7, 480–483. doi: 10.1038/nchembio.620
Lempp, M., Farke, N., Kuntz, M., Freibert, S. A., Lill, R., and Link, H. (2019). Systematic identification of metabolites controlling gene expression in E. coli. Nat. Commun. 10:4463. doi: 10.1038/s41467-019-12474-1
Levine, E., and Hwa, T. (2007). Stochastic fluctuations in metabolic pathways. Proc. Natl. Acad. Sci. U.S.A. 104, 9224–9229. doi: 10.1073/pnas.0610987104
Liu, D., Mannan, A. A., Han, Y., Oyarzún, D. A., and Zhang, F. (2018). Dynamic metabolic control: towards precision engineering of metabolism. J. Indus. Microbiol. Biotechnol. 45, 535–543. doi: 10.1007/s10295-018-2013-9
Loftus, R. M., and Finlay, D. K. (2016). Immunometabolism: cellular metabolism turns immune regulator. J. Biol. Chem. 291, 1–10. doi: 10.1074/jbc.R115.693903
Lugagne, J. B., Oyarzún, D. A., and Stan, G. B. (2013). “Stochastic simulation of enzymatic reactions under transcriptional feedback regulation,” in Proceeding of the European Control Conference (Zurich), 3646–3651.
Mannan, A. A., Liu, D., Zhang, F., and Oyarzún, D. A. (2017). Fundamental design principles for transcription-factor-based metabolite biosensors. ACS Synth. Biol. 6, 1851–1859. doi: 10.1021/acssynbio.7b00172
Murabito, E., Verma, M., Bekker, M., Bellomo, D., Westerhoff, H. V., Teusink, B., et al. (2014). Monte-Carlo modeling of the central carbon metabolism of lactococcus lactis: insights into metabolic regulation. PLoS ONE 9:e106453. doi: 10.1371/journal.pone.0106453
Nikolic, N., Schreiber, F., Co, A. D., Kiviet, D. J., Bergmiller, T., Littmann, S., et al. (2017). Cell-to-cell variation and specialization in sugar metabolism in clonal bacterial populations. PLoS Genet. 13:e1007122. doi: 10.1371/journal.pgen.1007122
Okumus, B., Landgraf, D., Lai, G. C., Bakhsi, S., Arias-Castro, J. C., Yildiz, S., et al. (2016). Mechanical slowing-down of cytoplasmic diffusion allows in vivo counting of proteins in individual cells. Nat. Commun. 7, 1–11. doi: 10.1038/ncomms12130
Oyarzún, D. A., Lugagne, J. B., and Stan, G. B. V. (2015). Noise propagation in synthetic gene circuits for metabolic control. ACS Synth. Biol. 4, 116–125. doi: 10.1021/sb400126a
Oyarzún, D. A., and Stan, G. B. V. (2013). Synthetic gene circuits for metabolic control: design trade-offs and constraints. J. R. Soc. Interface 10:20120671. doi: 10.1098/rsif.2012.0671
Paige, J. S., Nguyen-Duc, T., Song, W., and Jaffrey, S. R. (2012). Fluorescence imaging of cellular metabolites with RNA. Science 335:1194. doi: 10.1126/science.1218298
Puchałka, J., and Kierzek, A. M. (2004). Bridging the gap between stochastic and deterministic regimes in the kinetic simulations of the biochemical reaction networks. Biophys. J. 86, 1357–1372. doi: 10.1016/S0006-3495(04)74207-1
Radzikowski, J. L., Schramke, H., and Heinemann, M. (2017). Bacterial persistence from a system-level perspective. Curr. Opin. Biotechnol. 46, 98–105. doi: 10.1016/j.copbio.2017.02.012
Raj, A., and van Oudenaarden, A. (2008). Nature, nurture, or chance: stochastic gene expression and its Consequences. Cell 135, 216–226. doi: 10.1016/j.cell.2008.09.050
Rathinam, M., Petzold, L. R., Cao, Y., and Gillespie, D. T. (2003). Stiffness in stochastic chemically reacting systems: the implicit tau-leaping method. J. Chem. Phys. 119, 12784–12794. doi: 10.1063/1.1627296
Reid, M. A., Dai, Z., and Locasale, J. W. (2017). The impact of cellular metabolism on chromatin dynamics and epigenetics. Nat. Cell Biol. 19, 1298–1306. doi: 10.1038/ncb3629
Rosenfeld, N., Perkins, T. J., Alon, U., Elowitz, M. B., and Swain, P. S. (2006). A fluctuation method to quantify in vivo fluorescence data. Biophys. J. 91, 759–766. doi: 10.1529/biophysj.105.073098
Schmitz, A. C., Hartline, C. J., and Zhang, F. (2017). Engineering microbial metabolite dynamics and heterogeneity. Biotechnol. J. 12:1700422. doi: 10.1002/biot.201700422
Schreiber, F., Littmann, S., Lavik, G., Escrig, S., Meibom, A., Kuypers, M. M. M., et al. (2016). Phenotypic heterogeneity driven by nutrient limitation promotes growth in fluctuating environments. Nat. Microbiol. 1:16055. doi: 10.1038/nmicrobiol.2016.55
Shahrezaei, V., and Swain, P. S. (2008). Analytical distributions for stochastic gene expression. Proc. Natl. Acad. Sci. U.S.A. 105, 17256–17261. doi: 10.1073/pnas.0803850105
Shan, Y., Gandt, A. B., Rowe, S. E., Deisinger, J. P., Conlon, B. P., and Lewis, K. (2017). ATP-dependent persister formation in Escherichia coli. mBIO 8, 1–14. doi: 10.1128/mBio.02267-16
Şimşek, E., and Kim, M. (2018). The emergence of metabolic heterogeneity and diverse growth responses in isogenic bacterial cells. ISME J. 12, 1199–1209. doi: 10.1038/s41396-017-0036-2
Stevens, J. T., and Carothers, J. M. (2015). Designing RNA-based genetic control systems for efficient production from engineered metabolic pathways. ACS Synth. Biol. 4, 107–115. doi: 10.1021/sb400201u
Swain, P. S., Elowitz, M. B., and Siggia, E. D. (2002). Intrinsic and extrinsic contributions to stochasticity in gene expression. Proc. Natl. Acad. Sci. U.S.A. 99, 12795–12800. doi: 10.1073/pnas.162041399
Takhaveev, V., and Heinemann, M. (2018). Metabolic heterogeneity in clonal microbial populations. Curr. Opin. Microbiol. 45, 30–38. doi: 10.1016/j.mib.2018.02.004
Taniguchi, Y., Choi, P. J., Li, G. W., Chen, H., Babu, M., Hearn, J., et al. (2010). Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science 329, 533–538. doi: 10.1126/science.1188308
Thattai, M., and van Oudenaarden, A. (2001). Intrinsic noise in gene regulatory networks. Proc. Natl. Acad. Sci. U.S.A. 98, 8614–8619. doi: 10.1073/pnas.151588598
Thomas, P., and Grima, R. (2015). Approximate probability distributions of the master equation. Phys. Rev. E 92:012120. doi: 10.1103/PhysRevE.92.012120
Thomas, P., Popović, N., and Grima, R. (2014). Phenotypic switching in gene regulatory networks. Proc. Natl. Acad. Sci. U.S.A. 111, 6994–6999. doi: 10.1073/pnas.1400049111
Thomas, P., Terradot, G., Danos, V., and Weiße, A. Y. (2018). Sources, propagation and consequences of stochasticity in cellular growth. Nat. Commun. 9, 1–11. doi: 10.1038/s41467-018-06912-9
Tonn, M. K. (2020). Stochastic modelling and analysis of metabolic heterogeneity in single cells (Ph.D. thesis). Imperial College London, London, United Kingdom.
Tonn, M. K., Thomas, P., Barahona, M., and Oyarzún, D. A. (2019). Stochastic modelling reveals mechanisms of metabolic heterogeneity. Commun. Biol. 2:108. doi: 10.1038/s42003-019-0347-0
Tourigny, D., Goldberg, A., and Karr, J. (2020). Simulating single-cell metabolism using a stochastic flux-balance analysis algorithm. bioRxiv. doi: 10.1101/2020.05.22.110577
van Heerden, J. H., Wortel, M. T., Bruggeman, F. J., Heijnen, J. J., Bollen, Y. J. M., Planqué, R., et al. (2014). Lost in transition: start-up of glycolysis yields subpopulations of nongrowing cells. Science 343:1245114. doi: 10.1126/science.1245114
Vilhena, C., Kaganovitch, E., Shin, J. Y., Grünberger, A., Behr, S., Kristoficova, I., et al. (2018). A single-cell view of the BtsSR/YpdAB pyruvate sensing network in Escherichia coli and its biological relevance. J. Bacteriol. 200, 1–13. doi: 10.1128/JB.00536-17
Wehrens, M., Büke, F., Nghe, P., and Tans, S. J. (2018). Stochasticity in cellular metabolism and growth: approaches and consequences. Curr. Opin. Syst. Biol. 8, 131–136. doi: 10.1016/j.coisb.2018.02.006
Weiße, A. Y., Oyarzún, D. A., Danos, V., and Swain, P. S. (2015). Mechanistic links between cellular trade-offs, gene expression, and growth. Proc. Natl. Acad. Sci. U.S.A. 112, E1038–E1047. doi: 10.1073/pnas.1416533112
Xiao, Y., Bowen, C. H., Liu, D., and Zhang, F. (2016). Exploiting non-genetic, cell-to-cell variation for enhanced biosynthesis. Nat. Chem. Biol. 12, 339–344. doi: 10.1038/nchembio.2046
Yaginuma, H., Kawai, S., Tabata, K. V., Tomiyama, K., Kakizuka, A., Komatsuzaki, T., et al. (2014). Diversity in ATP concentrations in a single bacterial cell population revealed by. Sci. Rep. 4:6522. doi: 10.1038/srep06522
Keywords: metabolic variability, stochastic gene expression, metabolic modeling, single-cell modeling, mixture model analysis
Citation: Tonn MK, Thomas P, Barahona M and Oyarzún DA (2020) Computation of Single-Cell Metabolite Distributions Using Mixture Models. Front. Cell Dev. Biol. 8:614832. doi: 10.3389/fcell.2020.614832
Received: 07 October 2020; Accepted: 26 November 2020;
Published: 22 December 2020.
Edited by:
Ankur Sharma, Genome Institute of Singapore, SingaporeReviewed by:
Yogesh Goyal, University of Pennsylvania, United StatesAlejandro Vignoni, Max Planck Institute of Molecular Cell Biology and Genetics, Germany
Copyright © 2020 Tonn, Thomas, Barahona and Oyarzún. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Diego A. Oyarzún, RC5PeWFyenVuJiN4MDAwNDA7ZWQuYWMudWs=