- 1Pacific Northwest National Laboratory, Richland, WA, United States
- 2Department of Microbiology, The Ohio State University, Columbus, OH, United States
- 3Interdisciplinary Center for Quantitative Modeling in Biology University of California, Riverside, CA, United States
Elucidating cell regulation remains a challenging task due to the complexity of metabolism and the difficulty of experimental measurements. Here we present a method for prediction of cell regulation to maximize cell growth rate while maintaining the solvent capacity of the cell. Prediction is formulated as an optimization problem using a thermodynamic framework that can leverage experimental data. We develop a formulation and variable initialization procedure that allows for computing solutions of the optimization with an interior point method. The approach is applied to photoheterotrophic growth of Rhodospirilium rubrum using ethanol as a carbon source, which has applications to biosynthesis of ethylene production. Growth is captured as the rate of synthesis of amino acids into proteins, and synthesis of nucleotide triphoshaptes into RNA and DNA. The method predicts regulation that produces a high rate of protein and RNA synthesis while DNA synthesis is reduced close to zero in agreement with production of DNA being turned off for much of the cell cycle.
1 Introduction
Biological systems can be understood as dissipative systems analogous to tornadoes and hurricanes. Dissipative systems act by taking the most probable path to reduce the energy difference in the environment. These most probable paths result in cyclical patterns of material movement that act to transport the material from regions of high energy to regions of lower energy. In tornadoes, air movement becomes correlated and cyclical in which hot air is driven up and cool air down. Whether one considers entropy from a thermodynamic or information standpoint, the concept is the same—the most probable path allowed by physical constraints is taken. It is in this respect that we expect the dynamics of biological systems act to maximize their entropy production rates to the extent possible. In this regard, biological processes are highly constrained by physical and biological limitations—constraints that are critical for their function. Moreover, for biological systems, the adaptation of their dynamics to maximize their entropy production rates occurs on the time scale of natural selection. This is to say, through natural selection their genome structure and metabolism become updated such that they act to move to the most probable state possible by taking the least action path. On the shorter timescale of a cell’s lifetime, those adaptations also include complex mechansims for adapting to changing environmental conditions. These adaptation mechanisms serve to control the dynamics in such a way that their ability to carry out auto-catalysis (Hinshelwood, 1952) is preserved. Adaptation is accomplished through regulation of metabolic reaction pathways such that organisms remain viable in a physico-chemical sense despite rapid or dramatic shifts in the environment.
Because regulation is at the heart of understanding biological processes, discovery and understanding of the basis for regulation is critical to the field of biology. Regulation may be in the form of just-in-time regulation, graded regulation, switch-like regulation, pre-programmed regulation such as the circadian clock system, or even the choice to not regulate when it is too costly to alter the respective enzyme expression but to instead constitutively turn on an activity (Sivak and Thomson, 2014). Regulation is an important aspect of fitness in that organisms and even individual cells must regulate themselves so that they act as efficiently and as quickly as possible, otherwise they are outcompeted in their environment. Consequently, the environment over the time of evolution shapes regulation. Those cell phenotypes that operate in the most efficient manner in nutrient poor and dramatically shifting environments will tend towards fixation in the population. In rich environments in which selection pressure is not strong, dysregulation of cells can result in uncontrolled growth, manifest as cancer in metazoans.
Regulation in biological systems has been historically uncovered using experiments carefully designed to test hypotheses, typically employing isotope labeling methods to track reaction fluxes. Such meticulous but labor intensive studies have been quite successful at uncovering metabolic regulation (Larsson-Raznikiewicz, 1967; Newsholme and Start, 1973; Waygood and Sanwal, 1974; Waygood et al., 1975; 1976; Aithal et al., 1985; Jitrapakdee and Wallace, 1999; Holness and Sugden, 2003; Lehninger et al., 2005; Hallows et al., 2012; Saha et al., 2014; Wang et al., 2014), yet we are far from a complete understanding of regulation in even model organisms. High-throughput methods to uncover regulation are highly desirable, and recent work in this area has made progress (Hackett et al., 2016; Reznik et al., 2017), but we are still a long way from a routine method for determining regulation, either by experiment or prediction.
In a recent study by Hackett et al. (2016), Michealis-Menten steady-state kinetic models with regulation were used to predict reaction fluxes and concentrations. The predictions were compared to reaction fluxes inferred from 13C isotope experiments and concentrations derived from mass spectrometry and NMR measurements (Hackett et al., 2016). The correlation between simulation-predicted fluxes and experimentally-inferred fluxes were evaluated with and without regulation in the simulation. If the match was better with regulation, then regulation was assumed. The work was a tour de force in that 25 chemostat studies were used to carefully measure both absolute and relative metabolomics data while at the same time cover as much of the proteome as possible.
An approach used by Reznik et al. (2017), has less reliance on multimodal experimental designs, and instead used sophisticated informatics to develop a model of small molecule regulatory networks from curated databases of enzymes. They integrated the regulatory network with a metabolic model of Escherichia coli, and distilled information on how substrates and inhibitors contribute to metabolic flux regulation (Reznik et al., 2017).
More recently, Britton et al. (2020) have developed optimization and reinforcement learning methods that predict which enzymes need to be regulated to maintain metabolite concentrations at a level such that the diffusion process within the cell remains viable. These methods build on work by Cannon et al. (2018) that takes advantage of a maximum path-entropy/caliber (Jaynes, 1985; Dewar, 2009; Dixit et al., 2018) formulation of the law of mass action in order to predict likely metabolite concentrations. The advantage of these approaches is that minimal measurements and few experimentally-derived parameters are required to predict regulation.
Maximum entropy methods are attractive because they use the most likely values of the parameters needed to model a process (Jaynes, 1985; Dewar, 2009; Dixit et al., 2018). That is, rather than a large search through parameter space to determine the most likely kinetic parameters, a maximum entropy formulation can provide the solution directly, then if needed, reaction parameters can be back calculated. Strictly speaking, these methods maximize the path entropy of a system, meaning that they maximize the entropy subject to the stoichiometric constraints imposed by the reactions and the constraints due to the system boundary conditions. Without the stoichiometric constraints, each molecular species would move to its natural abundance determined by its standard chemical potential. Without non-equilibrium boundary constraints, the reactions would all go to the equilibrium state.
In the study by Britton et al. (2020), regulation was inferred only with regard to which reactions needed to be controlled to keep metabolites at concentrations that are not so high that they would impede diffusion (Britton et al., 2020). As mentioned above, in an environmental niche with finite nutrient resources for growth, for cells to persist they must replicate in a timely and efficient manner in order to compete with other species. Therefore in the context of evolution we expect to see regulation in cells that maximizes growth, and only produces those enzyme catalysts that contribute to growth in a timely and efficient manner.
In this study, we develop an optimization approach that allows for extension of the work by Britton et al. (2020) to be applied for more general constraint based modeling, and demonstrate our method for prediction of regulation that maximizes growth pathways while still constraining metabolite levels to physiological values. While flux balance type analyses optimize a similar objective and may include information about regulation (Covert et al., 2001) or even thermodynamic feasibility constraints (Henry et al., 2007), the method we present is fundamentally different in that we use a thermodynamic perspective to directly infer the most likely distribution of both fluxes and metabolite concentrations subject to given constraints.
We apply our method, that we term pathway-controlled optimization (PCO), to the metabolism and enzymatic activities of Rhodospirilium rubrum, a purple non-sulfur photosynthetic bacterium that is being used as a synthetic biology organism for the purpose of ethylene production (North et al., 2020). We compare the regulation and reaction fluxes predicted by this new method to the approach presented in (Britton et al., 2020) that adjust regulation only to maintain metabolite concentrations at physiological levels.
2 Methods
We begin this section with a brief introduction to the maximum path entropy approach developed in (Cannon et al., 2018) and (Britton et al., 2020). Then we present how we extend this approach to account for regulation in cells to optimize growth.
2.1 Maximum path entropy solution without regulation
For a set of molecular species
Here, k1 and k−1 are the reaction rate parameters.
Let
where for a given n, J(n) is the net flux of the set of forward and reverse reactions. The chemical species occurring on the left hand side of the equation are known as reactants and belong to the subset
For any reversible reaction with forward and reverse reactions + j and −j, the net flux is given by,
Eq. 3 is a purely kinetic description of the reaction flux. For elementary reactions, thermodynamic terms are introduced into the law of mass action by a simple factorization
where
Using the maximum path entropy solution, and representing the thermodynamic reaction forces for each reaction
the flux is then given by
and we can express the time dependence of each metabolite
2.2 Steady state
At the maximum path entropy configuration the system is in a steady state, where a set of metabolites, denoted
2.3 Controlling metabolite concentrations
Metabolite concentrations predicted using the maximum path entropy approach without regulation will produce values that are much too large to be physiologically reasonable—concentrations may approach the limit of their solubility, causing the cytoplasm to become glass-like (Parry et al., 2014; Heimlicher et al., 2019).
That regulation is needed to control concentrations in vivo was proposed early in the field of enzymology (Atkinson, 1969;Atkinson, 1977). In a previously reported method to control concentrations (Britton et al., 2020), Britton, et al. applied regulation to reactions using a scalar valued activity coefficient αj ∈ [0.0, 1.0] for each reaction j, that linearly scales the reaction rate such that the time dependence of metabolite concentration ni is given by,
where
When αj = 1.0, the reaction is fully active and when αj = 0.0, the activity, and hence reaction rate Jj (n, αj), is zero. Activities are adjusted in a deterministic manner by characterizing the sensitivity of a metabolite concentration to an enzyme activity using Metabolic Control Analysis (MCA) (Sauro, 2018). In MCA, the concentration control coefficient Ci,j measures the sensitivity of a concentration nj to an activity coefficient αj,
Using this approach, metabolites whose predicted concentrations are furthest from their experimentally observed values, as measured by the log ratio of the predicted to observed concentration, are reduced first. The concentration of a metabolite ni is reduced by adjusting the activity coefficient αj that has the largest influence on ni and all other metabolites that exceed their experimentally observed concentrations, as determined by the sensitivity analysis. This process is iteratively carried out until all concentrations are at or below the experimentally observed concentrations. Details of the approach are provided in the study by Britton et al. (2020). In the cases discussed herein, the maximum value of the experimentally observed metabolite concentrations are taken to be ni = 1.0 mM for
2.4 Controlling metabolite concentrations and maximizing growth
While controlling metabolite concentrations may be a primary role of metabolic regulation, natural selection also requires that organisms be regulated to grow fast and efficiently—using the available energy from the environment to ensure survival and compete with others. Here we develop a novel approach, we call pathway-controlled optimization (PCO), whose goal is to obtain this biological objective, and derive a formulation that allows for tractable numerical solutions.
Let
The objective seeks to maximize the flux through the growth reactions
The formulation of the PCO problem is simple to express but difficult to solve. The steady state constraints (13b) are non-linear and non-convex presenting significant challenges to optimization. Values for the flux, activity coefficients, and metabolite concentrations can also vary over many orders of magnitude, which introduces additional difficulty in employing numerical methods to compute solutions. In this work we present a more computationally tractable reformulation of the constraints and present numerical solutions from an interior point solver.
2.4.1 Representing the steady state condition
To simplify solving the optimization, we reformulate the steady state constraint (13b) to be more numerically tractable. Let
that is
We separate (14) into the identification of a vector of fluxes
The expression yj = Jj (n, αj) can be made more tractable by considering the computation of the reaction flux in terms of the log of the metabolites. For
then we have from (7) that the flux as a function of η in the maximum path entropy formulation is given by
where (S)j is the jth column of S, and ⟨⋅, ⋅⟩ is the standard inner product. It follows that in terms of the log of the metabolite counts, conditions (16) are equivalent to.
Computation of
with sgn the signum function given by
A further complication in using the steady state conditions (19) as constraints for optimization is that the value of
grows unbounded as αj and yj go to zero. In this application we consistently found solutions with αj and yj on the order of 10–9 for some reactions while others were unregulated with values of yj on the order of 103. These discrepancies in scale are challenging for numerical methods.
However, it is not necessary to explicitly compute the activity coefficients in order to characterize a steady state. Instead it is sufficient only to identify metabolite concentrations that produce flux values that can be reduced to a steady state. We can capture this criteria with the conditions
where 1 is the vector of all ones corresponding to no regulation for all reactions. If (23) is satisfied then any reaction fluxes Jj (n, 1) with magnitude greater than the steady state value yj can always be reduced to satisfy equality so long as they have the same sign, which is captured by the second condition in (23). This idea can also be captured in our modified formulation (19) for the steady state conditions.
For metabolites η and steady state fluxes y there exists an α such that (19) holds if the following hold
where g = STη. This follows from (19b)-(19c), (20), and the inequality
for all
Therefore we can formulate steady state conditions equivalent to (19) without the need to explicitly specify the activity coefficients as follows.
2.4.2 Numerical formulation of the steady state constraints
Implementing conditions (26a) - (26e) as constraints for numerical optimization requires further reformulation. The constraint (26d) is non-convex, and represents an ‘or’ condition depending on the sign of the flux for each reaction. This switching condition makes the feasible set non-convex and therefore challenging for optimization methods to search over. To find solutions we use a big M relaxation of (26d) with the constraints
where M is taken to be a large constant. For each reaction the variable uj is a “switching” term that relaxes either the constraint (27b) or (27c) depending on the sign of yj such that only the correct constraint is imposed for each reaction.
In several constraints we utilize the signum function of the flux which is a discrete function, whereas the optimization method we use here requires continuous functions of all variables for the constraints. Therefore we approximate the signum function with
for scaling parameters λ and ϵ.
To implement the constraint
and we capture the constraint
2.4.3 Maximum growth optimization numerical formulation
Using the steady state conditions as outlined in sections 2.4.2 and 2.4.1 we formulate the maximum growth pathway-controlled optimization problem as
where the constraints (30b)-(30h) represent the steady state constraint (13b). Note that the activity coefficients α for reactions are not explicitly included, however for a solution with optimal log metabolite counts η* and fluxes y*, the associated activity coefficients can be recovered as
It is important here to note that when (30) admits solutions they are not necessarily unique. For instance, those reactions that do not directly impact the objective may have a range of choices for the fluxes and constituent metabolite values that satisfy constraints (30b)–(30h) without changing the objective value. In particular constraints (30e) and (30f) are not neccesairly all that restrictive.
2.4.4 Initialization for optimization
The greatest difficulty in solving (30) is in identifying the correct sign for the fluxes. If the correct flux directions are known then the problem simplifies considerably. Using IPOPT, a general non-linear interior point solver (Wächter and Biegler, 2006), we found that the solution and convergence of the solver was extremely sensitive to the variable initialization, as is common for non-convex problems. In particular, starting with fluxes close to the optimal directions was crucial for convergence to a good solution. To initialize the variables we solve for metabolite values that produces fluxes close to the gradient direction of the objective. Using random or other initialization methods we found that the solver failed to converge, or converged to points with smaller objective values.
To compute initial variable values we first find the projection of the gradient of the objective onto the space of steady state solutions, with
where
where s is a vector of positive slack variables and ◦ is the Hadamard product. Consider that if the objective is zero at a solution η∗ then sgn (−yg) = sgn (STη∗ − log(K)) which implies that there exists a γ > 0 such that (η∗, γyg) is a feasible point of (30). Therefore this is a feasible point with fluxes in the directions of yg. In practice solutions of (33) may only be close to zero, so we choose initial fluxes y∗ as
for a ζ > 0. Larger zeta values move the fluxes further from the origin and was found to improve convergence, likely as it reduces the number of fluxes that will switch sign early in the optimization.
2.5 Numerical solution
We solved (33) with the SciPy version 1.7.1 lsq_linear routine using the default configuration (Virtanen et al., 2020). A solution of (30) was then computed with IPOPT version 3.14.4 using Pyomo (Hart et al., 2017) again using the default optimization parameters except with tol = 1.0 × 10−7 and maximum iterations set to 10,000.
The parameters for
For all variable metabolites the upper bound was
2.6 Metabolic model
The metabolic model of R. rubrum consisted of 184 reactions and 204 metabolites including central metabolism, the Calvin cycle, the oxidative and reductive TCA cycles, biosynthetic pathways for all amino acids, pyrimidines and purines, the S-adenylsyl methionine (SAM) cycles I and II, the methyl alkane reductase pathway (North et al., 2020), sulfate assimilation, and generic RNA, DNA and protein synthesis pathways. The model represents an engineered species that is predicted to not need the ethylmalonyl CoA pathway due to the inclusion of a oxaloacetate decarboxylase/malic enzyme that converts oxaloacetate to pyruvate and CO2 and the inactivation of phospho-glycerate mutase. Consequently, all 3-phophoglycerate produced by the Calvin Cycle is stays within the Calvin cycle.
Boundary conditions (fixed species concentrations) consisted of concentrations of nutrients (ethanol, diphosphate, orthophosphate, sulfate and ammonia), cell waste products (CO2, H2, ethylene), ATP derivatives to set up the energy gradient due to photoheterotrophic growth (ATP, ADP, AMP), internal redox pairs (NAD+, NADH, NADP+, NADPH, reduced and oxidized ferredoxin, reduced and oxidized thioredoxin), folates (N10-formyltetrahydrofolate, 5-methyltetrahydrofolate, 5,10-methylenetetrahydrofolate, 7,8-dihydrofolate and tetrahydrofolate), and other internal metabolites for which scavenging pathways were not modeled (coenzyme A, (4S)-4,5-dihydroxypentan-2,3-dione, adenine, adenosine, adenosine-3’,5’-bisphosphate).
The standard free energy of reaction for each reaction was calculated with eQuilibrator, version 0.3.1. The free energies for the methyl alkane reductase pathway, the SAM cycles I & II, the ferredoxin oxidoreductases for 2-oxoglutarate and pyruvate, DNA, RNA and protein polymerization reactions were not available from eQuilibrator and were estimated manually from half-reactions reactions or overall reactions for the respective pathways. If estimates for some individual reactions were available for these pathways, they were adjusted accordingly, otherwise the standard free energy change for the pathways was divided equally among the constituent reactions. Details are included in the computational notebook available as supplemental material.
The model was constructed from a custom R. rubrum BioCyc database at PNNL. The draft model is available as a Jupyter notebook along with the necessary auxillary files for running the notebook in King and Cannon (2023).
3 Results
In order to place the results in the proper context, a brief overview of the metabolism of R. rubrum, the subject of the biomass optimization, is necessary. R. rubrum is capable of growing both photoautotrophically, meaning that it only uses CO2 as a carbon source, or photoheterotrophically, meaning that it can carry out photosynthesis while simultaneously assimilating organic carbon. In this study, we are modeling photoheterotropic growth, in which energy is acquired through photosynthesis while carbon is mostly acquired by uptake and assimilation of ethanol.
During photosynthesis, R. rubrum uses the Calvin cycle to regenerate chemical precursors for the assimilation of CO2. The Calvin cycle includes reactions shared with the non-oxidative branch of the pentose phosphate cycle, some of which can act as an alternate route in the Calvin Cycle. The two routes in the Calvin cycle differ in that one route produces erythrose-4-phosphate and the other route consumes erythrose-4-phosphate. Erythrose-4-phosphate is a key metabolite for the synthesis of vitamin B6, pyridoxal-5’-phosphate, an essential cofactor in amino acid, and hence protein, synthesis. Therefore, it is important that during growth, erythrose-4-phosphate not be entirely used as a Calvin cycle precursor.
We define the biomass pathways to be optimized as the pathways regarding the incorporation of amino acids into proteins and the incorporation of nucleotide triphosphates (NTPs) into RNA and DNA. The synthesis of amino acids and NTPs from precursors, however, are not included in the biomass pathways. During growth, proteins, RNA and DNA can be simultaneously synthesized, although RNA and DNA synthesis compete for both available energy and molecular precursors, NTPs and deoxynucleotide triphosphates (dNTPs), respectively. R. rubrum is known to have a cell cycle in which DNA can only be synthesized during the S phase, while RNA is synthesized continuously throughout the cell cycle except during mitosis. Whether DNA is synthesized depends on the redox state of the cell, specifically, the ratio of NADP to NADPH.
3.1 Controlling concentrations
Because some metabolites have highly favorable free energies of formation or standard chemical potentials, their concentrations can rise to dangerously high levels if not controlled (Atkinson, 1977). The primary role of control in a biological system must be to maintain viability. Metabolism will not operate if the concentrations of metabolites becomes so great that the cytoplasm becomes too viscous for diffusion-reaction processes to occur. We previously showed that concentrations of metabolites can be controlled by reducing key reaction fluxes through the activities of the respective enzymes identified by concentration control coefficients from Metabolic Control Analysis. In this approach, described briefly in the Methods section and in detail in reference (Britton et al., 2020), sensitivity analyses are used to characterize the influence of an enzyme’s activity on the concentration of any of the models metabolites. If a metabolite concentration is too high, the enzyme with the most influence over that metabolite is chosen to be regulated. This process occurs until all metabolite concentrations are at physiological levels. The approach can be implemented either as an optimal control problem or using reinforcement learning. Analysis of central metabolism showed that both analyses produced similar results and identified known regulators of central metabolism.
In Table 1, we compare the regulation of enzymes that were identified as necessary to be controlled to maintain physiological levels of metabolites in the MCA predictions with those from the pathway-controlled optimization (PCO) predictions. Of course, in the PCO method, reactions are controlled both to maintain physiological levels and to maximize flux to the growth reactions, so any amount of regulation predicted by the PCO method may result from a combination of the two objectives of maximizing growth while constraining metabolite concentrations to physiological levels. Nevertheless, as shown in Table 1, the PCO predicted regulation of reactions is consistent with the MCA predicted regulation of reactions.
TABLE 1. Reactions regulated in the MCA approach in which the goal was to control concentrations. Ten out of twelve of these reactions are also regulated in the pathway-controlled optimization approach.
The PCO method regulated ten out of twelve of the same reactions as the MCA method, suggesting that the PCO method reduced the activity of these reactions in order to maintain physiological levels of the metabolites. However, the observation that the flux in some of these reactions is now effectively zero may indicate that this subset of reactions were additionally shutdown to redirect flux to the growth reactions.
In addition, the PCO method may need to control more reactions than the MCA method, since reaction fluxes are also redistributed due to optimizing the growth reaction rates (discussed below). Decreased or increased reaction flux may result in higher reactant or product levels, respectively, such that the increased levels need to be controlled to keep them at physiological levels. The PCO approach regulated reactions that are not regulated in MCA and yet have non-zero fluxes. These are shown in Table 2. Presumably, these additional reactions need to be regulated to maintain metabolite concentrations at physiological levels in the new regime.
TABLE 2. Reactions regulated in the PCO method but not in the MCA method and that have significant reaction flux in the PCO method. Presumably these reactions are regulated to control metabolite concentrations.
3.2 Increasing growth
In order to increase flux through the growth reactions in accordance with the PCO objective function, additional reactions are regulated to redirect flux to that end. The cumulative value of all the reaction fluxes for both the MCA-regulated reactions and the PCO-regulated reactions are shown in Figure 1. Some reactions in the PCO model are indeed reduced and some are increased, with the overall effect being a large increase in absolute rates of a subset of reactions. In general, more flux is directed to biosynthetic reactions, with the net result being a redirection of flux away from central metabolism in comparing the MCA-derived flux distribution to the PCO-derived flux distribution, as indicated in the flux maps for the two models in Figure 2.
FIGURE 1. Comparison of the cumulative distribution of the reaction fluxes for the MCA model and the PCO model. Flux through the growth reactions are maximized by turning reaction fluxes down but not off.
FIGURE 2. Comparison of the fluxes through central metabolism for the MCA model (top) and the PCO model (bottom). Flux through the growth reactions are maximized by turning reaction fluxes down but not off. The decreased flux through central metabolism in the PCO model reflects the same decrease shown below the red line for the MCA model in Figure 1.
The reactions with the highest fluxes from the PCO approach are shown in Table 3. The reaction with the highest absolute flux in the PCO regulated set of reactions is the RNA synthesis reaction, which increased to 8.11 × 103, a more than 7-fold increase as compared to the MCA regulated model for controlling only metabolite levels. Likewise, the reaction for protein synthesis is included in this group. Both of these reactions are part of the growth reaction set used in the objective function. The other ten reactions with the highest flux are all involved in uptake and processing of the carbon source, ethanol. Much of the carbon from ethanol flows into fumarate and then into the pathways for synthesis of the pyrimidine nucleosides, uridine triphosphate (UTP) and cytidine triphosphate (CTP), precursors of RNA synthesis. In contrast the purine nucleosides are readily available from the already high levels of ATP.
TABLE 3. Top 12 reactions from the constrained optimization that have the highest fluxes. The reaction that has the highest flux in the constrained optimization is the growth reaction for RNA synthesis. Likewise, the reaction for protein synthesis is included in this group. Other reactions with the highest flux are all involved in uptake and processing of the carbon source, ethanol.
Reactions that were effectively shutdown are shown in Table 4. As can be seen at the bottom of the table, DNA synthesis was minimized by significantly down regulating the synthesis of dNTPs, making more NTPs available for RNA synthesis. In addition, the alternate pathway in the Calvin cycle that consumes erythrose-4-phosphate is shutdown (top three reactions in Table 4), making more erythrose-4-phosphate available for synthesis of vitamin B6, a necessary component for production of the branched chain amino acids tryptophan and tyrosine.
TABLE 4. Reactions regulated in the PCO method but not in the MCA method and that have insignificant reaction flux in the PCO method. Presumably these reactions are shut down to redirect mass flow to the growth reactions.
4 Discussion
4.1 Biological discussion
The PCO method we present here agrees with the majority of the regulation imposed by the MCA method for maintaining solvent capacity of the cell, which was shown previously to agree with known points of regulation from experiments as reported in the literature Britton et al. (2020). However, the consideration of growth as an additional objective for regulation in our PCO approach produced significant additional regulation in line with adjusting flux through growth pathways. The inclusion of the consideration of growth is likely to be crucial for a complete understanding of regulation in cells.
While this study uses a limited model (184 reactions and 204 metabolites), that DNA synthesis was turned off, as it is during most of the cell cycle, is interesting, and presumably was done so in order to maximize RNA synthesis. Further evaluation with a more complete model is needed. It is not clear if under low NADP/NADPH conditions that DNA synthesis is likewise maximized by reducing RNA synthesis.
Of course, it will be critical to test the results of the model against experimental observations in order to draw firm biological conclusions. In this regard, previous studies of the metabolic dynamics of R. rubrum using Metabolic Flux Analysis (McCully et al., 2020), an experimental method to derive fluxes from observations of isotope distributions, will provide a good test of the regulation model.
If successful, the incorporation of thermodynamics into the optimization of growth with the PCO approach developed here, along with the ability to infer kinetic parameters from maximum path entropy distributions, will provide a way to further explore the relationship between natural selection and thermodynamics in detail. Ideally, these capabilities will allow the study of the expression patterns and activities of enzymes at various stages in the cell cycle.
4.2 Optimization discussion
One of the biggest challenges in solving the PCO problem is the range of the variable values. For the solutions presented here both metabolites concentrations and activity coefficients for reactions vary over many orders of magnitude. To make the problem more numerically tractable our formulation in essence seeks to do computations with respect to the log of the variables to mitigate the large discrepancies in scale. This is relatively straightforward with respect to the metabolite concentrations in part because they must be positive. However, to handle the flux we must separate out its magnitude and sign which complicates the formulation. It is important to note that our approach does not strictly adhere to the requirements of the IPOPT solver, namely, that the objective and constraints be twice continuously differentiable. In particular, both constraint (27a) and (30h) violate this condition.
For the constraint (27a), the original expression (19c) it is derived from is twice continuously differentiable but resulted in numerical errors. By introducing the form (20) and using the approximation (28) to the signum function a cusp is introduced into the first derivative at zero and error is introduced in an interval about the origin determined by the parameters λ and ϵ. We choose λ and ϵ to shrink this interval such that the true derivative is closely approximated at all points evaluated by the solver. We found this choice to produce more accurate solutions and better convergence behavior than using twice continuously differentiable approximations of the signum function or other forms for (19c). It may be possible to further improve performance by supplying the solver directly with alternative derivative computations rather than relying on automatic differentiation, particularly if solutions approach the origin.
The relaxed formulation (30) also uses the approximation (28) to the signum function in the constraint (30h) specifying the value of the ‘switching’ variable u. Here choices of λ and ϵ are made such that the right hand side of the constraint is effectively constant and partials with respect to the flux will evaluate to zero, in order to hold u constant at 1 or −1 so long as the sign of the flux remains the same. This formulation captures the problem well when the sign on the fluxes are in the optimal direction but it is unclear to what extent it constrains the domain reachable by the solver from an arbitrary initial condition. We also evaluated solutions including an explicit computation of the activity coefficients as in (20) but found that convergence of the solver required setting a lower bound on regulation of roughly 1 × 10−9, and solutions had a significantly lower objective value than were found using the implicit regulation formulation (30).
Due to the potential restrictions on the solutions that can be obtained with our approach using the relaxed formulation (30) we will consider methods to more thoroughly explore the feasible set in future work including stochastic search methods. Of course a more complete exploration of the space of solutions can come at significant computational cost and if other methods perform similarly to the approach we present here, then it may have a clear practical advantage in that the times-to-solution will likely be much faster. Additionally, as discussed in Section 2.4.3 solutions are not necessarily unique. Methods to explore and characterize the set of equivalent solutions, that is solutions that can achieve the same objective value, will allow for a greater understanding of the flexibility that may be possible in controlling reaction fluxes and metabolite concentrations while achieving high growth rates.
Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.
Author contributions
WC conceived and designed the maximum entropy approach to model metabolism and developed the R. rubrum metabolic model. EK designed and implemented the optimization method for learning regulation to maximize growth. JH contributed to the optimization development. JN contributed to the development of the metabolic model of Rhodospirilium rubrum. The manuscript was written by EK and WC.
Funding
WC and JN were supported by the U.S. Department of Energy, Office of Biological and Environmental Research through contracts 78266 and award DE-SC0022091, respectively. EK and JH were supported by the Data Model Convergence Initiative, and EK additionally by the Predictive Phenomics Initiative, under the Laboratory Directed Research and Development Program at the Pacific Northwest National Laboratory. PNNL is a multiprogram national laboratory operated by Battelle for the U.S. Department of Energy under contract DE-AC05-76RLO 1830.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fsysb.2023.981866/full#supplementary-material
Supplementary Data Sheet S1 | Equilibrium constants.
Supplementary Data Sheet S2 | Metabolite concentrations.
Supplementary Data Sheet S3 | Stoichiometric Matrix.
References
Aithal, H. N., Walsh-Reitz, M. M., and Toback, F. G. (1985). Regulation of glyceraldehyde-3-phosphate dehydrogenase by a cytosolic protein. Am. J. Physiol. 249, C111–C116. doi:10.1152/ajpcell.1985.249.1.C111
Atkinson, D. E. (1969). “Limitation of metabolite concentrations and the conservation of solvent capacity in the living cell,” in Current topics in cellular regulation. Editors B. L. Horecker, and E. R. Stadtman (Academic Press), 1, 29–43. doi:10.1016/B978-0-12-152801-0.50007-9
Beber, M. E., Gollub, M. G., Mozaffari, D., Shebek, K. M., Flamholz, A., Milo, R., et al. (2021). eQuilibrator 3.0: a database solution for thermodynamic constant estimation. Nucleic Acids Res. 50, D603–D609. doi:10.1093/nar/gkab1106
Bennett, B. D., Kimball, E. H., Gao, M., Osterhout, R., Van Dien, S. J., and Rabinowitz, J. D. (2009). Absolute metabolite concentrations and implied enzyme active site occupancy in escherichia coli. Nat. Chem. Biol. 5, 593–599. doi:10.1038/nchembio.186
Britton, S., Alber, M., and Cannon, W. R. (2020). Enzyme activities predicted by metabolite concentrations and solvent capacity in the cell. J. Roy. Soc. Interfaces 17, 20200656. doi:10.1098/rsif.2020.0656
Cannon, W., Zucker, J., Baxter, D., Kumar, N., Baker, S., Hurley, J., et al. (2018). Prediction of metabolite concentrations, rate constants and post-translational regulation using maximum entropy-based simulations with application to central metabolism of neurospora crassa. Processes 6, 63. doi:10.3390/pr6060063
Covert, M. W., Schilling, C. H., and Palsson, B. (2001). Regulation of gene expression in flux balance models of metabolism. J. Theor. Biol. 213, 73–88. doi:10.1006/jtbi.2001.2405
Dewar, R. C. (2009). Maximum entropy production as an inference algorithm that translates physical assumptions into macroscopic predictions: Don’t shoot the messenger. Entropy 11, 931–944. doi:10.3390/e11040931
Dixit, P. D., Wagoner, J., Weistuch, C., Presse, S., Ghosh, K., and Dill, K. A. (2018). Perspective: Maximum caliber is a general variational principle for dynamical systems. J. Chem. Phys. 148, 010901. doi:10.1063/1.5012990
Hackett, S. R., Zanotelli, V. R. T., Xu, W., Goya, J., Park, J. O., Perlman, D. H., et al. (2016). Systems-level analysis of mechanisms regulating yeast metabolic flux. Science 354, aaf2786. doi:10.1126/science.aaf2786
Hallows, W. C., Yu, W., and Denu, J. M. (2012). Regulation of glycolytic enzyme phosphoglycerate mutase-1 by sirt1 protein-mediated deacetylation. J. Biol. Chem. 287, 3850–3858. doi:10.1074/jbc.M111.317404
Hart, W. E., Laird, C. D., Watson, J.-P., Woodruff, D. L., Hackebeil, G. A., Nicholson, B. L., et al. (2017). Pyomo–optimization modeling in python second edn, 67. Springer Science & Business Media.
Heimlicher, M. B., Bachler, M., Liu, M., Ibeneche-Nnewihe, C., Florin, E. L., Hoenger, A., et al. (2019). Reversible solidification of fission yeast cytoplasm after prolonged nutrient starvation. J. Cell Sci. 132, jcs231688. doi:10.1242/jcs.231688
Henry, C. S., Broadbelt, L. J., and Hatzimanikatis, V. (2007). Thermodynamics-based metabolic flux analysis. Biophysical J. 92, 1792–1805. doi:10.1529/biophysj.106.093138
Hinshelwood, C. N. (1952). 136. On the chemical kinetics of autosynthetic systems. J. Chem. Soc. 745, 745. doi:10.1039/JR9520000745
Holness, M. J., and Sugden, M. C. (2003). Regulation of pyruvate dehydrogenase complex activity by reversible phosphorylation. Biochem. Soc. Trans. 31, 1143–1151. doi:10.1042/bst0311143
Jaynes, E. T. (1985). Macroscopic prediction. Berlin Heidelberg: Springer-Verlag, 254–269. doi:10.1007/978-3-642-70795-7
Jitrapakdee, S., and Wallace, J. C. (1999). Structure, function and regulation of pyruvate carboxylase. Biochem. J. 340 (1), 1–16. doi:10.1042/bj3400001
King, E., and Cannon, W. (2023). Pathway controlled optimization. Available at: https://github.com/pnnl/pathway_controlled_optimization.
Larsson-Raznikiewicz, M. (1967). Kinetic studies on the reaction catalyzed by phosphoglycerate kinase. ii. the kinetic relationships between 3-phosphoglycerate, mgatp2-and activating metal ion. Biochim. Biophys. Acta 132, 33–40. doi:10.1016/0005-2744(67)90189-1
Lehninger, A. L., Nelson, D. L., and Cox, M. M. (2005). Lehninger principles of biochemistry 4th edn. New York: W. H. Freeman.
Marcelin, R. (1910). The mechanics of irreversible phenomenon. Comptes Rendus Hebd. Des. Seances De. L Acad. Des. Sci. 151, 1052–1055.
McCully, A. L., Onyeziri, M. C., LaSarre, B., Gliessman, J. R., and McKinlay, J. B. (2020). Reductive tricarboxylic acid cycle enzymes and reductive amino acid synthesis pathways contribute to electron balance in a rhodospirillum rubrum calvin-cycle mutant. Microbiology 166, 199–211. doi:10.1099/mic.0.000877
North, J. A., Narrowe, A. B., Xiong, W., Byerly, K. M., Zhao, G., Young, S. J., et al. (2020). A nitrogenase-like enzyme system catalyzes methionine, ethylene, and methane biogenesis. Science 369, 1094–1098. doi:10.1126/science.abb6310
Park, J. O., Rubin, S. A., Xu, Y. F., Amador-Noguez, D., Fan, J., Shlomi, T., et al. (2016). Metabolite concentrations, fluxes and free energies imply efficient enzyme usage. Nat. Chem. Biol. 12, 482–489. doi:10.1038/nchembio.2077
Parry, B. R., Surovtsev, I. V., Cabeen, M. T., O’Hern, C. S., Dufresne, E. R., and Jacobs-Wagner, C. (2014). The bacterial cytoplasm has glass-like properties and is fluidized by metabolic activity. Cell 156, 183–194. doi:10.1016/j.cell.2013.11.028
Reznik, E., Christodoulou, D., Goldford, J. E., Briars, E., Sauer, U., Segrè, D., et al. (2017). Genome-scale architecture of small molecule regulatory networks and the fundamental trade-off between regulation and enzymatic activity. Cell Rep. 20, 2666–2677. doi:10.1016/j.celrep.2017.08.066
Saha, A., Connelly, S., Jiang, J., Zhuang, S., Amador, D. T., Phan, T., et al. (2014). Akt phosphorylation and regulation of transketolase is a nodal point for amino acid control of purine synthesis. Mol. Cell 55, 264–276. doi:10.1016/j.molcel.2014.05.028
Sauro, H. M. (2018). Systems biology: An introduction to metabolic control Analysis. Ambrosius Publishing.
Sivak, D. A., and Thomson, M. (2014). Environmental statistics and optimal regulation. PLoS Comput. Biol. 10, e1003826. doi:10.1371/journal.pcbi.1003826
Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., et al. (2020). SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272. doi:10.1038/s41592-019-0686-2
Wächter, A., and Biegler, L. (2006). On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Math. Program. 106, 25–57. doi:10.1007/s10107-004-0559-y
Wang, Y. P., Zhou, L. S., Zhao, Y. Z., Wang, S. W., Chen, L. L., Liu, L. X., et al. (2014). Regulation of g6pd acetylation by sirt2 and kat9 modulates nadph homeostasis and cell survival during oxidative stress. EMBO J. 33, 1304–1320. doi:10.1002/embj.201387224
Waygood, E. B., Mort, J. S., and Sanwal, B. D. (1976). The control of pyruvate kinase of escherichia coli. binding of substrate and allosteric effectors to the enzyme activated by fructose 1,6-bisphosphate. Biochemistry 15, 277–282. doi:10.1021/bi00647a006
Waygood, E. B., Rayman, M. K., and Sanwal, B. D. (1975). The control of pyruvate kinases of escherichia coli. ii. effectors and regulatory properties of the enzyme activated by ribose 5-phosphate. Can. J. Biochem. 53, 444–454. doi:10.1139/o75-061
Keywords: metabolism, regulation, maximum entropy, optimization, modeling
Citation: King E, Holzer J, North JA and Cannon WR (2023) An approach to learn regulation to maximize growth and entropy production rates in metabolism. Front. Syst. Biol. 3:981866. doi: 10.3389/fsysb.2023.981866
Received: 29 June 2022; Accepted: 21 March 2023;
Published: 05 April 2023.
Edited by:
Yoram Vodovotz, University of Pittsburgh, United StatesReviewed by:
Loïc Paulevé, UMR5800 Laboratoire Bordelais de Recherche en Informatique (LaBRI), FranceR. Chase Cockrell, University of Vermont, United States
Copyright © 2023 King, Holzer, North and Cannon. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Ethan King, ethan.king@pnnl.gov