From Genes to Ecosystems in Microbiology: Modeling Approaches and the Importance of Individuality

Kreft, Jan-Ulrich; Plugge, Caroline M.; Prats, Clara; Leveau, Johan H. J.; Zhang, Weiwen; Hellweger, Ferdi L.

doi:10.3389/fmicb.2017.02299

REVIEW article

Front. Microbiol. , 27 November 2017

Sec. Systems Microbiology

Volume 8 - 2017 | https://doi.org/10.3389/fmicb.2017.02299

This article is part of the Research Topic The Individual Microbe: Single-Cell Analysis and Agent-Based Modelling View all 16 articles

From Genes to Ecosystems in Microbiology: Modeling Approaches and the Importance of Individuality

$\r\nJan-Ulrich Kreft$ Jan-Ulrich Kreft¹

Ferdi L. Hellweger⁶^*

¹Centre for Computational Biology, Institute for Microbiology and Infection, School of Biosciences, University of Birmingham, Birmingham, United Kingdom
²Laboratory of Microbiology, Wageningen University and Research, Wageningen, Netherlands
³Department of Physics, School of Agricultural Engineering of Barcelona, Universitat Politècnica de Catalunya–BarcelonaTech, Castelldefels, Spain
⁴Department of Plant Pathology, University of California, Davis, Davis, CA, United States
⁵Laboratory of Synthetic Microbiology, Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin, China
⁶Civil and Environmental Engineering Department, Marine and Environmental Sciences Department, Bioengineering Department, Northeastern University, Boston, MA, United States

Models are important tools in microbial ecology. They can be used to advance understanding by helping to interpret observations and test hypotheses, and to predict the effects of ecosystem management actions or a different climate. Over the past decades, biological knowledge and ecosystem observations have advanced to the molecular and in particular gene level. However, microbial ecology models have changed less and a current challenge is to make them utilize the knowledge and observations at the genetic level. We review published models that explicitly consider genes and make predictions at the population or ecosystem level. The models can be grouped into three general approaches, i.e., metabolic flux, gene-centric and agent-based. We describe and contrast these approaches by applying them to a hypothetical ecosystem and discuss their strengths and weaknesses. An important distinguishing feature is how variation between individual cells (individuality) is handled. In microbial ecosystems, individual heterogeneity is generated by a number of mechanisms including stochastic interactions of molecules (e.g., gene expression), stochastic and deterministic cell division asymmetry, small-scale environmental heterogeneity, and differential transport in a heterogeneous environment. This heterogeneity can then be amplified and transferred to other cell properties by several mechanisms, including nutrient uptake, metabolism and growth, cell cycle asynchronicity and the effects of age and damage. For example, stochastic gene expression may lead to heterogeneity in nutrient uptake enzyme levels, which in turn results in heterogeneity in intracellular nutrient levels. Individuality can have important ecological consequences, including division of labor, bet hedging, aging and sub-optimality. Understanding the importance of individuality and the mechanism(s) underlying it for the specific microbial system and question investigated is essential for selecting the optimal modeling strategy.

Introduction

Microbes are important drivers of biogeochemical cycles in all ecosystems and impact their environments in a plethora of ways. For example, in lakes, the harmful cyanobacterium Microcystis aeruginosa can bloom and produce toxins that make the water unsafe to drink (Paerl et al., 2011). The common gut bacterium Bacteroides fragilis produces a chemical that helps the host develop its immune system (Atarashi et al., 2013).

Models are important tools for understanding and managing ecosystems. They can be used to advance scientific understanding by interpreting field observations and aid in hypothesis testing. For example, Jöhnk et al. (2008) used a model to quantify the roles of temperature range and buoyancy regulation in the fitness of the toxic cyanobacterium Microcystis during heat waves. Buffie et al. (2015) applied the model of Stein et al. (2013) to infer an antagonistic interaction in the gut between the pathogen Clostridium difficile and another species of that genus, Clostridium scindens. For ecosystem management, models can be used to answer “what if” questions and make predictions about the effects of future environmental conditions. For example, Blumberg and Di Toro (1990) used a model to predict the effects of climate warming on phytoplankton and dissolved oxygen in a lake. Bucci et al. (2016) predicted the composition of the mouse gut microbiota following infection with C. difficile.

In the past decades, microbiology has experienced rapid advances in observational and experimental technologies, resulting in substantial progress in the understanding of microbes at the molecular level. For example, nitrogen (N) fixation by the cyanobacterium Anabaena involves a division of labor among N-fixing heterocysts and photosynthesizing vegetative cells. The nitrogen-containing β-aspartyl-arginine is produced by cyanophycinase in heterocysts, transferred intercellularly to vegetative cells where it is converted to aspartate and arginine by isoaspartyl dipeptidase (Burnat et al., 2014). Another example involves transcription of genes to messenger RNA (mRNA) and translation to proteins, which is performed by RNA polymerase (RNAP) and the ribosome complex, respectively. In bacteria, those can form a single transcribing and translating “expressome” complex, with known 3D structure and functional consequences on transcriptional pausing, backtracking and termination (Kohler et al., 2017). Characterization of ecosystems is following the same trend. For example, lakes used to be characterized using bulk measures, like Chlorophyll a and total phosphorus concentrations, but observations are now often at the molecular level, including gene expression (transcript levels) (Vila-Costa et al., 2013). Animal and human microbiota are now routinely characterized using multiple omics technologies, such as community characterization using bacterial 16S ribosomal RNA (rRNA) polymerase chain reaction (Costello et al., 2009), and increasingly meta-genomics, transcriptomics and proteomics (Wang et al., 2015).

The development of models is lagging behind as most models still do not make use of molecular level understanding or observations. It is recognized that there is a substantial gap between our microbial ecology models and current microbiology knowledge and environmental observations (Fuhrman et al., 2013; Trivedi et al., 2013; Hellweger, 2015; Dick, 2017; Stec et al., 2017). For example, lake phytoplankton models still simulate phytoplankton biomass concentrations (e.g., μg Chlorophyll a L⁻¹) and the effect of a nutrient on the growth rate using an equation developed 75 years ago (Monod model). Likewise, most models of the gut aggregate species into functional groups based on metabolic pathways (Kettle et al., 2015). Models are now being developed that explicitly resolve genes and make predictions at the population and ecosystem level.

This paper has two parts. First, we review existing modeling approaches. Here, we focus on mechanistic models that explicitly include genes and simulate population-level properties (e.g., microbe concentration, nutrient uptake) rather than empirical models. One aspect in which the existing approaches differ is their representation of microbial individuality. The second part of our review will use examples to explain why including individuality is important.

Part 1: Review of Existing Modeling Approaches

In this section we describe three modeling approaches that have been used to bridge the gap between genes and ecosystems, including metabolic flux, gene-centric and agent-based modeling (ABM). We illustrate each approach using a hypothetical ecosystem, where two microbial species grow and interact via three metabolites (Figure 1). We then discuss a number of examples from the literature, focusing mostly on the modeling aspects of the studies. Then we highlight the weaknesses and strengths of each approach. Finally, we characterize the models along a number of dimensions, including space, time, function, heterogeneity, species diversity and genes.

FIGURE 1

Figure 1. Hypothetical ecosystem used to illustrate different modeling approaches. Species 1 takes up metabolites A and B, produces metabolites C, D, E, and F and excretes metabolite C. Species 2 takes up metabolites A and C and produces metabolites D, G, and H.

Literature Selection Criteria

The review is focused on the use of gene-level models for advancing understanding and making predictions of microbial ecosystems. To keep the scope of the review manageable, we included only quantitative models, which are required for predictions, although qualitative models may be sufficient to advance understanding. We applied the following selection criteria: (1) model uses a mechanistic (vs. empirical) approach, (2) model explicitly considers at least one actual gene or protein; (3) model includes some form of direct or indirect interaction among microbes; (4) model includes multiple microbial species (or strains) or phenotypes in different locations; and (5) model makes predictions at the population level. We therefore exclude empirical models that correlate observed gene distributions to environmental factors and function (e.g., carbon export in the ocean, Guidi et al., 2016), models that use hypothetical genes or digital genomes describing behavioral traits (e.g., Lenski et al., 1999; Clark et al., 2011), scale up single-cell models using multiple independent simulations where the cells do not interact (e.g., Emonet and Cluzel, 2008; Labhsetwar et al., 2013) and studies that infer interactions from comparison of metabolic networks and do not make quantitative predictions (e.g., Levy and Borenstein, 2013; Zelezniak et al., 2015).

Metabolic Flux Modeling

Definition

This approach builds on the genome-scale, constraint-based modeling approach most commonly applied to single species (Feist et al., 2008). In this approach, the genome sequence is used to derive a network of potential metabolic reactions by a combination of automated and manual (curation) steps. Then, a flux distribution is predicted, typically by optimizing the flux distribution to maximize an objective function, like maximization of biomass production (Schuster et al., 2008). The extension of this approach to multiple species builds on efforts to extend it to multiple compartments of higher eukaryotic organisms. There are three approaches to multi-species metabolic flux modeling, which we will refer to as environmentally coupled, directly linked and aggregated approaches. The environmentally coupled approach builds on the dynamic flux balance analysis (FBA) approach (Varma and Palsson, 1994), where the microbes and extracellular metabolites are represented using concentration state variables. The growth rate and metabolite fluxes are computed from FBA assuming a common pool for extracellular metabolites and that the system is in a steady state during each time step. The directly linked approach explicitly links the metabolic networks of the species using exchange reactions. This is conceptually the same way in which multi-compartment organisms are modeled. The aggregated approach (also referred to as pooled, supra-organism or enzyme soup approach) involves constructing one network by combining the individual networks and removing duplicates. This ignores cellular boundaries and is most applicable to metagenomic datasets. Box 1 illustrates these three approaches for the hypothetical ecosystem shown in Figure 1. This approach has also been referred to as Ecosystems Biology (Klitgord and Segrè, 2011) or Community Systems Biology (Zengler and Palsson, 2012).

Box 1. From genes to ecosystems using metabolic flux modeling.

Single species

The starting point for a metabolic flux model is a set of mass balance equations:

\begin{array}{l} \frac{d x}{d t} = S \cdot v & (B 1.1) \end{array}

where x (mmol gDW⁻¹, i.e., per gram biomass dry weight) is a vector of metabolite concentrations, S is the stoichiometric matrix and v (mmol gDW⁻¹ h⁻¹) is a vector of reaction rates for uptake, excretion, internal metabolism and growth. Typically, a steady-state is assumed so the derivatives are zero. The stoichiometric matrix (S) for species 1 of the hypothetical ecosystem is presented in Table B1, where columns are reactions and rows are metabolites. Lower and upper bounds for the reaction rates, determined based on thermodynamics, enzyme kinetics or measurements, can be included in the optimization procedure.

There are typically infinitely many solutions that satisfy the equation. For example, in species 1 (Figure B1.1), biomass (metabolite F) can be produced by any combination of two pathways (A yes E yes F or A yes D yes F). Computational methods are available that decompose the stoichiometric matrix into unique sets of functional units (pathways) such as elementary modes or extreme pathways (Papin et al., 2004). A more common approach, flux-balance analysis (FBA), involves optimizing reaction rates to maximize the value of some objective function using linear programing (LP). Several objective functions have been used, such as minimizing ATP production and maximizing production of some metabolite, but maximizing biomass production yield or rate is often considered to be the most appropriate in an ecological context. When biomass production is maximized, it is assumed that the cell regulates fluxes through its metabolic network in a way that maximizes biomass production. The corresponding objective function for the species 1 of our example system is to maximize the production of metabolite F (V_MetE1 + V_PrdF1 or V_Growth1 in Table B1). This is relatively simple and real models typically use a more complex biomass growth function, e.g., a genome-scale model may include various precursors (e.g., G6P, F6P) and cofactors (e.g., ATP, NADH). Algorithms that integrate gene expression data are also available (Becker and Palsson, 2008). FBA is fundamentally a steady-state approach, but a pseudo-time-variable model can be constructed (Varma and Palsson, 1994; Mahadevan et al., 2002).

Multiple species—environmentally coupled models

Figure B1.1 illustrates the dynamic, multi-species metabolic flux modeling methodology. The model includes state variables for microbial biomass (X) and extracellular metabolites (C). The microbes grow according to a growth rate (μ) and consume/produce metabolites according to specific consumption/production rates (V). Those values are calculated from the metabolic flux models, which are optimized to maximize the growth yield subject to a number of constraints, including a maximum consumption rate for each metabolite based on its concentration. A simulation will proceed in a step-wise manner: (1) Calculate the constraints based on all C. (2) Optimize the metabolic model of each species, which yields μ and V. (3) Calculate the new X for both species based on μ. (4) Calculate the new C for both metabolites based on V from both species. Repeat. When the metabolic model does not lead to a viable solution, a simple death routine can be invoked (Zhuang et al., 2011). It is conceptually straightforward to include other reactions (e.g., between extracellular compounds) and transport (Scheibe et al., 2009).

FIGURE B1.1

Figure B1.1. Multi-species metabolic flux modeling—environmentally coupled models. After Figure 2 in Zhuang et al. (2011). X (gDW L⁻¹) = microbial biomass concentration, C (mmol L⁻¹) = extracellular metabolite concentrations, μ (h⁻¹) = specific growth rates, V (mmol gDW⁻¹ h⁻¹) = specific flux velocities.

Multiple species—directly linked model

Figure B1.2 illustrates the multi-species metabolic flux modeling approach developed by Stolyar et al. (2007). The metabolic models for each species (Figure B1.1) are combined into a single model. Exchange of metabolites among the species occurs by directly linking their reactions, which constrains them to be the same. This is equivalent to assuming there is no change in the extracellular metabolite concentrations. The model is optimized to maximize a weighted combination of the biomasses.

Multiple species—aggregated model

Figure B1.3 illustrates the multi-species metabolic flux modeling approach developed by Taffs et al. (2009). The reactions and metabolites for the two species (as shown in Figure B1.1) are merged into a single model and a single objective function is used to determine the flux distribution.

FIGURE B1.2

Figure B1.2. Multi-species metabolic flux modeling—directly linked model. After Figure 2 in Stolyar et al. (2007). The metabolic models for each species (Figure B1.1) are combined into one model, with exchange reactions linking their metabolisms.

FIGURE B1.3

Figure B1.3. Multi-species metabolic flux modeling—aggregated model. After Figure S2 in Taffs et al. (2009). The metabolic models for each species (as shown in Figure B1.2) are merged into one model.

TABLE B1

Table B1. Stoichiometric Matrix (S) and lower and upper bounds for the FBA of species 1.

Examples

There have been several applications of metabolic flux models to communities of microbes. For recent reviews see Zengler and Palsson (2012), Biggs et al. (2015), Tan et al. (2015), Zomorrodi and Segrè (2016), Perez-Garcia et al. (2016), and Gottstein et al. (2016).

Environmentally coupled models

Scheibe et al. (2009) applied FBA to learn about the growth of Geobacter and uranium bioremediation in a contaminated groundwater site where Geobacter dominates the community. They coupled a genome-scale FBA model to a two-dimensional reactive transport model. The FBA model computes growth rate and fluxes based on ambient acetate, Fe(III) and ammonia concentrations in each grid element. Those growth rates and fluxes are then used by the reactive transport model to compute the Geobacter biomass, acetate, Fe(III) and ammonia concentrations, as well as other processes like U(VI) reduction. The new ambient concentrations are then again used by the FBA model to compute the growth rate and fluxes at the next time step and so on. Due to computational constraints, the FBA calculations were done a priori for 1,000 combinations of metabolite concentrations and stored in a look-up table, rather than a dynamic coupling between the models. One of the main advantages of the FBA-based approach is that it allows for variable substrate utilization and growth yields, which is not supported by conventional models. The model was able to make predictions of similar quality as the previous reactive transport model (i.e., without FBA component), but it did so without the need to calibrate rate parameters (Figure 2).

FIGURE 2

Figure 2. Comparison of observations (symbols), traditional model (solid lines) and FBA-based model (dashed lines). Reproduced from Scheibe et al. (2009) with permission. The figure shows acetate and U(VI) concentrations at a groundwater bioremediation site. Concentration time series are presented at 3.7, 7.3, and 14.6 m distance from the acetate injection gallery. Acetate increases at progressively later times as the distance from the injection gallery increases. Consistent with this, U(VI) decreases at progressively later times. Colors identify single wells.

Tzamali et al. (2009) and Tzamali et al. (2011) used the dynamic FBA approach to simulate the interaction among various E. coli strains, including wild type and single gene knockouts. For various substrates, they identified potential communities of co-existing strains. For example, growth on pyruvate supported communities with up to 6 strains. The most efficient community of 4 mutants produced 2.2% more biomass than a pure culture of the wild type.

Zhuang et al. (2011) developed a dynamic, genome-scale FBA model of two species in competition in a uranium-contaminated aquifer. Rhodoferax and Geobacter both oxidize acetate and reduce Fe(III), but only Geobacter can reduce U(VI), rendering it less soluble and therefore contributing to the clean-up of the site. The FBA models of the two species calculate growth and metabolite production/consumption rates, which are used to integrate biomass and metabolite concentration state variables. The model predicted that, under low-ammonia conditions, Rhodoferax is outcompeted by Geobacter, which can fix nitrogen, and that this promotes respiration (vs. biomass production) and associated U(VI) reduction, which are patterns consistent with observations.

Zhuang et al. (2012) expanded the model by Zhuang et al. (2011) and applied it to design remediation scenarios. In particular, they used two separate FBA models for attached and planktonic Geobacter to differentiate their functions: planktonic cells reduce U(VI) and attached cells reduce Fe(III). Attachment and detachment rates were used to transfer biomass among these two fractions. This illustrates one approach by which heterogeneity can be simulated in these types of models.

Harcombe et al. (2014) developed dynamic FBA models of two and three species on a two-dimensional grid, where biomass grows and dies, extracellular metabolites are consumed and produced, and biomass and metabolites move by diffusion. Cole et al. (2015) extended the dynamic FBA approach further to three dimensions and used it to simulate growth of E. coli in colonies on agar. The model was able to simulate the small-scale environmental heterogeneity in dissolved oxygen and nutrient concentrations, and the resulting phenotypic differentiation of the bacteria (i.e., fermenting cells in the interior). Other multi-species, environmentally coupled metabolic flux models were presented by Salimi et al. (2010), Hanly and Henson (2011), Hanly and Henson (2013), Biggs and Papin (2013), Chiu et al. (2014) and Louca and Doebeli (2015). Zomorrodi et al. (2014) presented a dynamic version of the multi-level optimization routine presented previously (Zomorrodi and Maranas, 2012, see below).

Directly linked models

Stolyar et al. (2007) developed an FBA model of two microbes that are mutualistic in the absence of sulfate, Desulfovibrio vulgaris and Methanococcus maripaludis. In the scenario evaluated, D. vulgaris grows on lactate, producing H₂, formate, CO₂ and acetate, which support the growth of M. maripaludis. The model consists of three compartments, representing the metabolism of the two species and the exchange between them. The metabolite fluxes in the central metabolism of each species and exchange reactions are represented using 89 and 82 equations, respectively. The third compartment represents the exchange flux of H₂, formate, CO₂ and acetate, where H₂ and formate were not allowed to accumulate in the medium, so that their rates of production by D. vulgaris and consumption by M. maripaludis are the same. The combined model was optimized to maximize biomass production of both species, with a larger weight for D. vulgaris, based on observations. However, the biomass ratio of the two species is constrained by the exchange reaction, so it was relatively invariant to the weights used. The model suggested that the H₂ was essential, but that formate could be eliminated.

Wintermute and Silver (2010) applied the FBA modeling approach at the genome scale to 46 E. coli mutants, each incapable to synthesize an essential metabolite. Growth experiments were conducted with 1,035 binary strain combinations. A joint FBA of each pair allowing for exchange of all shared metabolites between the strains was developed. The models were optimized to minimize the difference between the flux distributions of the wildtype and mutant (minimization of metabolic adjustment, MOMA, Segrè et al., 2002). The idea behind this objective function is that the regulatory system is still based on the wildtype and has not yet adjusted to the mutation. The joint FBA models were consistent with the finding that pairings of mutants blocked in the same biosynthetic pathway rarely show synergistic growth (4% of the cases) while pairings of mutants in separate pathways did so in 18% of cases. The model correctly predicted that strains grow best when they require small amounts of metabolites that are cheap to produce by the other strain. The ability of simple stoichiometric models to predict fitness costs and benefits of metabolic cross-feeding is encouraging.

Klitgord and Segrè (2010) applied the FBA modeling approach to binary pairs of seven species and identified the media composition that would support symbiosis. They developed genome-scale FBA models of all possible binary pairs and did a systematic search for media compositions that would support growth of the pair but not the individual species.

Huthmacher et al. (2010) generated an FBA model of the metabolism of the malaria causing Plasmodium falciparum and its host, the erythrocyte (red blood cell). By constraining the metabolic network with gene expression data of P. falciparum, they were able to predict metabolic fluxes for different life cycle stages of the pathogen.

Zomorrodi and Maranas (2012) developed a community FBA modeling framework and applied it to a number of systems, including those of Stolyar et al. (2007) and Taffs et al. (2009). A novel aspect in this work is the consideration of multiple objective functions, including maximization of growth of each species as well as biomass production at the community level, which can be used to explore tradeoffs between selfish and altruistic driving forces.

Other multi-species, directly linked metabolic flux models were produced by Taffs et al. (2009), Bizukojc et al. (2010), Bordbar et al. (2010), Freilich et al. (2011), Khandelwal et al. (2013), Nagarajan et al. (2013), Shoaie et al. (2013), Ye et al. (2014), El-Semman et al. (2014), Merino et al. (2015) and Heinken and Thiele (2015).

Aggregated models

Taffs et al. (2009) applied different approaches to model three species (oxygenic phototrophs, filamentous anoxygenic phototrophs and sulfate-reducing bacteria) in the thermophilic, phototrophic mat communities from Octopus and Mushroom Springs in Yellowstone National Park (USA). One of their approaches does not consider compartments, but lumps all reactions into one species (see Box 1). This approach ignores compartmentalization and the fact that intermediate intracellular metabolites from one species may not be available to another. However, it does not require assigning individual enzymes or reactions to species, functional groups or guilds and is well suited for data from metagenomics. A unique aspect of this study is the use of elementary mode analysis (EMA), which is an alternative to FBA and characterizes the set of all possible flux distributions, rather than just the optimal one.

Tobalina et al. (2015) applied the aggregated approach to naphthalene-contaminated soil communities. An interesting aspect of that study was that the model was based on metaproteomics data, which implicitly accounts for regulation.

Cerqueda-García and Falcón (2016) applied the aggregated approach to study the metabolism of communities in microbial mats and microbialites (living carbonate rock structures similar to corals and stromatolites). Starting with metagenomic datasets, they reconstructed a metabolic network, and then used EMA to identify feasible pathways through this network for C and N assimilation. They identified a number of alternative CO₂ fixation pathways, which were not identified for these systems previously.

Strengths

• The FBA approach can directly utilize molecular data, genomics, transcriptomics, proteomics and metabolomics, from pure laboratory cultures and the environment (e.g., metagenomics) as long as annotations are available, which is increasingly the case.

• The approach is comprehensive in terms of functions and metabolites. This is likely to be increasingly useful, as recent observations from a number of environments suggest that bacteria have a high substrate specificity (Kindaichi et al., 2013; Salcher et al., 2013). For example, when a freshwater community was presented with 14 radiolabeled low-molecular weight (LMW) organic substrates, the two most abundant microbes belonging to the Actinobacteria ac1 and Alphaproteobacteria LD12 tribes had no overlap in their substrate acquisition spectra. The concept of dissolved organic carbon (DOC) as a common currency for heterotrophic microbes is too simplistic. One of the main applications of FBA has been to understand complex substrate uptake patterns.

Weaknesses

• The directly linked and aggregated approaches assume the system to be in a steady-state. The environmentally coupled approach also assumes steady-state flux distributions during each time step, but flux distributions can change from time to time. For many cases this assumption will be sensible, but for others not. For example, planktonic bacteria experience a very heterogeneous nutrient regime and may experience nutrient patches with short durations (~60 s, Taylor and Stocker, 2012), comparable to the time required for gene expression, protein translation and maturation. Genome-scale models are being developed that go beyond steady-state metabolite fluxes (e.g., include dynamic transcript, protein and metabolite pools, Karr et al., 2012) and this technology will eventually be applied at the ecosystem scale.

• It is not always clear what objective function should be used to optimize the flux distribution (Schuster et al., 2008). Maximization of biomass production seems like a good choice from a biotechnological perspective. However, there are cases where it is advantageous to divert production away from biomass, including to storage products, toxins or EPS (Merino et al., 2015), which may conflict with the biomass objective. Moreover, in a well-mixed, stable environment, specific growth rate will likely be maximized by natural selection while in a spatially structured environment such as a biofilm, the biomass yield is likely to be maximized by natural selection (Kreft, 2004).

• The approach typically entails specifying a biomass composition, and commonly this is applied across different conditions. However, the biomass composition is known to change (Benyamini et al., 2010).

• Growth dilution of metabolites, other than the ones used in the growth equation (see above), is typically ignored (Benyamini et al., 2010). Specifically, there should be a “–μ x” on the right-hand side of Equation B1.1. Accounting for growth dilution is conceptually straightforward but it requires specifying the metabolite concentrations, which are not typically available at the genome scale. Metabolomics data can help to fill this gap, but this would be difficult for all metabolites, times and locations in the model and impossible for prediction simulations. Another hurdle is the computational cost. The metabolite dilution FBA (MD-FBA) model of Benyamini et al. (2010) uses mixed-integer linear programming (MILP, vs. LP used by FBA), which is computationally more demanding than LP. This limitation may be especially important for applications that require solutions for multiple species, times and locations.

• The approach does not account for individual heterogeneity (see Part 2).