Estimating Parameters From Multiple Time Series of Population Dynamics Using Bayesian Inference

Rosenbaum, Benjamin; Raatz, Michael; Weithoff, Guntram; Fussmann, Gregor F.; Gaedke, Ursula

doi:10.3389/fevo.2018.00234

METHODS article

Front. Ecol. Evol., 22 January 2019

Sec. Population, Community, and Ecosystem Dynamics

Volume 6 - 2018 | https://doi.org/10.3389/fevo.2018.00234

Estimating Parameters From Multiple Time Series of Population Dynamics Using Bayesian Inference

Benjamin Rosenbaum^1,2^*^†

Michael Raatz³^†

Guntram Weithoff³

Gregor F. Fussmann⁴

Ursula Gaedke³

¹German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany
²Institute of Biodiversity, Friedrich Schiller University Jena, Jena, Germany
³Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
⁴Department of Biology, McGill University, Montreal, QC, Canada

Empirical time series of interacting entities, e.g., species abundances, are highly useful to study ecological mechanisms. Mathematical models are valuable tools to further elucidate those mechanisms and underlying processes. However, obtaining an agreement between model predictions and experimental observations remains a demanding task. As models always abstract from reality one parameter often summarizes several properties. Parameter measurements are performed in additional experiments independent of the ones delivering the time series. Transferring these parameter values to different settings may result in incorrect parametrizations. On top of that, the properties of organisms and thus the respective parameter values may vary considerably. These issues limit the use of a priori model parametrizations. In this study, we present a method suited for a direct estimation of model parameters and their variability from experimental time series data. We combine numerical simulations of a continuous-time dynamical population model with Bayesian inference, using a hierarchical framework that allows for variability of individual parameters. The method is applied to a comprehensive set of time series from a laboratory predator-prey system that features both steady states and cyclic population dynamics. Our model predictions are able to reproduce both steady states and cyclic dynamics of the data. Additionally to the direct estimates of the parameter values, the Bayesian approach also provides their uncertainties. We found that fitting cyclic population dynamics, which contain more information on the process rates than steady states, yields more precise parameter estimates. We detected significant variability among parameters of different time series and identified the variation in the maximum growth rate of the prey as a source for the transition from steady states to cyclic dynamics. By lending more flexibility to the model, our approach facilitates parametrizations and shows more easily which patterns in time series can be explained also by simple models. Applying Bayesian inference and dynamical population models in conjunction may help to quantify the profound variability in organismal properties in nature.

1. Introduction

Trophic interactions provide the elementary link in all food webs. Long-term data sets of such interactions become increasingly available, both from field observations, and highly controlled large-scale and laboratory experiments (Fussmann et al., 2000; Tirok and Gaedke, 2007; Becks et al., 2010, 2012; Magurran et al., 2010; Weigelt et al., 2010; Boit and Gaedke, 2014). To mechanistically understand these trophic interactions, models are employed with the goal of reproducing the observed population dynamics, which can either settle to a steady state, or include more complex patterns like limit cycles or chaos (Fussmann et al., 2000; Boit et al., 2012; Becks and Arndt, 2013; Barraquand et al., 2017). Obtaining an agreement of such non-static experimental observations and modeled time series remains a demanding task. Often, modelers are faced with situations where they can reproduce basic features of the dynamics, but not on the right time scales or not on biomass levels close to the data. One logical reaction would be to refine the model structure and include a higher level of biological detail. However, another reason for such disagreements between model and data may be a generally valid model structure but an incorrect parametrization. In this study, we present a method to obtain the relevant model parameters directly from the experimental data by applying Bayesian inference to a comprehensive set of time series data for a laboratory predator-prey system that features both steady states and cyclic population dynamics.

Incorrect parametrizations can arise for different reasons. Firstly, a model always has to abstract from reality. This implies that individual model parameters summarize a multitude of different ecological processes. The impact of these processes on the model parameter likely changes over time. Due to these non-modeled processes a one-to-one relationship between empirically determined parameter values and model parameters is impossible. For example, typical predator-prey models consider a conversion efficiency by which prey biomass is converted into predator growth. Among others this efficiency depends on the variable prey abundance (S. Schälicke in prep.) and the relative importance of basal and activity respiration (Kath et al., 2018). Secondly, model parameters are often obtained experimentally under slightly different settings than the actually observed population dynamics. An example would be to measure the prey's population growth rate in batch experiments and then use this parameter in a chemostat model. By design, growth conditions in batch and chemostat cultures differ in some aspects, such as the dynamics of nutrient limitation or the selection pressure, e.g., for higher nutrient affinities vs. maximum growth rates. Therefore, the parameters that were obtained for one experimental setting might be of limited value for a different one.

It is more and more recognized that the functional traits of organisms that determine trophic interactions comprise a considerable variability (Litchman and Klausmeier, 2008; Bolnick et al., 2011; Violle et al., 2012; Bolius et al., 2017; Gaedke and Klauschies, 2017). This trait variability can have far-reaching consequences both at the population level (Abrams, 1999; Post and Palkovacs, 2009; Becks et al., 2010; Ehrlich et al., 2017; Raatz et al., 2017; Cortez, 2018) and at the community level (McGill et al., 2006; Hillebrand and Matthiessen, 2009). In such cases, employing just one parametrization for different time series of the same system may be insufficient and hamper the agreement of experimental and modeled population dynamics. Instead, the model will potentially support parts of the data or comprise certain general features, but will fail to reproduce its entire behavior. Our Bayesian framework allows to retrieve information on such parameter-related uncertainties directly from the time series data, which might even become apparent as between-replicate differences, and provide individual parameter estimates for each data set.

Dynamical population models have traditionally been fitted to time series data by least squares or maximum likelihood methods (Costantino et al., 2005; Cao et al., 2008; DeLong et al., 2014; Rall and Latz, 2016; Curtsdotter et al., 2018), see Bolker (2008) and Aster et al. (2012) for a general introduction. While they offer point estimates for unknown parameters, their confidence intervals rely on a local approximation and a normality assumption of the likelihood function (Bolker, 2008, Chap. 6.5).

Bayesian methods, on the other hand, quantify uncertainty more precisely by globally exploring the parameters' posterior probability distribution using Markov chain Monte Carlo (MCMC) sampling. They allow, for instance, direct inference on sought parameters and derived quantities, utilizing prior information, defining hierachical levels among parameters, and recovering unobserved system states (Kindsvater et al., 2018).

For discrete-time population dynamics, Bayesian methods have received growing attention over the last years (Almaraz and Oro, 2011; Elderd and Miller, 2015; Wittwer et al., 2015; Compagnoni et al., 2016; Robinson et al., 2017). In a discrete setup, state-space models (SSM) are feasible and allow, e.g., the separation of process and observation error (Hefley et al., 2013), recovering latent states (Hosack et al., 2012), incorporating age-structure (Taboadai and Anadón, 2016), adding environmental covariates (Almaraz et al., 2012; Koons et al., 2015), or spatially explicit models (Iijima et al., 2013). These advances were facilitated by the probabilistic programming environments BUGS (Lunn et al., 2009) and JAGS (Plummer, 2003).

The implementation of continuous-time population dynamics (described by ordinary differential equations, ODEs) is available in BUGS but not in JAGS. Until recently, modelers often combined numerical simulations of ODEs manually with MCMC routines (Gilioli et al., 2008; Toni et al., 2009; Johnson et al., 2013; Smith et al., 2015; Papanikolaou et al., 2016; Boersch-Supan et al., 2017). Like BUGS, the probabilistic programming language Stan (Carpenter et al., 2017) offers an integrated solution for ODEs. It comes with a built-in numerical ODE solver, interfaces to R, Python, Matlab and more, and a Hamiltonian Monte Carlo (HMC) sampler (Monnahan et al., 2017). Thus, it supports fitting dynamical population models to time series data in a Bayesian framework, see Fussmann et al. (2017) and Carpenter (2018) for recent applications.

In this study, we will apply Bayesian inference in Stan to a set of time series of a predator-prey system in a chemostat, i.e., a continuous flow-through culture (Novick and Szilard, 1950). The parameters of a well-established continuous-time chemostat ODE model will be estimated yielding posterior distributions for the parameters, which allow also to quantify their uncertainties. By comparing the posteriors for the individual time series we can deduce a variability among them that manifests in different types of population dynamics and pin-point to specific parameters that seem to determine this variability.

2. Materials and Methods

2.1. Data Collection

Chemostat experiments were performed to obtain predator-prey time series at a high temporal resolution in a highly controlled environment (Weithoff et al., unpublished) resulting in a large collection of long-term time series with several different species. From these we selected a subset of 13 experiments, which were replicates in the sense that the same species were used at the same inflow nutrient concentration and dilution rate, and daily counts of prey and predators are available. The experiments were conducted with a metazoan predator, the rotifer Brachionus calyciflorus s.s.(Michaloudi et al., 2018; Paraskevopoulou et al., 2018), and its prey Monoraphidium minutum, a unicellular green alga. The algae grew on nitrogen-limited medium. However, daily nitrogen concentrations are not available. The experiments were performed within a time span of 7 years, with some individual experiments lasting longer than 1 year. They yielded time series which differed with respect to the degree that they exhibited more or less regular cyclic dynamics, or more or less constant predator and prey densities. From these 13 replicates we selected all shorter time series that showed either clear and pronounced predator-prey cycles or steady-state equilibria, and excluded pronounced initial transient phases. We chose a minimum sample length of 20 days which would typically allow for at least two predator-prey cycles in this system. This process resulted in a set of 18 samples, 10 of which featured a steady state and eight contained cyclic dynamics.

2.2. Dynamical Population Model

The continuous-time population dynamics of a predator-prey system in a chemostat with nitrogen S, algae A and rotifers R are described by the equations:

\begin{array}{rcl} \frac{d S}{d t} = (S^{*} - S) δ - \frac{1}{c_{A}} \frac{f_{A} S}{h_{A} + S} A & (1) \end{array}

\begin{array}{rcl} \frac{d A}{d t} = \frac{f_{A} S}{h_{A} + S} A - \frac{1}{c_{R}} \frac{f_{R} A}{h_{R} + A} R - δ A & (2) \end{array}

\begin{array}{rcl} \frac{d R}{d t} = \frac{f_{R} A}{h_{R} + A} R - δ R & (3) \end{array}

This formulation represents a slightly simplified version of the one originally presented by Fussmann et al. (2000) and neglects age structure of the predator. δ is the system's inflow and outflow rate, and the concentration of nutrients in the inflow is given by S*. The factors c_A and c_R define the conversion of nutrients into algal biomass and algal into rotifer biomass, respectively. The growth rate of algae is described by Monod kinetics $\frac{f_{A} S}{h_{A} + S}$ . f_A is the maximum growth rate and h_A is the half-saturation constant (the resource density at which the growth rate equals half of the maximum growth rate). The same applies to the resource-dependent growth rate of rotifers feeding on algae $\frac{f_{R} A}{h_{R} + A}$ , which is described by a type II functional response (Real, 1977). The maximum growth rates f_A[d⁻¹], f_R[d⁻¹] and the conversion factors c_A[cells μmol⁻¹], c_R[ind cells⁻¹] are free parameters, which are estimated as described in the next section. We used constant values for the following parameters from a similar system which instead used Chlorella vulgaris as the prey species: S* = 80 μmol l⁻¹ and δ = 0.55 d⁻¹, as they were carefully controlled in the experiments, and h_A = 4.3 μmol l⁻¹ and h_R = 7.5 · 10⁸ cells l⁻¹ (Fussmann et al., 2000). These values for the half-saturation constants were in the range of our predicted and observed resource states S and A, respectively (cf. Figures 2, 3).

We chose to not use half-saturation constants h_A (or h_R) as free parameters, since the estimates can be highly correlated with maximum growth rates f_A (or f_R). Their combined effects on the resource-dependent growth rate $\frac{f_{A} S}{h_{A} + S}$ (or $\frac{f_{R} A}{h_{R} + A}$ ) can only be disentangled if the data cover a large range in the resource states S (or A) (Rosenbaum and Rall, 2018), which is not the case for the present chemostat experiments.

2.3. Model Fitting and Inference

We combined numerical simulations of the deterministic dynamical population model (Equations 1–3) with Bayesian parameter estimation by drawing samples from the posterior probability distribution P(θ|y) of the free parameters θ given the data y, based on the likelihood P(y|θ) and the prior distribution P(θ). We used Hamiltonian Monte Carlo sampling in Stan, accessed via the RStan R-package (Stan Development Team, 2018). The Stan software comes with a built-in Runge-Kutta ODE solver with adaptive stepsize control for generating predictions $\hat{y} (θ)$ .

The likelihood calculation P(y|θ) is carried out automatically by the software when provided with predictions $\hat{y} (θ)$ and the distribution of residuals $\hat{y} (θ) - y$ . The predictions $\hat{y} (θ)$ are defined by the numerical solutions of the ODE $\hat{A} (t_{i})$ and $\hat{R} (t_{i})$ , evaluated at times t_i, for a given parameter combination θ. We chose a log-normal distribution of the residuals, i.e., $ln (\hat{A} (t_{i})) ~ N (ln (A_{i}), σ_{A})$ and $ln (\hat{R} (t_{i})) ~ N (ln (R_{i}), σ_{R})$ , with scale parameters σ_A and σ_R. This trajectory matching method technically corresponds to treating the model deterministically and to assuming pure observation errors in the data without any process error (Bolker, 2008, Chap. 11). Note that, even without data for the concentration nitrogen S, it is possible to fit the ODE model by including the initial densities of the predictions ${\hat{S}}_{0}^{(j)}$ , ${\hat{A}}_{0}^{(j)}$ and ${\hat{R}}_{0}^{(j)}$ as free parameters (Carpenter, 2018). However, one-step-ahead fitting (i.e., assuming pure process error) would only be possible for this ODE model if data for all state variables S, A and R was available. We did not consider full state-space models accounting for both process and observation error.

We fitted the maximum growth rates f_A and f_R and the conversion factors c_A and c_R on their logarithmic scale (see model code, Supporting Information). The dynamics and the statistical model are equivalent to fitting them on their original scale. But since they differ by several orders of magnitude, we found that log-transforming the parameter search space makes the iterative MCMC routine more robust.

We used a hierarchical model for the maximum growth rates f_A and f_R and for the conversion factors c_A and c_R by using time series identity j = 1, …, m as a random effect. This means that every time series j is fitted with its individual set of parameters ${f_{A}^{(j)}, f_{R}^{(j)}, c_{A}^{(j)}, c_{R}^{(j)}}$ , while each of these four parameters originates from a joint distribution across all m time series replicates. Thus, some information is shared across the individual replicates via the joint distribution, therefore this technique is also known as partial pooling. In a Bayesian framework, this can be modeled via hierarchical dependencies in the prior distributions. Including logarithmic scaling, they read

\begin{array}{rcl} ln (θ^{(j)}) ~ N (μ_{ln (θ)}, σ_{ln (θ)}), j = 1, \dots, m, θ = f_{A}, f_{R}, c_{A}, c_{R}, & (1) & (4) \end{array}

where μ_ln(θ) are the overall means and σ_ln(θ) the standard deviations across all m time series. μ_ln(θ) and σ_ln(θ) are also free parameters with their own prior distribution (see Table 1 for a full description of the priors).

TABLE 1

Table 1. Free parameters and their prior distributions.

We tested for variation in the dynamics of the different time series by uncovering differences in parameters. For each parameter θ = f_A, f_R, c_A, c_R, pairwise contrasts θ^(j)−θ^(k) between two time series j and k were inferred. I.e., we generated posterior probabilities $P_{j k} = P (θ^{(j)} > θ^{(k)})$ that quantify these differences. These quantities P_jk are directly computed from the posterior distribution by dividing the number of samples featuring θ^(j) > θ^(k) by the total number of samples.

To further investigate the importance of variation among the parameter estimates for different time series, we also fitted the ODE model (Equations 1–3) using a single set of parameters {f_A, f_R, c_A, c_R} for all 18 time series as a null model. In contrast to the hierarchical (partial pooling) model above, this is also known as complete pooling, since all information across individual replicates is combined. Only for the initial states ${{\hat{S}}_{0}^{(j)}, {\hat{A}}_{0}^{(j)}, {\hat{R}}_{0}^{(j)}}$ we allowed distinct values for each of the time series j = 1, …, 18.

We briefly comment on numerical issues that can arise when combining numerical solutions of ODEs with MCMC sampling. When the MCMC sampler explores the parameter space, points can be sampled that make the computation of the likelihood by numerical simulation of the ODE infeasible (e.g., by requiring an immensely small integration step-size or simply by divergent state variables). Still, the sampling algorithm requires the computation of the likelihood to proceed with the iterations. To prevent the sampler from entering regions of the parameter space where, over a whole range of values, the likelihood is not available, we used two strategies. First, we implemented a numerical condition which prevents the numerical ODE solution from diverging or crossing the lower boundary of zero by setting the right-hand-side of the ODE to zero if one of the state variables exceeds a reasonable range of [10⁻⁶, 10¹⁶] (see model code in Supporting Information). Second, we used weakly informative priors on the overall mean parameters μ_ln(θ) based on measured values from Fussmann et al. (2000): μ_{ln(_f_A)} ~ $N$ (ln(3.3), 1), μ_{ln(_f_R)} ~ $N$ (ln(2.25), 1), $μ_{ln (c_{A})} ~ N (ln (5.0 \cdot 1 0^{7}), 1)$ , $μ_{ln (c_{R})} ~ N (ln (2.5 \cdot 1 0^{- 5}), 1)$ (see also Table 1). We used the same priors for the complete pooling model.

2.4. Simulation Study

Before fitting the presented models to the experimental chemostat data, we first validated our modeling approach in a simulation study. By fitting the hierarchical model to simulated time series, which were generated by known parameters, we tested the identifiability of model parameters. These parameters $f_{A}^{(j)}$ , $f_{R}^{(j)}$ , $c_{A}^{(j)}$ and $c_{R}^{(j)}$ were drawn randomly from lognormal distributions using the measured values from above (Fussmann et al., 2000) as means and a standard deviation of 0.5. Initial states of the time series were also assigned randomly according to Table 1. We numerically simulated ODE trajectories of Equations 1–3 for 100 days and chose 10 time series that settled to different steady states and 10 time series that featured cyclic dynamics of different frequencies and amplitudes. We used the observations of algal and rotifer states of the last 20 days (leaving out nitrogen states as in the experimental data) and added a random error with zero mean and standard deviation of 0.1 on the ln-scale (see also Figures A1, A2, Supporting Information).

3. Results

3.1. Model Convergence

We fitted all models (hierarchical model in simulation study; hierarchical model and complete pooling model for experimental chemostat data) by running 10 individual MCMC chains in parallel with an adaptation phase of 1,000 iterations and a sampling phase of 5,000 samples each, summing up to 50,000 samples of the posterior distribution. The runtime was approximately 7 days on a 2.2 GHz Intel Xeon server architecture. Visual inspection of the trace plots and density plots showed a good mixture of the chains. Gelman-Rubin statistics of $\hat{R} < 1.01$ and an adequate effective sample size n_eff (i.e., the estimated number of independent samples) verified convergence (Gelman and Hill, 2007). See Supporting Information (Tables A1–A4) for a full list of parameter estimates and their statistics.

3.2. Identifiability of Parameters

We used the simulation study to assess if known parameters can be recovered accurately by fitting the hierarchical model to a synthetic set of 10 steady-states and 10 cyclic time series (cf. Figures A1, A2, Supporting Information). Figure 1 shows the posterior error distributions (distributions of estimated parameter values minus true values on ln-scale) of maximum growth rates $ln (f_{A}^{(j)})$ and $ln (f_{R}^{(j)})$ and conversion factors, $ln (c_{A}^{(j)})$ and $ln (c_{R}^{(j)})$ of prey and predators, respectively. We found that all parameters of cyclic time series 11–20 were accurately identifiable. The posterior medians generally did not deviate more than 0.05 from the true parameters and all estimates featured a low uncertainty (posterior standard deviations smaller than 0.04).

FIGURE 1

Figure 1. Posterior error distributions of maximum growth rates $\ln (f_{A}^{(j)}) [d^{- 1}], \ln (f_{R}^{(j)}) [d^{- 1}]$ , and conversion factors $\ln (c_{A}^{(j)}) [cells {μmol}^{- 1}], \ln (c_{R}^{(j)}) [ind {cells}^{- 1}]$ for fitting simulated time series (j = 1, …, 20). Simulated trajectories 1–10 featured a steady-state equilibrium (green), while simulated trajectories 11–20 featured cyclic behavior (orange). Vertical lines represent medians, boxes represent 50% highest density intervals (HDIs) and horizontal lines represent 95% HDIs.

For steady-state time series 1–10, however, posterior distributions of algal maximum growth rates $ln (f_{A}^{(j)})$ showed relatively high uncertainties (posterior medians deviating up to 0.8 from true values with posterior standard deviations up to 0.34). We assume that, in combination with the lack of nitrogen data S, steady-state time series, which cover a smaller range in the state space, contain less information about the resource density-dependent growth rates of algae feeding on nitrogen $\frac{f_{A} S}{h_{A} + S}$ than cyclic time series. Estimates for rotifer maximum growth rates $ln (f_{R}^{(j)})$ , on the other hand, were highly accurate just as for cyclic time series. In contrast to f_A, the data seem to provide enough information on the growth rates of rotifers feeding on algae $\frac{f_{R} A}{h_{R} + A}$ even in steady-state time series, since observations for both involved trophic levels are available. The estimates for conversion factors $ln (c_{A}^{(j)})$ and $ln (c_{R}^{(j)})$ were also less accurate for steady-state time series 1–10 than for cyclic time series 11–20 (posterior medians deviate up to 0.47 from true parameters, with an uncertainty of posterior standard deviations up to 0.37). We assume that steady-state data is not as informative as cyclic data on the conversion factors, as we also observe a high correlation between the posterior of $ln (c_{A}^{(j)})$ and $ln (c_{R}^{(j)})$ for every steady-state time series (cf. Supporting Information, Figure A3). Note that correlation per se does not imply unidentifiability. We observed correlations between the posteriors of cyclic time series parameters as well (cf. Supporting Information, Figure A4), while these parameters are estimated with a high accuracy.

3.3. Posterior Predictions

After assessing the identifiability of the hierarchical model in a simulation study, we investigated its performance with the experimental chemostat data. We generated posterior predictions by numerical simulations of the ODE system (Equations 1–3) with all samples of the posterior distribution (Figures 2, 3). After a short transient phase, the median predicted trajectories feature either a steady-state equilibrium (time series 1–9), cyclic behavior (time series 13–18); or the posterior distribution includes parameter samples producing steady states as well as cyclic trajectories (time series 10–12). Correspondingly, we found multimodalities in the posterior distributions of time series 10–12 (see Supporting Information, Figures A6, A7).

FIGURE 2

Figure 2. Experimental time series 1–9 of 18, data (dots) and posterior predictions of hierarchical model for nitrogen [μmol l⁻¹] (blue), algae [cells l⁻¹] (green) and rotifers [ind l⁻¹] (red). Solid lines represent median predictions, shaded areas depict 95% highest density intervals of the predictions. Predicted trajectories 1–9 featured a steady-state equilibrium.

FIGURE 3

Figure 3. Experimental time series 10–18 of 18, data (dots) and posterior predictions of fitted model for nitrogen [μmol l⁻¹] (blue), algae [cells l⁻¹] (green) and rotifers [ind l⁻¹] (red). Solid lines represent median predictions, shaded areas depict 95% highest density intervals of the predictions. Predicted trajectories 13–18 featured cyclic behavior. Time series 10–12 featured multimodalities in the posterior distribution and the predictions did not exhibit a clear tendency toward a steady state or cycles.

Interestingly, the relative uncertainty of the predictions (quantified by 95% confidence intervals) for all state variables is substantially reduced in time series 13–18 where the predictions feature cyclic behavior compared to other time series (Figures 2, 3). We measured the predictive accuracy in the univariate time series by normalized root mean-square-errors (cf. Figure A7, Supporting Information). Here we see that the error distributions are shifted to smaller values and become more narrow for cyclic dynamics, also indicating a better fit. Again, this may be explained by acknowledging that steady-state data, which covers a smaller range in the state space than cyclic data, contains less information about the process rates and hence the parameters.

As no data constrains the predictions for nitrogen, the uncertainty is even higher here than in algae or rotifers in time series 1–9 (this was also observed for steady-state time series in the simulation study, cf. Supporting Information Figure A1). Also, we found that our ODE model is able to predict the full amplitude of cycles in algal states better than in rotifer states (Figures 2, 3). We further validated this by calculating predictive accuracies (normalized root-mean-square error, Figure A7, Supporting Information) and posterior predictive checks (comparing observations and replicated predictions drawn from the posterior ${\hat{A}}_{i}^{rep} ~ lognormal ({\hat{A}}_{i}, σ_{A})$ and ${\hat{R}}_{i}^{rep} ~ lognormal ({\hat{R}}_{i}, σ_{R})$ , Figures A8, A9, Supporting Information). This is likely caused by a higher regularity in the algal data, which covers a larger amplitude decreasing the relative counting error. Also, algae feature a less complex life cycle than rotifers and their dynamics should thus be less variable.

3.4. Variation Among Time Series

For assessing the variation in the parameters across the experimental chemostat time series (j = 1, …, 18), we plotted the marginal (i.e., one-dimensional projections of the multivariate) posterior probability distributions of the logarithmic maximum growth rates $ln (f_{A}^{(j)})$ and $ln (f_{R}^{(j)})$ and the logarithmic conversion factors, $ln (c_{A}^{(j)})$ and $ln (c_{R}^{(j)})$ of prey and predators, respectively (Figure 4). We also computed probabilities of pairwise contrasts $P_{j k} = P (θ^{(j)} > θ^{(k)})$ for a more detailed examination of the differences across time series (θ = f_A, f_R, c_A, c_R, Tables 2–5). Values close to one or close to zero indicate significant pairwise differences. Note that the tables are symmetrical in the sense of P_jk = 1−P_kj.

FIGURE 4

Figure 4. Marginal posterior distributions of maximum growth rates $\ln (f_{A}^{(j)}) [d^{- 1}], \ln (f_{R}^{(j)}) [d^{- 1}]$ , and conversion factors $\ln (c_{A}^{(j)}) [{cells μmol}^{- 1}], \ln (c_{R}^{(j)}) [ind {cells}^{- 1}]$ for fitting experimental time series (j = 1, …, 18). Predicted trajectories 1–9 featured a steady-state equilibrium (green), while predicted trajectories 13–18 featured cyclic behavior (orange). Time series 10–12 (purple) featured multimodalities in the posterior distribution and the predictions did not exhibit a clear tendency toward a steady state or cycles. Vertical lines represent medians, boxes represent 50% highest density intervals (HDIs) and horizontal lines represent 95% HDIs.

TABLE 2

Table 2. Pairwise contrasts $P_{j k} = P (f_{A}^{(j)} > f_{A}^{(k)})$ of maximum growth rates f_A.

We found that time series with predicted steady states feature systematically higher values of f_A than time series with cyclic dynamics (Figure 4A; Table 2, top right and bottom left blocks). While in cyclic time series 13–18 values of f_A differ among each other (bottom right block), no evidence was found for differences among steady-state time series 1–9 (top left block). Steady-state time series also exhibit a high uncertainty in f_A estimates with confidence intervals spanning from 2.43 d-1 to 40.85 d-1 when transformed back to a linear scale (Figure 4A, see also Supporting Information, Table A1). This uncertainty is substantially reduced in cyclic time series 13–18 with confidence intervals spanning from 2.29 d-1 to 5.31 d-1. Here, the predicted parameter values are close to the published value in Fussmann et al. (2000), which however was published for Chlorella instead of Monoraphidium, but should be in the same range.

For the rates f_R we found pairwise differences across all time series (Figure 4B, Table 3). No systematic effect of cyclic or steady-state time series was observed (i.e., f_R estimates for cyclic time series are not systematically smaller than estimates for steady-state time series or vice versa). In contrast to the rates f_A, even steady-state data provide enough information on rates f_R (with consumer and resource data available), resulting in a low uncertainty in estimation just as for cyclic time series.

TABLE 3

Table 3. Pairwise contrasts $P_{j k} = P (f_{R}^{(j)} > f_{R}^{(k)})$ of maximum growth rates f_R.

The conversion factors c_A and c_R did not show significant pairwise differences for steady-state time series 1–9 (Figures 4C,D; Tables 4, 5, top left blocks; with few exceptions for c_R). Some pairwise differences among cyclic time series 13–18 (bottom right blocks) and to steady-state time series (top right and bottom left blocks) were observed, without being as systematic as in f_A. The uncertainty in parameter estimates is slightly larger in steady-state time series than in cycles time series, but the effect is not as prominent as in f_A estimates (see also full tables of estimates, Tables A1–A4, Supporting Information). All findings of this section regarding uncertainties in the parameter estimation were in accordance to the simulation study's results.

TABLE 4

Table 4. Pairwise contrasts $P_{j k} = P (c_{A}^{(j)} > c_{A}^{(k)})$ of conversion factors c_A.

TABLE 5

Table 5. Pairwise contrasts $P_{j k} = P (c_{R}^{(j)} > c_{R}^{(k)})$ of conversion factors c_R.

3.5. Comparison to Complete Pooling Model Fitting

To further support the importance of variation among the parameter estimates, we also fitted the complete pooling model using a single set of parameters {f_A, f_R, c_A, c_R} for all 18 time series as a null model. We used the same priors and model fitting specifications as above. Although a formal model comparison via information criteria is generally available for Bayesian statistics (Vehtari et al., 2018), it is not applicable to our dynamical model, since predictions (and therefore residuals) are correlated along time. Hence, we compare the complete pooling and the partial pooling (hierarchical) model qualitatively and quantitatively via their predictions. In contrast to the previous hierarchical model, the complete pooling model produces transient and steady-state predictions for all time series. Cyclic dynamics can not be reproduced (Supporting Information, Figures A10, A11). Also, the predicted equilibrium fails to reproduce the correct level of algae and rotifer states in some cases. This corresponds to a generally lower predictive accuracy when fitting algal and rotifers time series with the complete pooling model as compared to the hierachical model (Supporting Information, Figure A7). The effect is highest for cyclic time series 13–18, but also pronounced in time series 2, 3, 8, and 10, where the equilibrium states are not reproduced correctly.

3.6. Transition From Cyclic Dynamics to Steady States

To support the observation that low maximum growth rates f_A cause cyclic dynamics while high maximum growth rates yield a steady-state equilibrium (Figure 4), we performed a simulation study. We numerically simulated the chemostat model (Equations 1–3) while systematically varying growth rates f_A between 1 d-1 and 7.389 d-1 (corresponding to parameter values ln(f_A) between 0 and 2). The maximum growth rate of the predator and the conversion factors were held constant at their estimated overall means exp(μ_ln(θ)) across the 18 time series (f_R = 1.419 d⁻¹, c_A = 3.865 · 10⁷ cell μmol⁻¹, c_R = 1.065 · 10⁻⁵ ind cell⁻¹). The bifurcation diagram (Figure 5) shows system states of simulations over 200 days after discarding the first 100 days and clearly indicates cycles for low growth rates and steady states for high growth rates, with the Hopf bifurcation located in f_A = 4.446 d⁻¹.

FIGURE 5

Figure 5. Bifurcation diagram for simulations with varying maximum growth rate f_A, while keeping the remaining parameters constant at the fitted overall means. A Hopf bifurcation occurs at f_A = 4.446.

4. Discussion

In this study we presented how a differential equations model can be fitted to observed time series data of species abundances, taking predator-prey dynamics as an example. Next to obtaining key model parameters in situ, this method allows to decipher variability in the outcomes among replicates and to point toward probable sources of this variability.

The comparison of choosing the time series identity as a random effect (partial pooling) to a model using only a single set of parameters (complete pooling) shows that allowing for such a variability in the parameters between replicates can be crucial to see whether time series data agree with a model. Especially if a certain parameter is close to a bifurcation, as it seems to be the case in our system for the maximum growth rate of the algae f_A, minor deviations in this parameter result in different predictions for the system dynamics. In such cases, in-situ parameter estimation allows the detection of parameter sets for both steady states and cyclic dynamics, which can be separated by a multi-dimensional bifurcation boundary in models with a high number of parameters. Thereby, the chosen model structure may be accepted for all replicates, even if their dynamics differ, without increasing the model complexity. Instead, more than one parameter set might be needed to cover the whole diversity of possible and observable patterns in population dynamics.

Of course, this method requires that the model structure used for fitting the data applies sufficiently well to the mechanisms acting in the system delivering the data. Also, the data quality has to be high enough. We see that the dynamics of the prey are fitted better than those of the predator, which may be explained by different mechanisms: (i) The densities of the prey in the data are much larger than those of the predator. Assuming that the relative experimental error of counting decreases with larger individual numbers, this error should be smaller for the prey than for the predator. This increases the regularity of the dynamical patterns and simplifies the fitting of the prey time series. (ii) Rotifers, as metazoan animals, possess a more complex life-cycle than algae. Their dynamics may be affected by age-structure, variable resource co-limitation and other factors, which, for simplicity, were not included in the model. These non-modeled processes obviously decrease the goodness of fit. (iii) As no data was available for the resource of the prey, the parameters of the prey were confined only by the prey data, leaving more flexibility to improve the fitting in the prey's states. Contrarily, the parameters of the predator were more restricted, as both prey and predator data was available, leaving less flexibility for fitting the predator's states.

Interestingly, the type of population dynamics affects the quality of the inference as well. Population cycles contain a higher degree of information on the system than steady states, as also rates of change are apparent, additional to the biomasses of the trophic levels. This manifests in more narrow parameter estimates (Figure 4) and leads to more certain predictions (Figures 2, 3). Also, steady-state fits give rather (and partially unrealistically) high estimates for algal growth rates, whereas those from cyclic population dynamics are estimated close to published values. These findings were also confirmed in a simulation study by fitting the model to synthetic data generated by known parameters (Figure 1).

Variability in key parameters suggests a heterogeneity in the traits that are encoded by these parameters. Heterogeneous traits imply intra-specific variability which may enable populations to escape perturbations and to persist in the presence of strong stressors (Reusch et al., 2005; Bell and Gonzalez, 2009; Chevin et al., 2010). Bayesian parameter inference provides uncertainty estimates on key parameters and thereby allows to detect such variability in experiments. We propose that this technique may also help to quantify the presence of trait heterogeneity in nature.

Author Contributions

BR, UG, and MR conceived the ideas, GW and GF contributed the data, BR developed the methods, BR and MR led the writing of the manuscript. All authors contributed critically to the drafts and gave final approval for publication.

Funding

BR gratefully acknowledges the support of the German Centre for integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig funded by the German Research Foundation (FZT 118). MR was supported by German Research Foundation within the Priority Programme 1704 (DynaTrait) by grants WA 2445/11-1 and GA 401/26-1.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

We thank Bernd Blasius for his contribution to the acquisition and provision of the data.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo.2018.00234/full#supplementary-material

References

Abrams, P. A. (1999). Is predator-mediated coexistence possible in unstable systems? Ecology 80, 608–621. doi: 10.2307/176639