- 1Medical Biometry and Epidemiology, Faculty of Health/School of Medicine, Witten/Herdecke University, Witten, Germany
- 2Biostatistics and Medical Biometry, Medical School OWL, Bielefeld University, Bielefeld, Germany
During the SARS-CoV-2 pandemic, the effective reproduction number (R-eff) has frequently been used to describe the course of the pandemic. Analytical properties of R-eff are rarely studied. We analytically examine how and under which conditions the conventional susceptible–infected–removed (SIR) model (without infection age) serves as an approximation to the infection-age-structured SIR model. Special emphasis is given to the role of R-eff, which is an implicit parameter in the infection-age-structured SIR model and an explicit parameter in the approximation. The analytical findings are illustrated by a simulation study about an hypothetical intervention during a SARS-CoV-2 outbreak and by historical data from an influenza outbreak in Prussian army camps in the region of Arnsberg (Germany), 1918–1919.
1 Introduction
The susceptible–infected–removed (SIR) model is a frequently used model in infectious disease epidemiology dating back at least to Kermack and McKendrick (1). In the SIR model, the population is partitioned into susceptible, infected, and removed (the initial letters of which give the model’s name “SIR”). To take into account varying transmissibility during the infectious period, sometimes a generalized conventional SIR model, known as the infection-age-structured SIR model, is considered. Both models are described by a set of differential equations. While the conventional SIR model is easy to understand and frequently used, the infection-age-structured SIR is slightly more complex. In this work, we seek for an approximation to simplify the differential equations of the infection-age-structured SIR model. In both models, conventional and infection-age-structured SIR, the removed state comprises people recovered and deceased from the infected state. The numbers of the people in the susceptible and the removed states at time are denoted by and , respectively.
The conventional SIR model is described in the next section. We start with the infection-age-structured SIR model, where the function denotes the density of infected people at time and duration since infection (i.e., the infection age). The number of infected at time [ ] is
The transmission rate of the infected people with infection age (i.e., the duration since infection) is and the removal rate from the infectious stage is . The rate comprises mortality as well as remission. According to Inaba (2), we can formulate the model equations for the infection-age-structured SIR model as follows:
The incidence rate in Equation 2 is given by
which is usually called the force of infection (2).
Systems 2–4 are accompanied with the following initial and boundary conditions:
where is assumed to be positive and is assumed to be non-negative and integrable. Apart from non-negativity, can have any distribution. For later use, we additionally assume that . Condition 9 is called coupling equation and guarantees that Systems 2–4 is well-defined [see Chen et al. (3) for details]. Note that Systems 2–4 are a generalization of the SEIR model (2), where SEIR means a model consisting of the states susceptible, exposed, infected, and removed. Detailed discussion of Equations 2–4 with initial Conditions 6–9 can be found in Inaba (2), such that we can be brief here.
Denoting the probability of still being infected at infection age with ,
the effective reproduction number is defined by (4, Equations 22 and 23)
Note that the reproduction number defined in Equation 11 is occasionally called instantaneous reproduction number (5).
A typical situation in infectious disease epidemiology is that the transmission rate and the initial Conditions 6–9 are given. Then, Systems 2–4 are solved and the effective reproduction number is calculated by Equation 11. In this way, can be seen as an implicit or indirect parameter for the infection-age-structured SIR model, because can be calculated via Equation 11 after solving the governing Equations 2–4 with initial Conditions 6–9. In some situations, however, can be estimated more easily from population surveys than the transmission rate . Especially, in early phases of an outbreak of a new pathogen, the function is frequently unknown. Then, the question arises if and how the infection-age-structured SIR model can be solved if the effective reproduction number is given instead of . In this case, we ask for a direct (or explicit) dependency of the differential equations on the parameter .
2 Approximation of the age-structured SIR model by the conventional SIR model
In case the transmission rate depends only on calendar time , i.e., , the force of infection can be written as
Then, Systems 2–4 become explicitly dependent on . This means that for given , the system can be numerically solved, for instance, by the algorithm described in the supplement to Brinks et al. (6). This is advantageous in situations when the effective reproduction number is known while the transmission rate is unknown. Note that there are a variety of methods for estimating from a time series of numbers of incident cases, see, e.g., Fraser and Galvani (5) and Cori et al. (7). The question arises under which conditions Systems 2–4 can be approximated by the simpler conventional SIR model that explicitly depends on As shown at the beginning of this section, this is the case if the transmission rate is independent from the infection age , i.e., For many diseases, however, there is a non-negligible dependency on the infection age , as for example, in SARS-CoV-2, see He et al. (8).
The conventional SIR model is a simpler model than the infection-age-structured SIR. The meaning of the variables is the same as above; however, the infection age is not considered. The governing equations of the conventional SIR model are as follows:
To show similarities between the two SIR models, we start by applying Leibniz’s integral rule to Equation 1. The temporal derivative of the number of infected in Equation 1 can then be expressed as
As (by the assumption above) and Equation 15 reads as
It is reasonable to assume that the integral in Equation 16 has a finite upper bound , because there are no infected people with infinite infection age As , the Mean Value Theorem for Definite Integrals (9) guarantees the existence of such that
Equation 17 is the same as Equation 13 with which gives an indication that Equation 3 from the infection-age-structured SIR model can indeed be approximated by Equation 13 from the conventional SIR model.
If it holds true that
we can reformulate Equation 13 with an explicit dependency on . To see this, we assume that Equation 18 holds true and find
With the usual smoothness assumptions, Equation 19 has the unique solution
where (note that was assumed to be integrable). Equation 19 (and equivalently Equation 20) directly relates the number of infected people to the effective reproduction number .
Now, we have to examine the conditions such that Equation 18 at least approximately holds true. As is non-negative, the Mean Value Theorem for Definite Integrals applied to the left-hand side of Equation 18 reads as
for .
On the right-hand side of Equation 18, we have
where we assumed that has a compact support and .
By comparing Equations 21 and 22, we see that and imply the desired equality . Hence, if is close to and is close to , we can expect that Equation 20 is a reasonable approximation for the number of infected in the age-structured SIR model.
Equation 19 has the important advantage of being a linear ordinary differential equation with the analytic general solution Equation 20. Given the remission rate and the effective reproduction number the solution can be calculated at least numerically, for example, by Romberg integration (10). Given that Equation 19 (or equivalently Equation 20) yields a very simple justification of the epidemiological commonplace that the number of infected people is increasing over time if and only if the effective reproduction number is greater than 1.
3 Simulation: lockdown during the SARS-CoV-2 pandemic
To demonstrate how good the approximation given by Equation 19 (or equivalently by Equation 20) is to describe the number of infected people in the infection-age-structured SIR model, we use a simulation motivated from the SARS-CoV-2 pandemic. During the pandemic, many governments decided to invoke public health interventions to control the spread of the virus. Brinks et al. (6) simulated three consecutive periods of the epidemic in a hypothetical population. A phase of increasing number of infections from to (days) is followed by a phase of implementation of a (strict) lockdown (from to ). During the third phase (post-lockdown), the pandemic remains controlled (from to ). The 5-day period following the start of the lockdown was chosen as the wash-in phase. The rationale for the wash-in phase is that public health interventions usually require some time before taking full effect (11). After this wash-in period, we assumed that the effect of the lockdown remains unaltered until the end of the simulation at day
The specific numbers for solving Equations 2–4 with initial Conditions 6–9 including their justifications are given in Brinks et al. (6). The supplement of Brinks et al. (6) also contains a description for the numerical solution of Equations 2–4 with initial Conditions 6–9 on a grid , starting with , ending with , and equidistant step size Apart from the setting of the simulation and its results, Brinks et al. (6) has a little overlap to the work presented here.
For the simulation done here, we calculate the incidence density on the grid , starting with , ending with , and equidistant step size (days). The transmission rate is assumed to be the product of two functions Note that the factorization into two factors is not a necessary condition and has been chosen for ease of representation, for details refer to Brinks et al. (6).
The removal rate is assumed to be constant Applying Romberg integration (10) to Equation 11 yields the effective reproduction number as depicted in Figure 1 [details can be found in Brinks et al. (6) and the associated supporting information S1]. We see that the lockdown reduces quickly to values below 1, which is intended to control the spread of the disease. The steep decline of is consistent with Fraser’s description of an abrupt switch from a high to a low value due to an effective intervention (5). Note that we do not consider the kind (or effectiveness) of the considered interventions. Here, it is important to simulate a realistic reduction of (at least in magnitude).
Figure 1. Reproduction number (ordinate) over calendar time t (abscissa, in days) in the simulation [cf. Figure 5 in Supporting Information S1 of Brinks et al. (6)].
So far, we have only used the theory of the infection-age-structured SIR model. To see if the approximation by Equation 20 (derived from the conventional SIR model) yields reasonable results, we solve Equation 20 with the calculated values (as shown in Figure 1). With respect to the number of infected people over time , we obtain the black graph as presented in Figure 2. For comparison, the exact as calculated by Equation 1 (from the infection-age-structured SIR model) is shown as a blue curve. Periods of increasing and decreasing numbers of infected people coincide quite well in both curves. The approximated numbers (the black curve) deviate from the exact values (the blue curve) less than 10% between days 0 and 45. After day 45, when the number of infected people is already strongly decreasing, the approximated values overestimate the true values considerably (up to 67% at day 60).
Figure 2. Number of infected people over calendar time in the simulation. The blue curve corresponds to the exact solution (Equation 1) while the black curve is the approximation via Equation 20.
4 Application: pandemic influenza in Prussian army camps around Arnsberg (Germany)
In 2007, Nishiura estimated the number of cases of incident influenza in Prussian army camps in the region of Arnsberg (Germany) from reported death cases during September 1918 and January 1919 (12). The estimated numbers of incident cases are extracted from Nishiura (12) and are published in Briggs et al. (13). Then, Nishiura and later Nishiura and Chowell estimated the effective reproduction numbers from the case counts for a period of 140 days starting from 9 September 1918 (4, 12). The estimation method required the length of the generation time of the virus, which is unknown. To overcome the problem of the unknown generation time, Nishiura and Chowell used three scenarios with different generation times. We confine ourselves to a coarse schematic description of the temporal course of shown in Figure 3. After the start of the outbreak, the effective reproduction decreases from about 1.8 to 0.7 on day 90, increases to 1 on day 100, and decreases after that again. Using these values, we could reconstruct the estimated values of infected people by Equation 20. The result is shown in Figure 4. The black curve corresponds to the approximated according to Equation 20 with constant . The number of estimated cases is shown as a blue curve. As in the previous section about the simulation, the fit between the curves is reasonably well.
Figure 3. Approximate effective reproduction number after 9 September 1918 (in days) for an influenza outbreak in Prussian army camps.
Figure 4. Number of infected people (ordinate) over calendar time in Prussian army camps in the region of Arnsberg (Germany). The blue curve corresponds to estimated numbers according to Nishiura (12) and the black curve is the approximation via Equation 20.
5 Discussion
In this article, we demonstrated that the number of infected people in the infection-age-structured SIR model without demography of the background host can be approximated by a simpler differential equation based on the conventional SIR model. While the dependency on the effective reproduction number in the infection-age-structured SIR model is implicit, the simpler differential equation makes the dependency explicit. This is advantageous for calculating in situations when the effective reproduction number is given, for example, by surveys or surveillance.
In a simulated example about a hypothetical lockdown during the SARS-CoV-2 pandemic, we could compare the exact number of infected people (calculated by the infection-age-structured SIR model) with the approximation during a 60-day period. Qualitatively, the two epidemic curves agree reasonably well over the whole simulated period. During the first 45 days, the absolute difference between the epidemic curves is below 10%. As the number of infected people decreases, the relative difference increases rapidly and reaches about 70% at the end of the 60-day period. Of course, this is also an effect of the relatively low absolute numbers of infected people at the end of the simulation.
Finally, we applied the approximation to real-world data from an influenza outbreak in Prussian army camps after the Second World War. We could compare the estimated number of infected people with the approximated number based on the values. Again, the two epidemic curves qualitatively agree well.
In other prominent sources, the effective reproduction number is defined via a renewal equation [see, e.g., Nishiura (12)] or as the spectral radius of the next generation operator (14). However, our approach is based on calculus and does not need these more sophisticated mathematical concepts. Thus, we believe that the important concept of the effective reproduction number might be accessible and useful to a broader audience.
During epidemic situations, the effective reproduction number is frequently estimated from the number of reported cases (5). Recently, we have shown that this estimation is stable in case of incomplete case detection (6). At the moment, we do not have any indication that Equations 19 or 20 provide a better alternative for estimating . The benefit of Equations 19 or 20 lies in the fact that an estimated allows calculation of the true number of infected people , which might be underestimated by incomplete case detection or under-reporting. The question whether to use Equation 19, or equivalently Equation 20, depends on the specific application and the preference of the user. We have chosen to use Romberg integration for Equation 20, because error estimates are easier to obtain than for numerical solutions to Equation 19 (10, p. 335).
To our knowledge, it is the first time that the dependency of on the effective reproduction number is made explicit by an approximation using a differential equation in the infection-age-structured SIR model. For the conventional SIR model, which is a special case of the infection-age-structured SIR model, a similar result has been found in Bettencourt and Ribeiro (15, Equation 2). For a recent study about approximating SARS-CoV-2 with the conventional SIR model, we refer the reader to Prodanov (16). Usually, the effective reproduction number is defined in terms of the variables in Equations 2–4, like in Equation 11. In the literature, we frequently find the definition , which describes how depends on the basic reproduction number and the variables and defined above [see, e.g., Vynnycky and White (17)]. Conversely, Equations 19 and 20 describe how the number of infected depends on for the infection-age-structured SIR model. Hence, Equations 19 and 20 describe an opposite way than usual. In our work here, we could generalize the findings of Bettencourt and Ribeiro (15) for the conventional SIR model to the infection-age-structured SIR model. We note that the use of SIR models is not restricted to epidemiology but can also be used in examining the spread of rumors and news (18). With a view to news, the infection age refers to the time elapsed after a recipient got to know the new information.
By the differential Equation 19 and similarly Bettencourt and Ribeiro (15, Equation 2), the common interpretation of as an indicator if the number of infected people increases () or not is justified in a very simple and straightforward way. While the approximation works reasonably well in the examples about SARS-CoV-2 and influenza shown here, there might be diseases for which a higher accuracy is requested. The approximation of Equations 19 or 20 might not work well in all cases or other parameter constellations. To gain insight into these constellations, an extensive simulation study is necessary, which is beyond the scope of this article and is subject to future work. As long as error estimates are not known, careful consideration of the use of Equation 19 (or equivalently Equation 20) is necessary.
Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.
Ethics statement
Ethical approval was not required for the study involving humans in accordance with the local legislation and institutional requirements. Written informed consent to participate in this study was not required from the participants or the participants’ legal guardians/next of kin in accordance with the national legislation and the institutional requirements.
Author contributions
RB: Conceptualization, Data curation, Formal Analysis, Methodology, Project administration, Resources, Software, Supervision, Writing – original draft, Writing – review & editing. AH: Supervision, Validation, Writing – review & editing.
Funding
The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The author(s) declared that they were an editorial board member of Frontiers, at the time of submission. This had no impact on the peer review process and the final decision.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
1. Kermack W, McKendrick A. Contributions to the mathematical theory of epidemics. Proc R Soc A. (1927) 115A:700–21. doi: 10.1098/rspa.1927.0118
2. Inaba H, Age-Structured Population Dynamics in Demography and Epidemiology. Singapore: Springer (2017).
3. Chen Y, Zhou S, Yang J. Global analysis of an sir epidemic model with infection age and saturated incidence. Nonlinear Anal Real World Appl. (2016) 30:16–31. doi: 10.1016/j.nonrwa.2015.11.001
4. Nishiura H, Chowell G, “The effective reproduction number as a prelude to statistical estimation of time-dependent epidemic trends.” In: Chowell G, Hyman JM, Bettencourt LMA, Castillo-Chavez C, editors. Mathematical and Statistical Estimation Approaches in Epidemiology. Dordrecht: Springer (2009).
5. Fraser C., Galvani A.. Estimating individual and household reproduction numbers in an emerging epidemic. PLoS One. (2007) 2:e758. doi: 10.1371/journal.pone.0000758
6. Brinks R, Küchenhoff H, Timm J, Kurth T, Hoyer A. Epidemiological measures for assessing the dynamics of the SARS-COV-2-outbreak: simulation study about bias by incomplete case-detection. PLoS One. (2022) 17:1–10. doi: 10.1371/journal.pone.0276311
7. Cori A, Ferguson N, Fraser C, Cauchemez S. A new framework and software to estimate time-varying reproduction numbers during epidemics. Am J Epidemiol. (2013) 178:1505–12. doi: 10.1093/aje/kwt133
8. He X, Lau EH, Wu P, Deng X, Wang J, Hao X, et al. Temporal dynamics in viral shedding and transmissibility of COVID-19. Nat Med. (2020) 26:672–5. doi: 10.1038/s41591-020-0869-5
11. Montcho Y, Klingler P, Lokonon BE, Tovissodé CF, Glèlè Kakaï R, Wolkewitz M. Intensity and lag-time of non-pharmaceutical interventions on COVID-19 dynamics in German hospitals. Front Public Health. (2023) 11:1087580. doi: 10.3389/fpubh.2023.1087580
12. Nishiura H. Time variations in the transmissibility of pandemic influenza in Prussia, Germany, from 1918–19. Theor Biol Med Model. (2007) 20. doi: 10.1186/1742-4682-4-20
13. Brinks R. Estimated numbers of incident cases of pandemic influenza in Prussian army camps around Arnsberg (Germany) 1918/19 [Dataset]. Zenodo (2023). doi: 10.5281/zenodo.10380474
14. Okuwa K, Inaba H, Kuniya T. Mathematical analysis for an age-structured SIRS epidemic model. Math Biosci Eng. (2019) 16:6071–102. doi: 10.3934/mbe.2019304
15. Bettencourt LMA, Ribeiro RM. Real time Bayesian estimation of the epidemic potential of emerging infectious diseases. PLoS One. (2008) 3:1–9. doi: 10.1371/journal.pone.0002185
16. Prodanov D. Computational aspects of the approximate analytic solutions of the SIR model: applications to modelling of COVID-19 outbreaks. Nonlinear Dyn. (2023) 111:15613–31. doi: 10.1007/s11071-023-08656-8
Keywords: effective reproduction number, net reproduction number, influenza, SARS-CoV-2, Lexis diagram, Spanish flu
Citation: Brinks R and Hoyer A (2024) Approximation of the infection-age-structured SIR model by the conventional SIR model of infectious disease epidemiology. Front. Epidemiol. 4:1429034. doi: 10.3389/fepid.2024.1429034
Received: 7 May 2024; Accepted: 2 December 2024;
Published: 17 December 2024.
Edited by:
Rosie Cornish, University of Bristol, United KingdomReviewed by:
Dimiter Prodanov, Interuniversity Microelectronics Centre (IMEC), BelgiumGilberto Gonzalez-Parra, New Mexico Tech, United States
Copyright: © 2024 Brinks and Hoyer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Ralph Brinks, cmFscGguYnJpbmtzQHVuaS13aC5kZQ==