Skip to main content

EDITORIAL article

Front. Physiol., 05 July 2022
Sec. Computational Physiology and Medicine
This article is part of the Research Topic Integration of Machine Learning and Computer Simulation in Solving Complex Physiological and Medical Questions View all 14 articles

Editorial: Integration of Machine Learning and Computer Simulation in Solving Complex Physiological and Medical Questions

  • 1Department of Surgery, University of Vermont Larner College of Medicine, Burlington, VT, United States
  • 2Department of Otorhinolaryngology Head and Neck Surgery, Medical School, Division of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
  • 3School of Communication Sciences and Disorders, McGill University, Montreal, QC, Canada
  • 4Department of Otolaryngology-Head and Neck Surgery, McGill University, Montreal, QC, Canada
  • 5Department of Biomedical Engineering, McGill University, Montreal, QC, Canada

Background

This Research Topic, “Integration of Machine Learning and Computer Simulation in Solving Complex Physiological and Medical Questions”, brings together two powerful computational approaches to investigate complex disease processes: the use of high-fidelity, mechanism-based simulation models (MSMs), and the training of artificial neural networks (ANNs) via machine learning (ML) and artificial intelligence (AI). These two approaches represent distinct aspects of the scientific process: ML/AI involves correlation identification/hypothesis generation whereas MSMs provide an in silico means for hypothesis testing and conceptual model verification, with capabilities that can complement and address each other’s limitations. High-fidelity MSMs can contain very large numbers of parameters, which poses challenges to effective parameterization and/or parameter space exploration, and can present prohibitive computational costs in terms of executing simulation experiments. Alternatively, ML/AI approaches are notoriously data-hungry (a considerable issue when dealing with biological data sets that are generally orders of magnitude more sparse compared to other ML applications), are highly limited in terms of testing inferred causal relationships, and are often “black boxes” in terms of interpreting why the ANNs do what they do. This Research Topic brings together work that integrates MSM and ML in a complementary fashion. We have organized these papers in the following general classes of investigation.

Applications of Integrated ML and MSM in Personalized Medicine

The ostensible goal of the practice of medicine is to treat sick individuals with the right drug and the right time, and be able to have such a treatment regimen for every sick patient. MSMs can serve as “digital twins” of individual patients and provide a means of virtually forecasting their future disease course, or, with future developments, aid in personalizing potential therapies. Implicit in this process is the need to capture disease trajectories over time (i.e., integrating time series data), which challenges data-hungry pure ML approaches, but also requires tuning a simulation model to a specific person’s “parameters.” Kuruvila et al. combined convolutional neural network (CNN) and long short-term memory (LTSM) models to infer a listener’s auditory attention in noisy acoustic environments. CNNs were trained with experimental data of electroencephalography (EEG) and speech spectrograms from speakers. The CNN outputs then parsed to the bidirectional LSTM and the auditory attention to speakers were classified. Their results supported the integration of listener-specific EEG signals into ML-powered hearing aids that will help listeners attend to speech signals in noisy scenarios. Schafer et al. applied physics-based network diffusion models to simulate the propagation of misfolded tau proteins in three brain regions of patients with Alzheimer’s disease. Hierarchical Bayesian Inference models were used to obtain posterior probability distributions for two personalized model parameters, namely, the diffusion coefficient and production rate of tau proteins. Personalized models of tau pathology with capability of predicting tau evolution and their associated cognitive functions would be of great use in creating virtual patient controls for clinical trials. van Duuren et al. combined bi-objective evolutionary algorithm (EA) and an established microsimulation model for personalized colorectal cancer screening. EA was used to find personalized screening policies in minimizing the costs while maximizing the number of Quality-Adjusted Life Years gained. Their study results supported the use of computer models to guide policy making and implementation of personalized colorectal cancer screening.

Machine Learning as Surrogate Models of Complex Mechanistic Models

Developing “lighter weight” surrogate models of complex MSMs would enhance the computational efficiency of simulation experiments. ANNs, as governed by the universal Approximation Theorem (Hornik et al., 1989) are able to recapitulate any generative function and are therefore appealing means of creating surrogate models. Quetzalcóatl Toledo-Marín et al. applied this principle to partial differential equation (PDE) models of biological diffusion. In this case, there is considerable improvement in performance with the surrogate ANN, which allows for both greater complexity of the MSM and more extensive exploration of possible behaviors with simulation experiments. Alternatively, there are types of MSMs that do not have a readily accessible equation form, primarily agent-based models (ABMs). Larie et al. uses ABM simulation data to create a surrogate ANN, but with certain caveats related to properties often found in biomedical ABMs. Firstly, in contrast to deterministic equation-based models, instead of specific trajectories ANN surrogates of ABMs generate a probabilistic “cone” of future trajectories (ala hurricane path prediction). As such, any attempt to use such surrogate models needs to account for this projected uncertainty with updating to produce a rolling forecast horizon. Secondly, the ANN of the ABM also shows the property of path non-uniqueness, which has implications regarding attempts to “reverse engineer” particular pathway or causal network structures from biological data.

ML-Based Parameter Space Characterization Methods for High-Fidelity MSMs

Complex medical problems require complex solutions. However, there is a tension between using models simple enough to readily parameterize but do not capture key details necessary for clinical utility versus sufficiently expressive but highly complex models with a host of parameters that may not be accessible experimentally. ML methods have thus been applied to the problem of parameter space characterization and uncertainty quantification (Granato and Li-Jessen, 2020) through Model Exploration (ME) (Ozik et al., 2018) methods, such as Random Forest (Garg et al., 2019). Alarid-Escudero et al. utilized the Extreme-scale Model Exploration with Swift (EMEWS) framework for high performance computing (HPC) enabled ME to characterize how experimentally unidentifiable parameters affected the performance of microsimulation decision models regarding the natural history of colorectal cancer (CRC). Leaving these known factors out of a decision model would lead to an intuitively inferior model, and therefore this group used EMEWS to infer regions of identifiable parameter space that produced clinically relevant alterations in the decision model outputs. A different perspective is presented in the paper by Cockrell & An using genetic algorithms (GA) to calibrate a complex ABM. This study introduces a formal mathematical object, the Model Rule Matrix (MRM), intended to account for the inherent “incompleteness” of any mechanism-based simulation model by accounting for all the possible “missing connections” as model parameters. Therefore, as opposed to “parameter fitting” that attempts to reduce experimental/clinical data variation, this approach expands the range of allowable model parameterizations given real-world observations.

Conclusion

Future work will invariably continue leveraging the strengths of MSMs and ML to offset their inherent limitations. Moving forward, we note multiple open challenges remain, two of which we briefly note:

• The use of synthetic data is ubiquitous in most non-biomedical applications of ML/AI. This need is even more pronounced given the relative sparsity of biological data. However, given the universal Approximating capabilities of ANNs, care must be taken when generating biological time series data such that the ANN does not only “learn” to the generative model. Therefore, developing means to “hide” the generative model from the ANN is a crucial area of investigation and development. The paper in this Research Topic by Cockrell & An begins to address this issue.

• One main concern regarding the use of ML/AI in biomedicine is the opacity of these systems. “Explainable” or “interpretable” AI is a key research topic in the general AI community. The use of MSMs to generate synthetic data can aid in addressing the transparency issues, as the MSMs are explicitly transparent (by virtue of their programmed structure) and essentially represent the conceptual-symbolic model that many consider a necessary component of next generation AI systems. (Garcez and Lamb, 2020).

We hope that the papers in this Research Topic will help spur additional developments and applications in what we consider to be an essential set of methods to better understand and treat complex medical diseases.

Author Contributions

All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.

Funding

GA was sponsored in part by the National Institutes of Health Award UO1EB025825. GA is also sponsored by the Defense Advanced Research Projects Agency (DARPA) through Cooperative Agreement D20AC00002 awarded by the United States Department of the Interior (DOI), Interior Business Center. NL-J is supported by the National Sciences and Engineering Research Council of Canada (RGPIN-2018-03843 and ALLRP 548623-19), Compute Canada and Canada Research Chair research stipend. The content of the information does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Garcez Ad. A., Lamb L. C. (2020). Neurosymbolic AI: The 3rd Wave. arXiv preprint arXiv:201205876.

Google Scholar

Garg A., Yuen S., Seekhao N., Yu G., Karwowski J., Powell M., et al. (2019). Towards a Physiological Scale of Vocal Fold Agent-Based Models of Surgical Injury and Repair: Sensitivity Analysis, Calibration and Verification. Appl. Sci. 9 (15), 2974. doi:10.3390/app9152974

PubMed Abstract | CrossRef Full Text | Google Scholar

Granato B., Li-Jessen N. Y. (2020). Sensitivity Analysis for Dimensionality Reduction in Agent-Based Modeling. In ECAI 2020. IOS Press 2905–2906.

Google Scholar

Hornik K., Stinchcombe M., White H. (1989). Multilayer Feedforward Networks Are Universal Approximators. Neural Netw. 2 (5), 359–366. doi:10.1016/0893-6080(89)90020-8

CrossRef Full Text | Google Scholar

Ozik J., Collier N. T., Wozniak J. M., Macal C. M., An G. (2018). Extreme-Scale Dynamic Exploration of a Distributed Agent-Based Model with the EMEWS Framework. IEEE Trans. Comput. Soc. Syst. 5 (3), 884–895. doi:10.1109/tcss.2018.2859189

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: machine learning, computer simulation, complex disease, personalized medcine, high fidelity computational method, multi-scale modeling

Citation: An G, Döllinger M and Li-Jessen NYK (2022) Editorial: Integration of Machine Learning and Computer Simulation in Solving Complex Physiological and Medical Questions. Front. Physiol. 13:949771. doi: 10.3389/fphys.2022.949771

Received: 21 May 2022; Accepted: 07 June 2022;
Published: 05 July 2022.

Edited and reviewed by:

Raimond L. Winslow, Northeastern University, United States

Copyright © 2022 An, Döllinger and Li-Jessen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Nicole Y. K. Li-Jessen, bmljb2xlLmxpQG1jZ2lsbC5jYQ==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.