AUTHOR=Chen Serena H. , LondoƱo-Larrea Pablo , McGough Andrew Stephen , Bible Amber N. , Gunaratne Chathika , Araujo-Granda Pablo A. , Morrell-Falvey Jennifer L. , Bhowmik Debsindhu , Fuentes-Cabrera Miguel TITLE=Application of Machine Learning Techniques to an Agent-Based Model of Pantoea JOURNAL=Frontiers in Microbiology VOLUME=12 YEAR=2021 URL=https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2021.726409 DOI=10.3389/fmicb.2021.726409 ISSN=1664-302X ABSTRACT=

Agent-based modeling (ABM) is a powerful simulation technique which describes a complex dynamic system based on its interacting constituent entities. While the flexibility of ABM enables broad application, the complexity of real-world models demands intensive computing resources and computational time; however, a metamodel may be constructed to gain insight at less computational expense. Here, we developed a model in NetLogo to describe the growth of a microbial population consisting of Pantoea. We applied 13 parameters that defined the model and actively changed seven of the parameters to modulate the evolution of the population curve in response to these changes. We efficiently performed more than 3,000 simulations using a Python wrapper, NL4Py. Upon evaluation of the correlation between the active parameters and outputs by random forest regression, we found that the parameters which define the depth of medium and glucose concentration affect the population curves significantly. Subsequently, we constructed a metamodel, a dense neural network, to predict the simulation outputs from the active parameters and found that it achieves high prediction accuracy, reaching an R2 coefficient of determination value up to 0.92. Our approach of using a combination of ABM with random forest regression and neural network reduces the number of required ABM simulations. The simplified and refined metamodels may provide insights into the complex dynamic system before their transition to more sophisticated models that run on high-performance computing systems. The ultimate goal is to build a bridge between simulation and experiment, allowing model validation by comparing the simulated data to experimental data in microbiology.