AUTHOR=Sisimayi Chenjerai , Harley Charis , Nyabadza Farai , Visaya Maria Vivien TITLE=AI-enabled case detection model for infectious disease outbreaks in resource-limited settings JOURNAL=Frontiers in Applied Mathematics and Statistics VOLUME=9 YEAR=2023 URL=https://www.frontiersin.org/journals/applied-mathematics-and-statistics/articles/10.3389/fams.2023.1133349 DOI=10.3389/fams.2023.1133349 ISSN=2297-4687 ABSTRACT=Introduction

The utility of non-contact technologies for screening infectious diseases such as COVID-19 can be enhanced by improving the underlying Artificial Intelligence (AI) models and integrating them into data visualization frameworks. AI models that are a fusion of different Machine Learning (ML) models where one has leveraged the different positive attributes of these models have the potential to perform better in detecting infectious diseases such as COVID-19. Furthermore, integrating other patient data such as clinical, socio-demographic, economic and environmental variables with the image data (e.g., chest X-rays) can enhance the detection capacity of these models.

Methods

In this study, we explore the use of chest X-ray data in training an optimized hybrid AI model based on a real-world dataset with limited sample size to screen patients with COVID-19. We develop a hybrid Convolutional Neural Network (CNN) and Random Forest (RF) model based on image features extracted through a CNN and EfficientNet B0 Transfer Learning Model and applied to an RF classifier. Our approach includes an intermediate step of using the RF's wrapper function, the Boruta Algorithm, to select important variable features and further reduce the number of features prior to using the RF model.

Results and discussion

The new model obtained an accuracy and recall of 96% for both and outperformed the base CNN model and four other experimental models that combined transfer learning and alternative options for dimensionality reduction. The performance of the model fares closely to relatively similar models previously developed, which were trained on large datasets drawn from different country contexts. The performance of the model is very close to that of the “gold standard” PCR tests, which demonstrates the potential for use of this approach to efficiently scale-up surveillance and screening capacities in resource limited settings.