AUTHOR=Alam Md. Zahangir , Simonetti Albino , Brillantino Raffaele , Tayler Nick , Grainge Chris , Siribaddana Pandula , Nouraei S. A. Reza , Batchelor James , Rahman M. Sohel , Mancuzo Eliane V. , Holloway John W. , Holloway Judith A. , Rezwan Faisal I. TITLE=Predicting Pulmonary Function From the Analysis of Voice: A Machine Learning Approach JOURNAL=Frontiers in Digital Health VOLUME=4 YEAR=2022 URL=https://www.frontiersin.org/journals/digital-health/articles/10.3389/fdgth.2022.750226 DOI=10.3389/fdgth.2022.750226 ISSN=2673-253X ABSTRACT=Introduction

To self-monitor asthma symptoms, existing methods (e.g. peak flow metre, smart spirometer) require special equipment and are not always used by the patients. Voice recording has the potential to generate surrogate measures of lung function and this study aims to apply machine learning approaches to predict lung function and severity of abnormal lung function from recorded voice for asthma patients.

Methods

A threshold-based mechanism was designed to separate speech and breathing from 323 recordings. Features extracted from these were combined with biological factors to predict lung function. Three predictive models were developed using Random Forest (RF), Support Vector Machine (SVM), and linear regression algorithms: (a) regression models to predict lung function, (b) multi-class classification models to predict severity of lung function abnormality, and (c) binary classification models to predict lung function abnormality. Training and test samples were separated (70%:30%, using balanced portioning), features were normalised, 10-fold cross-validation was used and model performances were evaluated on the test samples.

Results

The RF-based regression model performed better with the lowest root mean square error of 10ยท86. To predict severity of lung function impairment, the SVM-based model performed best in multi-class classification (accuracy = 73.20%), whereas the RF-based model performed best in binary classification models for predicting abnormal lung function (accuracy = 85%).

Conclusion

Our machine learning approaches can predict lung function, from recorded voice files, better than published approaches. This technique could be used to develop future telehealth solutions including smartphone-based applications which have potential to aid decision making and self-monitoring in asthma.