AUTHOR=Beverin Luka , Topalovic Marko , Halilovic Armin , Desbordes Paul , Janssens Wim , De Vos Maarten TITLE=Predicting total lung capacity from spirometry: a machine learning approach JOURNAL=Frontiers in Medicine VOLUME=10 YEAR=2023 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2023.1174631 DOI=10.3389/fmed.2023.1174631 ISSN=2296-858X ABSTRACT=Background and objective

Spirometry patterns can suggest that a patient has a restrictive ventilatory impairment; however, lung volume measurements such as total lung capacity (TLC) are required to confirm the diagnosis. The aim of the study was to train a supervised machine learning model that can accurately estimate TLC values from spirometry and subsequently identify which patients would most benefit from undergoing a complete pulmonary function test.

Methods

We trained three tree-based machine learning models on 51,761 spirometry data points with corresponding TLC measurements. We then compared model performance using an independent test set consisting of 1,402 patients. The best-performing model was used to retrospectively identify restrictive ventilatory impairment in the same test set. The algorithm was compared against different spirometry patterns commonly used to predict restriction.

Results

The prevalence of restrictive ventilatory impairment in the test set is 16.7% (234/1402). CatBoost was the best-performing machine learning model. It predicted TLC with a mean squared error (MSE) of 560.1 mL. The sensitivity, specificity, and F1-score of the optimal algorithm for predicting restrictive ventilatory impairment was 83, 92, and 75%, respectively.

Conclusion

A machine learning model trained on spirometry data can estimate TLC to a high degree of accuracy. This approach could be used to develop future smart home-based spirometry solutions, which could aid decision making and self-monitoring in patients with restrictive lung diseases.