AUTHOR=Huang Guanghua , Liu Lei , Wang Luyi , Li Shanqing TITLE=Prediction of postoperative cardiopulmonary complications after lung resection in a Chinese population: A machine learning-based study JOURNAL=Frontiers in Oncology VOLUME=12 YEAR=2022 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2022.1003722 DOI=10.3389/fonc.2022.1003722 ISSN=2234-943X ABSTRACT=Background

Approximately 20% of patients with lung cancer would experience postoperative cardiopulmonary complications after anatomic lung resection. Current prediction models for postoperative complications were not suitable for Chinese patients. This study aimed to develop and validate novel prediction models based on machine learning algorithms in a Chinese population.

Methods

Patients with lung cancer receiving anatomic lung resection and no neoadjuvant therapies from September 1, 2018 to August 31, 2019 were enrolled. The dataset was split into two cohorts at a 7:3 ratio. The logistic regression, random forest, and extreme gradient boosting were applied to construct models in the derivation cohort with 5-fold cross validation. The validation cohort accessed the model performance. The area under the curves measured the model discrimination, while the Spiegelhalter z test evaluated the model calibration.

Results

A total of 1085 patients were included, and 760 were assigned to the derivation cohort. 8.4% and 8.0% of patients experienced postoperative cardiopulmonary complications in the two cohorts. All baseline characteristics were balanced. The values of the area under the curve were 0.728, 0.721, and 0.767 for the logistic, random forest and extreme gradient boosting models, respectively. No significant differences existed among them. They all showed good calibration (p > 0.05). The logistic model consisted of male, arrhythmia, cerebrovascular disease, the percentage of predicted postoperative forced expiratory volume in one second, and the ratio of forced expiratory volume in one second to forced vital capacity. The last two variables, the percentage of forced vital capacity and age ranked in the top five important variables for novel machine learning models. A nomogram was plotted for the logistic model.

Conclusion

Three models were developed and validated for predicting postoperative cardiopulmonary complications among Chinese patients with lung cancer. They all exerted good discrimination and calibration. The percentage of predicted postoperative forced expiratory volume in one second and the ratio of forced expiratory volume in one second to forced vital capacity might be the most important variables. Further validation in different scenarios is still warranted.