AUTHOR=Zhuan Bing , Ma Hong-Hong , Zhang Bo-Chao , Li Ping , Wang Xi , Yuan Qun , Yang Zhao , Xie Jun TITLE=Identification of non-small cell lung cancer with chronic obstructive pulmonary disease using clinical symptoms and routine examination: a retrospective study JOURNAL=Frontiers in Oncology VOLUME=13 YEAR=2023 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2023.1158948 DOI=10.3389/fonc.2023.1158948 ISSN=2234-943X ABSTRACT=Background

Patients with non-small cell lung cancer (NSCLC) and patients with NSCLC combined with chronic obstructive pulmonary disease (COPD) have similar physiological conditions in early stages, and the latter have shorter survival times and higher mortality rates. The purpose of this study was to develop and compare machine learning models to identify future diagnoses of COPD combined with NSCLC patients based on the patient’s disease and routine clinical data.

Methods

Data were obtained from 237 patients with COPD combined with NSCLC as well as NSCLC admitted to Ningxia Hui Autonomous Region People’s Hospital from October 2013 to July 2022. Six machine learning algorithms (K-nearest neighbor, logistic regression, eXtreme gradient boosting, support vector machine, naïve Bayes, and artificial neural network) were used to develop prediction models for NSCLC combined with COPD. Sensitivity, specificity, positive predictive value, negative predictive value, accuracy, F1 score, Mathews correlation coefficient (MCC), Kappa, area under the receiver operating characteristic curve (AUROC)and area under the precision-recall curve (AUPRC) were used as performance indicators to evaluate the performance of the models.

Results

135 patients with NSCLC combined with COPD, 102 patients with NSCLC were included in the study. The results showed that pulmonary function and emphysema were important risk factors and that the support vector machine-based identification model showed optimal performance with accuracy:0.946, recall:0.940, specificity:0.955, precision:0.972, npv:0.920, F1 score:0.954, MCC:0.893, Kappa:0.888, AUROC:0.975, AUPRC:0.987.

Conclusion

The use of machine learning tools combining clinical symptoms and routine examination data features is suitable for identifying the risk of concurrent NSCLC in COPD patients.