AUTHOR=Hao Peng , Deng Bo-Yu , Huang Chan-Tao , Xu Jun , Zhou Fang , Liu Zhe-Xing , Zhou Wu , Xu Yi-Kai TITLE=Predicting anaplastic lymphoma kinase rearrangement status in patients with non-small cell lung cancer using a machine learning algorithm that combines clinical features and CT images JOURNAL=Frontiers in Oncology VOLUME=12 YEAR=2022 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2022.994285 DOI=10.3389/fonc.2022.994285 ISSN=2234-943X ABSTRACT=Purpose

To develop an appropriate machine learning model for predicting anaplastic lymphoma kinase (ALK) rearrangement status in non-small cell lung cancer (NSCLC) patients using computed tomography (CT) images and clinical features.

Method and materials

This study included 193 patients with NSCLC (154 in the training cohort, 39 in the validation cohort), 68 of whom tested positive for ALK rearrangements and 125 of whom tested negative. From the nonenhanced CT scans, 157 radiomic characteristics were extracted, and 8 clinical features were collected. Five machine learning (ML) models were assessed to find the best classification model for predicting ALK rearrangement status. A radiomic signature was developed using the least absolute shrinkage and selection operator (LASSO) algorithm. The predictive performance of the models based on radiomic features, clinical features, and their combination was assessed by receiver operating characteristic (ROC) curves.

Results

The support vector machine (SVM) model had the highest AUC of 0.914 for classification. The clinical features model had an AUC=0.805 (95% CI 0.731–0.877) and an AUC=0.735 (95% CI 0.566–0.863) in the training and validation cohorts, respectively. The CT image-based ML model had an AUC=0.953 (95% CI 0.913–1.0) in the training cohort and an AUC=0.890 (95% CI 0.778–0.971) in the validation cohort. For predicting ALK rearrangement status, the ML model based on CT images and clinical features performed better than the model based on only clinical information or CT images, with an AUC of 0.965 (95% CI 0.826–0.882) in the primary cohort and an AUC of 0.914 (95% CI 0.804–0.893) in the validation cohort.

Conclusion

Our findings revealed that ALK rearrangement status could be accurately predicted using an ML-based classification model based on CT images and clinical data.