With the development of imaging technology, an increasing number of pulmonary nodules have been found. Some pulmonary nodules may gradually grow and develop into lung cancer, while others may remain stable for many years. Accurately predicting the growth of pulmonary nodules in advance is of great clinical significance for early treatment. The purpose of this study was to establish a predictive model using radiomics and to study its value in predicting the growth of pulmonary nodules.
According to the inclusion and exclusion criteria, 228 pulmonary nodules in 228 subjects were included in the study. During the one-year follow-up, 69 nodules grew larger, and 159 nodules remained stable. All the nodules were randomly divided into the training group and validation group in a proportion of 7:3. For the training data set, the t test, Chi-square test and Fisher exact test were used to analyze the sex, age and nodule location of the growth group and stable group. Two radiologists independently delineated the ROIs of the nodules to extract the radiomics characteristics using Pyradiomics. After dimension reduction by the LASSO algorithm, logistic regression analysis was performed on age and ten selected radiological features, and a prediction model was established and tested in the validation group. SVM, RF, MLP and AdaBoost models were also established, and the prediction effect was evaluated by ROC analysis.
There was a significant difference in age between the growth group and the stable group (P < 0.05), but there was no significant difference in sex or nodule location (P > 0.05). The interclass correlation coefficients between the two observers were > 0.75. After dimension reduction by the LASSO algorithm, ten radiomic features were selected, including two shape-based features, one gray-level-cooccurence-matrix (GLCM), one first-order feature, one gray-level-run-length-matrix (GLRLM), three gray-level-dependence-matrix (GLDM) and two gray-level-size-zone-matrix (GLSZM). The logistic regression model combining age and radiomics features achieved an AUC of 0.87 and an accuracy of 0.82 in the training group and an AUC of 0.82 and an accuracy of 0.84 in the verification group for the prediction of nodule growth. For nonlinear models, in the training group, the AUCs of the SVM, RF, MLP and boost models were 0.95, 1.0, 1.0 and 1.0, respectively. In the validation group, the AUCs of the SVM, RF, MLP and boost models were 0.81, 0.77, 0.81, and 0.71, respectively.
In this study, we established several machine learning models that can successfully predict the growth of pulmonary nodules within one year. The logistic regression model combining age and imaging parameters has the best accuracy and generalization. This model is very helpful for the early treatment of pulmonary nodules and has important clinical significance.