To develop and validate a deep learning radiomics (DLR) model that uses X-ray images to predict the classification of osteoporotic vertebral fractures (OVFs).
The study encompassed a cohort of 942 patients, involving examinations of 1076 vertebrae through X-ray, CT, and MRI across three distinct hospitals. The OVFs were categorized as class 0, 1, or 2 based on the Assessment System of Thoracolumbar Osteoporotic Fracture. The dataset was divided randomly into four distinct subsets: a training set comprising 712 samples, an internal validation set with 178 samples, an external validation set containing 111 samples, and a prospective validation set consisting of 75 samples. The ResNet-50 architectural model was used to implement deep transfer learning (DTL), undergoing -pre-training separately on the RadImageNet and ImageNet datasets. Features from DTL and radiomics were extracted and integrated using X-ray images. The optimal fusion feature model was identified through least absolute shrinkage and selection operator logistic regression. Evaluation of the predictive capabilities for OVFs classification involved eight machine learning models, assessed through receiver operating characteristic curves employing the “One-vs-Rest” strategy. The Delong test was applied to compare the predictive performance of the superior RadImageNet model against the ImageNet model.
Following pre-training separately on RadImageNet and ImageNet datasets, feature selection and fusion yielded 17 and 12 fusion features, respectively. Logistic regression emerged as the optimal machine learning algorithm for both DLR models. Across the training set, internal validation set, external validation set, and prospective validation set, the macro-average Area Under the Curve (AUC) based on the RadImageNet dataset surpassed those based on the ImageNet dataset, with statistically significant differences observed (P<0.05). Utilizing the binary “One-vs-Rest” strategy, the model based on the RadImageNet dataset demonstrated superior efficacy in predicting Class 0, achieving an AUC of 0.969 and accuracy of 0.863. Predicting Class 1 yielded an AUC of 0.945 and accuracy of 0.875, while for Class 2, the AUC and accuracy were 0.809 and 0.692, respectively.
The DLR model, based on the RadImageNet dataset, outperformed the ImageNet model in predicting the classification of OVFs, with generalizability confirmed in the prospective validation set.