- 1Department of Hematology, Nanjing Drum Tower Hospital, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing, China
- 2The Key Laboratory of Broadband Wireless Communication and Sensor Network Technology (Ministry of Education), Nanjing University of Posts and Telecommunications, Nanjing, China
- 3Department of Nuclear Medicine, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, China
- 4Department of Nuclear Medicine, West China Hospital, Sichuan University, Chengdu, China
- 5Department of Hematology, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, China
Objectives: This study aims to develop 7×7 machine-learning cross-combinatorial methods for selecting and classifying radiomic features used to construct Radiomics Score (RadScore) of predicting the mid-term efficacy and prognosis in high-risk patients with diffuse large B-cell lymphoma (DLBCL).
Methods: Retrospectively, we recruited 177 high-risk DLBCL patients from two medical centers between October 2012 and September 2022 and randomly divided them into a training cohort (n=123) and a validation cohort (n=54). We finally extracted 110 radiomic features along with SUVmax, MTV, and TLG from the baseline PET. The 49 features selection-classification pairs were used to obtain the optimal LASSO-LASSO model with 11 key radiomic features for RadScore. Logistic regression was employed to identify independent RadScore, clinical and PET factors. These models were evaluated using receiver operating characteristic (ROC) curves and calibration curves. Decision curve analysis (DCA) was conducted to assess the predictive power of the models. The prognostic power of RadScore was assessed using cox regression (COX) and Kaplan–Meier plots (KM).
Results: 177 patients (mean age, 63 ± 13 years,129 men) were evaluated. Multivariate analyses showed that gender (OR,2.760; 95%CI:1.196,6.368); p=0.017), B symptoms (OR,4.065; 95%CI:1.837,8.955; p=0.001), SUVmax (OR,2.619; 95%CI:1.107,6.194; p=0.028), and RadScore (OR,7.167; 95%CI:2.815,18.248; p<0.001) independently contributed to the risk factors for predicting mid-term outcome. The AUC values of the combined models in the training and validation groups were 0.846 and 0.724 respectively, outperformed the clinical model (0.714;0.556), PET based model (0.664; 0.589), NCCN-IPI model (0.523;0.406) and IPI model (0.510;0.412) in predicting mid-term treatment outcome. DCA showed that the combined model incorporating RadScore, clinical risk factors, and PET metabolic metrics has optimal net clinical benefit. COX indicated that the high RadScore group had worse prognosis and survival in progression-free survival (PFS) (HR, 2.1737,95%CI: 1.2983, 3.6392) and overall survival (OS) (HR,2.1356,95%CI: 1.2561, 3.6309) compared to the low RadScore group. KM survival analysis also showed the same prognosis prediction as Cox results.
Conclusion: The combined model incorporating RadScore, sex, B symptoms and SUVmax demonstrates a significant enhancement in predicting medium-term efficacy and prognosis in high-risk DLBCL patients. RadScore using 7×7 machine learning cross-combinatorial methods for selection and classification holds promise as a potential method for evaluating medium-term treatment outcome and prognosis in high-risk DLBCL patients.
Introduction
Diffuse large B-cell lymphoma (DLBCL) is a highly heterogeneous and aggressive B-cell lymphoma, accounting for 30%-40% of initial diagnosed non-Hodgkin’s lymphomas (NHL) (1). The first-line immunochemotherapy are R-CHOP (rituximab, cyclophosphamide, doxorubicin, vincristine and prednisone) or R-CHOP-like regimens (2, 3). Clinically, 30%-40% of patients undergoing this therapy experience relapse or refractory (4, 5). This could be attributed to the tumor heterogeneity, leading to reduced sensitivity to chemotherapy (6, 7). Patients classified as high-risk face poorer prognostic survival (8). The gene expression profiling of DLBCL defined three primary subtypes based on “cell of origin” (COO): germinal center B cell-like (GCB), activated B cell-like (ABC), and not otherwise specified (NOS). The molecular subclassification could account for some of the heterogeneity in the clinical outcomes of DLBCL (9). Numerous prognostic tools have been identified through large-scale retrospective studies. The International Prognostic Index (IPI) was proposed in 1993, incorporating five risk factors: age, lactate dehydrogenase (LDH), the Eastern Cooperative Oncology Group (ECOG) Physical Status (PS), Ann Arbor stage, and extra-nodal involvement (10). The National Comprehensive Cancer Network -IPI (NCCN-IPI) was proposed in 2014, which form four risk groups based on scores ranging from 0 to 8. The NCCN-IPI provides more accurate identification of intermediate-high (4, 5) /high-risk (6–8) DLBCL patients (11). However, the focus of both the IPI and the NCCN-IPI on clinical and biologic indicators makes it difficult to comprehensively assess the tumor heterogeneity of DLBCL (12, 13).
18F-fluorodeoxyglucose (FDG)-positron emission tomography/computed tomography (PET/CT) is widely utilized for early DLBCL diagnosis, staging, and assessing chemotherapy response (14). SUVmax, MTV and TLG are commonly used in PET. These metabolic indicators reflect tumor malignancy and are valuable for baseline assessment as well as improve response prediction (15). In the previous research, SUVmax is the most widely used indices (16). MTV and TLG, are associated with tumor burden, as well as progression-free survival (PFS) and overall survival (OS) (17). Vercellino et al. found that the integration of baseline total metabolic tumor volume (TMTV) with parameters of tumor load distribution has the potential to enhance the accuracy of risk stratification for DLBCL patients (18). Nevertheless, these indicators have limitations on describing tumor heterogeneity. Radiomics were used to assess tumor heterogeneity and assisted in the prediction of clinical outcomes. PET radiomics features present promising biomarkers for predicting treatment outcome and prognosis in DLBCL (19).
Machine learning is commonly used for radiomic feature identification and classification (20). Several studies investigated the risk stratification and efficacy of PET radiomics, Lue et al. used the least absolute shrinkage and selection operator regression (LASSO) method and discovered that the baseline 18F-FDG PET radiomic feature RLNGLRLM is an independent prognostic factor for survival outcomes (21). But these studies utilized limited machine learning methods (22, 23). Additionally, other studies reported the outcome and prognostic value of radiomics features using cross-combination methods (24). However, these methods have not yet been applied in high-risk DLBCL patients. In this paper, we therefore employed a cross-combination of seven machine learning methods to select and classify PET radiomics features associated with tumor internal heterogeneity. Furthermore, we established a tool as early prognostic biomarker that predicts mid-term treatment outcome and prognosis, also identifies high-risk DLBCL patients with unresponsive to R-CHOP regimen.
Materials and methods
Patient data collection
This study followed the principles outlined in the Declaration of Helsinki. Ethical approval for this retrospective analysis was obtained from the Ethics Committee of two medical centers. Written consent was not required for this study. A total of 177 patients with DLBCL classified as intermediate-high/high-risk according to NCCN-IPI score of 4–8 were enrolled between October 2012 and September 2022. Among them, 125 patients were from Nanjing Drum Tower Hospital of Nanjing University Medical School, and 52 patients were from West China Hospital of Sichuan University. The patients were randomly divided into a training cohort (123) and a validation cohort (54) using a 7:3 randomization ratio. Inclusion criteria were defined as follows: (I) patients with confirmed NCCN-IPI ≥4 for DLBCL, (II) [18F]-FDG PET/CT scan was performed before baseline treatment, and (III) received R-CHOP-like regimens, and (IV) patients had to be aged ≥18 years at the time of diagnosis. Exclusion criteria were used: (I) participants with primary central nervous system lymphoma, (II) participants with a history of other tumors, and (III) participants with incomplete clinical data, and (IV) had undergone previous treatment such as chemotherapy, radiotherapy, or surgery, and (V) lost to follow-up.
The datasets included patient clinical data such as gender, age, B symptoms, ECOG PS, IPI, NCCN-IPI, LDH, Ann Arbor stage, extranodal involvement, bone marrow involvement. Patient follow-up data were collected through electronic medical records or telephone interviews. The mid-term PET scans based on the Deauville 5-point scale were used as study endpoints for mid-term efficacy and prognosis in DLBCL patients. A score of 1–3 was defined as complete metabolic remission (CMR), and 4–5 was defined as partial metabolic remission (PMR), disease stabilization (SD), or disease progression (PD) (25). Therefore, the patients were divided into CR group and non-CR group. Figure 1 illustrates the baseline and mid-term pet of non-CR and CR patients.
Figure 1 show the baseline and mid-term 18F-FDG PET/CT of the patients. Baseline (A) and mid-term image (B) of the patient without complete remission (Non-CR), and baseline (C) and mid-term image (D) of the patient with complete response (CR).
PET/CT scanning protocol
All patients should fast for more than 6 hours before PET/CT scans, and their fasting blood glucose levels were under 8.7 mmol/L. Patients were injected with 18F-FDG (3.70–5.18 MBq/kg; Fludeoxyglucose[18F] Injection; AMS Limited) via a superficial forearm vein, and rested quietly for 60 minutes before PET/CT. CT scanning conditions included a tube voltage of 120 kV, tube current of 100 mA, and layer thickness of 2 mm (Philips). PET scanning conditions included acquisition of 7–10 beds, with each bed lasting for 2 minutes (Philips2). At the end of acquisition, a response line image reconstruction was implemented to obtain cross-sectional, coronal, and sagittal PET and CT images, which were later corrected for attenuation. Image reconstruction was performed using voxels of 4 × 4 × 4 mm³ over three iterations and 33 subsets.
VOI drawing and radiomics processing
The PET images were processed using LIFEx (Local Image Feature Extraction) software(version7.3.0) (26). (I) A voxel boundary threshold of 41% SUVmax was employed (15). A semi-automatic segmentation method was used to outline the volume of interest (VOI), (II) with non-lymphoma 18F-FDG uptake being manually excluded. In case of disagreement, a senior nuclear medicine physician was consulted to jointly determine the VOI. (III) The metabolic metrics, including SUVmax, MTV, and TLG, were determined for each lesion. SUVmax represented the maximum standardized uptake value with the highest uptake in tumor lesions. MTV was the volume of tumor lesion for a single VOI, and TLG was calculated as the sum of the product of the SUVmean and the MTV for the lesion (TLG = [SUVmean × MTV]). Lesions with MTV smaller than 10 cm³ were not included. All radiomics features complied with the benchmarks of the Image Biomarker Standardization Initiative (IBSI) (27).
PET radiomic features were extracted from baseline PET images by the open-source software package LIFEx (www.lifexsoft.org). For the original PET image, (I) the Wavelet and Laplacian of Gaussian (LoG) transform were applied to obtain the corresponding Wavelet and LoG images. Then, (II) three types of features were extracted: first-order statistical features (maximum, minimum), shape features (roundness, extensibility), and texture features. Figure 2 illustrates the workflow of radiomic analysis.
Figure 2 Analysis workflow in this study. SVM, support vector machine; GBDT, gradient boosting decision tree; RF, random forest; ET, extra-trees; LASSO, least absolute shrinkage and selection operator; LR, logistic regression; AdaBoost, adaptive boosting.
Radiomics feature selection and RadScore construction
The extracted radiomic features were screened and classified using a cross-combination method of seven machine learning models. These methods are Gradient Boosted Decision Tree (GBDT), Extreme Tree (ET), Random Forest (RF), Adaptive Boosting (AdaBoost), Least Absolute Shrinkage and Selection Operator (LASSO), Support Vector Machines (SVM) and Logistic Regression (LR). GBDT (28) utilizes decision trees as its base learner, with predictions from a series of trees summed together. RF (29) is an ensemble of decision trees, where the results of all the decision trees are voted upon or averaged to obtain the final prediction. ET (30) is the model underlying the feature recursive elimination algorithm, which selects the dataset and obtains weight values for each feature. Features with the smallest absolute weight values are then sequentially removed from the feature set. AdaBoost (31) adapts to different datasets by adjusting the weights of the training samples and combines multiple classifiers linearly to enhance their performance. LASSO (32) is a classical regression analysis method that minimizes regression coefficients through shrinkage operations, preserving non-zero variables in the model. SVM (33) is a powerful method for building classifiers that establishes a decision boundary between two categories, enabling label prediction based feature vectors. LR (34) is a generalized linear model used for classification tasks, analyzing the impact of independent variables on classification results by quantifying their effects.
This paper presented a feature selection-classification pairs from 7×7 possible combinations, such as LASSO-LASSO SVM-SVM and SVM-LASSO. Seven machine learning methods were used to select features, and seven machine learning methods were used to classify features. Subsequently, the optimal candidate pair were used to build Radiomic Score (RadScore). RadScore was defined as the sum of the product of the selected radiomic feature and the corresponding feature weights. The identification of the best candidate model involved five steps utilizing fivefold cross-validation: (I) The patient data was randomly divided into training(n=123) and validation(n=54) cohorts. (II) For the training cohort, we employed seven feature selection models, developed 110 PET radiomics features and obtained corresponding feature weights after dimensionality reduction. Based on these feature weights, we trained feature selection models by recursively considering subsets of radiomic features. The feature selection model with the largest area under curve (AUC) value was identified as the most important one. Then(III) fivefold cross-validation was applied to the reduced training cohort that divided it into approximately equal-sized groups, with four groups used for training and one group for test. (IV) Four training groups had been separately developed using the seven feature classification models. The feature classification model with the largest AUC value was identified as the most important one. (V) We calculated the AUC of each feature selection-classification model and outputted the average AUC. The model with the largest average AUC was selected as the optimal candidate model. (VI) Finally, we validated the optimal model in the test group.
Development and validation of the models
Univariate and multivariate logistic regression were utilized to identify potential independent risk factors in the training group and construct a predictive model for the mid-term treatment outcome. In the univariate analysis, statistically significant clinical and PET factors were included separately in the multivariate analysis. Independent clinical predictors were employed to develop clinical models, while independent PET predictors were utilized to create PET-based models. Subsequently, all independent clinical predictors, PET predictors, and RadScore were assembled a combined model. Additionally, NCCN-IPI model and IPI model were also developed.
Clinical benefit analysis based on the models
All models were assessed in both the training and validation groups through Receiver operating characteristic (ROC) curves and calibration curves. Additionally, decision curve analysis (DCA) was employed to evaluate the net clinical benefits of these models.
Statistical analysis
All data were analyzed using SPSS 25.0 (IBM Corp, Armonk, NY, USA) and R statistical software (version 4.2.2). A P value less than 0.05 was considered statistically significant. The χ2 test was used to compare clinical characteristics and PET metabolic metrics in the training and validation groups. Nomograph were used to show the score of independent risk factors. ROCcurves were utilized to determine the optimal thresholds for SUVmax, MTV, TLG, and RadScore in predicting mid-term efficacy, PFS and OS. Logistic regression analyses were employed to assess and develop independent predictors. Calibration curves, ROC, and DCA were calculated for the model in both the training and validation cohorts. Survival analysis was conducted by Cox regression and Kaplan-Meier (KM) analysis.
Results
Patient characteristics
A total of 177 patients (mean age,63 ± 13 years,129 men) were included. Table 1 summarized the baseline characteristics for patients in both the training and validation cohorts. The χ2 test revealed no statistically significant(P<0.05) difference between the two groups. The median follow-up time for the training and validation cohorts was 30.5 and 30.8 months, respectively. In the training cohort, 62 individuals experienced disease relapse or progression, resulting in 42 deaths. The 1-year, 3-year, and 5-year PFS rates were 89.6%, 72.2%, and 56.2%, while 1-year, 3-year, and 5-year OS rates were 89.3%, 67.9%, and 63.4%. Likewise, in the validation cohort, disease relapse or progression occurred in 24 individuals, leading to 13 deaths. 1-year, 3-year, and 5-year PFS rates were 79.7%, 55.3%, and 38.0%, and 1-year, 3-year, and 5-year OS rates were 90.9%, 75.8%, and 70.0%.
Radiomics feature selection and RadScore construction
Based on the 49 features machine learning selection-classification pairs, we selected 110 radiomics features to construct the optimal LASSO-LASSO model (AUC=0.74) (Figure 3). The LASSO-LASSO model screened out 10 key radiomics features for constructing RadScore (Table 2). We employed the ROC curves to identify the optimal cut-off for these dichotomous variables, which corresponds to the point with the maximum Youden index. The Youden index represents the sum of sensitivity and specificity and then subtracting 1. Table 3 shows that RadScore cut-off threshold of 2.0, 2.2 and 2.2 was optimal for predicting mid-term efficacy, PFS and OS.
Figure 3 Heatmaps indicate the AUC performance of the cross-combinations of the feature selection methods (columns) and classification models (rows) in predicting mid-term response (A). The Histogram demonstrate the selected features (IBSI name) and weights to build the optimal candidate model (B).
Table 2 The 110 radiomic features extracted from PET and the 11 key features* for constructing RadScore in this study.
Table 3 Optimal cut-off thresholds of SUVmax, MTV, TLG and RadScore area under the curve (AUC) of mid-term outcome, progression-free survival and overall survival in the training and validation cohorts.
Univariate and multivariate analysis results
Table 4 shows the between-group differences in clinical characteristics, PET metabolic indices regarding mid-term efficacy.
Table 4 Univariate and multivariate analyses of factors predictive of mid-term treatment outcome in the training cohort.
For the clinical variables, we found that gender (OR=2.760 (95%CI:1.196–6.368), P=0.017) and B symptoms (OR=4.065 (95%CI:1.837–8.955), P=0.001) were independent risk factors for mid-term outcomes, as shown in Table 4.
Regarding the PET variables, RadScore (OR=7.167(95%CI:2.815–18.248), P=0.001) and SUVmax (OR=2.619 (95%CI:1.107–6.194), P=0.028) were independent risk factors influencing mid-term outcomes. These results were presented in Table 4.
Assessment and validation of models built for predicting mid-term efficacy
To predict mid-term efficacy, we developed a combined model that utilized separate clinical predictors (gender, B-symptoms), PET predictor (SUVmax) and RadScore (Figure 4; Table 5). Additionally, we also created separate clinical models, PET-based models, IPI model and NCCN-IPI models (Table 5).
Figure 4 Nomogram to predict the patient mid-term efficacy risk (A). Calibration curves of the model for predicting mid-term response in the training (B) and validation (C) cohorts.
Nomograms visualized the score of risk factors on mid-term efficacy. The calibration curves after 1000 repetitions of bootstrapping for each model, which showed satisfactory agreement between the estimated values and the actual observed values in both the training and validation groups for the combined model (Figure 4).
The ROC curves of the models for predicting mid-term response in the training (A) and validation (B) cohorts, which showed that the AUC values of the combined model (0.846;0.724)of clinical factors, pet metabolic parameters and RadScore were better than those of the clinical model (0.714;0.556), PET based model (0.664; 0.589), NCCN-IPI model (0.523;0.406) and IPI model (0.510;0.412) (Figure 5).
Figure 5 Receiver operating characteristic curve of the models for predicting mid-term response in the training (A) and validation (B) cohorts.
Performance analysis of the combined models in clinical use
DCA were shown in Figure 6. These analyses demonstrated that the combined model consistently outperformed the clinical model, PET-based model, IPI model and NCCN-IPI model in terms of overall net benefit for most risk thresholds in both the training and validation cohorts.
Survival analysis in the training and validation cohorts
To confirm the added prognostic value of RadScore, we evaluated it in low RadScore groups and high RadScore groups. The low and high-risk groups identified using the RadScore cut-off threshold demonstrated distinct outcomes in terms of PFS and OS in both the training and validation cohorts (Figure 7; Table 6). The prognosis power of the low RadScore group was superior to that of the high RadScore group.
Figure 7 Kaplan–Meier plots according to RadScore for patients’ progression-free survival and overall survival in the training (A) and validation cohorts (B).
Table 6 Cox regression of RadScore predictive of progression-free survival and overall survival in the training and validation cohorts.
Cox regression analysis showed that the risk of adverse prognostic events in the group with high RadScore was higher than that in the group with low RadScore. Patients with a low RadScore (n=128) had a better PFS (Median months,50;95%CI:33.533,77.000) than those with a high RadScore (n=49) (Median months,17;95%:9.567,39.000). The risk ratio and 95% confidence interval of high RadScore group (n=49)/low RadScore group (n=128) were 2.1737 (1.298–3.639), and the P value were 0.003. Patients with a low RadScore (n=80) had a better OS (Mean months, 85.965;95%CI:72.684,99.246) than those with a high RadScore (n=97) (Mean months, 62.365;95%CI: 51.938,72.792). The risk ratio and 95% confidence interval of the high RadScore group (n=97)/low RadScore group (n=80) were 2.1356 (1.256–3.6309), and the P value were 0.005 (Table 6).
Kaplan Meier analysis showed that both in the training cohort and the validation cohort, the low RadScore group and the high RadScore group showed the same results in PFS and OS. However, the probability of adverse prognosis risk events in the low RadScore group was lower than that in the high RadScore group (Figure 7).
Discussion
In this retrospective study utilizing real-world data, we found that a combined model, which incorporated RadScore, outperformed clinical, PET, and NCCN-IPI models in predicting mid-term efficacy and prognsis of DLBCL patients. This combined model can serve as a valuable tool for individualized outcome prediction and guiding treatment decisions for early-stage, high-risk DLBCL patients.
Accurately predicting the mid-term outcomes of DLBCL patients is crucial for optimizing treatment strategies. Numerous studies have endeavored to evaluate the predictive value of PET radiomics features for DLBCL. Santiago et al (35) demonstrated a models based on radiomics accurately predicted refractory DLBCL. Their study employed RF as a classifier, randomly assigning patients to training (70%) and independent test cohorts (30%). The AUC of the two cohorts was 0.83 and 0.79, respectively. Coskun et el. found that texture features extracted from baseline PET predicted chemotherapy insensitivity to R-CHOP regimens in DLBCL patients with an ROC accuracy of 0.87 (AUC=0.81). Notably, SUVmax and the differences in grey-scale covariance matrix played crucial roles in predicting chemotherapy insensitivity (36). Consistent with prior studies, our study independently associated the RadScore based on PET radiomic features with mid-term outcomes in high-risk DLBCL patients (OR=7.167 (95%CI:2.815–18.248), P=0.001). The RadScore on 11 key radiomic features obtained from PET were valuable in predicting the mid-term efficacy of high-risk DLBCL patients. This is likely attributed to the close association between radiomics features and tumor heterogeneity (37, 38), which serves as a prognostic determinant of patient survival (39, 40).
With the increasing utilization of machine learning techniques in extracting and classifying image features. The LASSO model is a selection method that effectively narrows down and regresses from a large pool of potentially multicollinear variables to obtain a set of relevant predictors (32). Many studies have employed the LASSO to identify and classify data features. However, existing studies often employ a single machine learning method for radiomic features selection and construction. In clinical practice, a machine learning method that combines feature classification and cross-validation can enhance the accuracy and generalization of the predicted results (41). In this study, we developed the cross-combination pairs of seven machine learning method generate 49 permutations, and determine the optimal feature selection-classification pairs based on the maximum AUC results to obtain the final RadScore. Our research method made RadScore more robust and reproducible than those studies with single machine learning method. Figure 3 illustrated that the ET-LASSO model had a poor AUC (0.370), while AUC of the LASSO-LASSO model for predicting mid-term efficacy were 0.74. Additionally, our study revealed that the best LASSO-LASSO models selected radiomic features from shape feature (Surface Area), first-order features (Global Intensity Peak etc.) and texture features (GLCM), which indicated that main first-order features and texture features possess good ability to discriminate high-risk patients.
[18F]-FDG PET/CT can provide information about tumor biology by measuring cellular glucose metabolism. Our study demonstrated that SUVmax as an independent predictor of medium-term efficacy (OR=2.619 (95% CI: 1.107–6.194), P=0.028). The result were consistent with previous studies (42). We developed a user-friendly model that integrated RadScore, PET metabolic factors, and clinical risk factors and compared it with other models (e. g. clinical models, PET-based models, IPI model and NCCN-IPI model). The ROC curves and DCA results demonstrated that the combined model outperformed the other models, the performance of IPI model and NCCN-IPI model in the training cohort and the validation cohort were both unsatisfactory. Additionally, the combined model exhibited good agreement with the calibration curve and demonstrated a clear advantage in terms of AUC. These results indicate that the combined model is more suitable and practical for predicting medium-term outcome of DLBCL. Consistent with previous studies (43, 44), our results suggest that the IPI and NCCN-IPI may require improvement in identifying intermediate-high/high-risk DLBCL patients who would benefit from non-first-line treatment. Furthermore, our results support the RadScore of radiomic features(shape, first-order and GLCM) with SUVmax and clinical predictors, aligning with the findings of Jiang et al. (24, 45), to accurately identify intermediate-high/high-risk DLBCL patients.
One limitation of our study was its retrospective. We collected patient data from two medical centers, but future studies should include data from additional centers to ensure clinical generalizability. Given the specificity of DLBCL, the distribution of intra- and/or extra-lymph node lesions are highly variable and heterogeneous. The morphological and textural features of the lesions are highly sensitive to tumor segmentation methods. Thus, we employed the 41% of SUVmax tumor segmentation method recommended by the European Association of Nuclear Medicine. This method may be more practical and straightforward to implement in clinical. Additionally, the use of mid-term PET as the study endpoint in our research may lead to false-positive interpretation results. To justify therapeutic decisions, complementary studies utilizing end-stage PET should be conducted in the future. However, a major strength of our study lies in the homogeneity of the included patients, as they all had new-onset DLBCL histology and received R-CHOP-like regimens as standard treatment. The methodology employed also supports the general applicability of our model.
Conclusion
The RadScore is obtained by the feature selection-classification crossover combination of 7×7 machine learning method that included shape feature, first-order features and texture features (GLCM), can serve as a predictor for both mid-term efficacy and prognosis in DLBCL patients. In addition, the combined model which integrates the RadScore, PET metabolic indicator (SUVmax), and clinical risk factors (sex, B symptoms), can aid in rational risk stratification and facilitate the screening of appropriate treatment regimens for at intermediate-high/high risk DLBCL patients in the early stages.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Ethics statement
The studies involving humans were approved by Ethics Committee of Nanjing Drum Tower Hospital of Nanjing University Medical School and West China Hospital of Sichuan University. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.
Author contributions
MC: Investigation, Visualization, Writing – original draft. JR: Methodology, Writing – review & editing. JZ: Formal Analysis, Supervision, Writing – review & editing. YT: Validation, Writing – review & editing. CJ: Conceptualization, Writing – review & editing. JC: Software, Writing – review & editing. JX: Funding acquisition, Writing – review & editing.
Funding
The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was partially supported by fundings for Clinical Trials from the Affiliated Drum Tower Hospital, Medical School of Nanjing University (No. 2021-LCYJ-MS-04; 2022-LCYJ-PY-44).This work was also partially supported by fundings for the Key Project of Medical Science and Technology of Nanjing (No. ZKX21011).
Acknowledgments
This is a short text to acknowledge the contributions of specific colleagues, institutions, or agencies that aided the efforts of the authors.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Abbreviations
RadScore, Radiomics score; DLBCL, Diffuse large B-cell lymphoma; IPI, International Prognostic Index; NCCN-IPI, National Comprehensive Cancer Network International Prognostic Index; LDH, Lactate dehydrogenase; SUVmax, Maximum standardized uptake value; MTV, Metabolic tumor volume; TLG, Total lesion glycolysis; VOI, Volume of interest; AdaBoost, Adaptive Boosting; ET, Extreme Tree; GBDT, Gradient Boosted Decision Tree; LASSO, Least Absolute Shrinkage and Selection Operator; LR, Logistic Regression; RF, Random Forest; SVM, Support Vector Machines; PFS, Progression-free survival; OS, Overall survival; DCA, Decision curve analysis; ROC, Receiver operating characteristic; AUC, Area under the ROC curve; KM, Kaplan–Meier plots; HR, Hazard ratio; OR, Odds ratio.
References
1. Swerdlow SH, Campo E, Pileri SA, Harris NL, Stein H, Siebert R, et al. The 2016 revision of the World Health Organization classification of lymphoid neoplasms. Blood. (2016) 127:2375–90. doi: 10.1182/blood-2016–01-643569
2. Coiffier B, Thieblemont C, Van Den Neste E, Lepeu G, Plantier I, Castaigne S, et al. Long-term outcome of patients in the LNH-98.5 trial, the first randomized study comparing rituximab-CHOP to standard CHOP chemotherapy in DLBCL patients: a study by the Groupe d’Etudes des Lymphomes de l’Adulte. Blood. (2010) 116:2040–5. doi: 10.1182/blood-2010–03-276246
3. Purroy N, Bergua J, Gallur L, Prieto J, Lopez LA, Sancho JM, et al. Long-term follow-up of dose-adjusted EPOCH plus rituximab (DA-EPOCH-R) in untreated patients with poor prognosis large B-cell lymphoma. A phase II study conducted by the Spanish PETHEMA Group. Br J Haematol. (2015) 169:188–98. doi: 10.1111/bjh.13273
4. Kurtz DM, Green MR, Bratman SV, Scherer F, Liu CL, Kunder CA, et al. Noninvasive monitoring of diffuse large B-cell lymphoma by immunoglobulin high-throughput sequencing. Blood. (2015) 125:3679–87. doi: 10.1182/blood-2015–03-635169
5. Crump M, Neelapu SS, Farooq U, Van Den Neste E, Kuruvilla J, Westin J, et al. Outcomes in refractory diffuse large B-cell lymphoma: results from the international SCHOLAR-1 study. Blood. (2017) 130:1800–8. doi: 10.1182/blood-2017–03-769620
6. Dagogo-Jack I, Shaw AT. Tumour heterogeneity and resistance to cancer therapies. Nat Rev Clin Oncol. (2018) 15:81–94. doi: 10.1038/nrclinonc.2017.166
7. Steen CB, Luca BA, Esfahani MS, Azizi A, Sworder BJ, Nabet BY, et al. The landscape of tumor cell states and ecosystems in diffuse large B cell lymphoma. Cancer Cell. (2021) 39:1422–1437.e10. doi: 10.1016/j.ccell.2021.08.011
8. Bishton MJ, Hughes S, Richardson F, James E, Bessell E, Sovani V, et al. Delineating outcomes of patients with diffuse large b cell lymphoma using the national comprehensive cancer network-international prognostic index and positron emission tomography-defined remission status; a population-based analysis. Br J Haematol. (2016) 172:246–54. doi: 10.1111/bjh.13831
9. Wright GW, Huang DW, Phelan JD, Coulibaly ZA, Roulland S, Young RM, et al. A probabilistic classification tool for genetic subtypes of diffuse large B cell lymphoma with therapeutic implications. Cancer Cell. (2020) 37:551–568.e14. doi: 10.1016/j.ccell.2020.03.015
10. International Non-Hodgkin’s Lymphoma Prognostic Factors Project. A predictive model for aggressive non-Hodgkin’s lymphoma. N Engl J Med. (1993) 329:987–94. doi: 10.1056/NEJM199309303291402
11. Zhou Z, Sehn LH, Rademaker AW, Gordon LI, Lacasce AS, Crosby-Thompson A, et al. An enhanced International Prognostic Index (NCCN-IPI) for patients with diffuse large B-cell lymphoma treated in the rituximab era. Blood. (2014) 123:837–42. doi: 10.1182/blood-2013–09-524108
12. Coutinho R, Lobato J, Esteves S, Cabeçadas J, Gomes da Silva M. Clinical risk scores do not accurately identify a very high risk population with diffuse large B cell lymphoma-an analysis of 386 Portuguese patients. Ann Hematol. (2019) 98:1937–46. doi: 10.1007/s00277–019-03676–0
13. El-Galaly TC, Villa D, Alzahrani M, Hansen JW, Sehn LH, Wilson D, et al. Outcome prediction by extranodal involvement, IPI, R-IPI, and NCCN-IPI in the PET/CT and rituximab era: A Danish-Canadian study of 443 patients with diffuse-large B-cell lymphoma. Am J Hematol. (2015) 90:1041–6. doi: 10.1002/ajh.24169
14. Cheson BD, Fisher RI, Barrington SF, Cavalli F, Schwartz LH, Zucca E, et al. Recommendations for initial evaluation, staging, and response assessment of Hodgkin and non-Hodgkin lymphoma: the Lugano classification. J Clin Oncol. (2014) 32:3059–68. doi: 10.1200/JCO.2013.54.8800
15. Boellaard R, Delgado-Bolton R, Oyen WJG, Giammarile F, Tatsch K, Eschner W, et al. FDG PET/CT: EANM procedure guidelines for tumour imaging: version 2.0. Eur J Nucl Med Mol Imaging. (2015) 42:328–54. doi: 10.1007/s00259-014-2961-x
16. Chihara D, Oki Y, Onoda H, Taji H, Yamamoto K, Tamaki T, et al. High maximum standard uptake value (SUVmax) on PET scan is associated with shorter survival in patients with diffuse large B cell lymphoma. Int J Hematol. (2011) 93:502–8. doi: 10.1007/s12185-011-0822-y
17. Shagera QA, Cheon GJ, Koh Y, Yoo MY, Kang KW, Lee DS, et al. Prognostic value of metabolic tumour volume on baseline 18F-FDG PET/CT in addition to NCCN-IPI in patients with diffuse large B-cell lymphoma: further stratification of the group with a high-risk NCCN-IPI. Eur J Nucl Med Mol Imaging. (2019) 46:1417–27. doi: 10.1007/s00259–019-04309–4
18. Vercellino L, Cottereau A-S, Casasnovas O, Tilly H, Feugier P, Chartier L, et al. High total metabolic tumor volume at baseline predicts survival independent of response to therapy. Blood. (2020) 135:1396–405. doi: 10.1182/blood.2019003526
19. Eertink JJ, van de Brug T, Wiegers SE, Zwezerijnen GJC, Pfaehler EAG, Lugtenburg PJ, et al. 18F-FDG PET baseline radiomics features improve the prediction of treatment outcome in diffuse large B-cell lymphoma. Eur J Nucl Med Mol Imaging. (2022) 49:932–42. doi: 10.1007/s00259–021-05480–3
20. Zhou Y, Ma X-L, Zhang T, Wang J, Zhang T, Tian R. Use of radiomics based on 18F-FDG PET/CT and machine learning methods to aid clinical decision-making in the classification of solitary pulmonary lesions: an innovative approach. Eur J Nucl Med Mol Imaging. (2021) 48:2904–13. doi: 10.1007/s00259–021-05220–7
21. Lue K-H, Wu Y-F, Lin H-H, Hsieh T-C, Liu S-H, Chan S-C, et al. Prognostic value of baseline radiomic features of 18F-FDG PET in patients with diffuse large B-cell lymphoma. Diagnostics (Basel). (2020) 11:36. doi: 10.3390/diagnostics11010036
22. Mayerhoefer ME, Riedl CC, Kumar A, Gibbs P, Weber M, Tal I, et al. Radiomic features of glucose metabolism enable prediction of outcome in mantle cell lymphoma. Eur J Nucl Med Mol Imaging. (2019) 46:2760–9. doi: 10.1007/s00259–019-04420–6
23. Wang H, Zhao S, Li L, Tian R. Development and validation of an 18F-FDG PET radiomic model for prognosis prediction in patients with nasal-type extranodal natural killer/T cell lymphoma. Eur Radiol. (2020) 30:5578–87. doi: 10.1007/s00330–020-06943–1
24. Jiang C, Li A, Teng Y, Huang X, Ding C, Chen J, et al. Optimal PET-based radiomic signature construction based on the cross-combination method for predicting the survival of patients with diffuse large B-cell lymphoma. Eur J Nucl Med Mol Imaging. (2022) 49:2902–16. doi: 10.1007/s00259–022-05717–9
25. Meignan M, Gallamini A, Meignan M, Gallamini A, Haioun C. Report on the first international workshop on interim-PET-scan in lymphoma. Leuk Lymphoma. (2009) 50:1257–60. doi: 10.1080/10428190903040048
26. Nioche C, Orlhac F, Boughdad S, Reuzé S, Goya-Outi J, Robert C, et al. LIFEx: A freeware for radiomic feature calculation in multimodality imaging to accelerate advances in the characterization of tumor heterogeneity. Cancer Res. (2018) 78:4786–9. doi: 10.1158/0008–5472.CAN-18–0125
27. Zwanenburg A. Radiomics in nuclear medicine: robustness, reproducibility, standardization, and how to avoid data analysis traps and replication crisis. Eur J Nucl Med Mol Imaging. (2019) 46:2638–55. doi: 10.1007/s00259–019-04391–8
28. Zhang Z, Jung C. GBDT-MO: gradient-boosted decision trees for multiple outputs. IEEE Trans Neural Netw Learn Syst. (2021) 32:3156–67. doi: 10.1109/TNNLS.2020.3009776
29. Sylvester EVA, Bentzen P, Bradbury IR, Clément M, Pearce J, Horne J, et al. Applications of random forest feature selection for fine-scale genetic population assignment. Evol Appl. (2018) 11:153–65. doi: 10.1111/eva.12524
30. Désir C, Petitjean C, Heutte L, Salaün M, Thiberville L. Classification of endomicroscopic images of the lung based on random subwindows and extra-trees. IEEE Trans BioMed Eng. (2012) 59:2677–83. doi: 10.1109/TBME.2012.2204747
31. Takemura A, Shimizu A, Hamamoto K. Discrimination of breast tumors in ultrasonic images using an ensemble classifier based on the AdaBoost algorithm with feature selection. IEEE Trans Med Imaging. (2010) 29:598–609. doi: 10.1109/TMI.2009.2022630
32. McEligot AJ, Poynor V, Sharma R, Panangadan A. Logistic LASSO regression for dietary intakes and breast cancer. Nutrients. (2020) 12:2652. doi: 10.3390/nu12092652
33. Huang S, Cai N, Pacheco PP, Narrandes S, Wang Y, Xu W. Applications of support vector machine (SVM) learning in cancer genomics. Cancer Genomics Proteomics. (2018) 15:41–51. doi: 10.21873/cgp.20063
34. LaValley MP. Logistic regression. Circulation. (2008) 117:2395–9. doi: 10.1161/CIRCULATIONAHA.106.682658
35. Santiago R, Ortiz Jimenez J, Forghani R, Muthukrishnan N, Del Corpo O, Karthigesu S, et al. CT-based radiomics model with machine learning for predicting primary treatment failure in diffuse large B-cell Lymphoma. Transl Oncol. (2021) 14:101188. doi: 10.1016/j.tranon.2021.101188
36. Coskun N, Okudan B, Uncu D, Kitapci MT. Baseline 18F-FDG PET textural features as predictors of response to chemotherapy in diffuse large B-cell lymphoma. Nucl Med Commun. (2021) 42:1227–32. doi: 10.1097/MNM.0000000000001447
37. Zhang J, Zhao X, Zhao Y, Zhang J, Zhang Z, Wang J, et al. Value of pre-therapy 18F-FDG PET/CT radiomics in predicting EGFR mutation status in patients with non-small cell lung cancer. Eur J Nucl Med Mol Imaging. (2020) 47:1137–46. doi: 10.1007/s00259–019-04592–1
38. El Naqa I, Grigsby P, Apte A, Kidd E, Donnelly E, Khullar D, et al. Exploring feature-based approaches in PET images for predicting cancer treatment outcomes. Pattern Recognit. (2009) 42:1162–71. doi: 10.1016/j.patcog.2008.08.011
39. Eary JF, O’Sullivan F, O’Sullivan J, Conrad EU. Spatial heterogeneity in sarcoma 18F-FDG uptake as a predictor of patient outcome. J Nucl Med. (2008) 49:1973–9. doi: 10.2967/jnumed.108.053397
40. Pavic M, Bogowicz M, Kraft J, Vuong D, Mayinger M, Kroeze SGC, et al. FDG PET versus CT radiomics to predict outcome in Malignant pleural mesothelioma patients. EJNMMI Res. (2020) 10:81. doi: 10.1186/s13550–020-00669–3
41. Shen Z, Zhang S, Jiao Y, Shi Y, Zhang H, Wang F, et al. LASSO model better predicted the prognosis of DLBCL than random forest model: A retrospective multicenter analysis of HHLWG. J Oncol. (2022) 2022:1618272. doi: 10.1155/2022/1618272
42. Morbelli S, Capitanio S, De Carli F, Bongioanni F, De Astis E, Miglino M, et al. Baseline and ongoing PET-derived factors predict detrimental effect or potential utility of 18F-FDG PET/CT (FDG-PET/CT) performed for surveillance in asymptomatic lymphoma patients in first remission. Eur J Nucl Med Mol Imaging. (2016) 43:232–9. doi: 10.1007/s00259–015-3164–9
43. Nakaya A, Fujita S, Satake A, Nakanishi T, Azuma Y, Tsubokura Y, et al. Enhanced international prognostic index in Japanese patients with diffuse large B-cell lymphoma. Leuk Res Rep. (2016) 6:24–6. doi: 10.1016/j.lrr.2016.06.003
44. Hong J, Kim SJ, Chang MH, Kim J-A, Kwak J-Y, Kim JS, et al. Improved prognostic stratification using NCCN- and GELTAMO-international prognostic index in patients with diffuse large B-cell lymphoma. Oncotarget. (2017) 8:92171–82. doi: 10.18632/oncotarget.20988
Keywords: [18F]-FDG PET/CT, diffuse large B-cell lymphoma, machine learning, interim, treatment outcome, prognosis
Citation: Chen M, Rong J, Zhao J, Teng Y, Jiang C, Chen J and Xu J (2024) PET-based radiomic feature based on the cross-combination method for predicting the mid-term efficacy and prognosis in high-risk diffuse large B-cell lymphoma patients. Front. Oncol. 14:1394450. doi: 10.3389/fonc.2024.1394450
Received: 01 March 2024; Accepted: 22 May 2024;
Published: 05 June 2024.
Edited by:
Nadia Gisella Di Muzio, Vita-Salute San Raffaele University, ItalyReviewed by:
Saveria Mazzara, Bocconi University, ItalyKezheng Wang, Harbin Medical University Cancer Hospital, China
Copyright © 2024 Chen, Rong, Zhao, Teng, Jiang, Chen and Xu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Chong Jiang, jiang_nju@163.com; Jianxin Chen, chenjx@njupt.edu.cn; Jingyan Xu, xujingyan@nju.edu.cn
†These authors have contributed equally to this work and share first authorship
‡These authors have contributed equally to this work and share last authorship