- 1Department of Nuclear Medicine, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai, China
- 2Clinical and Translational Center in Shanghai Chest Hospital, Shanghai Key Laboratory for Molecular Imaging, Shanghai University of Medicine and Health Sciences, Shanghai, China
- 3Statistical Center, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai, China
- 4Department of Radiology, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai, China
- 5Department of Radiology, Second Affiliated Hospital of Soochow University, Suzhou, China
- 6Pharmaceutical Diagnostic Department, GE Healthcare China, Shanghai, China
- 7Institute for Medical Imaging Technology, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
- 8Department of Thoracic Surgery, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai, China
- 9Department of Ultrasound, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai, China
Objectives: Anaplastic lymphoma kinase (ALK) rearrangement status examination has been widely used in clinic for non-small cell lung cancer (NSCLC) patients in order to find patients that can be treated with targeted ALK inhibitors. This study intended to non-invasively predict the ALK rearrangement status in lung adenocarcinomas by developing a machine learning model that combines PET/CT radiomic features and clinical characteristics.
Methods: Five hundred twenty-six patients of lung adenocarcinoma with PET/CT scan examination were enrolled, including 109 positive and 417 negative patients for ALK rearrangements from February 2016 to March 2019. The Artificial Intelligence Kit software was used to extract radiomic features of PET/CT images. The maximum relevance minimum redundancy (mRMR) and least absolute shrinkage and selection operator (LASSO) logistic regression were further employed to select the most distinguishable radiomic features to construct predictive models. The mRMR is a feature selection method, which selects the features with high correlation to the pathological results (maximum correlation), meanwhile retain the features with minimum correlation between them (minimum redundancy). LASSO is a statistical formula whose main purpose is the feature selection and regularization of data model. LASSO method regularizes model parameters by shrinking the regression coefficients, reducing some of them to zero. The feature selection phase occurs after the shrinkage, where every non-zero value is selected to be used in the model. Receiver operating characteristic (ROC) analysis was used to evaluate the performance of the models, and the performance of different models was compared by the DeLong test.
Results: A total of 22 radiomic features were extracted from PET/CT images for constructing the PET/CT radiomic model, and majority of these features used were based on CT features (20 out of 22), only 2 PET features were included (PET percentile 10 and PET difference entropy). Moreover, three clinical features associated with ALK mutation (age, burr and pleural effusion) were also employed to construct a combined model of PET/CT and clinical model. We found that this combined model PET/CT-clinical model has a significant advantage to predict the ALK mutation status in the training group (AUC = 0.87) and the testing group (AUC = 0.88) compared with the clinical model alone in the training group (AUC = 0.76) and the testing group (AUC = 0.74) respectively. However, there is no significant difference between the combined model and PET/CT radiomic model.
Conclusions: This study demonstrated that PET/CT radiomics-based machine learning model has potential to be used as a non-invasive diagnostic method to help diagnose ALK mutation status for lung adenocarcinoma patients in the clinic.
Introduction
Lung cancer is the most common cause of cancer mortality worldwide, and non-small cell lung cancer (NSCLC) accounts for approximately 85% of all lung cancers (1). Treatment options for NSCLC greatly developed in the last decades with the advance in targeted therapies against mutated genes, such as epidermal growth factor receptor (EGFR), anaplastic lymphoma kinase (ALK), ROS proto-oncogene 1 (ROS-1) and v-raf murine sarcoma viral oncogene homolog B (BRAF) (2–7). All these activating mutated-genes can be targeted with FDA-approved drugs. To identify these patient subsets with the specific mutated genes, reliable biomarker testing is needed to identify the different genetic subtypes of lung cancers. The frequency of ALK mutation in NSCLC patients is about 5% in the western and about 4.9% in the Asian population, especially higher in lung adenocarcinomas patients (6.0%) (8). ALK mutation detection has been widely used in clinic for NSCLC patients (8).
Currently, several different techniques can be used to identify ALK-rearranged lung cancers, such as immunohistochemistry and fluorescence in situ hybridization (9, 10). However, there are several limitations to these techniques in the detection of ALK mutation. First, these examinations are based on surgical specimens or biopsies, which will exclude patients not suitable for surgery and also biopsy. Second, due to the heterogeneity of tumor tissues (11, 12), most sites in the tumor tissues could not be examined, which greatly affects the accuracy of conventional ALK mutation examination. Therefore, a non-invasive and more reliable tool for ALK mutation examination is urgently needed.
Recently, radiomic analysis based on data derived from clinically medical images has been used to analyze tumors, including tumor heterogeneity, gene mutation status, and response to treatments (13, 14). Conventional imaging evaluation of tumor lesions typically includes only lesion size, location, and enhancing characteristics. By contrast, radiomic analysis extracts highly detailed features from clinical images to tumor lesions, including tumor texture, shape and intensity (15). Thus, radiomic analysis has become an alternative method to evaluate tumors and also predict gene mutation status for lung cancer patients. A large number of studies have shown that the radiomic analysis can be used to predict the mutation status of several oncogenes (16, 17). Currently, most studies in lung cancer have been done in primary tumors using computed tomography (CT) images (18–22). For example, Gevaert et al. used CT images-based signature of primary lung tumors to predict EGFR mutation status (23). Liu et al. used a set of five CT-based features to predict EGFR mutation status (16). Arbour et al. showed that ALK rearranged NSCLC primary tumor CT imaging features are different from those of EGFR mutated or wide type NSCLC (3). Recently, Song et al. developed a machine learning model based on CT radiomic features to predict ALK rearrangement status for lung adenocarcinoma patients (24).
However, positron emission tomography/computed tomography (PET/CT) radiomic features of lung adenocarcinoma have not been well studied. In our previous studies, we demonstrated that lung adenocarcinoma tumors with micropapillary or solid contents have a higher maximum standard uptake value (SUVmax) and correlate with lymph node metastasis based on PET/CT images (25). Furthermore, we also found that the SUVmax of 18FDG PET/CT can be used to predict the histological grade of lung adenocarcinoma (26). Besides, we demonstrated that combining 18FDG PET/CT metabolic parameters and clinical parameters can be used to predict ALK and ROS-1 mutation in NSCLC patients (27).
To the best of our knowledge, this is the first study using PET/CT radiomic approaches and a machine learning model to predict the ALK mutation status in lung cancer primary tumors. We collected PET/CT images of lung adenocarcinoma patients, segmented the images, extracted radiomic features, and used machine learning algorithms to classify the mutation status. Here, we proposed that a novel machine learning model based on radiomic features of PET/CT images and clinical characteristics could be used to predict ALK mutation status in lung adenocarcinoma patients.
Materials and Methods
Patients Selection
We retrospectively identified 631 lung adenocarcinoma patients treated at our hospital between February 2016 and March 2019 who underwent PET/CT scan as well as surgery or biopsy treatments and tested for ALK mutation in primary tumors. Histological tumor slides were reviewed by two pathological specialists who have rich experience in the examination of lung tumors. The criteria used to select patients includes: (1) all patients were examined on a Siemens PET/CT machine with the same collection conditions; (2) all the cases included in this study had pathological results from surgery or biopsy specimens, and all underwent ALK genetic testing, and the surgery was completed within 2 weeks after PET/CT examination; (3) medical history of patients was complete, and the image collection was complete. The criteria used to exclude patients includes: (1) patients who had undergone radiotherapy, chemotherapy, or targeted drug therapy for lung adenocarcinoma before PET/CT examination (38 cases); (2) multiple tumor nodules in the lung or multiple tumors in other parts of the body (15 cases); (3) tumor lesions were close to the center and could not be separated from the adjacent hilar anatomy (10 cases); (4) PET/CT images with poor quality and artifacts affected the diagnosis (42 cases). According to the final pathological results, the included cases were divided into ALK-positive group and ALK-negative group. The detailed process of screening and grouping of lung adenocarcinoma cases is shown in Figure 1. This retrospective study followed a protocol approved by the Institutional Review Board at Shanghai Chest Hospital and the need for informed patient consent was waived.
Positron Emission Tomography/Computed Tomography Scan Procedures
All 631 selected patients were examined by Biograph mCT-S PET/CT (Siemens) and the scanning conditions and parameters are set to be consistent. The imaging agent 18F-FDG was produced by Shanghai Atom Kexin Pharmaceutical Co., Ltd, with PH value is about 7.0 and radiochemical purity > 95%. Patients were examined with blood glucose less than 7.8 mmol/L. The amount of imaging agent injected according to the standard is 0.10–0.15mCi/kg. The parameters of the CT scan were set as voltage = 120 kV, the milliamp seconds are automatically adjusted according to CARE Dose, and the image acquisition is 5 mm thick per layer and reconstructed to a 512 × 512 matrix (voxel size: 0.98 × 0.98 × 3.0 mm3). CT scan was taken first, followed by a PET scan. PET scan used 5 beds, each bed lasts about 120 s, the layer thickness was 5 mm. After the whole-body PET/CT scan, a thin high-resolution CT (HRCT) scan with a layer thickness of 1.0 mm was performed. The matrix size of all PET reconstruction was 200 × 200, and the anisotropic voxel was 4.07 × 4.07 × 3.0 mm3. The PET images were attenuated by CT data and reconstructed by TrueX+TOF. Finally, the reconstructed PET and CT images were fused and transmitted to the post-processing platform.
Processing and Analysis of Positron Emission Tomography/Computed Tomography Images
The ITK-SNAP 3.8.0 software (www.itksnap.org) was employed to obtain the volume of interest (VOI). Firstly, PET images with 5mm slice thickness and HRCT images with 1 mm slice thickness from the workstation (DICOM format) were imported into the ITK-SNAP software to draw the primary lung cancer lesions in multi-plane modes including cross-section, sagittal plane, and coronal plane. After all the images were preprocessed, the images were resampled to 1×1×1 mm3, and grey discretization were performed to the images with 8 fixed bin numbers. Lung cancer lesions on CT with 1mm slice thickness or PET images were drawn on a dimensional interface. The region of interest (ROI) was sketched by two nuclear radiologists with more than 10 years of diagnostic experience without knowing the pathological results. For delineation on CT images, we observed the lesion on a window width of 1,600 HU and a window position of −600 HU. Then the boundary of the lung cancer was drawn semi-automatically, and slowly adjusted manually. For the delineation of PET image, refer to the CT boundary, the SUV threshold was set to 40% VOI by referring to the standard values in the TrueD tool suite of Siemens MMWP workstation, and manually sketched the three-dimensional ROI of lung cancer using the “adaptive brush” semi-automatic sketching tool on ITK-SNAP. When the lesion was close to the hilar blood vessels, the CT boundary had been delineated with reference to PET. To show the heterogeneity of lung cancer, necrosis, bleeding, calcification and burrs were included in the ROI drawing. If there was an inflammatory lesion around the lesion, the pulmonary inflammatory lesions had been excluded.
Image Pre-Processing and Feature Extraction
Based on PET/CT images, PET images displayed molecular metabolic information of lung adenocarcinoma lesions, while CT images displayed morphological features. The original images of PET images with 5 mm slice thickness and breath-hold thin-layer CT images with 1 mm (DICOM format) as well as the outlined lesions for every lung adenocarcinoma were imported into the Artificial Intelligence Kit software (A.K. software; GE Healthcare, China), two pre-processing techniques were used to improve the recognition of image textures. First, all the images were resampled to 1 × 1 × 1 mm3 voxels via linear interpolation. Second, the images were normalized into standardized intensity ranges by z-score transformation with a mean value of 0 and a standard deviation value of 1. A total of 402 features were extracted, including 42 histogram features, 11 grey level size zone matrix (GLSZM) features, 15 form factor features (refer to shape characteristics, such as sphericity of VOI and density of VOI), 154 gray level co-occurrence matrix (GLCM) features and 180 run length matrix (RLM) features. All the features were extracted by AK software, and the algorithm used in the AK software are IBSI compliant (28). The consistency of lesions segmentation between two nuclear medicine doctors was evaluated by calculating the intra- and inter-class correlation coefficients (ICCs) of the extracted PET/CT radiomic features of 50 random cases picked from 526 enrolled patients. The first nuclear medicine doctor sketched the PET/CT features twice in two months and the radiomics features were extracted to assess the intra observer ICCs; the second nuclear medicine doctor extracted the imaging features once and then assess the inter observer ICCs between two doctors. ICCs > 0.75 indicated good consistency (29), and the first doctor finished the remaining segmentation (Supplementary Table S1). The final selected features which were used to construct PET/CT model were explained as Supplementary Table S2.
Feature Extraction, Subsampling, Radiomics Signature Construction, and Model Validation
All patients were randomly divided into training group and test group at a ratio of 7:3. In the training group, to get a balance data distribution, a synthetic minority oversampling technique (SMOTE) was applied to sample generation of minority group from the joint weighting of optimal features. Then minimum redundancy and maximum correlation (mRMR) was used to select features with high correlation with ALK mutation and without redundancy. Least absolute shrinkage and selection operator (LASSO) logistic regression was performed to select the most distinguishable feature subsets in the training group. The workflow of radiomic analysis was shown in the Figure 2. Three different radiomic models based on radiomic features of PET and CT images, including PET + CT radiomic model, CT radiomic model and PET radiomic model were established respectively to predict ALK mutation in lung adenocarcinoma (Figure 3, Supplementary Figures S1 and S2). Radiomic score (radscore) was calculated by summing the selected features weighted by their coefficients and the radscore of each patient was calculated. The formula of radscore calculation was provided in the Supplementary Methods. The ROC curve was used to evaluate the diagnostic efficacy, and the critical value was taken when the Youden index was at its maximum (30). The sensitivity, specificity, and accuracy of predicting ALK mutation in lung adenocarcinoma were calculated, and the model was verified in the test group. DeLong test was performed between the three models to determine whether the area under the curve (AUC) values of the three radiomic models for predicting ALK mutation were statistically different. Finally, 100-folds leave-group-out cross-validation (LGOCV) was performed to verify the reliability of our results in PET/CT groups.
Figure 2 The workflow of radiomic analysis. Feature extraction: AK software (402), 402 means the total number of extracted features from AK software. ROI, region of interest; GLCM, gray level co-occurrence matrix; GLSZM, grey level size zone matrix; RLM, run length matrix; mRMR, minimum redundancy and maximum correlation; LASSO, least absolute shrinkage and selection operator; ROC, receiver operating characteristic.
Figure 3 Construction of a PET/CT radiomic model based on PET/CT images. (A) the Selection of the tuning parameter (λ) in the LASSO model via 10-fold cross-validation based on minimum criteria. Binomial deviances of the LASSO regression cross-validation model are plotted as a function of ln (λ). The y-axis shows binomial deviances and the lower x-axis the ln (λ). Feature numbers along the upper x-axis indicate the number of features via the change of λ. (B) The final retained features selected by mRMR, y axis was the retained features and x axis shows the corresponding LASSO regression coefficients of them. The fitted coefficients of the features plotted vs. ln (λ). (C) Representative results of PET/CT radiomic model for predicting ALK rearrangement in training (left) and testing (right) group of lung adenocarcinoma patients. 0, negative ALK rearrangement; 1, positive ALK rearrangement. (D) Cross-validation analysis showed that PET/CT radiomic model has good reliability to predict ALK rearrangement in training (left) and testing (right) group of lung adenocarcinoma patients.
Collection of Clinical Characteristics for Lung Adenocarcinoma
Two nuclear medicine doctors with more than 10 years of experience in chest diagnosis evaluated the PET/CT images. A total of 16 clinical factors in lung adenocarcinoma were collected (Table 1) including lobulation, burr, calcification, air bronchial sign, vacuolar sign, ground glass composition, pleural effusion, pleural traction, maximum length, location, SUVmax of primary tumors, age, sex, pre-treatment carcinoembryonic antigen (CEA), smoking history and clinical stage. Regarding the clinical factors, the smoking status was simply a binary variable in this study, including current smoker (1) versus non-smoker (0). The definition of smoking status was based on following criteria, current smokers include smokers (patients have been smoking) and ex-smokers (patients stopped smoke > 15 years, but have history of smoke > 10 pack-years), while non-smokers include patients never or smoked < 100 cigarettes in their lifetime. Tumor location is consistent with lung distribution include right upper lobe, right middle lobe, right lower lobe, left upper lobe, left lower lobe. CEA (ng/ml) is calculated according to the value measured by immunoassay method. Clinical stage is divided into stage I, II, III, and IV; CT evaluation indexes of lesions: burrs refer to high-resolution CT judgment on the lung window, thorny and radial protrusions around the lung tumor lesions; lobes refer to tumor edges are not smooth and protruding outward, uneven; pleural adhesion refer to the pleura or visceral pleura is stretched and shrinks towards the lung cancer; air bronchus signs refer to the HRCT lung window, combined with multi-planar reconstruction technology, if it can show bronchial shadow is defined as air bronchial signs; vacuolar sign refer to gas shadow seen in tumor lesions, generally less than 5 mm; calcification refer to the high-density shadow observed on the mediastinal window; ground glass composition refer to a cloud-like or ground glass opacity on the HRCT lung window, vascular lesions which may be displayed or bronchial movies; maximum length (cm) refer to primary lesion measuring the longest diameter on lung window; the PET image measuring metabolic indicator is SUVmax, measurement of lesion maximum standardized uptake value on PET.
Construction of the Individualized Prediction Model
Chi-square test, Student t-test and Mann-Whitney U test were applied to clinical features. The variables with p-value < 0.1 were included in the univariate logistic regression to calculate the odd ratio (OR) value and p-value of clinical features. By combining radiomic features with clinical features, we further constructed an integrated mode (PET/CT radiomics + clinical). The clinical model was constructed based on clinical features to predict ALK mutation status, by Chi-square test or Wilconxon test and univariate logistic test. Clinical variables contributing significantly to the model were also incorporated as well as radiomics score into a multivariate logistic regression to establish nomogram. Meanwhile, the variance inflation factor (VIF) was used for collinear analysis, removing factors with VIF > 10. The independent predictive risk factors were applied to construct the nomogram.
Statistical Analysis
IBM SPSS 25.0 (http://www.ibm.com) and R language software (version 3.5.1, http://www.R-project.org) were used for statistical analysis. The optimal cutoff value was the point on the ROC curve with the largest positive likelihood ratio in the training dataset and was used for the validation dataset. A calibration curve was used to assess the consistency between the radiomics nomogram and the observed value, the Hosmer-Lemeshow test was applied to evaluate the difference. The decision curve was used in the test group to evaluate the clinical utility of the integrated model to predict ALK mutation in lung adenocarcinoma.
Results
Patient Enrollment
A total of 526 patients with invasive lung adenocarcinoma were selected. Postoperative pathology confirmed 109 cases of ALK-positive, accounting for 20.7% of the total, 417 cases of ALK-negative, accounting for 79.3%. (Figure 1). All the patients were randomly subjected to training cohort (7/10) and testing cohort (3/10).
Extraction and Selection of Features Derived From Positron Emission Tomography/Computed Tomography Images
A total of 402 radiomic features were extracted. The mRMR was used to select the most distinguishable features. The inter- and intra-observer correlation coefficients show that 256 and 314 of 402 radiomics were identified as good reproducibility (ICC > 0.75) for the CT group and PET group respectively. First, 30 features were retained after mRMR analysis (Figure 3A). Then, a total of 22 PET/CT radiomic features were identified as robust by LASSO logistic regression for constructing model (Figure 3B).
The radscore distribution between negative and positive ALK mutation patients in the training group and test group respectively were shown in Figure 3C, Supplementary Figures S1C and S2C, we found that all 3 radiomic models can predict the ALK mutation status in lung adenocarcinoma patients (Table 2). We further used cross-validation analysis to investigate the reliability of the PET/CT model (Figure 3D).
Radiomic Models: Performance and Validation
We use ROC analysis to evaluate the performance of 3 different models and found that every model can predict the ALK mutation status (Figure 4). For example, the AUC based on the PET/CT radiomic model, is 0.85 (95% CI: 0.80–0.90) in the training cohort and 0.86 (95% CI: 0.78–0.94) in the test cohort, respectively; the AUC based on the CT radiomic model is 0.84 (95% CI: 0.81–0.88) in the training cohort and 0.80 (95% CI: 0.70–0.89) in the test cohort, respectively, the AUC based on the PET radiomic model is 0.84 (95% CI: 0.81–0.87) in the training cohort and 0.82 (95% CI: 0.73–0.91) in the test cohort, respectively. Although the AUC value of ROC curve in PET/CT radiomic model is higher than the other two models, there is no significant difference between every two groups (Supplementary Table S3, DeLong test).
Figure 4 ROC curve analysis of three radiomics models, PET/CT, CT, and PET in training group (A) and testing group (B), respectively.
Integrated Clinical and Radiomic Model: Performance and Validation
After clinical model screening, we found that age, burr, pleural adhesion, maximum length, pleural effusion, calcification, ground glass opacity and tumor grade were associated with ALK mutation status by univariate logistic analysis in the training cohort (Supplementary Table S4). We further analyzed the 8 clinical features using multivariate logistic, and found 3 clinical variables with significant influence on the model (age, burr and pleural effusion), among which 2 clinical features (age and pleural effusion) were independent predictors of ALK mutation status (Supplementary Table S5). The ROC curve analysis results of the three models were shown in Figure 5A. The performances of the integrated model and PET/CT radiomic model were very close in both the training cohort and test cohort (Table 2). In both cohorts, the integrated model achieved the best performance with AUC = 0.87 in the training cohort and AUC = 0.88 in the test cohort (Table 2). A statistically significant difference in AUC was found between the integrated model and the clinical model with the DeLong test (p <0.001), and also between the PET/CT radiomic model and the clinical model (p = 0.023) by DeLong test in the training cohort (Supplementary Table S6, DeLong test). However, there was no significant difference between the integrated PET/CT + clinical model and PET model or CT model alone.
Figure 5 Evaluates the performances of integrated PET/CT radiomics-clinical model. (A) Receiver operating characteristic (ROC) curves of predictive performances of different methods in the training cohort (left) and test cohort (right). The curves of 3 colors represent different models: red, PET/CT radiomics + clinical model; blue, PET/CT radiomics model; green, clinical model. AUC, area under the curve. (B) Nomogram for ALK mutation prediction. The nomogram was developed by integrating radiomic score with 3 significant clinical features (age, burr and pleural effusion). The probability of each predictor can be converted into the “points” scale at the top of the nomogram. By sum up the points for each predictor and locate in the “Total points” scale, we can predict the probability of ALK mutation in the “Risk” scale. (C) Calibration curve with Hosmer-Lemeshow test of the nomogram in the training cohort (left panel) and test cohort (right panel). Calibration curve shows the calibration of the model in terms of consistence between predicated risk of ALK mutation and real observed ALK mutation status. The x-axis represents the predicted risk of ALK mutation and y-axis represents the real ALK mutation status. (D) Decision curve analysis for the nomograms. The y-axis measures the standardized net benefit. The net benefit is calculated by adding up the true positive results and subtracting the false positive results, weighting the latter by a factor relevant to the relative harm of an undetected cancer compared with the harm of unnecessary treatment. The red line represents the PET/CT radiomics and clinical features model, the green line represents the PET/CT clinical features model, the gray line represents the assumption than all patients are negative for ALK mutation and the blue line represents the assumption that all patients are positive for ALK mutation.
Further, we built a nomogram to predict the presence of ALK mutation (Figure 5B). The calibration curves of the nomograms were shown in Figure 5C. This curve showed the good calibration of the nomogram in terms of the agreement between the estimated and the observed ALK mutation status in the training cohort (p = 0.142) and test cohort (p = 0.254). Finally, we token steps to evaluate the clinical usefulness of these models by decision curve analysis, as shown in Figure 5D, the benefits of an integrated model based on radiomics and clinical features were relatively higher than model based on clinical features alone, especially between 20–80% high risk threshold.
Discussion
The application of tyrosine kinase inhibitors against specific gene targets (EGFR, ALK and ROS1) has revolutionized the treatment for lung adenocarcinoma (31). ALK inhibitors, such as Crizotinib and Ceritinib, have been widely used to treat cancers with mutations of ALK, especially for non-small cell lung cancers (32, 33). For example, a small subset of lung cancer patients with rearrangements of ALK or ROS1 genes are sensitive to ALK inhibitors (34, 35). Therefore, the screening of patients with ALK mutation has become a routine test in NSCLC treatments. Currently, four primary tools for detecting ALK rearrangement have been used in the clinic, including fluorescence in situ hybridization, immunohistochemical staining, reverse transcription-polymerase chain reaction (RT-PCR) and next-generation sequencing (36). Each of these techniques has both its advantages and limitations (37). For example, ALK rearrangements with distinct breakpoints and multiple fusion partners (38). Also, all these examinations need biopsy or surgical tumor specimens. Accordingly, these traditional ALK tools usually present a significant technical challenge. In order to non-invasively identify patients with ALK mutations, this study intends to develop a predictive radiomic model based on PET/CT images.
Recently, several machine learning models based on CT images and clinical features have been developed to predict ALK rearrangement in lung adenocarcinoma (24, 39). The aim of the current study is to construct a machine learning model that can be used to non-invasively and automatically detect ALK mutation based on PET/CT images from tumor lesions of lung adenocarcinoma patients and clinical characteristics of these patients. First, we constructed 3 different models using PET/CT, CT and PET radiomic features, respectively. Our findings showed that the PET/CT radiomic model is slightly better than the other two models to predict ALK mutation, but there is no significant difference between each of the two models, which suggests that our new model based on PET/CT radiomic features has advantage to predict ALK mutation status with the highest AUC value (0.86) in the test cohort. There are two PET radiomic features have been selected to construct PET/CT model. First, the PET_Percentile10 in statistics indicates that the value below which a given percentage of observations in a group of observations fall 10%. Second, the PET_differenceEntropy means the randomness/variability in neighborhood intensity value differences. The final retained features used in our model includes more CT radiomic features than PET radiomic features after mRMR and LASSO selection, which may be because the images used for delineation in this study are 5mm PET images and 1mm thin-layer resolution CT images. The extracted 1 mm CT images have higher resolution than 5 mm PET images, which also suggests that adding a thin layer of 1 mm CT scan in conventional PET/CT scans can help to extract more features.
We further took steps to build an integrated model by combining PET/CT radiomic features with clinical characteristics and found that this integrated model has the advantage to predict ALK mutation with the highest AUC value (0.87) in our training cohort, which is slightly higher than the AUC value (0.85) in the training cohort from PET/CT radiomic model but there is no significant difference between these two models. Notably, the integrated model has a significant advantage to predict ALK mutation status compared to the clinical model (AUC = 0.76).
There are several limitations in the current study. First, one of the limitations of this study is that the model was constructed based on the images that acquired and processed in the same way. Parameter consistency is both our weakness and our strength, and the data of different parameters need to eliminate the batch effect of data (40). We will collect more data that acquire in different parameters to validate the generalization of this model in our next study. Second, this predictive model was constructed based on PET/CT scans from a selected population of lung adenocarcinoma patients in one single medical center, results derived from this model cannot represent broad ALK mutation status of the general lung adenocarcinoma population. Therefore, the predictive effect of this model needs to be validated in independent cohorts from multi-centers. Third, ALK rearrangements are almost always mutually exclusive with other driver mutations, such as EGFR and KRAS mutations in lung adenocarcinoma. Therefore, the mutation of other frequently mutated genes in lung adenocarcinoma needs to be counted in future studies. Last, only two PET features were employed to build this model compared to 20 CT features, and there is no significant difference between PET/CT radiomic model and CT radiomic model, which means that this model was built mostly on CT images-based structural features rather than PET images-based metabolic features. Therefore, more PET features should be extracted and selected to develop a more powerful model in the future.
As several other studies have pointed out previously that there is no “one fits all” approach, although several machine learning algorithms have been employed in radiomics model development for feature selections (41–45). Nevertheless, the integrated model developed in the current study may serve as a preliminary model to support future prospective studies using machine learning algorithms to identify ALK mutation status for lung adenocarcinoma patients. Future studies should be performed with a larger scale of sample size and external cohorts to validate our results.
Conclusions
In conclusion, this study highlights the feasibility of non-invasively detecting ALK genetic status in lung adenocarcinomas using a machine learning model based on combined PET/CT radiomic features and clinical characteristics. The detection of ALK mutation status using this approach might be useful for informing treatment strategies for lung adenocarcinoma patients.
Data Availability Statement
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.
Ethics Statement
The studies involving human participants were reviewed and approved by the Institutional Review Board of Shanghai Jiao Tong University-affiliated Shanghai Chest Hospital. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.
Author Contributions
CC, XS, JC, and WX conceived and designed the study. HYu, WZ, and RW conducted the literature research. XS, BL, LW, LL, MR, HYa, and CL acquired the data. CC, XS, and XQ analyzed and interpreted the data. CC and XS evaluated the conventional thin-slice CT and PET images. YG, SD, and GW performed the statistical analysis. CC and XS drafted the manuscript. All authors performed manuscript revision for important intellectual content, manuscript editing, and had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. All authors contributed to the article and approved the submitted version.
Funding
This work was supported by the special project of the integrated traditional Chinese and Western medicine in the general hospital of Shanghai Health Committee (Grant Number ZHYY-ZXYJHZX-202023), the Natural Science Foundation of Shanghai (Grant Number 18ZR1435200), the Shanghai Sailing Program (grant numbers 20YF1444500), the Youth Medical Talents–Medical Imaging Practitioner Program (grant number SHWRS(2020)_087), the National Natural Science Foundation of China (Grant Number 81602415), the National Natural Science Foundation of China (Grant Number 81871353), the National Natural Science Foundation of China (Grant Number 81773007), the National Natural Science Foundation of China (Grant Number 81671679), and the Scientific Research project of Shanghai Municipal Commission of Health and Family Planning (Grant Number 20174Y0077).
Conflict of Interest
Authors YG and SD were employed by company GE Healthcare China.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2021.603882/full#supplementary-material
References
1. Duma N, Santana-Davila R, Molina JR. Non-Small Cell Lung Cancer: Epidemiology, Screening, Diagnosis, and Treatment. Mayo Clin Proc (2019) 94:1623–40. doi: 10.1016/j.mayocp.2019.01.013
2. Herbst RS, Morgensztern D, Boshoff C. The biology and management of non-small cell lung cancer. Nature (2018) 553:446–54. doi: 10.1038/nature25183
3. Arbour KC, Riely GJ. Systemic Therapy for Locally Advanced and Metastatic Non-Small Cell Lung Cancer: A Review. JAMA (2019) 322:764–74. doi: 10.1001/jama.2019.11058
4. McLoughlin EM, Gentzler RD. Epidermal Growth Factor Receptor Mutations. Thorac Surg Clin (2020) 30:127–36. doi: 10.1016/j.thorsurg.2020.01.008
5. Serritella AV, Bestvina CM. Anaplastic Lymphoma Kinase Mutation-Positive Non-Small Cell Lung Cancer. Thorac Surg Clin (2020) 30:137–46. doi: 10.1016/j.thorsurg.2019.12.001
6. Giustini NP, Bazhenova L. ROS1-rearranged Non-small Cell Lung Cancer. Thorac Surg Clin (2020) 30:147–56. doi: 10.1016/j.thorsurg.2020.01.007
7. O’Leary CG, Andelkovic V, Ladwa R, Pavlakis N, Zhou C, Hirsch F, et al. Targeting BRAF mutations in non-small cell lung cancer. Transl Lung Cancer Res (2019) 8:1119–24. doi: 10.21037/tlcr.2019.10.22
8. Du X, Shao Y, Qin HF, Tai YH, Gao HJ. ALK-rearrangement in non-small-cell lung cancer (NSCLC). Thorac Cancer (2018) 9:423–30. doi: 10.1111/1759-7714.12613
9. To KF, Tong JH, Yeung KS, Lung RW, Law PP, Chau SL, et al. Detection of ALK rearrangement by immunohistochemistry in lung adenocarcinoma and the identification of a novel EML4-ALK variant. J Thorac Oncol (2013) 8:883–91. doi: 10.1097/JTO.0b013e3182904e22
10. Kim H, Yoo SB, Choe JY, Paik JH, Xu X, Nitta H, et al. Detection of ALK gene rearrangement in non-small cell lung cancer: a comparison of fluorescence in situ hybridization and chromogenic in situ hybridization with correlation of ALK protein expression. J Thorac Oncol (2011) 6:1359–66. doi: 10.1097/JTO.0b013e31821cfc73
11. Zito Marino F, Liguori G, Aquino G, La Mantia E, Bosari S, Ferrero S, et al. Intratumor Heterogeneity of ALK-Rearrangements and Homogeneity of EGFR-Mutations in Mixed Lung Adenocarcinoma. PLoS One (2015) 10:e0139264. doi: 10.1371/journal.pone.0139264
12. Cai W, Lin D, Wu C, Li X, Zhao C, Zheng L, et al. Intratumoral Heterogeneity of ALK-Rearranged and ALK/EGFR Coaltered Lung Adenocarcinoma. J Clin Oncol (2015) 33:3701–9. doi: 10.1200/JCO.2014.58.8293
13. Hu F, Li C, Xu J, Guo J, Shen Y, Nie W, et al. Additional local consolidative therapy has survival benefit over EGFR tyrosine kinase inhibitors alone in bone oligometastatic lung adenocarcinoma patients. Lung Cancer (2019) 135:138–44. doi: 10.1016/j.lungcan.2019.07.024
14. Tagliafico AS, Piana M, Schenone D, Lai R, Massone AM, Houssami N. Overview of radiomics in breast cancer diagnosis and prognostication. Breast (2020) 49:74–80. doi: 10.1016/j.breast.2019.10.018
15. Shakir H, Deng Y, Rasheed H, Khan TMR. Radiomics based likelihood functions for cancer diagnosis. Sci Rep (2019) 9:9501. doi: 10.1038/s41598-019-45053-x
16. Liu Y, Kim J, Balagurunathan Y, Li Q, Garcia AL, Stringfield O, et al. Radiomic Features Are Associated With EGFR Mutation Status in Lung Adenocarcinomas. Clin Lung Cancer (2016) 17:441–8.e6. doi: 10.1016/j.cllc.2016.02.001
17. Smith CP, Czarniecki M, Mehralivand S, Stoyanova R, Choyke PL, Harmon S, et al. Radiomics and radiogenomics of prostate cancer. Abdom Radiol (NY) (2019) 44:2021–9. doi: 10.1007/s00261-018-1660-7
18. Thawani R, McLane M, Beig N, Ghose S, Prasanna P, Velcheti V, et al. Radiomics and radiogenomics in lung cancer: A review for the clinician. Lung Cancer (2018) 115:34–41. doi: 10.1016/j.lungcan.2017.10.015
19. Li H, Zhang R, Wang S, Fang M, Zhu Y, Hu Z, et al. CT-Based Radiomic Signature as a Prognostic Factor in Stage IV ALK-Positive Non-small-cell Lung Cancer Treated With TKI Crizotinib: A Proof-of-Concept Study. Front Oncol (2020) 10:57. doi: 10.3389/fonc.2020.00057
20. Mei D, Luo Y, Wang Y, Gong J. CT texture analysis of lung adenocarcinoma: can Radiomic features be surrogate biomarkers for EGFR mutation statuses. Cancer Imaging (2018) 18:52. doi: 10.1186/s40644-018-0184-2
21. Shen TX, Liu L, Li WH, Fu P, Xu K, Jiang YQ, et al. CT imaging-based histogram features for prediction of EGFR mutation status of bone metastases in patients with primary lung adenocarcinoma. Cancer Imaging (2019) 19:34. doi: 10.1186/s40644-019-0221-9
22. Yoon J, Suh YJ, Han K, Cho H, Lee HJ, Hur J, et al. Utility of CT radiomics for prediction of PD-L1 expression in advanced lung adenocarcinomas. Thorac Cancer (2020) 11:993–1004. doi: 10.1111/1759-7714.13352
23. Gevaert O, Echegaray S, Khuong A, Hoang CD, Shrager JB, Jensen KC, et al. Predictive radiogenomics modeling of EGFR mutation status in lung cancer. Sci Rep (2017) 7:41674. doi: 10.1038/srep41674
24. Song L, Zhu Z, Mao L, Li X, Han W, Du H, et al. Clinical, Conventional CT and Radiomic Feature-Based Machine Learning Models for Predicting ALK Rearrangement Status in Lung Adenocarcinoma Patients. Front Oncol (2020) 10:369. doi: 10.3389/fonc.2020.00369
25. Chang C, Sun XY, Zhao WL, Wang R, Qian XH, Lei B, et al. Minor components of micropapillary and solid subtypes in lung invasive adenocarcinoma (≤3cm): PET/CT findings and correlations with lymph node metastasis. Radiol Med (2020) 125:257–64. doi: 10.1007/s11547-019-01112-x
26. Sun XY, Chen TX, Chang C, Teng HH, Xie C, Ruan MM, et al. SUVmax of (18)FDG PET/CT Predicts Histological Grade of Lung Adenocarcinoma. Acad Radiol (2021) 28(1):49–57. doi: 10.1016/j.acra.2020.01.030
27. Ruan MM, Liu L, Wang LH, Lei B, Sun XY, Chang C, et al. Correlation between combining (18)F-FDG PET/CT metabolic parameters and other clinical features and ALK or ROS1 fusion in patients with non-small-cell lung cancer. Eur J Nucl Med Mol Imaging (2020) 47:1183–97. doi: 10.1007/s00259-019-04652-6
28. Liu HH, Zhang CY, Wang LJ, Luo R, Li JN, Zheng H, et al. MRI radiomics analysis for predicting preoperative synchronous distant metastasis in patients with rectal cancer. Eur Radiol (2019) 29(8):4418–26. doi: 10.1007/s00330-018-5802-7
29. Kim S, Shin J, Kim D, Choi G, Kim M, Choi J. Radiomics on Gadoxetic Acid–Enhanced Magnetic Resonance Imaging for Prediction of Postoperative Early and Late Recurrence of Single Hepatocellular Carcinoma. Clin Cancer Res (2019) 13:3847–55. doi: 10.1158/1078-0432.CCR-18-2861
30. Fluss R, Faraggi D, Reiser B. Estimation of the Youden Index and its Associated Cutoff Point. Biom J (2005) 47(4):458–72. doi: 10.1002/bimj.200410135
31. Harada G, Gongora ABL, da Costa CM, Santini FC. TRK Inhibitors in Non-Small Cell Lung Cancer. Curr Treat Options Oncol (2020) 21:39. doi: 10.1007/s11864-020-00741-z
32. Shaw AT, Bauer TM, Marinis F D, Felip E, Goto Y, Liu G, et al. First-Line Lorlatinib or Crizotinib in Advanced ALK-Positive Lung Cancer. N Engl J Med (2020) 9;383(21):2018–29. doi: 10.1056/NEJMoa2027187
33. Elliott J, Bai Z, Hsieh SC, Kelly SE, Chen L, Skidmore B, et al. ALK inhibitors for non-small cell lung cancer: A systematic review and network meta-analysis. PLoS One (2020) 15(2):e0229179. doi: 10.1371/journal.pone.0229179
34. Mileham KF, Ahmad MN, Kim ES. Targeted Therapy for Mutation-Driven Metastatic Non-Small-Cell Lung Cancer: Considerations for ALK-Rearranged Tumors. Oncol (Williston Park) (2019) 33:152–5.
35. Rosell R, Karachaliou N, Wolf J, Ou SH. ALK and ROS1 non-small-cell lung cancer: two molecular subgroups sensitive to targeted therapy. Lancet Respir Med (2014) 2:966–8. doi: 10.1016/S2213-2600(14)70259-0
36. Letovanec I, Finn S, Zygoura P, Smyth P, Soltermann A, Bubendorf L, et al. Evaluation of NGS and RT-PCR Methods for ALK Rearrangement in European NSCLC Patients: Results from the European Thoracic Oncology Platform Lungscape Project. J Thorac Oncol (2018) 13:413–25. doi: 10.1016/j.jtho.2017.11.117
37. Wu YC, Chang IC, Wang CL, Chen TD, Chen YT, Liu HP, et al. Comparison of IHC, FISH and RT-PCR methods for detection of ALK rearrangements in 312 non-small cell lung cancer patients in Taiwan. PLoS One (2013) 8:e70839. doi: 10.1371/journal.pone.0070839
38. Rosenbaum JN, Bloom R, Forys JT, Hiken J, Armstrong JR, Branson J, et al. Genomic heterogeneity of ALK fusion breakpoints in non-small-cell lung cancer. Mod Pathol (2018) 31:791–808. doi: 10.1038/modpathol.2017.181
39. Yamamoto S, Korn RL, Oklu R, Migdal C, Gotway MB, Weiss GJ, et al. ALK molecular phenotype in non-small cell lung cancer: CT radiogenomic characterization. Radiology (2014) 272:568–76. doi: 10.1148/radiol.14140789
40. Orlhac F, Frouin F, Nioche C, Ayache N, Buvat I. Validation of A Method to Compensate Multicenter Effects Affecting CT Radiomics. Radiology (2019) 291(1):53–9. doi: 10.1148/radiol.2019182023
41. Shiri I, Maleki H, Hajianfar G, Abdollahi H, Ashrafinia S, Hatt M, et al. Next-Generation Radiogenomics Sequencing for Prediction of EGFR and KRAS Mutation Status in NSCLC Patients Using Multimodal Imaging and Machine Learning Algorithms. Mol Imaging Biol (2020) 22(4):1132–48. doi: 10.1007/s11307-020-01487-8
42. Parmar C, Grossmann P, Rietveld D, Rietbergen MM, Lambin P, Aerts HJWL. Radiomic machine-learning classifiers for prognostic biomarkers of head and neck cancer. Front Oncol (2015) 3;5:272. doi: 10.3389/fonc.2015.00272
43. Hajianfar G, Shiri I, Maleki H, Oveisi N, Haghparast A, Abdollahi H, et al. Noninvasive O6 methylguanine-DNA methyltransferase status prediction in glioblastoma multiforme cancer using magnetic resonance imaging radiomics features: univariate and multivariate radiogenomics analysis. World Neurosurg (2019) 132:e140–61. doi: 10.1016/j.wneu.2019.08.232
44. Leger S, Zwanenburg A, Pilz K, Lohaus F, Linge A, Zöphel K, et al. A comparative study of machine learning methods for time-to-event survival data for radiomics risk modelling. Sci Rep (2017) 16;7(1):13206. doi: 10.1038/s41598-017-13448-3
Keywords: positron emission tomography/computed tomography (PET/CT), machine learning, radiomics, anaplastic lymphoma kinase (ALK) rearrangement, lung adenocarcinoma
Citation: Chang C, Sun X, Wang G, Yu H, Zhao W, Ge Y, Duan S, Qian X, Wang R, Lei B, Wang L, Liu L, Ruan M, Yan H, Liu C, Chen J and Xie W (2021) A Machine Learning Model Based on PET/CT Radiomics and Clinical Characteristics Predicts ALK Rearrangement Status in Lung Adenocarcinoma. Front. Oncol. 11:603882. doi: 10.3389/fonc.2021.603882
Received: 08 September 2020; Accepted: 08 January 2021;
Published: 02 March 2021.
Edited by:
Hong Huang, Chongqing University, ChinaReviewed by:
Maria Luisa Belli, Romagnolo Scientific Institute for the Study and Treatment of Tumors (IRCCS), ItalyIsaac Shiri, Geneva University Hospitals (HUG), Switzerland
Copyright © 2021 Chang, Sun, Wang, Yu, Zhao, Ge, Duan, Qian, Wang, Lei, Wang, Liu, Ruan, Yan, Liu, Chen and Xie. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Wenhui Xie, xknuclear@163.com; Jie Chen, snoopy_cj@126.com
†These authors share first authorship
‡These authors have contributed equally to this work