- Department of Oncology, The Affiliated Hospital of Southwest Medical University, Luzhou, China
Purpose: By using a radiomics-based approach, multiple radiomics features can be extracted from regions of interest in computed tomography (CT) images, which may be applied to automatically classify kidney tumors and normal kidney tissues. The study proposes a method based on CT radiomics and aims to use extracted radiomics features to automatically classify of kidney tumors and normal kidney tissues and to establish an automatic classification model.
Methods: CT data were retrieved from the 2019 Kidney and Kidney Tumor Segmentation Challenge (KiTS19) in The Cancer Imaging Archive (TCIA) open access database. Arterial phase-enhanced CT images from 210 cases were used to establish an automatic classification model. These CT images of patients were randomly divided into training (168 cases) and test (42 cases) sets. Furthermore, the radiomics features of gross tumor volume (GTV) and normal kidney tissues in the training set were extracted and screened, and a binary logistic regression model was established. For the test set, the radiomic features and cutoff value of P were consistent with the training set.
Results: Three radiomics features were selected to establish the binary logistic regression model. The accuracy (ACC), sensitivity (SENS), specificity (SPEC), area under the curve (AUC), and Youden index of the training and test sets based on the CT radiomics classification model were all higher than 0.85.
Conclusion: The automatic classification model of kidney tumors and normal kidney tissues based on CT radiomics exhibited good classification ability. Kidney tumors could be distinguished from normal kidney tissues. This study may complement automated tumor delineation techniques and warrants further research.
Introduction
Radiomics refers to the high-throughput extraction of a large amount of information from medical images, such as computed tomography (CT), magnetic resonance imaging (MRI), and positron emission computed tomography (PET), to extract features and establish models. In general, visual image information is converted into digital feature variables for quantitative research. Extensive research and analysis of massive image data information can assist doctors in providing a more accurate diagnosis of a patient’s condition (1–3). Compared to biopsy, radiomics has the technical advantage of obtaining non-invasive and repeatable radiological images, thus providing a safer and more way for conducting patient follow-ups and prognosis prediction. Radiomics method can also be used for pathological tumor classification and grading (4, 5). Traditional radiomics, which involves the extraction and screening of high-throughput features of regions of interest from medical images, is used primarily for the diagnosis of benign and malignant diseases, prognosis evaluation, and survival prediction (6–13). Radiomics combined with machine learning and deep learning performs well for differentiating, grading and staging kidney tumors (14–16).
Traditional classification methods using computed tomography (CT) images mainly rely on the pixel value of the image and less on other parameters; this makes it difficult to accurately distinguish the tumor area from the surrounding normal organs (17–22). However, radiomics-based methods can be used to extract more than 800 radiomics features from regions of interest in CT images. In the present study, we proposed a radiomics research method based on the automatic classification technology of radiomics which has the potential to supplement the deep learning automatic delineation technology. This study is the first to report such a method. Therefore, the purpose of this study is to establish a preliminary classification model based on CT radiomics to automatically classify kidney tumors and normal kidney tissues.
Materials and methods
Data collection
The study was reviewed and approved by Ethics Committee of the Affiliated Hospital of Southwest Medical University (18 January 2017, KY2021023). CT data were retrieved from the 2019 Kidney and Kidney Tumor Segmentation Challenge (KiTS19) (https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=61081171) of The Cancer Imaging Archive (TCIA) open access database. Arterial phase-enhanced images of 210 patients from the database were used to establish an automatic classification model. The kidneys and tumors were already manually segmented. These CT images were randomly divided into a training set (168 cases) and test set (42 cases).
Radiomics feature extraction
The CT images were preprocessed using wavelet-based methods. Before feature extraction, all images were resampled according to a voxel size of 1 × 1 × 1 mm3. The gross tumor volume (GTV) and normal kidney tissue were regarded as the regions of interest. Feature extraction was based on a three-dimensional (3D) slicer platform and performed using the pyradiomics package; the package is available at http://PyRadiomics.readthedocs.io/en/latest/ (last accessed on June 30, 2019). The eigenvalue data of all radiomic features were processed using z-score standardization. Figure 1 shows a CT sectional view of a patient in the training set. The CT radiomic feature variables of the GTV and normal kidney tissue were extracted. A total of 837 radiomics features were extracted, including first-order statistics, gray level co-occurrence matrix (GLCM), gray level dependence matrix (GLDM), gray level run length matrix (GLRLM), gray level size zone matrix (GLSZM), and neighboring gray tone difference matrix (NGTDM). Shape features were removed in this study. First-order features describes single pixel or voxel within the ROI. GLCM defines different combination of gray levels of an image area. GLDM quantifies the gray level dependencies in an image. GLRLM provides information about runs of consecutive pixels with the same gray level. GLSZM quantifies gray level zones in an image. And wavelet-based features were transformed based on above features.
Figure 1 CT section of one patient in the training set. The white arrow indicates normal kidney tissue, and the red arrow indicates gross tumor volume.
Screening of radiomics features of the training set
(A) Univariate feature screening
By considering the radiomics features of the GTV and normal kidney tissue as independent variables and the GTV and normal kidney tissue as binary variables, the area under curve (AUC; defined as the area surrounded by the receiver operation characteristic (ROC) curve and the abscissa and ordinate axis) corresponding to each radiomics feature was calculated. The AUC ranged from 0.5 to 1. The closer the value is to 1.0, the higher the authenticity, whereas an AUC value equal to 0.5 indicates that the radiomics have no application value in the study. The ROC curve takes the false positive rate (FPR) as the abscissa and true positive rate (TPR) as the ordinate. The curve is mostly used for the evaluation of binary classification problems. Radiomics features with an AUC less than 0.7 were excluded after the univariate screening.
(B) LASSO logistic regression feature screening
The principle of least absolute shrinkage and selection operator (LASSO) regression is to compress the original eigenvalue coefficients; more specifically, it involves directly compressing the original small coefficients to 0 and treating the eigenvalue variables corresponding to these coefficients as nonsignificant variables. Such nonsignificant variables have little or no impact on the final classification results; thus, such variables can be directly discarded, which results in variable screening. LASSO logistic regression was conducted using the method of five-fold cross-validation method to select the radiomics features. In the LASSO regression analysis, the L1 regularization term is added based on the least-squares fit to improve the accuracy of the linear regression model. Its penalty function is the absolute value of the regression coefficient, which guarantees that the parameter estimation results equal to zero. Thus, it is helpful for feature selection. This study is a binary classification problem, and logistic regression analysis is a generalized linear model commonly used in binary classification or one-to-many classifications. It normalizes the response of simple linear regression to zero and one. Therefore, the linear regression in the LASSO regression model can be replaced by logistic regression to select the characteristics. The objective function of LASSO logistic regression optimization is as follows:
where n is the total number of samples; Xi is an m × n-size raw data (each sample has m eigenvalues); yi is the corresponding response value of each sample; ω is the linear regression coefficient; b is the cutoff value of linear regression; and λ is a nonnegative regularization parameter used to control the sparsity of regression coefficients.
The extracted radiomics features were input into the LASSO logistic regression model; subsequently, the lambda (λ) value with the smallest model deviation was calculated and the radiomics features were screened.
(C) Model collinearity detection
The variance expansion factor (VIF) of the independent variable of the logistic regression model was calculated after screening the variables using the LASSO logistic regression. The VIF measures the severity of multicollinearity in multiple regression models. This represents the ratio of the variance of the estimator of the regression coefficient to the variance when no linear correlation between the independent variables is assumed.
The VIF can be calculated as follows.
where Ri is the negative correlation coefficient of the regression analysis for other independent variables. The larger the VIF, the greater the possibility of collinearity between the independent variables. Generally, multicollinearity is assumed when the VIF value is greater than five; thus, removing the radiomics features with a VIF value greater than five is necessary.
Model establishment
The final binary logistic regression model was established based on the final radiomics features (X):
where P is the probability that GTV is positive. The ROC curve of the model was plotted and its AUC value was calculated. The sensitivity (SENS) and specificity (SPEC) corresponding to each point on the ROC curve were used to calculate the point that maximized SENS + SPEC, which is the cutoff value of P.
Model diagnosis
The accuracy (ACC), SENS, SPEC, AUC, and the Youden index of the model were used for evaluating the effectiveness of the model. The Youden index, also known as the correct index, was used to evaluate the authenticity of screening tests. It is calculated as Youden index=SENS+ SPEC -1.
Model validation
For the test set, the radiomics features extracted concerning GTV and normal kidney tissue were consistent with those of the training set. The cutoff value of P in the test set is also consistent with the training set. If the GTV of the test set is positive, the ROC curve is drawn to calculate the AUC value. Thereafter, the ACC, SENS, SPEC, AUC, and Youden index of the test model were again calculated. The flowchart of the proposed method is shown in Figure 2.
All statistical analyses were performed using R software, version 4.1.2 (R Foundation for Statistical Computing, Vienna, Austria).
Results
Altogether, 837 radiomics features were extracted. After the univariate screening, 217 radiomic features were identified. Using LASSO logistic regression for variable screening, according to the calculation, the deviation of the model was the smallest when the minimum value was 0.1715; Figure 3 shows the model deviation and lambda. Finally, three radiomics features were extracted: dependence entropy of the GLDM of the original (Feature 1), zone entropy of GLSZM of the wavelet-HLL (H = high-frequency band, L = low-frequency band; Feature 2), and gray level non-uniformity of GLSZM of the wavelet-LLL (Feature 3). Table 1 lists the VIF values corresponding to these features. The VIFs were all less than five, which indicates that no multicollinearity existed among the three radiomics features.
Figure 3 Features selected by the LASSO regression model. (A) Selection of the tuning parameter (λ) in the LASSO model via 5-fold cross-validation. The optimal λ are indicated by the dotted vertical lines, and a value of 0.1715 was selected. (B) LASSO coefficient profiles of 217 radiomics features. A coefficient profile plot was generated versus the selected log λ value using five-fold cross-validation. Three radiomics features with non-zero coefficients were selected.
The final binary logistic regression model was established based on the final radiomics features as follows.
where P is the probability that GTV is positive. The cutoff value in this study was 0.4851. It was used as a radiomics marker to determine the tumor area and was equivalent to the critical point. If the detected value is greater than the cutoff value, it is the GTV; if the detected value is less than the cutoff value, it is the normal kidney tissue.
Table 2 lists the diagnostic parameters of the training set model, and Figure 4 shows the ROC curve of the training set model.
Table 3 lists the model diagnostic parameters of the test set, and Figure 5 shows the ROC curve of the test set model.
Discussion
Confirming the GTV is a challenge when treating a tumor. It tests the patience and proficiency of clinicians in relevant clinical knowledge. Furthermore, tumors of different shapes are difficult to delineate. A qualified clinician must have systematic learning and continuous practice to be competent in sketching. In this study, based on the CT radiomics method, the extracted CT radiomics features were used to automatically distinguish kidney tumor areas from normal kidney tissues.
The cutoff value, which is the critical point, was used as a radiomics marker for determining the GTV. If the detection value was greater than the cutoff value, it was considered positive; if the detection value was less than the cutoff value, it was considered negative. In this study, a positive detection value represented the kidney tumor part, and a negative value represented normal tissues and organs. The AUC values of the training and test sets obtained from the ROC curve were 0.9798 and 0.9841, respectively, which were significantly greater than 0.7. The ACC, SENS, SPEC, and Youden indices for the training and test sets based on the CT radiomics classification model were all greater than 0.85. This indicates that automatic classification technology based on radiomics had achieved good application results. Further studies need to extract only selected radiomic features from the GTV and kidney training sets instead of all radiomic features to improve efficiency.
In recent years, artificial intelligence technologies such as machine learning and neural networks have been widely used in the field of automatic tumor and organ delineation in radiotherapy, such as automatic segmentation technology based on convolutional neural network and automatic classification technology based on the U-NET model. These have greatly reduced clinician workload and increased productivity (23–26). Artificial intelligence models have achieved great success in the automatic delineation of organs, but the accuracy of the automatic delineation of tumor regions is still a problem. For future research, we can limit the region of interest to the entire kidney region, grid it, and extract only the radiomics features filtered through the training set to improve efficiency. A binary logistic regression model can be established based on the final radiomics features extracted from the training set, and the cutoff value calculated from the training set can be used as a radiomics marker for classifying tumor regions. When the P value is greater than the Cutoff value, the grid is considered to be a tumor region; when the P value is less than the Cutoff value, the grid is considered to be a normal organ. After clustering the tumor region or substituting P-values for pixel values, it is possible to automatically delineate the tumor region, which may be a supplement to the automatic delineation technology of deep learning.
In conclusion, automatic classification technology based on radiomics can be feasibly applied to distinguish between GTV and normal kidney tissue in patients. Nevertheless, our study has a few limitations. First, this study used a limited number of samples. Second, the establishment and optimization of the model were affected by the quality of the CT images and the accuracy of manual tumor segmentation. Third, different clinicians have a different understanding of GTV boundaries, and some CT images still have problems such as a fuzzy boundary of the tumor target area. Forth, the pre-malignant state and benign kidney lesions should be analyzed and differentiated. Thus, the data may have biases. For future clinical application, we plan to train and test CT image data of more patients with different tumors, to obtain better automatic classification results. If the technology is further matured and developed, it may reduce the risk of inaccurate drawing due to a lack of experience, save valuable processing time, and benefit doctors, patients, and radiotherapy technology.
Radiomics methods are generally used in the diagnosis of benign and malignant diseases, prognosis evaluation, and survival prediction. The preliminary method of automatic classification technology based on radiomics proposed in this study aims to enrich research on radiomics and may be to supplement to deep-learning automatic rendering technology to realize more accurate GTV determination. Future research should focus on different diseases and increase the number of samples to further improve the accuracy of this automatic classification technology.
Conclusion
The automatic classification model of kidney tumors and normal kidney tissue based on CT radiomics exhibited good classification ability. Tumorous kidney tissues could be distinguished from normal kidneys, with these observations worthy of further study.
Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.
Author contributions
HP and SL conceived the proposed concept and verified the underlying data. HP, YL, XG and XT took the lead in writing the manuscript. All authors provided critical feedback and helped shape the research, analysis, and manuscript. All authors contributed to the article and approved the submitted version.
Funding
We acknowledge funding from the Gulin County People’s Hospital, Southwest Medical University Affiliated Hospital Science and Technology Strategic Cooperation Project (project number 2022GLXNNYDFY05). We acknowledge funding from the Sichuan Medical Association Scientific Research Project (project number S21005).
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
1. Avanzo M, Stancanello J, Pirrone G, Sartor G. Radiomics and deep learning in lung cancer. Strahlenther Onkol (2020) 196(10):879–87. doi: 10.1007/s00066-020-01625-9
2. Tagliafico AS, Piana M, Schenone D, Lai R, Massone AM, Houssami N. Overview of radiomics in breast cancer diagnosis and prognostication. Breast (2020) 49:74–80. doi: 10.1016/j.breast.2019.10.018
3. Zhou J, Zhang Y, Chang KT, Lee KE, Wang O, Li J, et al. Diagnosis of benign and malignant breast lesions on dce-mri by using radiomics and deep learning with consideration of peritumor tissue. J Magn Reson Imaging (2020) 51(3):798–809. doi: 10.1002/jmri.26981
4. E L, Lu L, Li L, Yang H, Schwartz LH, Zhao B. Radiomics for classification of lung cancer histological subtypes based on nonenhanced computed tomography. Acad Radiol (2019) 26(9):1245–52. doi: 10.1016/j.acra.2018.10.013
5. Wang S, Wei Y, Li Z, Xu J, Zhou Y. Development and validation of an mri radiomics-based signature to predict histological grade in patients with invasive breast cancer. Breast Cancer (Dove Med Press) (2022) 14:335–42. doi: 10.2147/bctt.S380651
6. Hao W, Gong J, Wang S, Zhu H, Zhao B, Peng W. Application of mri radiomics-based machine learning model to improve contralateral bi-rads 4 lesion assessment. Front Oncol (2020) 10:531476. doi: 10.3389/fonc.2020.531476
7. Zheng X, Yao Z, Huang Y, Yu Y, Wang Y, Liu Y, et al. Deep learning radiomics can predict axillary lymph node status in early-stage breast cancer. Nat Commun (2020) 11(1):1236. doi: 10.1038/s41467-020-15027-z
8. Wu K, Shui Y, Sun W, Lin S, Pang H. Utility of radiomics for predicting patient survival in hepatocellular carcinoma with portal vein tumor thrombosis treated with stereotactic body radiotherapy. Front Oncol (2020) 10:569435. doi: 10.3389/fonc.2020.569435
9. Wu Q, Wang S, Chen X, Wang Y, Dong L, Liu Z, et al. Radiomics analysis of magnetic resonance imaging improves diagnostic performance of lymph node metastasis in patients with cervical cancer. Radiother Oncol (2019) 138:141–8. doi: 10.1016/j.radonc.2019.04.035
10. Wang T, Gao T, Yang J, Yan X, Wang Y, Zhou X, et al. Preoperative prediction of pelvic lymph nodes metastasis in early-stage cervical cancer using radiomics nomogram developed based on T2-weighted mri and diffusion-weighted imaging. Eur J Radiol (2019) 114:128–35. doi: 10.1016/j.ejrad.2019.01.003
11. Dercle L, Lu L, Schwartz LH, Qian M, Tejpar S, Eggleton P, et al. Radiomics response signature for identification of metastatic colorectal cancer sensitive to therapies targeting egfr pathway. J Natl Cancer Inst (2020) 112(9):902–12. doi: 10.1093/jnci/djaa017
12. Chang N, Cui L, Luo Y, Chang Z, Yu B, Liu Z. Development and multicenter validation of a ct-based radiomics signature for discriminating histological grades of pancreatic ductal adenocarcinoma. Quant Imaging Med Surg (2020) 10(3):692–702. doi: 10.21037/qims.2020.02.21
13. Liza H, Anais S, Julia S, Florent C, Isabelle B-A, Jean Marc C, et al. Radiomics method for the differential diagnosis of radionecrosis versus progression after fractionated stereotactic body radiotherapy for brain oligometastasis. Radiat Res (2020) 193(5):471–80. doi: 10.1667/RR15517.1
14. Weszka JS. A survey of threshold selection techniques. In: Computer graphics and image processing, vol. 7. (1978). p. 259–65. doi: 10.1016/0146-664X(78)90116-8
15. Al-Amri SS, Kalyankar N, Khamitkar S. Image segmentation by using edge detection. Int J Comput Sci Eng (2010) 2(3):804–7. doi: 10.1016/j.ijleo.2013.10.049
16. Lu X, Wu J, Ren X, Zhang B, Li Y. The study and application of the improved region growing algorithm for liver segmentation. Optik (2014) 125(9):2142–7. doi: 10.1016/j.ijleo.2013.10.049
17. Uhlig J, Leha A, Delonge LM, Haack AM, Shuch B, Kim HS, et al. Radiomic features and machine learning for the discrimination of renal tumor histological subtypes: A pragmatic study using clinical-routine computed tomography. Cancers (Basel) (2020) 12(10):3010. doi: 10.3390/cancers12103010
18. Hussain MA, Hamarneh G, Garbi R. Learnable image histograms-based deep radiomics for renal cell carcinoma grading and staging. Comput Med Imaging Graph (2021) 90:101924. doi: 10.1016/j.compmedimag.2021.101924
19. Zhao Y, Chang M, Wang R, Xi IL, Chang K, Huang RY, et al. Deep learning based on mri for differentiation of low- and high-grade in low-stage renal cell carcinoma. J Magn Reson Imaging (2020) 52(5):1542–9. doi: 10.1002/jmri.27153
20. Cabezas M, Oliver A, Llado X, Freixenet J, Cuadra MB. A review of atlas-based segmentation for magnetic resonance brain images. Comput Methods Programs BioMed (2011) 104(3):e158–77. doi: 10.1016/j.cmpb.2011.07.015
21. Aljabar P, Heckemann RA, Hammers A, Hajnal JV, Rueckert D. Multi-atlas based segmentation of brain images: Atlas selection and its effect on accuracy. Neuroimage (2009) 46(3):726–38. doi: 10.1016/j.neuroimage.2009.02.018
22. Sharp G, Fritscher KD, Pekar V, Peroni M, Shusharina N, Veeraraghavan H, et al. Vision 20/20: Perspectives on automated image segmentation for radiotherapy. Med Phys (2014) 41(5):050902. doi: 10.1118/1.4871620
23. Mohammadi R, Shokatian I, Salehi M, Arabi H, Shiri I, Zaidi H. Deep learning-based auto-segmentation of organs at risk in high-dose rate brachytherapy of cervical cancer. Radiother Oncol (2021) 159:231–40. doi: 10.1016/j.radonc.2021.03.030
24. Ahn SH, Yeo AU, Kim KH, Kim C, Goh Y, Cho S, et al. Comparative clinical evaluation of atlas and deep-Learning-Based auto-segmentation of organ structures in liver cancer. Radiat Oncol (2019) 14(1):213. doi: 10.1186/s13014-019-1392-z
25. Vrtovec T, Močnik D, Strojan P, Pernuš F, Ibragimov B. Auto-segmentation of organs at risk for head and neck radiotherapy planning: From atlas-based to deep learning methods. Med Phys (2020) 47(9):e929–e50. doi: 10.1002/mp.14320
Keywords: computed tomography (CT), radiomics, kidney, kidney tumor, automatic classification
Citation: Li Y, Gao X, Tang X, Lin S and Pang H (2023) Research on automatic classification technology of kidney tumor and normal kidney tissue based on computed tomography radiomics. Front. Oncol. 13:1013085. doi: 10.3389/fonc.2023.1013085
Received: 06 August 2022; Accepted: 13 February 2023;
Published: 24 February 2023.
Edited by:
Chintan Parmar, Dana–Farber Cancer Institute, United StatesReviewed by:
Nguyen Minh Duc, Pham Ngoc Thach University of Medicine, VietnamSikandar Shaikh, Shadan Hospital and Institute of Medical Sciences, India
Copyright © 2023 Li, Gao, Tang, Lin and Pang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Sheng Lin, linsheng@swmu.edu.cn; Haowen Pang, haowenpang@foxmail.com
†These authors have contributed equally to this work and share first authorship