Skip to main content

ORIGINAL RESEARCH article

Front. Oncol., 06 January 2021
Sec. Genitourinary Oncology

Liq_ccRCC: Identification of Clear Cell Renal Cell Carcinoma Based on the Integration of Clinical Liquid Indices

  • 1Department of Radiology, Lanzhou University Second Hospital, Lanzhou, China
  • 2Department of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou, China
  • 3Department of Pathology, Lanzhou University Second Hospital, Lanzhou, China
  • 4Institute of Urology, Lanzhou University Second Hospital, Key Laboratory of Gansu Province for Urological Diseases, Clinical Center of Gansu Province for Nephrourology, Lanzhou, China

Currently, preoperative diagnosis and differentiation of renal clear cell carcinoma and other subtypes remain a serious challenge for doctors. The liquid biopsy technique and artificial intelligence have inspired the pursuit of distinguishing clear cell renal cell carcinoma using clinically available test data. In this work, a method called liq_ccRCC based on the integration of clinical blood and urine indices through machine learning approaches was successfully designed to achieve this goal. Clinically available biochemical blood data and urine indices were collected from 306 patients with renal cell carcinoma. Finally, the integration of 18 top-ranked clinical liquid indices (13 blood samples and 5 urine samples) was proven to be able to distinguish renal clear cell carcinoma from other subtypes of renal carcinoma by cross-valuation with an AUC of 0.9372. The successful introduction of this identification method suggests that subtype differentiation of renal cell carcinoma can be accomplished based on clinical liquid test data, which is noninvasive and easy to perform. It has huge potential to be developed as a promising innovation strategy for preoperative subtype differentiation of renal cell carcinoma with the advantages of convenience and real-time testing. liq_ccRCC is available online for the free test of readers at http://lishuyan.lzu.edu.cn/liq_ccRCC.

Introduction

Renal cell carcinoma (RCC) is the primary malignant tumor in renal tumors, occupying the sixth place globally with regard to tumor death; it is the second leading cause of death among urinary system tumors only after bladder cancers (1). There are several subtypes of RCC, for which the growth rate, mode, and metastasis rate vary greatly. Among these subtypes, clear cell renal cell carcinoma (ccRCC) is the most prevalent type, accounting for about 75%–80% of all diagnosed instances (2) and 90% of cases that metastasize (3). Approximately 25%–30% of patients with ccRCC present with metastatic disease at the time of diagnosis (4). Therefore, the surgical methods and prognosis for ccRCC exhibit great differences compared with other subtypes. Hence, an accurate preoperative identification of ccRCC will contribute significantly to the success rate of surgery and survival rate of patients. However, CT or MR enhancement, which are currently the most dominant imaging diagnostic methods for renal cancers, have frequently failed to differentiate between different subtypes of RCC; this affects the treatment scheme, surgical approach, and prognosis of patients. Therefore, exploring new methods for simply and quickly distinguishing ccRCC from other subtypes is still a serious challenge for doctors.

Many researchers have contributed to the identification of ccRCC. For example, from the medical imageology standpoint, Dong et al. investigated the contrast-enhanced ultrasound (CEUS) method (5), and Wei et al. analyzed the dual energy spectral CT method (6) with this goal in mind; both achieved good identification performance. Young et al. provided novel evidence that multiphasic multidetector CT may assist in the discrimination of ccRCC from oncocytoma, papillary RCC, and chromophobe RCC (7). However, these methods are radioactive, time-consuming, or complex. Therefore, they have not been widely applied in clinical practice. Wang et al. utilized a 44-gene expression signature from microarray analysis to accurately discriminate ccRCC from different subtypes with an overall accuracy of 95.7% based on 5-fold cross-validation, which was beneficial for accelerating the development of the gene expression profile (8). Recently, with the concept of liquid biopsies continuing to evolve, various biomarkers have been rapidly emerging in the field of diagnosis, prediction, and prognosis of ccRCC. For example, James et al. applied an enhanced RT-PCR technique to test MN/CA9 mRNA that was expressed in the peripheral blood of patients with renal cancer. The results show that 86% of ccRCC had a positive expression of MN/CA9 mRNA, and no patient with a benign renal tumor exhibited MN/CA9 expression (9). Zhao et al. used real-time PCR to measure microRNA miR-210 in serum and found that the average level of miR-210 was significantly higher in ccRCC patients than in controls (p<0.001) with an area under the curve (AUC) of 0.874 (10). There is mounting evidence that serum-circulating long noncoding RNAs (lncRNAs) have great potential as practical biomarkers for clinical diagnosis. Wu et al. conducted an adequate investigation of the levels of the 5-lncRNA signature to build a risk model that could distinguish ccRCC samples from healthy controls with an AUC value of 0.9000 (11). In addition, serum histidine and plasma tryptophan were employed to correctly classify 85.5% of control and 84.7% of case samples with the logistic regression model (12). However, this requires extremely high sensitivity in terms of detection technology because of the low levels of these biomarkers that are released into the blood.

Therefore, we construct a simple, effective, and noninvasive method to differentiate ccRCC from other RCC subtypes. In this work, inspired by multi-analyte blood tests, which can reveal greater correlations between complex associations (13), and the success of machine learning in support decision systems (14, 15), we sought to find a link between ccRCC and clinical liquid data, including blood and urine indices that are easy and cost-effective to detect through machine learning approaches.

Materials and Methods

Source of Materials

A total of 306 samples that were collected in the Lanzhou University Second Hospital were used to build the model, of which 269 samples were used to train the model and 37 samples were used to test the performance of the model. From among these samples, 240 samples of ccRCC were classified as positive samples, and the remaining were negative samples, including papillary, chromophobe, and other rare renal tumors. Furthermore, all samples were collected with routine blood and urine testing when a patient was first diagnosed with malignant kidney tumors through clinical characteristics and hematological, radiological, and histopathological examinations, etc., by no less than two experienced experts. Each sample consisted of 26 routine blood indices (detected by Sysmex XN9000), 22 blood biochemical indices (detected by Roche COBAS 8000), and 16 numerical routine urine indices (detected by Sysmex UF-1000i). Detailed allocation information about the data sets is shown in Table 1, and general information about the indices is listed in Supplementary Table S1. The study was approved by the ethics committee of Lanzhou University Second Hospital. Written informed consent was obtained from all participants.

TABLE 1
www.frontiersin.org

Table 1 Detailed division number and general information of the data set of RCC.

Machine Learning Method

The random forest (RF) method is a flexible and practical classifier among many supervised machine learning algorithms and has been widely used in scientific research and practical applications. The most prominent advantages of the RF algorithm are random sampling and random feature selection, which can ensure the accuracy and stability of a model. In addition, it can reduce the dimension of high-dimensional data and has a strong generalization ability for data sets about which little is known. Moreover, it can monitor the error, strength, and correlation of an out-of-bag set; it can also present the importance of a set’s features through permutation. Generally speaking, the RF method mainly contains two parameters to be adjusted, namely the number of trees (ntree) and the number of randomly selected features to be split at each node (mtry). Because of the unique advantages of RF, this algorithm was used for training and predicting samples in this study.

The research process can be roughly divided into three successive stages. First, all blood and urine indices were used to construct a suitable classification model. The main purpose of this is to obtain the importance ranking of each index for high-performance prediction outcomes. Second, with the adjustment of ntree and mtry parameters, various models with different outcomes were built by increasing the number of important indices one by one based on 10-fold cross-validation. The value of ntree increased from 500 to 2000 with a step size of 100, whereas the value of mtry increased from 2 to 15 with a step size of 1. Third, according to the principle that an appropriate number of top-ranking indices could achieve a comparable prediction performance as using all the indices, the final model was determined by the 20 most relevant indices with ntree and mtry being 1400 and 2, respectively. RF was applied to the RandomForest package of R v4.6-7.

Validation Method

There are two different yet complementary methods for the model evaluation process, including 10-fold cross-validation of the training set and external verification of the testing set to obtain robust prediction performance for identifying ccRCC. Indeed, 10-fold cross-validation was used to divide the training set into 10 nonoverlapping parts, one of which was used for internal verification, and the remaining parts were used for internal model training. After this process had been repeated 10 times, each sample could be used to test the model once. Therefore, 10-fold cross-validation is a powerful and persuasive method for verifying the prediction ability of a model.

By contrast, external verification of the testing set was only employed to test the model performance; it did not contribute to the training process of the model, which was very different from that of the training set based on 10-fold cross-validation.

Results

After training based on 10-fold cross-validation, a model composed of 18 top-ranking indices selected by the RF method displayed relatively good performance in identifying ccRCC with high sensitivity, specificity, accuracy (ACC), and associated AUC values of 0.9456, 0.9097, 0.9372, and 0.9728, respectively, as shown in Figure 1. The specific information of these indices is listed in Table 2. Apart from the ability to discriminate ccRCC from various other types of malignant kidney tumors in the training set, the model had satisfactory prediction outcomes for the testing set with an ACC of 0.8375 and AUC of 0.8780 as shown in Figure 2. These results indicate that the model formed by the complex combination of 18 routine blood and urine indices exhibited good performance in identifying ccRCC; thus, the model could help patients predict the severity of their disease in advance and avoid unnecessary histopathological examinations.

FIGURE 1
www.frontiersin.org

Figure 1 Performance of different models with incremented number of top-ranking indices of the training set.

TABLE 2
www.frontiersin.org

Table 2 Top-ranking indices selected by the RF algorithm listed in order of decreasing importance.

FIGURE 2
www.frontiersin.org

Figure 2 Results of the external verification of the testing set.

An interactive web server of liq_ccRCC was developed for users to test, explore, and experience this method. It is very convenient and straightforward that users need only enter the related value into the corresponding text box according to the requirements of the interface. After clicking the “Submit” button, the prediction information of the sample is presented in the results interface after calculation and analysis. The main page of this website for ccRCC discrimination is shown in Figure 3. This method has the potential to be developed into a promising tool for the discrimination of ccRCC from other types of kidney cancers.

FIGURE 3
www.frontiersin.org

Figure 3 Web server of liq_ccRCC method.

Discussion

During the training process, the importance of an appropriate number of top-ranking indices should be highlighted in the context of an urgent requirement for high-efficiency solutions and easy-to-perform models. Thus, the 18 top-ranking indices are hand-picked by the RF method based on the principle that this low number of indices can attain comparative prediction ability to the entire set of indices. Although the AUC value of the model did not reach the optimal choice with the combination of 18 indices, both the Matthews correlation coefficient (MCC) and ACC reached their peaks. The difference between the AUC at the time and the nearby AUC values was very small. After comprehensive consideration, 18 indices became the best choice for identifying ccRCC without compromising on performance. Although ccRCC cannot be diagnosed by symptoms alone, the prewarning method can provide warning information to patients for more in-depth examinations. In particular, when the pattern of tumor cells is abnormal or the availability of lesion samples is limited, this auxiliary method is expected to offer a potential future avenue for some special ccRCC diagnosis. The method using the combination of key indices through the RF algorithm has been proven to be stable and reliable based on the results of the training set and the external testing set.

To further prove the generalization ability of this method, another external testing data set with 79 RCC samples, including 61 ccRCC and 18 non-ccRCC, were collected from the First Hospital of Lanzhou University. Routine blood indices were detected by Mindray BC 6800. The blood biochemical indices were detected by BECKMAN, and the routine urine indices were detected by Sysmex UF-1000i. Detailed information about this new data set is listed in Supplementary Table S2. The study was approved by the ethics committee of the First Hospital of Lanzhou University. Based on the same method described earlier, the new samples with 18 blood and urine indices showed comparable identification results for ccRCC in terms of sensitivity, specificity, accuracy, and AUC of 0.8229, 0.8000, 0.8222, and 0.8507, respectively. These new samples revealed comparable identification ability with the first testing set using the proposed method although the test data were collected with different testing equipment than the training data set. These results suggest that this method possesses good generalization ability and is able to tolerate systematic errors between different testing instruments to some extent. Therefore, this technology looks promising for differentiating ccRCC from other subtypes of RCC preoperatively in the future.

To explore more valuable information, all 18 indices selected from the RF algorithm, including 4 routine blood indices, 9 blood biochemical indices, and 5 numerical routine urine indices were analyzed by Mann–Whitney U tests, for both ccRCC and non-ccRCC samples. Finally, six differentially expressed blood indices were identified as providing some novel pathological insights and potential clinical application opportunities as shown in Figure 4. Immature granulocytes (IG#) and immature granulocyte ratio (IG%) are not discussed commonly in terms of clinical relevance for this disease. However, IG% played a special role in the severity of sepsis (16). In Figures 4A, B, the expression levels of these two indices for ccRCC were significantly lower than those of non-ccRCC patients; therefore, they may be two useful biomarkers for reflecting the progress of renal tumors. Previous studies report that platelets (PLT) are related to tumor angiogenesis; they promote metastasis and contain diverse angiogenic factors that are associated with various stages of tumor development (17). Thus, the PLT in Figure 4C has great potential for enhancing the identification probability of ccRCC. The function of magnesium for genetic instability and promotion of tumorigenesis in a body has received some pertinent attention (18). It is meaningful that there is a significant difference in the expression of magnesium between ccRCC and non-ccRCC samples (Figure 4D). As the main serum protein, globulin (GLO) has always been a common and important marker, having a significant impact on the inflammation process (19). The improved GLO can be regarded as an important risk factor for ccRCC, as shown in Figure 4E. Urea is a nitrogen-containing metabolite produced during protein metabolism (20). In this study, we found that the urea level of patients with ccRCC was significantly lower than that of other negative samples (Figure 4F), which may be an independent predictor of the risk of ccRCC in advance.

FIGURE 4
www.frontiersin.org

Figure 4 Differentially expressed blood indices between ccRCC and non-ccRCC. (A) Immature granulocytes (IG#); (B) Immature granulocyte ratio (IG%); (C) Platelet (PLT); (D) Magnesium (Mg); (E) Globulin (GLO); (F) Blood urine nitrogen (BUN).

Supplementing these differentially expressed blood indices remains an important driving force for promoting a better understanding of the ccRCC process. The random forest method can not only absorb these differentially expressed indices, but also integrate some conventional blood and urine indices into a prewarning system for identifying ccRCC after comprehensive consideration. Although none of the urine indices suggests significant differential expression levels, these parameters play an indispensable part in the nodes of each random tree to help accurately distinguish ccRCC from other types of renal tumors. It is a complex combination of all selected indices that leads to a high-performance ccRCC discrimination system, which implies that no single indicator is fully capable of identifying ccRCC from among complex diseases.

In order to further evaluate the performance of our method, the identification results of ccRCC were compared between this work and recently published works pursuing the same goal. Table 3 shows that our method can satisfy the enormous demands of discovering ccRCC with increased sensitivity and specificity compared with recently published methods. At the same time, several important blood and urine indices have been developed and applied in the RF algorithm to identify the relationship between ccRCC and other types of renal tumors and to facilitate the rapid and accurate diagnosis of ccRCC.

TABLE 3
www.frontiersin.org

Table 3 Performance comparison of different methods for identifying ccRCC.

Collectively, our results suggest that there are strong associations between various types of renal carcinoma, specific hematological characteristics and urine indicators. The diagnostic method liq_ccRCC, which was built based on these key indices in routine clinical settings, appears to be able to accurately differentiate ccRCC, especially with atypical histological presentation. The high-level discriminatory performance between ccRCC and other subtypes of RCC demonstrates that this method has huge potential to be extended and applied for the early warning of other malignant diseases after sufficient cognition by machine learning. More clinical trials are needed to test the reliability and stability of this method. Nonetheless, our results indicate that a new class of ccRCC diagnostic methods may provide significant future value to patients.

Data Availability Statement

The original contributions presented in the study are included in the article/Supplementary Materials. Further inquiries can be directed to the corresponding authors.

Author Contributions

ZW and SL conceived and designed this work. JZ and JWu constructed the identification model, built the web server and wrote the manuscript. JWe collected the data of blood indices. XS collected the data of urine indices. YC did the validation of the built model and helped to revise the manuscript. All authors contributed to the article and approved the submitted version.

Funding

This work was supported by the health industry scientific research project of Gansu province (GSWSKY-2015-54).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2020.605769/full#supplementary-material

References

1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2020. CA-Cancer J Clin (2020) 70:7–30. doi: 10.3322/caac.21590

PubMed Abstract | CrossRef Full Text | Google Scholar

2. Moch H, Cubilla AL, Humphrey PA, Reuter VE, Ulbright TM. The 2016 WHO Classification of Tumours of the Urinary System and Male Genital Organs—Part A: Renal, Penile, and Testicular Tumours. Eur Urol (2016) 70:93–105. doi: 10.1016/j.eururo.2016.02.029

PubMed Abstract | CrossRef Full Text | Google Scholar

3. Reuter VE, Tickoo SK. Differential diagnosis of renal tumours with clear cell histology. Pathology (2010) 42:374–83. doi: 10.3109/00313021003785746

PubMed Abstract | CrossRef Full Text | Google Scholar

4. Karakiewicz PI, Briganti A, Chun FK-H, Trinh Q-D, Perrotte P, Ficarra V, et al. Multi-Institutional Validation of a New Renal Cancer–Specific Survival Nomogram. J Clin Oncol (2007) 25:1316–22. doi: 10.1200/jco.2006.06.1218

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Dong XQ, Shen Y, Xu LW, Xu CM, Bi W, Wang XM. Contrast-enhanced ultrasound for detection and diagnosis of renal clear cell carcinoma. Chin Med J-Peking (2009) 122:1179–83. doi: 10.3760/cma.j.issn.0366-6999.2009.10.012

CrossRef Full Text | Google Scholar

6. Wei JY, Zhao JH, Zhang XL, Wang D, Zhang WJ, Wang ZP, et al. Analysis of dual energy spectral CT and pathological grading of clear cell renal cell carcinoma (ccRCC). PLoS One (2018) 13:e0195699. doi: 10.1371/journal.pone.0195699

PubMed Abstract | CrossRef Full Text | Google Scholar

7. Young JR, Margolis D, Sauk S, Pantuck AJ, Sayre J, Raman SS. Clear cell renal cell carcinoma discrimination from other renal cell carcinoma subtypes and oncocytoma at multiphasic multidetector CT. Radiology (2013) 267:444–53. doi: 10.1148/radiol.13112617

PubMed Abstract | CrossRef Full Text | Google Scholar

8. Wang Q, Gan H, Chen C, Sun Y, Chen J, Xu M, et al. Identification and validation of a 44-gene expression signature for the classification of renal cell carcinomas. J Exp Clin Cancer Res (2017) 36:176. doi: 10.1186/s13046-017-0651-9

PubMed Abstract | CrossRef Full Text | Google Scholar

9. McKiernan JM, Buttyan R, Bander NH, de la Taille A, Stifelman MD, Emanuel ER, et al. The detection of renal carcinoma cells in the peripheral blood with an enhanced reverse transcriptase-polymerase chain reaction assay for MN/CA9. Cancer (1999) 86:492–7. doi: 10.1002/(sici)1097-0142(19990801)86:3<492::aid-cncr18>3.0.co;2-r

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Zhao A, Li GR, Peoc’h M, Genin C, Gigante M. Serum miR-210 as a novel biomarker for molecular diagnosis of clear cell renal cell carcinoma. Exp Mol Pathol (2013) 94:115–20. doi: 10.1016/j.yexmp.2012.10.005

PubMed Abstract | CrossRef Full Text | Google Scholar

11. Wu Y, Wang YQ, Weng WW, Zhang QY, Yang XQ, Gan HL, et al. A serum-circulating long noncoding RNA signature can discriminate between patients with clear cell renal cell carcinoma and healthy controls. Oncogenesis (2016) 5:e192. doi: 10.1038/oncsis.2015.48

PubMed Abstract | CrossRef Full Text | Google Scholar

12. Lee HO, Uzzo RG, Kister D, Kruger WD. Combination of serum histidine and plasma tryptophan as a potential biomarker to detect clear cell renal cell carcinoma. J Transl Med (2017) 15:72. doi: 10.1186/s12967-017-1178-8

PubMed Abstract | CrossRef Full Text | Google Scholar

13. Cohen JD, Li L, Wang Y, Thoburn C, Afsari B, Danilova L, et al. Detection and localization of surgically resectable cancers with a multi-analyte blood test. Science (2018) 359:926–30. doi: 10.1126/science.aar3247

PubMed Abstract | CrossRef Full Text | Google Scholar

14. Wu J, Bai J, Wang W, Xi L, Zhang P, Lan J, et al. ATBdiscrimination: An in Silico Tool for Identification of Active Tuberculosis Disease Based on Routine Blood Test and T-SPOT.TB Detection Results. J Chem Infor Model (2019) 59:4561–8. doi: 10.1021/acs.jcim.9b00678

CrossRef Full Text | Google Scholar

15. Wu J, Zan X, Gao L, Zhao J, Fan J, Shi H, et al. A Machine Learning Method for Identifying Lung Cancer Based on Routine Blood Indices: Qualitative Feasibility Study. JMIR Med Inf (2019) 7:e13476. doi: 10.2196/13476

CrossRef Full Text | Google Scholar

16. Ha SO, Park SH, Park SH, Park JS, Huh JW, Lim C-M, et al. Fraction of immature granulocytes reflects severity but not mortality in sepsis. Scand J Clin Lab Inv (2015) 75:36–43. doi: 10.3109/00365513.2014.965736

CrossRef Full Text | Google Scholar

17. Wojtukiewicz MZ, Sierko E, Hempel D, Tucker SC, Honn KV. Platelets and cancer angiogenesis nexus. Cancer Metast Rev (2017) 36:249–62. doi: 10.1007/s10555-017-9673-1

CrossRef Full Text | Google Scholar

18. Trapani V, Wolf FI, Scaldaferri F. Dietary magnesium: the magic mineral that protects from colon cancer? Magnes Res (2015) 28:108–11. doi: 10.1684/mrh.2015.0390

PubMed Abstract | CrossRef Full Text | Google Scholar

19. Azab B, Kedia S, Shah N, Vonfrolio S, Lu W, Naboush A, et al. The value of the pretreatment albumin/globulin ratio in predicting the long-term survival in colorectal cancer. Int J Colorectal Dis (2013) 28:1629–36. doi: 10.1007/s00384-013-1748-z

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Renugadevi J, Prabu SM. Naringenin protects against cadmium-induced oxidative renal dysfunction in rats. Toxicology (2009) 256:128–34. doi: 10.1016/j.tox.2008.11.012

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: Liq_ccRCC, clear cell renal cell carcinoma, subtype differentiation, liquid indices, machine learning

Citation: Zhao J, Wu J, Wei J, Su X, Chai Y, Li S and Wang Z (2021) Liq_ccRCC: Identification of Clear Cell Renal Cell Carcinoma Based on the Integration of Clinical Liquid Indices. Front. Oncol. 10:605769. doi: 10.3389/fonc.2020.605769

Received: 13 September 2020; Accepted: 09 November 2020;
Published: 06 January 2021.

Edited by:

Adam R. Metwalli, Howard University Hospital, United States

Reviewed by:

Ilan Layman, Howard University, United States
Curtis J. Frederick, Howard University Hospital, United States
Joshua Cabral, Howard University, United States

Copyright © 2021 Zhao, Wu, Wei, Su, Chai, Li and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Zhiping Wang, wzplzu@163.com; Shuyan Li, lishuyan@lzu.edu.cn

These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.