- 1M&S Decisions LLC, Moscow, Russia
- 2I.M. Sechenov First Moscow State Medical University, Moscow, Russia
Objectives: Blood-based tests have been shown to be an effective strategy for colorectal cancer (CRC) detection in screening programs. This study was aimed to test the performance of 20 blood markers including tumor antigens, inflammatory markers, and apolipoproteins as well as their combinations.
Methods: In total 203 healthy volunteers and 102 patients with CRC were enrolled into the study. Differences between healthy and cancer subjects were evaluated using Wilcoxon rank-sum test. Several multivariate classification algorithms were employed using information about different combinations of biomarkers altered in CRC patients as well as age and gender of the subjects; random sub-sampling cross-validation was done to overcome overfitting problem. Diagnostic performance of single biomarkers and multivariate classification models was evaluated by receiver operating characteristic (ROC) analysis.
Results: Of 20 biomarkers, 16 were significantly different between the groups (p-value ≤ 0.001); ApoA1, ApoA2 and ApoA4 levels were decreased, whereas levels of tumor antigens (e.g. carcinoembriogenic antigen) and inflammatory markers (e.g., C-reactive protein) were increased in CRC patients vs. healthy subjects. Combinatorial markers including information about all 16 significant analytes, age and gender of patients, demonstrated better performance over single biomarkers with average accuracy on test datasets ≥95% and area under ROC curve (AUROC) ≥98%.
Conclusions: Combinatorial approach was shown to be a valid strategy to improve performance of blood-based CRC diagnostics. Further evaluation of the proposed models in screening programs will be performed to gain a better understanding of their diagnostic value.
Introduction
Colorectal cancer (CRC) is the third most commonly diagnosed malignancy worldwide with the highest prevalence in developed countries (1). In 2018, the predicted total mortality rates in the Russian Federation were 158.5/100,000 men and 84.1/100,000 women (2). Early diagnosis of cancer represents an effective way to reduce mortality rates, however, since clinical symptoms are often minor and non-specific until advanced disease stages, dedicated screening programs are required (3).
Several instrumental methods are currently used to diagnose CRC, including colonoscopy, computer tomography (CT), colonoscopy, flexible sigmoidoscopy etc. (4, 5). While these methods are required to confirm diagnosis, their usage in screening programs is limited due to invasiveness, labor intensiveness, risk of complications and the need for specific equipment. Additionally, several non-invasive methods such as fecal immunochemical test (FIT), fecal occult blood testing (FOBT) can be used (4, 6), however, high false positive rates are an important disadvantage of these tests (7, 8). DNA-based methods represent another strategy of CRC detection, but despite the diagnostic advantage over FOBT these systems cannot be used in screening programs due to their expense (9).
Blood-based tests would be the most suitable option for massive screening programs, since they can be easily combined with other biochemical assays. Several blood-based biomarkers, including carcioembriogenic antigen (CEA) and carbohydrate antigen (CA) 19-9 are well established in clinical practice, howbeit, low specificity and sensitivity are key limitations of these tests (10). Recent advances in -omics technologies enabled discovery of new potential biomarkers, including different proteins (11), circulating tumor DNA (12, 13) or microRNA (14) and circulating tumor cells (15) as well as numerous metabolites (16, 17) and transcriptional biomarkers (18). Despite many of these biomarkers demonstrated high diagnostic potential in retrospective proof-of-concept studies, further research is required to determine their clinical validity and utility (11). Another challenges, limiting extensive use of these biomarkers in routine practice nowadays, are their expensiveness and lack of reproducibility (11).
An alternative strategy of the screening optimization is exploiting multifactorial approaches, implying development of multivariate classification models, which can be used to calculate probability of having the disease based on measurements of several biomarkers (10, 19). Such biomarkers may demonstrate higher diagnostic performance compared to single analytes due to more comprehensive reflection of complex and diverse mechanisms of carcinogenesis and multiple metabolic, genetic and structural alternations in cancer cells (10). The current work is aimed to assess the diagnostic potential of multiple biomarkers, including oncofetal proteins, inflammation, and vascularization markers, adhesion molecules and their combination to evaluate the CRC risk.
Materials and Methods
Patients, Sampling and Measurements
The study was approved by the Local Ethics Committee of I.M. Sechenov First Moscow State Medical University. All patients were given an informed consent to participate in the study. In total 102 patients with histologically-confirmed CRC (16 patients with T1-2, 86 patients with T3-4) and 203 healthy subjects were included in the analysis. Serum samples were collected at Sechenov University Hospital after overnight fasting and sent to the Hospital laboratory. Samples were stored at −70°C in liquid nitrogen until analyzed.
In total 20 biomarkers were measured including apolipoproteins A1, A2, B (ApoA1, ApoA2, ApoB), alpha-fetoprotein (AFP), beta 2 microglobulin (B2M), carbohydrate antigen 19-9 (CA 19-9), cancer antigens 15-3 and 125 (CA 15-3, CA 125), carcinoembryonic antigen (CEA), cytokeratin 19-fragments (CYFRA 21-1), human epididymis protein 4 (HE4), human-specific C-reactive protein (hsCRP), D-dimer, leucine-rich alpha-2-glycoprotein 1 (LRG 1), total prostate-specific antigen (PSA), regulated on activation, normal T cell expressed and secreted (RANTES) soluble vascular cell adhesion molecule 1 (sVCAM 1), transthyretin (TTR), vascular endothelial growth factor receptor 1 (VEGFR 1). Biomarker levels were measured in all 305 samples, except total PSA, which was only analyzed in serum samples obtained from men.
Sandwich enzyme-linked immunosorbent assay (ELISA) was used to analyze RANTES, sVCAM-1, VEGFR-1, ApoA4, LRG-1 (Quantikine® kits, R&D systems, US) with Biochrom Anthos 2020 microplate reader (Biochrom, UK); AFP, CA15-3, CA19-9, CA125, HE4, CEA, CYFRA21-1, and total PSA were measured using Elecsys® sandwich electrochemiluminiscent assay (ECLIA) on the Cobas e411 analyzer (Roche diagnostics, Germany); hsCRP, ApoA1, ApoB, TTR were measured on Advia 1800 auto-analyzer by immunoturbodimeric method (Siemens Healthcare, Germany); B2M and Ddimer were measured by sandwich chemiluminescent assay (CLIA) on Immulite 2000 auto-analyzer (Siemens Medical Solutions, USA); ApoA2 was measured using enzymatic colorimetric method (Randox laboratories, UK).
Statistical Methods
All data processing, statistical and visualization procedures were performed using R statistical software (v.3.5.1) (20). R-based packages randomForest (v.4.6-14), MASS (v.7.3-50), e1071 (v.1.7-2), stats (v.3.5.1) and caret (v.6.0-84) were used for development of combinatorial biomarkers; sensitivity analysis of the developed biomarkers was done using R-based mmpf package (v.0.0.5); R-based pROC package was used to perform ROC analysis (v.1.15.3).
Biomarker values were log-transformed prior to analysis. At first, the significance of single biomarkers was evaluated using Mann-Whitney U-test and the diagnostic value of each biomarker was assessed via receiver operating characteristics (ROC) analysis; sensitivity, specificity, and diagnostic accuracy at optimal cut-off values as well as area under ROC curve (AUROC) were calculated. Influence of subject characteristics (gender and age) on biomarker levels in healthy and CRC groups was evaluated via analysis of covariance (ANCOVA) using generalized linear models.
Secondly, classification models were assembled based on the measurements of biomarkers, which were significantly different between healthy subjects and CRC patients (p-value<0.05) and demonstrated discriminative ability (AUROC > 0.6). Patient characteristics (age and gender) were also tested as predictors. Several classification algorithms including random forest (RF), support vector machine (SVM), linear discriminant analysis (LDA), and naïve Bayes classifier (NBC), as well as multiple logistic regression (MLR) (21) were trained using the whole dataset and their discriminative ability was assessed via ROC analysis, similar to single biomarkers. Accuracy of model-predicted probabilities of having the disease was evaluated using Brier score. To detect overfitting of classification models a 100-times repeated random 5-fold sub-sampling cross-validation was performed. Sensitivity of the model predictions to changes in values of single biomarkers and patient characteristics was evaluated using model-agnostic permutation importance method (22).
Finally, all possible classification models, exploiting information about one to five biomarkers and patient characteristics, were trained and their diagnostic performance was assessed.
Results
Diagnostic Accuracy of Single Biomarkers
Comparison of the biomarker levels in healthy subjects and CRC patients is presented in Figure 1 and Table 1. Among considered analytes AFP, ApoB, CA 15-3, and VEGFR 1 were not significantly different between the two groups; ApoA1 and ApoA2 levels were lower in CRC group compared to healthy subjects; levels of the rest biomarkers were higher in CRC vs. healthy group (Table 1). While disease stratification to early (T1-T2) and advanced (T3-T4) stages, levels of ApoA2, ApoA4, Ddimer, HE4, and LRG 1 were found to be significantly changed in both early and advanced CRC stages (Figure 1). As can be seen from Table 1, mean age of CRC patients was higher compared to healthy subjects (48 ± 6.33 and 63 ± 12.4 years, respectively, p-value < 0.001); in accordance to ANCOVA results, significant differences in biomarker levels persisted after age and gender adjustment (Table S1).
Figure 1. Comparison of biomarker levels between healthy subjects and patients with early and advanced CRC stages. Dots indicate individual patient data; differences between healthy subjects and CRC patients with stages T1-T2 or T3-T4 were evaluated using Wilcoxon test with Bonferroni correction for multiple testing.
Diagnostic accuracy of single biomarkers was assessed using the data, collected from all CRC patients simultaneously (Table 1) as well as separately from patients with early and advanced CRC stages (Tables S2, S3, Figure S1). The highest diagnostic performance was demonstrated for ApoA4, LRG 1, and ApoA2 with AUROC 0.9, 0.89, and 0.87, respectively (Table 1, Figure 2), which can be explained by their good performance in patients with both early and advanced stages; as expected, CRC-specific biomarkers, such as CEA and CA 19-9 demonstrated good performance only in CRC patients with advanced stages.
Figure 2. ROC curves for the (A) single-biomarker based tests and (B) multivariate classification models. Different models are shown by color. Numbers denote AUROC values; 90% confidence intervals for validation are shown in brackets.
Diagnostic performance of AFP, ApoB, CA 15-3, and VEGFR 1 was poor (AUROC = 0.55, 0.55, 0.52, and 0.52, respectively, Table 1) and, hence, these biomarkers were excluded from further analysis. Total PSA measurements were not used for classification models, since the information was not available for all patients.
Diagnostic Accuracy of Multivariate Classification Models
Measurements of 15 biomarkers, selected on the last step, were used to train classification models. Diagnostic performance of classification models as well as results of cross-validation are reported in Table 2; ROC curves are summarized in Figure 2. All multivariate classification models demonstrated better performance compared to single-marker-based tests while a whole dataset was used (AUROC ≥ 0.99, specificity and sensitivity ≥95%). In cross-validation exercise, MLR, NBC, and RF demonstrated higher variability in diagnostic performance compared to SVM and LDA.
ROC analysis, performed separately on data, collected from patients with early and advanced disease stages, indicated higher performance of MLR, NBC and LDA classifiers for the latter group (Figure S2, Table S4). To further investigate diagnostic performance of the models for each cancer stage, individual probabilities of having the disease were calculated using the models, grouped by stage and visualized (Figure 3). All models correctly identified most of patients with T2-T4 stages, but patients with T1 were correctly classified only using RF model; this model also demonstrated the highest predictive accuracy (Brier score = 0.006).
Figure 3. Predicted individual probabilities of having the disease stratified by CRC stage. Different stages are shown by color.
Sensitivity analysis revealed differences in feature importance across the developed models (Figure 4). Among tested classifiers RF classifier was less sensitive to feature permutations. Probabilities calculated using MLR, LDA, and SVM classifiers were sensitive to permutations in ApoA4 and ApoA2 levels; age was found to be an important patient characteristic for most of the tested algorithms.
Testing Alternative Multivariate Classification Models
Our next question was to see whether a comparable diagnostic performance can be achieved by including information from lower number of biomarkers. To test this hypothesis, we selected SVM and LDA classifiers, and trained them using measurements of 1–5 biomarkers extracted from the whole dataset; influence of patient characteristics information inclusion into the models was additionally evaluated. In total, 6,340 models were tested, AUROC, sensitivity, and specificity was calculated.
Inclusion of information from higher number of biomarkers was followed by AUROC, sensitivity and specificity increase; taking into consideration the information about patient age and gender improved diagnostic performance of all combinations, mostly by increasing test sensitivity; this improvement is more pronounced in SVM vs. LDA algorithm, as a result, while accounting for patient characteristics, SVM performance was higher than LDA (Figure 5). While evaluating the discriminative ability, it was found that models, jointly considering information about both tumor antigens (e.g., CEA) and metabolic or inflammatory markers (e.g., ApoA2) demonstrated the highest diagnostic potential (Table 3).
Figure 5. Comparison of alternative classification models, stratified by number of biomarkers and grouped by inclusion of age and gender.
Table 3. Diagnostic performance of 2-5-biomarker models for CRC diagnosis with highest AUROC values.
As among 15 analytes, levels of ApoA2, ApoA4, Ddimer, HE4, and LRG 1 were found to be altered in patients with both early and advanced CRC stages (Figure 1), diagnostic performance of the combination of these 5 biomarkers was additionally evaluated and was shown to be comparable to that of the full 15-biomarker models (Table S5).
Discussion
Multivariate approach represents a promising strategy to improve performance of diagnostic tools for cancer risk evaluation and several tests have been already approved by FDA, including OVA1® intended for ovarian cancer detection based on plasma measurements of 5 biomarkers (23), and multitarget stool DNA-based test Cologuard® for colorectal cancer screening (24). At the same time identification of new biomarkers in genome and proteome studies could further enhance the potential of cancer diagnostics (25, 26) whereas the increase of computational power followed by dissemination of machine learning techniques enabled a more efficient use of routinely collected patient data to improve different aspects of CRC screening. Hence, algorithms enabling identification of subjects with high CRC risk based on age, gender and full blood count information, can be applied to optimize screening programs (27–29), while deep learning methods could be used for computer-assisted colonoscopy image analysis (30). However, the development of multiple-biomarker tests still seems to be key to machine learning application in cancer diagnostics. In total, in a systematic review by Bhardwaj et al. 36 studies evaluated diagnostic performance of multiple-biomarker tests for CRC detection were identified (10). Variability in diagnostic performance of both single biomarkers and multiplex biomarker panels across the studies was reported, which was hypothesized as being a result of between-population differences as well as study design features (e.g., stage and histology of the tumors), thus, underlying the importance of developing or validating diagnostic platforms using the data obtained from intended to screen population. In the current study we reported the results of the cancer screening program “OncoPro,” aimed at improving early CRC detection in the Russian Federation.
Well-known biomarkers, associated with CRC diagnosis, such as CEA and CA 19-9 (31), demonstrated limited sensitivity in the present analysis and were not significantly increased in patients with early T1-T2 stages. This is in line with previous findings, which limits their usage in screening programs (32). Moreover, other proteins associated with CRC diagnosis such as CYFRA 21-1, HE 4, and LRG 1 were also tested and found to be altered in CRC patients, as previously reported (33–35). An interesting finding from the current study were the differences in PSA levels between healthy subjects and patients with CRC (1.13 ± 0.97 vs. 1.9 ± 1.61, p-value = 0.003), although the PSA level was only outside the reference range in two patients. One possible explanation could be the cross-reactivity of the PSA antibody with other serine proteases produced by colon cancer (36). Interestingly, in contrast to the results of the Hou, Luo, and Zhang meta-analysis (37), we found no AFP abnormalities in cancer subjects, which may suggest the need for screening tests adjusted to different populations. While the diagnostic potential of various antigens for CRC screening has been investigated, to our knowledge the current study is the first that demonstrates the alternations of metabolic markers ApoA1, ApoA2, and ApoA4 in CRC patients. Currently, ApoA1 is included into FDA-approved OVA1 test, used for ovarian cancer screening and was shown to be decreased in pancreatic cancer (38). These observations may point to antitumor ApoA1 activity (38), and support the link between metabolic disorders and cancer risk, previously hypothesized and investigated in the epidemiological Malmo Diet and Cancer Study (39).
The next step of our research was to evaluate the multivariate classification models, and in order to achieve this, we tested several classification algorithms, including information about different combinations of the aforementioned biomarkers, as well as patient characteristics. As expected, the diagnostic performance of multivariate models was higher compared to that of single-biomarkers and a number of considered biomarkers and patient characteristics was positively associated with the diagnostic accuracy of the tests. Classification models, exploiting information about all 15 biomarkers, age and gender of patients, demonstrated high performance (AUROC > 0.95) in line with previous studies, where similar biomarker panels enabled accurate identification of subjects with breast and lung cancer (40, 41). We hypothesized that such a good agreement between the model predictions and actual data could be consequence of overfitting, negatively affecting model predictive power, which is common for genomic and proteomic tests, exploiting information about thousands of predictors (42). A relatively small number of analytes was considered in the proposed models (15 biomarkers, age, and gender of patients) and cross-validation did not indicate this problem. Alternative explanation of good diagnostic performance of the models could be a large proportion of patients with advanced cancer stages, characterized by more pronounced alternations in biomarker levels. To evaluate this hypothesis, we investigated diagnostic performance of the models for early and advanced stages separately and compared posterior probabilities of the disease presence by stage. Higher probabilities were predicted for patients with advanced cancer stages using all classifiers, but only RF enabled accurate identification of patients with T1 stage. A possible explanation could be that this algorithm has more flexible structure compared to linear classifiers, such as MLR or LDA (43), howbeit, it should be stated, that performance of the algorithms may significantly depend on the tuning parameters (e.g. number of trees for RF or type of kernel function for SVM) and characteristics of a training dataset.
Whereas numerous multi-marker diagnostics tests with good performance have been developed already, they are not suitable for screening programs due to expensiveness. Cost-effective analysis did not demonstrate advantage of ~$500 Cologuard® test over current screening strategies (44). The estimated cost of the 15 biomarker-based analysis is ~$100, which is much cheaper compared to recently proposed multivariate diagnostic systems. To investigate possibility of further cost reduction, we evaluated models, considering smaller number of analytes, and identified several perspective candidates with good diagnostic performance.
As the current study was a pilot to evaluate the multiple-biomarker approach for CRC screening in the Russian Federation further research is still required to understand better the potential of the proposed classification models. This includes: (1) additional enrollment of patients with T1-T2 CRC stages, since the group size was relatively small in the current analysis; (2) inclusion of patients with benign tumors and colon diseases to evaluate the discriminative ability of the tests between CRC and other pathologies. Finally, prospective randomized clinical trials are required to demonstrate the clinical value of the proposed approach (42).
In conclusion, it could be stated that combinatorial biomarkers ensure more accurate discrimination between healthy subjects and CRC patients compared to univariate biomarkers and could be used as a decision-support tool for screening programs, however, further large-scale studies are necessary to confirm clinical utility of the developed diagnostic platform.
Data Availability Statement
The datasets generated for this study are available on request to the corresponding author.
Ethics Statement
The studies involving human participants were reviewed and approved by Local Ethics Committee of I.M. Sechenov First Moscow State Medical University. The patients/participants provided their written informed consent to participate in this study.
Author Contributions
MS, PG, AS, EP, PT, AE, EG, and AR developed study concept and design. VV performed statistical data analysis and modeling and prepared a manuscript draft. All authors performed manuscript revision and made a substantial contribution to the research.
Funding
This work was funded by the Russian Academic Excellence Project 5-100 program.
Conflict of Interest
MS, PG, AS, EP, PT, AE, EG, and AR are currently applying for a patent relating to the models reported in the manuscript. VV is employed by M&S Decisions LLC and received research funding from AstraZeneca.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Acknowledgments
The authors would like to acknowledge the valuable input from M&S decisions LLC colleagues (Kirill Peskov and Yuri Kosinsky) and biomarker scientists from Sechenov Hospital laboratory (Elena Popova and Alla Gindis) on methodological aspects of the research and wish to thank all patients who participated in the study.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2020.00832/full#supplementary-material
References
1. Arnold M, Sierra MS, Laversanne M, Soerjomataram I, Jemal A, Bray F. Global patterns and trends in colorectal cancer incidence and mortality. Gut. (2017) 66:683–91. doi: 10.1136/gutjnl-2015-310912
2. Carioli G, Malvezzi M, Bertuccio P, Levi F, Boffetta P, Negri E, et al. Cancer mortality and predictions for 2018 in selected Australasian countries and Russia. Ann Oncol. (2019) 30:132–42. doi: 10.1093/annonc/mdy489
3. Moore JS, Aulet TH. Colorectal cancer screening. Surg Clin N Am. (2017) 97:487–502. doi: 10.1016/j.suc.2017.01.001
4. Rex DK, Boland RC, Dominitz JA, Giardiello FM, Johnson DA, Kaltenbach T, et al. Colorectal cancer screening: recommendations for physicians and patients from the U.S. Am J Gastroenterol. (2017) 112:1016–30. doi: 10.1038/ajg.2017.174
5. Sovich JL, Sartor Z, Misra S. Developments in screening tests and strategies for colorectal cancer. BioMed Res Int. (2015) 2015:1–11. doi: 10.1155/2015/326728
6. Shapiro JA, Bobo JK, Church TR, Rex DK, Chovnick G, Thompson TD, et al. A comparison of fecal immunochemical and high-sensitivity guaiac tests for colorectal cancer screening. Am J Gastroenterol. (2017) 112:1728–35. doi: 10.1038/ajg.2017.285
7. American College of Physicians. Suggested technique for fecal occult blood testing and interpretation in colorectal cancer screening. Ann Intern Med. (1997) 126:808–10.
8. Niv Y, Sperber AD. Sensitivity, specificity, and predictive value of fecal occult blood testing (Hemoccult II) for colorectal neoplasia in symptomatic patients: a prospective study with total colonoscopy. Am J Gastroenterol. (1995) 90:1974–7.
9. Song L-L, Li Y-M. Current noninvasive tests for colorectal cancer screening: An overview of colorectal cancer screening tests. WJGO. (2016) 8:793. doi: 10.4251/wjgo.v8.i11.793
10. Bhardwaj M, Gies A, Werner S, Schrotz-King P, Brenner H. Blood-based protein signatures for early detection of colorectal cancer: a systematic review. Clin Trans Gastroenterol. (2017) 8:e128. doi: 10.1038/ctg.2017.53
11. Alves Martins BA, de Bulhões GF, Cavalcanti IN, Martins MM, de Oliveira PG, Martins AMA. Biomarkers in colorectal cancer: the role of translational proteomics research. Front Oncol. (2019) 9:1284. doi: 10.3389/fonc.2019.01284
12. Petit J, Carroll G, Gould T, Pockney P, Dun M, Scott RJ. Cell-free DNA as a diagnostic blood-based biomarker for colorectal cancer: a systematic review. J Surg Res. (2019) 236:184–97. doi: 10.1016/j.jss.2018.11.029
13. Merker JD, Oxnard GR, Compton C, Diehn M, Hurley P, Lazar AJ, et al. Circulating tumor DNA analysis in patients with cancer: american society of clinical oncology and college of american pathologists joint review. JCO. (2018) 36:1631–41. doi: 10.1200/JCO.2017.76.8671
14. Marcuello M, Vymetalkova V, Neves RPL, Duran-Sanchon S, Vedeld HM, Tham E, et al. Circulating biomarkers for early detection and clinical management of colorectal cancer. Mol Aspects Med. (2019) 69:107–22. doi: 10.1016/j.mam.2019.06.002
15. Eliasova P, Pinkas M, Kolostova K, Gurlich R, Bobek V. Circulating tumor cells in different stages of colorectal cancer. Folia Histochem Cytobiol. (2017) 55:1–5. doi: 10.5603/FHC.a2017.0005
16. Erben V, Bhardwaj M, Schrotz-King P, Brenner H. Metabolomics biomarkers for detection of colorectal neoplasms: a systematic review. Cancers. (2018) 10:246. doi: 10.3390/cancers10080246
17. Amir Hashim N, Ab-Rahim S, Suddin L, Ahmad Saman M, Mazlan M. Global serum metabolomics profiling of colorectal cancer (Review). Mol Clin Onc. (2019) 11:3–14. doi: 10.3892/mco.2019.1853
18. McNamara MG, Jacobs T, Frizziero M, Pihlak R, Lamarca A, Hubner R, Valle JW, Amir E. 89PDPrognostic and predictive impact of high tumor mutation burden (TMB) in solid tumors: A systematic review and meta-analysis. Ann Oncol. (2019) 30:mdz239. doi: 10.1093/annonc/mdz239
19. Yang X, Zhong J, Ji Y, Li J, Jian Y, Zhang J, et al. The expression and clinical significance of microRNAs in colorectal cancer detecting. Tumor Biol. (2015) 36:2675–684. doi: 10.1007/s13277-014-2890-0
20. R Development Core Team. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing (2008). Available online at: http://www.R-project.org.
21. James G, Witten D, Hastie T, Tibshirani R. An introduction to statistical learning: with applications. In: R. Springer Texts in Statistics. New York, NY: Springer-Verlag (2017).
22. Jones MZ, Linder JF. Edarf: exploratory data analysis using random forests. JOSS. (2016) 1:92. doi: 10.21105/joss.00092
23. Zhang Z. An in vitro diagnostic multivariate index assay (IVDMIA) for ovarian cancer: harvesting the power of multiple biomarkers. Rev Obstet Gynecol. (2012) 5:35–41. doi: 10.3909/riog0182
24. Imperiale TF, Ransohoff DF, Itzkowitz SH, Levin TR, Lavin P, Lidgard GP, et al. Multitarget stool DNA testing for colorectal-cancer screening. N Engl J Med. (2014) 370:1287–97. doi: 10.1056/NEJMoa1311194
25. Srivastava S, Verma M, Henson DE. Biomarkers for early detection of colon cancer. Clin Cancer Res. (2001) 7:1118–26.
26. García-Bilbao A, Armañanzas R, Ispizua Z, Calvo B, Alonso-Varona A, Inza I, et al. Identification of a biomarker panel for colorectal cancer diagnosis. BMC Cancer. (2012) 12:43. doi: 10.1186/1471-2407-12-43
27. Birks J, Bankhead C, Holt TA, Fuller A, Patnick J. Evaluation of a prediction model for colorectal cancer: retrospective analysis of 2.5 million patient records. Cancer Med. (2017) 6:2453–2460. doi: 10.1002/cam4.1183
28. Hornbrook MC, Goshen R, Choman E, O'Keeffe-Rosetti M, Kinar Y, Liles EG, et al. Early colorectal cancer detected by machine learning model using gender, age, and complete blood count data. Dig Dis Sci. (2017) 62:2719–27. doi: 10.1007/s10620-017-4722-8
29. Kinar Y, Kalkstein N, Akiva P, Levin B, Half EE, Goldshtein I, et al. Development and validation of a predictive model for detection of colorectal cancer in primary care by analysis of complete blood counts: a binational retrospective study. J Am Med Inform Assoc. (2016) 23:879–90. doi: 10.1093/jamia/ocv195
30. Urban G, Tripathi P, Alkayali T, Mittal M, Jalali F, Karnes W, et al. Deep learning localizes and identifies polyps in real time with 96% accuracy in screening colonoscopy. Gastroenterology. (2018) 155:1069–78.e8. doi: 10.1053/j.gastro.2018.06.037
31. VukobratBijedic Z, HusicSelimovic A, Sofic A, Bijedic N, Bjelogrlic I, Gogov B, et al. Cancer antigens (CEA and CA 19-9) as markers of advanced stage of colorectal carcinoma. Med Arh. (2013) 67:397. doi: 10.5455/medarh.2013.67.397-401
32. Thomas DS, Fourkala E-O, Apostolidou S, Gunu R, Ryan A, Jacobs I, et al. Evaluation of serum CEA, CYFRA21-1 and CA125 for the early detection of colorectal cancer using longitudinal preclinical samples. Br J Cancer. (2015) 113:268–74. doi: 10.1038/bjc.2015.202
33. Lee JH. Clinical usefulness of serum CYFRA 21–1 in patients with colorectal cancer. Nucl Med Mol Imaging. (2013) 47:181–7. doi: 10.1007/s13139-013-0218-4
34. Kemal Y, Demirag G, Bedir A, Tomak L, Derebey M, Erdem D, et al. Serum human epididymis protein 4 levels in colorectal cancer patients. Mol Clin Oncol. (2017) 7:481–5. doi: 10.3892/mco.2017.1332
35. Zhang Q, Huang R, Tang Q, Yu Y, Huang Q, Chen Y, et al. Leucine-rich alpha-2-glycoprotein-1 is up-regulated in colorectal cancer and is a tumor promoter. OncoTargets Ther. (2018) 11:2745–52. doi: 10.2147/OTT.S153375
36. Yamamoto M, Hibi H, Miyake K. Raised prostate-specific antigen in adenocarcinoma of the colon. Int Urol Nephrol. (1997) 29:221–5.
37. Hou P, Luo J, Zhang J. A study of optimized model of serum tumor markers of colorectal cancer based on intelligent algorithms. Int J Clin Exp Med. (2016) 9:17153–64.
38. Zamanian-Daryoush M, DiDonato JA. Apolipoprotein A-I and cancer. Front Pharmacol. (2015) 6:265. doi: 10.3389/fphar.2015.00265
39. Borgquist S, Butt T, Almgren P, Shiffman D, Stocks T, Orho-Melander M, et al. Apolipoproteins, lipids and risk of cancer: apolipoproteins, lipids and risk of cancer. Int J Cancer. (2016) 138:2648–56. doi: 10.1002/ijc.30013
40. Yoon HI, Kwon O-R, Kang KN, Shin YS, Shin HS, Yeon EH, et al. Diagnostic value of combining tumor and inflammatory markers in lung cancer. J Cancer Prev. (2016) 21:187–93. doi: 10.15430/JCP.2016.21.3.187
41. Kim BK, Lee JW, Park PJ, Shin YS, Lee WY, Lee KA, et al. The multiplex bead array approach to identifying serum biomarkers associated with breast cancer. Breast Cancer Res. (2009) 11:2247. doi: 10.1186/bcr2247
42. Duffy MJ, Sturgeon CM, Soletormos G, Barak V, Molina R, Hayes DF, et al. Validation of new cancer biomarkers: a position statement from the european group on tumor markers. Clin Chem. (2015) 61:809–20. doi: 10.1373/clinchem.2015.239863
43. Churpek MM, Yuen TC, Winslow C, Meltzer DO, Kattan MW, Edelson DP. Multicenter comparison of machine learning methods and conventional regression for predicting clinical deterioration on the wards: Criti Care Med. (2016) 44:368–74. doi: 10.1097/CCM.0000000000001571
Keywords: diagnostics, biomarkers, colorectal cancer, machine learning, carcinoembryonic antigen, apolipoproteins
Citation: Voronova V, Glybochko P, Svistunov A, Fomin V, Kopylov P, Tzarkov P, Egorov A, Gitel E, Ragimov A, Boroda A, Poddubskaya E and Sekacheva M (2020) Diagnostic Value of Combinatorial Markers in Colorectal Carcinoma. Front. Oncol. 10:832. doi: 10.3389/fonc.2020.00832
Received: 22 August 2019; Accepted: 28 April 2020;
Published: 22 May 2020.
Edited by:
Arndt Vogel, Hannover Medical School, GermanyReviewed by:
Razelle Kurzrock, University of California, San Diego, United StatesVladimir Lazar, Worldwide Innovative Networking in Cancer Personalized Medicine, France
Copyright © 2020 Voronova, Glybochko, Svistunov, Fomin, Kopylov, Tzarkov, Egorov, Gitel, Ragimov, Boroda, Poddubskaya and Sekacheva. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Veronika Voronova, veronika.voronova@msdecisions.ru