Paclitaxel Response Can Be Predicted With Interpretable Multi-Variate Classifiers Exploiting DNA-Methylation and miRNA Data

Bomane, Alexandra; Gonçalves, Anthony; Ballester, Pedro J.

doi:10.3389/fgene.2019.01041

ORIGINAL RESEARCH article

Front. Genet., 25 October 2019

Sec. Computational Genomics

Volume 10 - 2019 | https://doi.org/10.3389/fgene.2019.01041

Paclitaxel Response Can Be Predicted With Interpretable Multi-Variate Classifiers Exploiting DNA-Methylation and miRNA Data

Cancer Research Center of Marseille, CRCM, INSERM, Institut Paoli-Calmettes, Aix-Marseille Univ, CNRS, Paris, France

Abstract

To address the problem of resistance to paclitaxel treatment, we have investigated to which extent is possible to predict Breast Cancer (BC) patient response to this drug. We carried out a large-scale tumor-based prediction analysis using data from the US National Cancer Institute’s Genomic Data Commons. These data sets comprise the responses of BC patients to paclitaxel along with six molecular profiles of their tumors. We assessed 10 Machine Learning (ML) algorithms on each of these profiles and evaluated the resulting 60 classifiers on the same BC patients. DNA methylation and miRNA profiles were the most informative overall. In combination with these two profiles, ML algorithms selecting the smallest subset of molecular features generated the most predictive classifiers: a complexity-optimized XGBoost classifier based on CpG island methylation extracted a subset of molecular factors relevant to predict paclitaxel response (AUC = 0.74). A CpG site methylation-based Decision Tree (DT) combining only 2 of the 22,941 considered CpG sites (AUC = 0.89) and a miRNA expression-based DT employing just 4 of the 337 analyzed mature miRNAs (AUC = 0.72) reveal the molecular types associated to paclitaxel-sensitive and resistant BC tumors. A literature review shows that features selected by these three classifiers have been individually linked to the cytotoxic-drug sensitivities and prognosis of BC patients. Our work leads to several molecular signatures, unearthed from methylome and miRNome, able to anticipate to some extent which BC tumors respond or not to paclitaxel. These results may provide insights to optimize paclitaxel-therapies in clinical practice.

Introduction

Breast cancer (BC) is the most common type of cancer in women worldwide resulting in half a million deaths annually (Golubnitschaja et al., 2016). BC is a disease presenting substantial inter-tumor heterogeneity (Russnes et al., 2011). Cytotoxic drugs are used to eradicate tumor cells, to complement surgery or radiotherapy as well as to alleviate cancer symptoms. Paclitaxel is a BC-approved cytotoxic drug from the taxane family, which acts by interfering with the normal function of microtubules during cell division (Perez, 1998). As with other cancer drugs (Brown and Böger-Brown, 1999; Cardoso et al., 2002; Ribeiro et al., 2012; Housman et al., 2014), resistance to paclitaxel have been regularly observed in BC patients (Flint et al., 2009; Ajabnoor et al., 2012).

Precision oncology requires predictors to guide the optimization of drug therapies for patients (Peck, 2016; Schwartzberg et al., 2017). Indeed, it is now well-established that gene polymorphisms and other genomic alterations play important roles in the observed heterogeneous response to drugs (Wang et al., 2011;Harper and Topol, 2012; Kadra et al., 2012). This has led to the identification of clinical biomarkers of drug response from molecular profiles of the patients’ tumors (Huang et al., 2014). These predictive biomarkers now guide patient-specific treatment selection during clinical trials and are also used in clinical practice (Mandrekar and Sargent, 2009; Biankin et al., 2015). Most commonly, single-gene markers are used to discriminate between therapy responders and non-responders (Prahallad et al., 2012; Rodríguez-Antona and Taron, 2015), typically consisting of an actionable mutation (e.g. single-nucleotide variant) of a specific gene in the tumor sample.

Single-gene markers that are able to predict the efficacy of cytotoxic drugs are rare (Felip and Martinez, 2012), especially for taxanes (Murray et al., 2012; Bartlett et al., 2015; Norimura et al., 2018). For instance, Marsh et al. (2007) have proposed that the point mutation CYP1B1*3 could be an important factor that helps to differentiate between sensitive and resistant BC patients to paclitaxel. However, Gehrmann et al. (2008) have raised doubts about the association between this alteration and paclitaxel-treated patient prognosis, concluding that CYP1B1 alone is not sufficient to predict tumor response to paclitaxel, and that it could interact with still unknown factors involved in paclitaxel sensitivity. This is an example of inter-patient variability in drug response not being fully captured by the mutational status of single gene, as it has also been seen in vitro in a range of drugs (Naulaerts et al., 2017).

Machine Learning (ML) can be used to build in silico models able to predict tumor response to a given drug by combining multiple tumor features in an optimal manner (Libbrecht and Noble, 2015; Ali and Aittokallio, 2018). The scarcity of suitable clinical data to build such predictors has been a major roadblock, which has made predictors based on cancer cell line data thrive (Costello et al., 2014; Ding et al., 2016). Fortunately, response data from paclitaxel-treated BC patients along with comprehensive molecular profiles of their tumors are increasingly available. Such datasets represent an opportunity to improve our ability to anticipate which BC patients will respond to paclitaxel. We obtained them from the recently created Genomic Data Commons (GDC) of the US National Cancer Institute (NCI) (Jensen et al., 2017). The GDC provides a unified data repository enabling data sharing across cancer genomic studies in support of precision medicine. The GDC feeds from several cancer genome programs at the NCI Center for Cancer Genomics, notably The Cancer Genome Atlas (TCGA) (Weinstein et al., 2013), and offers a range of information-rich genomic, transcriptomic and epigenomic profiles, as well as clinical drug response data.

These datasets, however, pose the challenge of being high-dimensional. Each profile typically contains between hundreds and many thousands of features, but only tens of profiled tumors of the same cancer type and treated with the same drug. For example, a community challenge intended to predict drug response employed 53 BC cell lines (Costello et al., 2014), while thousands of features from DNA copy-number variation, transcript expression, mutations, DNA methylation, and protein abundance profiles were considered. In another study (Tripathi et al., 2016), predictive models of response to cytotoxic drugs were achieved using 60 pancancer cell lines and gene variants as features. A further example of predictive drug-sensitivity models is a study employing 60 diverse cell lines and protein abundances as features (Ma et al., 2006). Small sample sizes are not only typical of preclinical studies, but also of clinical studies addressing the same problem. For instance, gene expression signatures were identified and evaluated using 81 melanoma patients to predict their response to PD-1 checkpoint inhibitors (Ayers et al., 2017).

In this study, we will investigate whether it is possible to anticipate the response of BC patients to paclitaxel using GDC data. We also aim at discovering the molecular factors that, collectively, best discriminate between paclitaxel-resistant and paclitaxel-sensitive BC patients. High-dimensional data promotes model overfitting, which in turn results in poorer predictions. As predictive performance differences between ML algorithms are strongly problem-dependent (Tan and Gilbert, 2003; Fernández-Delgado et al., 2014), considering a range of algorithms to identify those that are most suitable for paclitaxel-treated BC patients is appealing. To this end, we apply 10 ML methods to build predictive models in combination with each available molecular profile. Some of the resulting multi-variate predictors are highly interpretable in that they can answer questions such as why this particular patient is non-responsive. This information should permit formulating hypothesis about the molecular mechanisms of BC patient resistance to paclitaxel.

Material and Methods

GDC Data

GDC molecular profiling and clinical data from the TCGA Breast Invasive Carcinoma or BRCA (https://portal.gdc.cancer.gov/projects/TCGA-BRCA) provide the basis for this study. Molecular profiles and clinical data come from release version 4.0, except for miRNA and miRNA isoform (isomiR) expressions coming from release version 8.0 (Release Notes- GDC Docs).

TCGA-BRCA project gathers data from 1,098 patients, resulting in almost 13,000 files (around 130 GB). These datasets were retrieved and downloaded using the GDC Application Programming Interface (API). Table S1 reports information about files collected from the GDC that have been used to generate datasets.

Processing Clinical Data for Modelling

Patient population included primary or secondary advanced breast cancer receiving single-agent paclitaxel. For some patients it was observed that different drugs have the same or very close treatment start and end time. These entries may form part of a drug combination. However, available drug response annotations do not allow to check this information. Therefore, possible effects due to drug combinations are ignored in this study when identifying paclitaxel-treated patients. Patients with missing paclitaxel response were not retained. To only consider baseline tumors’ molecular profiles, patient records were only retained if no treatment was administered before resection and the time of sample procurement is indicated. We assumed that a baseline molecular profile can explain drug responses observed in a given patient even if paclitaxel was administered at any time after sample resection (Geeleher et al., 2014). After these curation steps, 61 paclitaxel-treated BC patients with valid records remained (Table S2 reports information about treatments and biospecimens). Annotated patient responses are divided into four categories based on the RECIST standard (Therasse et al., 2000): “Complete Response” (CR), “Partial Response” (PR), “Stable Disease” (SD), and “Clinical Progressive Disease” (CPD). We further classified clinical responses into two categories, namely “responder” (CR or PR) and “non-responder” (SD or CPD).

Processing Molecular Profiles for Modelling

The GDC works on harmonization of raw genomic data developing specific workflows to provide consistent and up-to-date molecular profiles (GDC Reference Files| NCI Genomic Data Commons, Genomic Data Harmonization| NCI Genomic Data Commons). Available profiles comprise copy-number variation (CNV), DNA methylation, mRNA, miRNA and isomiR (Ameres and Zamore, 2013) expressions. In order to produce suitable inputs for ML algorithms and/or extract some specific information, we processed some of them. More details are available in the homonym section of Supplementary Methods. All the datasets produced from these molecular profiles are made of real-valued features.

Predicting Drug Response Using ML Algorithms With Embedded Feature Selection

Most classifiers were built with ML algorithms capable of embedded feature selection to mitigate the impact of high-dimensional data on their generalization on unseen data. Implementations of Classification And Regression Tree (CART) (Breiman et al., 1984) and Random Forest (RF) (Breiman, 2001) were taken from the python library Scikit-learn version 0.19.1, while the ones of XGBoost (XGB) (Chen and Guestrin, 2016) version 0.6 and LightGBM (LGBM) (Ke et al., 2017) version 2.0.10 were respectively downloaded from https://github.com/dmlc/xgboost and https://github.com/Microsoft/LightGBM. We also applied a Deep Neural Network (DNN) algorithm (Bengio, 2009) implemented with the python library Keras version 2.2.4 using the TensorFlow backend. In addition to these nonlinear models, linear models were generated with Logistic Regression (LR) (Ranstam et al., 2016), which is also implemented in Scikit-learn. The visualization of Decision Trees (DTs) was done with the python package dtreeviz version 0.2.2. The homonym section in Supplementary Methods provides more details.

Predicting Drug Response Using the Optimal Model Complexity (OMC)

OMC is a strategy to build ML models employing only the most relevant features (Nguyen et al., 2018). More details are available in the homonym section of Supplementary Methods.

Measuring the Predictive Performance of a Classifier

This is a binary classification problem, as each patient belongs to one of two classes, responder and non-responder, with the responder considered as the positive class. As it is customary with problems with a small number of data instances (Table S3), we are using LOO (Leave-One-Out) CV (Cross-Validation) to evaluate the developed classifiers. Several types of LOOCV will be used: standard for “all-features models”, nested for “OMC models”, and permutated for “permutation models”. As with any other CV (Kohavi, 1995), each data instance (patient here) is exactly once in the test set. Thus, CV performance of a model is exclusively based on the prediction of test instances that were not used in any way to train or select the model (any feature selection is hence carried out on training folds only). Employing nested CV on algorithms requiring model selection (those employing OMC) provides an unbiased estimate of model performance, as it has been shown elsewhere (Cawley and Talbot, 2010; Varma and Simon, 2006).

Once known and predicted classes are compared for all held-out samples, true positives (TP), true negatives (TN), false positives (FP) and false negatives (FN) are counted among BC patients. These counts are used for calculating classification metrics that summarize the predictive performance reached by a classifier: precision, recall, F1-scores (Van Rijsbergen, 1979), and Matthews Correlation Coefficient (MCC) (Matthews, 1975; Boughorbel et al., 2017). More details about these metrics can be found in the homonym section of Supplementary Methods. Classification scores and contingency matrices obtained from all produced classifiers are stored in Tables S4 and S5, respectively.

Results

Benchmarking of All-Features Models (RF, XGB, LGBM, DNN, LR) Reveals Some Informative Molecular Profiles, But the Resulting Classifiers Perform Marginally Better Than Random

Figure 1 shows that most of the all-features ML classifiers perform worse than random classification reaching slightly negative median MCCs (from -0.19 to -0.05). Poor performance was also obtained when using linear models: LR models perform randomly at best (median MCCs range from -0.17 to 0.1 depending on the profile). Poor predictive performance is primarily caused by FPs (i.e. misclassification of non-responders). This problem was particularly noticeable in RF models, which misclassified every non-responsive patient regardless of the employed profile, leading to undefined MCCs. The latter shows that all-features RF handles class imbalance poorly on these particular problem instances.

Figure 1

DNA methylation-based XGB, LGBM, and DNN models achieve median MCCs slightly higher than 0.0, and they perform hardly better than permutation models. On the one hand, CpG site methylation-based XGB, LGBM, and DNN models obtain a median MCC of 0.08 (p-value from two-sided paired Student’s t-test obtained by class-permutation test = 1.05·10^-1), 0.04 (p-value = 1.41·10^-1), and 0.14 (p-value = 4.08·10^-1), respectively. On the other hand, CpG island (CGI) methylation-based XGB and LGBM models achieve a median MCC of 0.08 (p-value = 2.33·10^-1) and 0.09 (p-value = 8.04·10^-2), respectively. Moreover, miRNA and mRNA expression-based DNN models had a median MCC of 0.088 (p-value = 4.84·10^-2) and 0.015 (p-value = 4.76·10^-1), respectively.

Complexity-Optimized ML Models (RF-OMC, XGB-OMC, and LGBM-OMC) Provide Better Prediction and Extract Relevant Factors for Paclitaxel Response From CGI Methylation Data

Using OMC allows both to reduce considerably the number of features considered during model training and to adjust the operating threshold for assigning class labels to data instances. This leads to some OMC models that perform better than those considering all features from dataset (Table S12 in Supplementary Results). This is especially the case for some methylation-based models that have been improved using OMC (Figure 1), unlike for models based on other profiles (Figure S5). The improvement of OMC over the all-features approach is ML algorithm-dependent.

CGI methylation-based OMC models have obtained improved predictive performance, using either RF or XGB. For instance, XGB-OMC models obtain a median MCC of 0.25, which is significantly better than both permutation and all-features models (p-values equal 9.30 ×·10^-4 and 2.16 ×·10^-2, respectively). In order to extract a robust subset of molecular factors potentially involved in paclitaxel response, the most informative features selected by these models were investigated (Table S13 and S14). It results in 7 out of the 11,644 CGI coordinates encoded as CGI_ID.24217, CGI_ID.15915, CGI_ID.6919, CGI_ID.5276, CGI_ID.5459, CGI_ID.16043, and CGI_ID.11903. Moreover, we notice that 5 of them are common to the features used by the RF-OMC models. Consulting indices provided in Tables S7 and S8 (more details in Supplementary Methods), we found that these coordinates are related to the following 16 genes: CYP2D6, NDUFA6-AS1, RP4-669P10.19 (or C6orf108 pseudogene), MBTPS2, YY2, C2orf40 (or ECRG4), UXS1, IKZF1, APOBEC4, RGL1, ARPC5, NCF2, SMG7, C1orf177 (or LEXM), RP11-631M21.6 (or FAM166A pseudogene 7), and TUBB8 (Table S14).

Transparent ML Models (CART) Capture CpG Methylation Sites and Mature MIRNAS Relevant for the Sensitivity to Paclitaxel and Show How They Are Combined to Explain Drug Response

Most of the available profiles led to poor classification of test set patients when modelled with CART (Figure 1). By contrast, CART classifiers based on miRNA expression and CpG site methylation data provided high to very high predictive performance in the context of this problem (in 10 LOOCV runs, median MCCs of 0.43 and 0.54 were obtained, respectively) and performed significantly better than random models (p-values from two-sided paired Student’s t-test obtained by class-permutation test equal 4.57·× 10^-6 and 2.86·× 10^-4, respectively; see Figure 1 and Table 1). For each case, the best model is defined as that obtaining the highest MCC in 10 standard LOOCVs of the full dataset (i.e. all data instances and all available features). Figure S6 shows that the performance of these models is robust to different sizes of both training set and test set.

Table 1

Tumor profiling data	Number of considered features	Number of selected features	Median MCC(CART trained on original data)	Median MCC(CART trained on class-permutated data)	p-value(original vs permutated)
miRNA	337	4	0.43	0.09	4.57·10^-6
methy_CpG	22,941	2	0.54	0.23	2.86·10^-4

Best CART models.

The predictive performance of CART models was presented in Figure 1. Here we summarize the characteristics of the two best models (i.e. those exploiting miRNA expression and CpG methylation profiles). A median MCC was calculated with the 10 MCCs coming from LOOCV experiments (each with a different random seed). This five additional LOOCV runs with respect to those presented in Figure 3 were carried out to better characterize the performance of the best models found in our study. The small difference found in median MCC (0.52 in Figure 3 versus 0.54 here) suggests that this performance metric is quite robust to the number of LOOCV runs for CART. The training sets were also class-permutated during cross-validation as explained in the Methods section and CART trained on the resulting data to provide a second set of 10 MCCs per profile. The p-value (two-sided paired Student’s t-test) of this class-permutated test shows how likely are the MCCs of the CART models to arise by chance. The first model was trained on miRNA expression: 4 out of 337 mature miRNAs were retained to build this model reaching a median MCC of 0.43 and performing significantly better than models based on permutated data (p-value = 4.57·10^-6). The second model is obtained processing CpG site methylation (shorten as ‘methy_CpG’): 2 out of 22,941 CpG sites were retained to build this model achieving a median MCC of 0.54 and performing significantly better than permutation models (p-value = 2.86·10^-4).

As observed in Figure 2 CART models strongly reduce the number of features involved in the predictions. The miRNA expression-based model found that 4 out of 337 mature miRNAs were the most informative features (MIMAT0004985 or miR-942-5p, MIMAT0000084 or miR-27a-3p, MIMAT0000274 or miR-217, and MIMAT0004657 or miR-200c-5p), while the CpG-site methylation model identified 2 out of 22,941 CpG sites as the most informative features (cg09691574, which is related to the protein coding genes MRGPRX4 and SAA2-SAA4, and to the lincRNA RP11-113D6.6 also called antisense to MRGPRX4; and cg12542281, which is related to the protein coding gene N4BP2L2). The DTs represented in Figure 2 show directly the interactions between independent features leading to the predictions. They also reveal the molecular types associated to paclitaxel-sensitive and paclitaxel-resistant BC tumors (the CpG site index is provided in Table S7).

Figure 2

Lastly, integrating different molecular profiles has sometimes been found to provide small predictive accuracy gains, e.g. see Figure 4 in this study (Costello et al., 2014). Thus, since both miRNA and methy_CpG profiles led to the most predictive models, it makes sense to integrate these data sets and train CART models on the features of the resulting hybrid profile. Using the same 10 random seeds as the methy_CpG-based CART models (median LOOCV MCC of 0.54), the hybrid CART models obtained slightly worse accuracy (median LOOCV MCC of 0.52). The resulting CART tree is identical to that in Figure 2, suggesting that miRNA features were overshadowed by methy_CpG features during CART induction.

Discussion

Owing to the wealth of curated data offered by the GDC, we could evaluate six profiles. The exhaustive evaluation of the 60 predictive models obtained, employing 10 ML algorithms with each profile, reveal strong variability in predictive performance (Figure 3). These results show the importance of considering multiple profiles and ML algorithms, the latter being always possible. For example, we could have carried out this study using the standard all-features versions of tree-ensemble, LR and DNN algorithms. However, this would have only resulted in models with near-random predictability despite using six profiles and thus, we could have concluded that precision oncology is not possible for paclitaxel-treated BC patients. Instead, we also tested algorithms generating models requiring only a handful of features (OMC-based and CART), which in addition, provided the best performance on these problem instances. Note that the most predictive of these models achieved an over 10,000-fold reduction in the number of features (Table 1).

Figure 3

Identifying a concise list of predictive molecular features is indeed beneficial for interpretability. The CGI methylation-based XGB-OMC model employs a dramatically reduced number of features (11 of the considered 11,644). The increased predictive performance comparing to all-features model (Figure 1) shows that the selected subset of features contains the information relevant for predictions (Figure S2). Therefore, applying OMC not only offers better predictivity, but also better interpretability of response to paclitaxel, as it revealed a molecular signature able to discriminate sensitive and resistant BC tumors from high-dimensional data. The best CART models reached the highest predictive performance among the generated predictors (Figure 1). Moreover, these models allow going further in the interpretation of response to paclitaxel (Figure 2). For example, the CpG-methylation DT unveils two rules employing only two features to predict which are the paclitaxel-sensitive BC tumors (Figure 2). The other example is the miRNA DT, which carries out these predictions using four induced rules based on only four features (Figure 2). Thus, the application of these rules to forthcoming tumors should improve paclitaxel treatment for BC patients. To facilitate such application, we are providing two python scripts in the supplementary materials, each implementing the rules for one of these predictive profiles.

Our best classifier obtained a median MCC of 0.54 in 10 LOOCV runs (an average MCC of 0.62, with MCC ranging from 0.48 to 0.87 in these runs as it can be seen in Figure 1). To put these predictive accuracies in the context of what is typically achieved when predicting tumor response to a drug from omics profiles, we have looked at other test set performances reported at the literature for this problem. One study (Kim et al., 2016) applied a range of ML algorithms to predict pancancer cell line response from transcriptomic profiles and obtained MCCs below 0.6 in all cases (see Figure 1 in that paper). Maximum MCCs slightly above 0.5 and 0.3 were also obtained using RF with transcriptomic profiles (Nguyen et al., 2017) and genomic profiles (Naulaerts et al., 2017), respectively. Another study (Xu et al., 2019) also predicted drug response using many hundreds of pancancer cell lines using several ML algorithms from various omics profiles (gene expression, copy-number alterations, single-nucleotide mutations). Depending on the considered data resource, average MCCs across drug and profiles range from 0.15 to 0.31 or from 0.22 to 0.45 (see Tables 2 and 3 in that paper). Yet another example is by (Tripathi et al., 2016) using gene variants as features, where MCCs range across drugs from 0.32 to 0.56 or from 0.30 to 0.44 depending on data resource (see Tables 1 and 2 in that paper). Lastly, single-gene drug response markers identified by MANOVA and Chi-Square tests on pancancer cell lines obtained maximum MCCs of 0.30 and 0.31, respectively (Dang et al., 2018).

The alteration of gene expression due to epigenetic modifications triggers the development of cancers, including BC. DNA methylation changes, occu rring both within and around CGIs, can impact transcriptional activity of genes or transcription factors involved in malignant phenotypes (Esteller, 2002; Irizarry et al., 2009; Levenson, 2010; Deaton and Bird, 2011; Manjegowda et al., 2017; Stirzaker et al., 2017). It has been shown that biomarkers for prognosis and treatment can be unearthed from DNA methylation profiles (Xiang et al., 2013; Mikeska and Craig, 2014; Stirzaker et al., 2014; Li et al., 2015; Pouliot et al., 2015). Furthermore, it has been found that DNA methylation can interfere in chemo-resistance to paclitaxel (Wang et al., 2012; Ignatov et al., 2014; Yun et al., 2015; He et al., 2016; Zhang et al., 2018). Our DNA-methylation-based predictors selected CpG sites and CGIs related to genes previously found individually involved in cancer development and with transcriptional activity regulated by methylation (e.g. MBTPS2, YY2, ECRG4, IKZF1). Selected features by these models are also related to genes associated to response to cytotoxic drugs such as N4BP2L2 (paclitaxel), CYP2D6 (tamoxifen), APOBEC4 (tamoxifen, doxorubicin, and etoposide), and TUBB8 (paclitaxel) (Table S15).

miRNAs also play a key role in cancer development by acting as tumor suppressors or oncogenes. These molecules can be used as biomarkers, and modulation of their specific activities provides insight for therapeutic investigations (Hayes et al., 2014; Peng and Croce, 2016). Furthermore, the expression of some miRNAs has been associated to the sensitivity to paclitaxel (Zhou et al., 2010; Chen et al., 2014; He et al., 2016; Lu et al., 2017). The miRNA expression-based CART model combines miR-27a-3p, miR-217, miR-200c-5p, and miR-942-5p to predict which BC tumors are paclitaxel-responsive with high accuracy (Figures 1 and 2A). Individually, each of these miRNAs have been linked to paclitaxel response and BC prognosis: the first three are related to paclitaxel resistance, whereas the last one is associated to shorter survival of BC patients (Table S15).

Our study has some limitations to be pointed out. First, for a given patient, molecular profiles were obtained from the primary tumor, while clinical response was registered later following tumor evolution. Both tumors may present some differences at the molecular level, due to temporal or spatial tumor heterogeneity, as well the possible impact of the treatment administered after tumor resection. Second, while we reported predictive accuracy on BC tumors not used in any way to build or select the model, an additional independent evaluation on a second cohort would shed further light into how general these models are. The latter is currently not possible due to the scarcity of paclitaxel-treated BC patients with DNA methylation or miRNA profiles of their tumors.

Yet, our work provides very predictive (in the context of the considered problem), robust (Figure S4 and Figure 4), and even interpretable models to identify primary BC tumors sensitive to paclitaxel. These results also suggest that tumor methylomes and miRNomes can be a source of multi-variate models to predict prognosis and treatment response. Indeed, our predictive models reveal molecular features that can collectively anticipate which BC tumors are sensitive or resistant to paclitaxel. Previous studies have experimentally validated the involvement in BC development, and even in the resistance to paclitaxel, of these molecular factors individually, which further supports the applicability of these classifiers. Furthermore, our results also suggest novel predictive factors such as the antisense to MRGPRX4; the pseudogenes (Poliseno et al., 2015; Xiao-Jie et al., 2015) C6orf108 and FAM166A; and the coding genes NDUFA6-AS1, UXS1, RGL1, and LEXM.

Figure 4

Funding

This work was supported by the Institut Paoli-Calmettes (grant number 305/2016 to PB).

Statements

Data availability statement

The datasets generated for this study can be found in the https://portal.gdc.cancer.gov/.

Author contributions

PB conceived the study and designed the experiments. AB and PB wrote the manuscript with the assistance of AG. AB carried out the numerical experiments. All authors analyzed the results and contributed to their discussion.

Acknowledgments

Computing resources for this study were partly provided by the computing facilities DISC (Datacenter IT and Scientific Computing) of the Centre de Recherche en Cancérologie de Marseille.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2019.01041/full#supplementary-material

References

1
AjabnoorG.CrookT.ColeyH. (2012). Paclitaxel resistance is associated with switch from apoptotic to autophagic cell death in MCF-7 breast cancer cells. Cell Death Dis.3, e260. doi: 10.1038/cddis.2011.139
- CrossRef
- Google Scholar
2
AliM.AittokallioT. (2018). Machine learning and feature selection for drug response prediction in precision oncology applications. Biophys. Rev.11, 1–9. doi: 10.1007/s12551-018-0446-z
- CrossRef
- Google Scholar
3
AmeresS. L.ZamoreP. D. (2013). Diversifying microRNA sequence and function. Nat. Rev. Mol. Cell Biol.14, 475–488. doi: 10.1038/nrm3611
- CrossRef
- Google Scholar
4
AyersM.LuncefordJ.NebozhynM.MurphyE.LobodaA.KaufmanD. R.et al. (2017). IFN-γ–related mRNA profile predicts clinical response to PD-1 blockade. J. Clin. Invest.127, 2930–2940. doi: 10.1172/JCI91190
- CrossRef
- Google Scholar
5
BartlettJ. M. S.NielsenT. O.GaoD.GelmonK. A.QuintayoM. A.StarczynskiJ.et al.. (2015). TLE3 is not a predictive biomarker for taxane sensitivity in the NCIC CTG MA.21 clinical trial. Br. J. Cancer.113, 722–728. doi: 10.1038/bjc.2015.271
- CrossRef
- Google Scholar
6
BengioY. (2009). “Learning Deep Architectures for AI,” in Found. Trends® Mach. Learn.Hanover, MA, USA. doi: 10.1561/2200000006
- CrossRef
- Google Scholar
7
BiankinA. V.PiantadosiS.HollingsworthS. J. (2015). Patient-centric trials for therapeutic development in precision oncology. Nature.526, 361–370. doi: 10.1038/nature15819
- CrossRef
- Google Scholar
8
BoughorbelS.JarrayF.El-AnbariM. (2017). Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric. PLoS One.12, e0177678. doi: 10.1371/journal.pone.0177678
- CrossRef
- Google Scholar
9
BreimanL. (2001). Random Forests. Mach. Learn. Boca Raton, FL, USA. 45, 5–32. doi: 10.1023/A:1010933404324
- CrossRef
- Google Scholar
10
BreimanL.FriedmanJ.StoneC. J.OlshenR. A. (1984). Classification and Regression Trees.Chapman and Hall/CRC.
- Google Scholar
11
BrownR.Böger-BrownU., (1999). Cytotoxic Drug Resistance Mechanisms.New Jersey: Humana Press. doi: 10.1385/1592596878
- CrossRef
- Google Scholar
12
CardosoF.Di LeoA.LohrischC.BernardC.FerreiraF.PiccartM. J. (2002). Second and subsequent lines of chemotherapy for metastatic breast cancer: what did we learn in the last two decades? Ann. Oncol.13, 197–207. doi: 10.1093/annonc/mdf101
- CrossRef
- Google Scholar
13
CawleyG. C.TalbotN. L. C. (2010). On over-fitting in model selection and subsequent selection bias in performance evaluation. J. Mach. Learn. Res.11, 2079–2107.
- Google Scholar
14
ChenN.ChonH. S.XiongY.MarchionD. C.JudsonP. L.HakamA.et al. (2014). Human cancer cell line microRNAs associated with in vitro sensitivity to paclitaxel. Oncol. Rep.31, 376–383. doi: 10.3892/or.2013.2847
- CrossRef
- Google Scholar
15
ChenT.GuestrinC. (2016). “XGBoost,” in Reliable Large-scale Tree Boosting System.ACMNew York, NY, USA. doi: 10.1145/2939672.2939785
- CrossRef
- Google Scholar
16
CostelloJ. C.HeiserL. M.GeorgiiE.GönenM.MendenM. P.WangN. J.et al. (2014). A community effort to assess and improve drug sensitivity prediction algorithms. Nat. Biotechnol.32, 1202–1212. doi: 10.1038/nbt.2877
- CrossRef
- Google Scholar
17
DangC. C.PeónA.BallesterP. J. (2018). Unearthing new genomic markers of drug response by improved measurement of discriminative power. BMC Med. Genomics11, 10. doi: 10.1186/s12920-018-0336-z
- CrossRef
- Google Scholar
18
DeatonA. M.BirdA. (2011). CpG islands and the regulation of transcription. Genes Dev.25, 1010–1022. doi: 10.1101/gad.2037511
- CrossRef
- Google Scholar
19
DingZ.ZuS.GuJ. (2016). Evaluating the molecule-based prediction of clinical drug responses in cancer. Bioinformatics32, 2891–2895. doi: 10.1093/bioinformatics/btw344
- CrossRef
- Google Scholar
20
EstellerM. (2002). CpG island hypermethylation and tumor suppressor genes: a booming present, a brighter future. Oncogene.21, 5427–5440. doi: 10.1038/sj.onc.1205600
- CrossRef
- Google Scholar
21
FelipE.MartinezP. (2012). Can sensitivity to cytotoxic chemotherapy be predicted by biomarkers? Ann. Oncol.23Suppl 1, x189–x192. doi: 10.1093/annonc/mds309
- CrossRef
- Google Scholar
22
Fernández-DelgadoM.CernadasE.BarroS.AmorimD. (2014). Do we need hundreds of classifiers to solve real world classification problems? J. Mach. Learn. Res.15, 3133–3181.
- Google Scholar
23
FlintM. S.KimG.HoodB. L.BatemanN. W.StewartN. A.ConradsT. P. (2009). Stress hormones mediate drug resistance to paclitaxel in human breast cancer cells through a CDK-1-dependent pathway. Psychoneuroendocrinology.34, 1533–1541. doi: 10.1016/j.psyneuen.2009.05.008
- CrossRef
- Google Scholar
24
GDC Reference Files | NCI Genomic Data Commons.
25
GeeleherP.CoxN. J.HuangR. S. (2014). Clinical drug response can be predicted using baseline gene expression levels and in vitro drug sensitivity in cell lines. Genome Biol.15, R47. doi: 10.1186/gb-2014-15-3-r47
- CrossRef
- Google Scholar
26
GehrmannM.SchmidtM.BraseJ. C.RoosP.HengstlerJ. G. (2008). Prediction of paclitaxel resistance in breast cancer: is CYP1B1*3 a new factor of influence? Pharmacogenomics.9, 969–974. doi: 10.2217/14622416.9.7.969
- CrossRef
- Google Scholar
27
Genomic Data Harmonization | NCI Genomic Data Commons.
28
GolubnitschajaO.DebaldM.YeghiazaryanK.KuhnW.PeštaM.CostigliolaV.et al. (2016). Breast cancer epidemic in the early twenty-first century: evaluation of risk factors, cumulative questionnaires and recommendations for preventive measures. Tumor Biol.37, 12941–12957. doi: 10.1007/s13277-016-5168-x
- CrossRef
- Google Scholar
29
HarperA. R.TopolE. J. (2012). Pharmacogenomics in clinical practice and drug development. Nat. Biotechnol.30, 1117–1124. doi: 10.1038/nbt.2424
- CrossRef
- Google Scholar
30
HayesJ.PeruzziP. P.LawlerS. (2014). MicroRNAs in cancer: biomarkers, functions and therapy. Trends Mol. Med.20, 460–469 doi: 10.1016/j.molmed.2014.06.005
- CrossRef
- Google Scholar
31
HeD. X.GuF.GaoF.HaoJ. J.GongD.GuX. T.et al. (2016). Genome-wide profiles of methylation, microRNAs, and gene expression in chemoresistant breast cancer. Sci. Rep.6, 24706. doi: 10.1038/srep24706
- CrossRef
- Google Scholar
32
HousmanG.BylerS.HeerbothS.LapinskaK.LongacreM.SnyderN.et al. (2014). Drug resistance in cancer: an overview. Cancers (Basel).6, 1769–1792. doi: 10.3390/cancers6031769
- CrossRef
- Google Scholar
33
HuangM.ShenA.DingJ.GengM. (2014). Molecularly targeted cancer therapy: some lessons from the past decade. Trends Pharmacol. Sci.35, 41–50. doi: 10.1016/j.tips.2013.11.004
- CrossRef
- Google Scholar
34
IgnatovT.EggemannH.CostaS. D.RoessnerA.KalinskiT.IgnatovA. (2014). BRCA1 promoter methylation is a marker of better response to platinum-taxane-based therapy in sporadic epithelial ovarian cancer. J. Cancer Res. Clin. Oncol.140, 1457–1463. doi: 10.1007/s00432-014-1704-5
- CrossRef
- Google Scholar
35
IrizarryR. A.Ladd-AcostaC.WenB.WuZ.MontanoC.OnyangoP.et al.. (2009). The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores. Nat. Genet.41, 178–186. doi: 10.1038/ng.298
- CrossRef
- Google Scholar
36
JensenM. A.FerrettiV.GrossmanR. L.StaudtL. M. (2017). The NCI genomic data commons as an engine for precision medicine. Blood.130, 453–459. doi: 10.1182/blood-2017-03-735654
- CrossRef
- Google Scholar
37
KadraG.FinettiP.ToironY.ViensP.BirnbaumD.BorgJ.-P.et al. (2012). Gene expression profiling of breast tumor cell lines to predict for therapeutic response to microtubule-stabilizing agents. Breast Cancer Res. Treat.132, 1035–1047. doi: 10.1007/s10549-011-1687-8
- CrossRef
- Google Scholar
38
KeG.MengQ.WangT.ChenW.MaW.LiuT.-Y.et al. (2017). LightGBM: a highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst.30, 3147–3155.
- Google Scholar
39
KimS.SundaresanV.ZhouL.KahveciT. (2016). Integrating domain specific knowledge and network analysis to predict drug sensitivity of cancer cell lines. PLoS One11, e0162173. doi: 10.1371/journal.pone.0162173
- CrossRef
- Google Scholar
40
KohaviR. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. Proc. 14th Int. Jt. Conf. Artif. Intell.2, 1137–1143. doi: 10.1067/mod.2000.109031
- CrossRef
- Google Scholar
41
LevensonV. V. (2010). DNA methylation as a universal biomarker. Expert Rev. Mol. Diagn.10, 481–488. doi: 10.1586/erm.10.17
- CrossRef
- Google Scholar
42
LiY.MelnikovA. A.LevensonV.GuerraE.SimeoneP.AlbertiS.et al.. (2015). A seven-gene CpG-island methylation panel predicts breast cancer progression. BMC Cancer.15, 417. doi: 10.1186/s12885-015-1412-9
- CrossRef
- Google Scholar
43
LibbrechtM. W.NobleW. S. (2015). Machine learning applications in genetics and genomics. Nat. Rev. Genet.16, 321–332. doi: 10.1038/nrg3920
- CrossRef
- Google Scholar
44
LuC.XieZ.PengQ. (2017). MiRNA-107 enhances chemosensitivity to paclitaxel by targeting antiapoptotic factor Bcl-w in non small cell lung cancer. Am. J. Cancer Res.7, 1863–1873.
- Google Scholar
45
MaY.DingZ.QianY.ShiX.CastranovaV.HarnerE. J.et al.. (2006). Predicting cancer drug response by proteomic profiling. Clin. Cancer Res.12, 4583–4589. doi: 10.1158/1078-0432.CCR-06-0290
- CrossRef
- Google Scholar
46
MandrekarS. J.SargentD. J. (2009). Clinical trial designs for predictive biomarker validation: theoretical considerations and practical challenges. J. Clin. Oncol.27, 4027–4034. doi: 10.1200/JCO.2009.22.3701
- CrossRef
- Google Scholar
47
ManjegowdaM. C.GuptaP. S.LimayeA. M. (2017). Hyper-methylation of the upstream CpG island shore is a likely mechanism of GPER1 silencing in breast cancer cells. Gene.614, 65–73. doi: 10.1016/j.gene.2017.03.006
- CrossRef
- Google Scholar
48
MarshS.SomloG.LiX.FrankelP.KingC. R.ShannonW. D.et al.. (2007). Pharmacogenetic analysis of paclitaxel transport and metabolism genes in breast cancer. Pharmacogenomics J.7, 362–365. doi: 10.1038/sj.tpj.6500434
- CrossRef
- Google Scholar
49
MatthewsB. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta. Protein Struct.405, 442–451. doi: 10.1016/0005-2795(75)90109-9
- CrossRef
- Google Scholar
50
MikeskaT.CraigJ. M. (2014). DNA methylation biomarkers: cancer and beyond. Genes (Basel).5, 821–864. doi: 10.3390/genes5030821
- CrossRef
- Google Scholar
51
MurrayS.BriasoulisE.LinardouH.BafaloukosD.PapadimitriouC. (2012). Taxane resistance in breast cancer: mechanisms, predictive biomarkers and circumvention strategies. Cancer Treat. Rev.38, 890–903. doi: 10.1016/j.ctrv.2012.02.011
- CrossRef
- Google Scholar
52
NaulaertsS.DangC.BallesterP. J. (2017). Precision and recall oncology: combining multiple gene mutations for improved identification of drug-sensitive tumours. Oncotarget8, 97025–97040. doi: 10.18632/oncotarget.20923
- CrossRef
- Google Scholar
53
NguyenL.DangC. C.BallesterP. J. (2017). Systematic assessment of multi-gene predictors of pan-cancer cell line sensitivity to drugs exploiting gene expression data. Research5, 2927. doi: 10.12688/f1000research.10529.2
- CrossRef
- Google Scholar
54
NguyenL.NaulaertsS.BomaneA.BrunaA.GhislatG.BallesterP. (2018). Machine learning models to predict in vivo drug response via optimal dimensionality reduction of tumour molecular profiles. bioRxiv277772, 1–34. doi: 10.1101/277772
- CrossRef
- Google Scholar
55
NorimuraS.KontaniK.KuboT.HashimotoS.-I.MurazawaC.KenzakiK.et al. (2018). Candidate biomarkers predictive of anthracycline and taxane efficacy against breast cancer. J. Cancer Res. Ther.14, 409–415. doi: 10.4103/jcrt.JCRT_1053_16
- CrossRef
- Google Scholar
56
PeckR. W. (2016). The right dose for every patient: a key step for precision medicine. Nat. Rev. Drug Discovery.15, 145–146. doi: 10.1038/nrd.2015.22
- CrossRef
- Google Scholar
57
PengY.CroceC. M. (2016). The role of microRNAs in human cancer. Signal Transduct. Target. Ther.1, 15004. doi: 10.1038/sigtrans.2015.4
- CrossRef
- Google Scholar
58
PerezE. A. (1998). Paclitaxel in Breast Cancer. Oncologist3, 373–389.
- Google Scholar
59
PolisenoL.MarranciA.PandolfiP. P. (2015). Pseudogenes in human cancer. Front. Med.2, 68. doi: 10.3389/fmed.2015.00068
- CrossRef
- Google Scholar
60
PouliotM. C.LabrieY.DiorioC.DurocherF. (2015). The role of methylation in breast cancer susceptibility and treatment. Anticancer Res.35, 4569–4574. doi: 10.1007/s13566-015-0216-5
- CrossRef
- Google Scholar
61
PrahalladA.SunC.HuangS.Di NicolantonioF.SalazarR.ZecchinD.et al. (2012). Unresponsiveness of colon cancer to BRAF(V600E) inhibition through feedback activation of EGFR. Nature483, 100–103. doi: 10.1038/nature10868
- CrossRef
- Google Scholar
62
RanstamJ.CookJ. A.CollinsG. S. (2016). Clinical prediction models. Br. J. Surg.103, 1886. doi: 10.1002/bjs.10242
- CrossRef
- Google Scholar
63
Release Notes – GDC Docs Available at: https://docs.gdc.cancer.gov/Data/Release_Notes/Data_Release_Notes/ [Accessed March 11, 2019].
64
RibeiroJ. T.MacedoL. T.CuriglianoG.FumagalliL.LocatelliM.DaltonM.et al. (2012). Cytotoxic drugs for patients with breast cancer in the era of targeted treatment: back to the future? Ann. Oncol.23, 547–555. doi: 10.1093/annonc/mdr382
- CrossRef
- Google Scholar
65
Van RijsbergenC. J. (1979) Information Retrieval.Butterworths, London, UK. doi: 10.1016/j.pestbp.2006.07.008
- CrossRef
- Google Scholar
66
Rodríguez-AntonaC.TaronM. (2015). Pharmacogenomic biomarkers for personalized cancer treatment. J. Int. Med.277, 201–217. doi: 10.1111/joim.12321
- CrossRef
- Google Scholar
67
RussnesH. G.NavinN.HicksJ.Borresen-DaleA. L. (2011). Insight into the heterogeneity of breast cancer through next-generation sequencing. J. Clin. Invest.121, 3810–3818. doi: 10.1172/JCI57088
- CrossRef
- Google Scholar
68
SchwartzbergL.KimE. S.LiuD.SchragD. (2017). Precision oncology: who, how, what, when, and when not? Am. Soc. Clin. Oncol. Educ. B.37, 160–169. doi: 10.14694/EDBK_174176
- CrossRef
- Google Scholar
69
StirzakerC.SongJ. Z.NgW.DuQ.ArmstrongN. J.LockeW. J.et al. (2017). Methyl-CpG-binding protein MBD2 plays a key role in maintenance and spread of DNA methylation at CpG islands and shores in cancer. Oncogene.36, 1328–1338. doi: 10.1038/onc.2016.297
- CrossRef
- Google Scholar
70
StirzakerC.TaberlayP. C.StathamA. L.ClarkS. J. (2014). Mining cancer methylomes: prospects and challenges. Trends Genet.30, 75–84. doi: 10.1016/j.tig.2013.11.004
- CrossRef
- Google Scholar
71
TanA. C.GilbertD. (2003). An empirical comparison of supervised machine learning techniques in bioinformatics. Proc. First Asia-Pacific Bioinforma. Conf. Bioinforma.19, 219–222.
- Google Scholar
72
TherasseP.ArbuckS. G.EisenhauerE. A.WandersJ.KaplanR. S.RubinsteinL.et al. (2000). New guidelines to evaluate the response to treatment in solid tumors. J. Natl. Cancer Inst.92, 205–216. doi: 10.1093/jnci/92.3.205
- CrossRef
- Google Scholar
73
TripathiS.BelkacemiL.CheungM. S.BoseR. N. (2016). Correlation between gene variants, signaling pathways, and efficacy of chemotherapy drugs against colon cancers. Cancer Inform.15, 1–13. doi: 10.4137/CIN.S34506
- CrossRef
- Google Scholar
74
VarmaS.SimonR. (2006). Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics7, 91. doi: 10.1186/1471-2105-7-91
- CrossRef
- Google Scholar
75
WangL.McLeodH. L.WeinshilboumR. M. (2011). Genomics and drug response. N. Engl. J. Med.364, 1144–1153. doi: 10.1056/NEJMra1010600
- CrossRef
- Google Scholar
76
WangN.ZhangH.YaoQ.WangY.DaiS.YangX. (2012). TGFBI promoter hypermethylation correlating with paclitaxel chemoresistance in ovarian cancer. J. Exp. Clin. Cancer Res.31, 6. doi: 10.1186/1756-9966-31-6
- CrossRef
- Google Scholar
77
WeinsteinJ. N.CollissonE. A.MillsG. B.ShawK. M.OzenbergerB. A.EllrottK.et al. (2013). The cancer genome atlas pan-cancer analysis project. Nat. Genet.45, 1113. doi: 10.1038/ng.2764
- CrossRef
- Google Scholar
78
XiangT. X.YuanY.LiL. L.WangZ. H.DanL. Y.ChenY.et al. (2013). Aberrant promoter CpG methylation and its translational applications in breast cancer. Chin. J. Cancer.32, 12–20. doi: 10.5732/cjc.011.10344
- CrossRef
- Google Scholar
79
Xiao-JieL.Ai-MeiG.Li-JuanJ.JiangX. (2015). Pseudogene in cancer: Real functions and promising signature. J. Med. Genet.52, 17–24. doi: 10.1136/jmedgenet-2014-102785
- CrossRef
- Google Scholar
80
XuX.GuH.WangY.WangJ.QinP. (2019). Autoencoder based feature selection method for classification of anticancer drug response. Front. Genet.10, 233. doi: 10.3389/fgene.2019.00233
- CrossRef
- Google Scholar
81
YunT.LiuY.GaoD.LinghuE.BrockM. V.YinD.et al. (2015). Methylation of CHFR sensitizes esophageal squamous cell cancer to docetaxel and paclitaxel. Genes Cancer.6, 38–48. doi: 10.18632/genesandcancer.46
- CrossRef
- Google Scholar
82
ZhangJ.ZhangJ.XuS.ZhangX.WangP.WuH.et al. (2018). Hypoxia-Induced TPM2 methylation is associated with chemoresistance and poor prognosis in breast cancer. Cell. Physiol. Biochem.45, 692–705. doi: 10.1159/000487162
- CrossRef
- Google Scholar
83
ZhouM.LiuZ.ZhaoY.DingY.LiuH.XiY.et al. (2010). MicroRNA-125b confers the resistance of breast cancer cells to paclitaxel through suppression of pro-apoptotic Bcl-2 antagonist killer 1 (Bak1) expression. J. Biol. Chem.285, 21496–21507. doi: 10.1074/jbc.M109.083337
- CrossRef
- Google Scholar

Summary

Keywords

biomarker discovery, machine learning, artificial intelligence, precision oncology, tumor profiling

Citation

Bomane A, Gonçalves A and Ballester PJ (2019) Paclitaxel Response Can Be Predicted With Interpretable Multi-Variate Classifiers Exploiting DNA-Methylation and miRNA Data. Front. Genet. 10:1041. doi: 10.3389/fgene.2019.01041

Received

18 June 2019

Accepted

30 September 2019

Published

25 October 2019

Volume

10 - 2019

Edited by

Rakesh Kaundal, Utah State University, United States

Reviewed by

Deepak Singla, Punjab Agricultural University, India; Haiquan Li, University of Arizona, United States

Updates

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Pedro J. Ballester, pedro.ballester@inserm.fr

This article was submitted to Bioinformatics and Computational Biology, a section of the journal Frontiers in Genetics

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Computational Genomics

ORIGINAL RESEARCH article

Paclitaxel Response Can Be Predicted With Interpretable Multi-Variate Classifiers Exploiting DNA-Methylation and miRNA Data

Abstract

Introduction