AUTHOR=Yuan Shinsheng , Chen Yen-Chou , Tsai Chi-Hsuan , Chen Huei-Wen , Shieh Grace S. TITLE=Feature selection translates drug response predictors from cell lines to patients JOURNAL=Frontiers in Genetics VOLUME=Volume 14 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2023.1217414 DOI=10.3389/fgene.2023.1217414 ISSN=1664-8021 ABSTRACT=Targeted and chemo-therapies are prevalent in cancer treatment. Identification of predictive markers to stratify cancer patients who will respond to these therapies remains challenging, because patient drug response data are limited. As large drug response data have been generated by cell lines, methods to efficiently translate cell-line-trained predictors to human tumors will be useful in clinical practice. Here, we propose versatile feature selection procedures, which can be combined with any classifier. For demonstration, we combined the feature selection procedures with a (linear) logit model and a (nonlinear) K-nearest neighbor, and train these by cell lines to result in LogitDA and KNNDA, respectively. We show that LogitDA/KNNDA significantly outperform existing methods, e.g., a logistic model and a deep learning method trained by thousands of genes, in prediction AUC (0.70-1.00 for seven of the ten drugs tested) and are interpretable. This may be due to sample sizes are often limited in the area of drug response prediction. We further derive a novel adjustment on the prediction cutoff for LogitDA to yield prediction accuracy 0.70-0.93 for seven drugs, including Erlotinib and Cetuximab whose pathways relevant to anti-cancer therapies are also uncovered. These results indicating our methods can efficiently translate cell-line-trained predictors to tumors. This is a provisional file, not the final typeset article identification of the characteristics of cancer patients who will respond to chemo-or targeted therapies using their molecular profiles is important for precision medicine. Given that patient drug response data is limited relative to cell lines, obtaining this information is challenging. However, large-scale drug sensitivity screens of cell lines has identified clinically meaningful gene-drug interactions(