AUTHOR=Choudhury Shubham , Bajiya Nisha , Patiyal Sumeet , Raghava Gajendra P. S. TITLE=MRSLpred—a hybrid approach for predicting multi-label subcellular localization of mRNA at the genome scale JOURNAL=Frontiers in Bioinformatics VOLUME=4 YEAR=2024 URL=https://www.frontiersin.org/journals/bioinformatics/articles/10.3389/fbinf.2024.1341479 DOI=10.3389/fbinf.2024.1341479 ISSN=2673-7647 ABSTRACT=

In the past, several methods have been developed for predicting the single-label subcellular localization of messenger RNA (mRNA). However, only limited methods are designed to predict the multi-label subcellular localization of mRNA. Furthermore, the existing methods are slow and cannot be implemented at a transcriptome scale. In this study, a fast and reliable method has been developed for predicting the multi-label subcellular localization of mRNA that can be implemented at a genome scale. Machine learning-based methods have been developed using mRNA sequence composition, where the XGBoost-based classifier achieved an average area under the receiver operator characteristic (AUROC) of 0.709 (0.668–0.732). In addition to alignment-free methods, we developed alignment-based methods using motif search techniques. Finally, a hybrid technique that combines the XGBoost model and the motif-based approach has been developed, achieving an average AUROC of 0.742 (0.708–0.816). Our method—MRSLpred—outperforms the existing state-of-the-art classifier in terms of performance and computation efficiency. A publicly accessible webserver and a standalone tool have been developed to facilitate researchers (webserver: https://webs.iiitd.edu.in/raghava/mrslpred/).