Skip to main content

BRIEF RESEARCH REPORT article

Front. RNA Res.
Sec. Non-coding RNA
Volume 2 - 2024 | doi: 10.3389/frnar.2024.1473293

Predicting Conserved Functional Interactions for Long Noncoding RNAs via Deep Learning

Provisionally accepted
  • 1 Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, United States
  • 2 School of Data Science and Society, University of North Carolina at Chapel Hill, Chapel Hill, United States

The final, formatted version of the article will be published soon.

    Long noncoding RNA (lncRNA) genes outnumber protein coding genes in the human genome and the majority remain uncharacterized. A major difficulty in generalizing understanding of lncRNA function is the dearth of gross sequence conservation, both for lncRNAs across species and for lncRNAs that perform similar functions within a species. Machine learning based methods which harness vast amounts of information on RNAs are increasingly used to impute certain biological characteristics. This includes interactions with proteins that are important mediators of RNA function, thus enabling the generation of knowledge in contexts for which experimental data are lacking. Here, we applied a natural language-based machine learning approach that enabled us to identify RNA binding protein interactions in lncRNA transcripts, using only RNA sequence as an input. We found that this predictive method is a powerful approach to infer conserved binding across species as distant as human and opossum, even in the absence of sequence conservation, thus informing on sequence-function relationships for these poorly understood RNAs.

    Keywords: long noncoding RNA, RNA binding protein, machine learning, deep learning, Natural Language Processing, CLIP-Seq

    Received: 30 Jul 2024; Accepted: 09 Sep 2024.

    Copyright: © 2024 Kratz and Smith. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Keriayn N. Smith, School of Data Science and Society, University of North Carolina at Chapel Hill, Chapel Hill, United States

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.