Skip to main content

ORIGINAL RESEARCH article

Front. Cell Dev. Biol.
Sec. Molecular and Cellular Pathology
Volume 12 - 2024 | doi: 10.3389/fcell.2024.1456728
This article is part of the Research Topic Advancements in Proteomics and PTMomics: Unveiling Mechanistic Insights and Targeted Therapies for Metabolic Diseases View all articles

DeepO-GlcNAc: A Web Server for Prediction of Protein O-GlcNAcylation Sites Using Deep Learning combined with attention mechanism

Provisionally accepted
  • 1 School of Pharmacy, Binzhou Medical University, Yantai, Shandong Province, China
  • 2 Xiamen University, Xiamen, Fujian Province, China
  • 3 Shandong University, Weihai, Weihai, Shandong, China
  • 4 Yale University, New Haven, Connecticut, United States

The final, formatted version of the article will be published soon.

    Protein O-GlcNAcylation is a dynamic post-translational modification involved in major cellular processes and associated with many human diseases. Bioinformatic prediction of O-GlcNAc sites before experimental validation is a challenge task in O-GlcNAc research. Recent advancements in deep learning algorithms and the availability of O-GlcNAc proteomics data present an opportunity to improve O-GlcNAc site prediction. This study aims to develop a deep learning-based tool to improve O-GlcNAcylation site prediction. We construct an annotated unbalanced O-GlcNAcylation data set and propose a new deep learning framework, DeepO-GlcNAc, using Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) combined with attention mechanism. The ablation study confirms that the additional model components in DeepO-GlcNAc, such as attention mechanisms and LSTM, contribute positively to improving prediction performance. Our model demonstrates strong robustness across five cross-species datasets, excluding humans. We also compare our model with three external predictors using an independent dataset. Our results demonstrated that DeepO-GlcNAc outperforms the external predictors, achieving an accuracy of 92%, an average precision of 72%, a MCC of 0.60, and an AUC of 92% in ROC analysis. Moreover, we have implemented DeepO-GlcNAc as a web server to facilitate further investigation and usage by the scientific community. Our work demonstrates the feasibility of utilizing deep learning for O-GlcNAc site prediction and provides a novel tool for O-GlcNAc investigation.

    Keywords: :Deep learning, O-GlcNAc, CNN, Prediciton, Attention

    Received: 29 Jun 2024; Accepted: 26 Sep 2024.

    Copyright: © 2024 Zhang, Pan, Deng, Zhang, Zhang, Yang, Yang, Tian and Mi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Jia Mi, School of Pharmacy, Binzhou Medical University, Yantai, 264003, Shandong Province, China

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.