Skip to main content

ORIGINAL RESEARCH article

Front. Genet.
Sec. Epigenomics and Epigenetics
Volume 15 - 2024 | doi: 10.3389/fgene.2024.1464976
This article is part of the Research Topic Protein Modifications in Epigenetic Dysfunctional Diseases: Mechanisms and Potential Therapeutic Strategies, Volume II View all articles

DLBWE-Cys: A Deep-Learning-Based Tool for Identifying Cysteine S-Carboxyethylation Sites Using Binary-Weight Encoding

Provisionally accepted
Zhengtao Luo Zhengtao Luo 1*Qingyong Wang Qingyong Wang 1Yingchun Xia Yingchun Xia 1*Xiaolei Zhu Xiaolei Zhu 1*Shuai Yang Shuai Yang 1*Zhaochun Xu Zhaochun Xu 2*Lichuan Gu Lichuan Gu 1*
  • 1 Anhui Agricultural University, Hefei, China
  • 2 Harbin Medical University, Harbin, Heilongjiang, China

The final, formatted version of the article will be published soon.

    Cysteine S-carboxyethylation, a novel post-translational modification (PTM), plays a critical role in the pathogenesis of autoimmune diseases, particularly ankylosing spondylitis. Accurate identification of S-carboxyethylation modification sites is essential for elucidating their functional mechanisms. Unfortunately, there are currently no computational tools that can accurately predict these sites, posing a significant challenge to this area of research. In this study, we developed a new deep learning model, DLBWE-Cys, which integrates CNN, BiLSTM, Bahdanau attention mechanisms, and a fully connected neural network (FNN), using Binary-Weight encoding specifically designed for the accurate identification of cysteine S-carboxyethylation sites. Our experimental results show that our model architecture outperforms other machine learning and deep learning models in 5-fold crossvalidation and independent testing. Feature comparison experiments confirmed the superiority of our proposed Binary-Weight encoding method over other encoding techniques. t-SNE visualization further validated the model's effective classification capabilities. Additionally, we confirmed the similarity between the distribution of positional weights in our Binary-Weight encoding and the allocation of weights in attentional mechanisms. Further experiments proved the effectiveness of our Binary-Weight encoding approach. Thus, this model paves the way for predicting cysteine Scarboxyethylation modification sites in protein sequences. The source code of DLBWE-Cys and experiments data are available at: https://github.com/ztLuo-bioinfo/DLBWE-Cys.

    Keywords: S-carboxyethylation, post-translational modification, Bahdanau attention mechanism, Binary-Weight encoding, deep learning

    Received: 15 Jul 2024; Accepted: 23 Dec 2024.

    Copyright: © 2024 Luo, Wang, Xia, Zhu, Yang, Xu and Gu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence:
    Zhengtao Luo, Anhui Agricultural University, Hefei, China
    Yingchun Xia, Anhui Agricultural University, Hefei, China
    Xiaolei Zhu, Anhui Agricultural University, Hefei, China
    Shuai Yang, Anhui Agricultural University, Hefei, China
    Zhaochun Xu, Harbin Medical University, Harbin, 130012, Heilongjiang, China
    Lichuan Gu, Anhui Agricultural University, Hefei, China

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.