Skip to main content

REVIEW article

Front. Bioeng. Biotechnol.
Sec. Synthetic Biology
Volume 13 - 2025 | doi: 10.3389/fbioe.2025.1506508

Evaluating the Advancements in Protein Language Models for Encoding Strategies in Protein Function Prediction: A Comprehensive Review

Provisionally accepted
Jia-Ying Chen Jia-Ying Chen 1,2,3*Jingfu Wang Jingfu Wang 1,2,3Yue Hu Yue Hu 1,2,3Xin-Hui Li Xin-Hui Li 1,2,3Yu-Rong Qian Yu-Rong Qian 2,3,4Chao-Lin Song Chao-Lin Song 1,2,3
  • 1 School of Software, Xinjiang University, Urumqi, Xinjiang Uyghur Region, China
  • 2 Key Laboratory of Software Engineering, Xinjiang University, Urumqi, China
  • 3 Key Laboratory of signal detection and processing in Xinjiang Uygur Autonomous Region, Urumqi, China
  • 4 School of Computer Science and Technology, Xinjiang University, Urumqi, China

The final, formatted version of the article will be published soon.

    Protein function prediction is crucial in several key areas such as bioinformatics and drug design.With the rapid progress of deep learning technology, applying protein language models has become a research focus. These models utilize the increasing amount of large-scale protein sequence data to deeply mine its intrinsic semantic information, which can effectively improve the accuracy of protein function prediction. This review comprehensively combines the current status of applying the latest protein language models in protein function prediction. It provides an exhaustive performance comparison with traditional prediction methods. Through the indepth analysis of experimental results, the significant advantages of protein language models in enhancing the accuracy and depth of protein function prediction tasks are fully demonstrated.

    Keywords: protein function prediction, protein language model, deep learning, deep multi-label classification, Gene Ontology (GO)

    Received: 05 Oct 2024; Accepted: 02 Jan 2025.

    Copyright: © 2025 Chen, Wang, Hu, Li, Qian and Song. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Jia-Ying Chen, School of Software, Xinjiang University, Urumqi, 830046, Xinjiang Uyghur Region, China

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.