Skip to main content

ORIGINAL RESEARCH article

Front. Bioeng. Biotechnol.
Sec. Bioprocess Engineering
Volume 12 - 2024 | doi: 10.3389/fbioe.2024.1495267

Enhancing the Reverse Transcriptase Function in Taq Polymerase via AI-driven Multiparametric Rational Design

Provisionally accepted
Yulia E Tomilova Yulia E Tomilova 1Nikolay E Russkikh Nikolay E Russkikh 2Igor M Yi Igor M Yi 2Elizaveta V Shaburova Elizaveta V Shaburova 3Viktor N Tomilov Viktor N Tomilov 4Galina B Pyrinova Galina B Pyrinova 1Svetlana O Brezhneva Svetlana O Brezhneva 1Olga S Tikhonyuk Olga S Tikhonyuk 1Nadezhda S Gololobova Nadezhda S Gololobova 1Dmitriy V Popichenko Dmitriy V Popichenko 1Maxim O Arkhipov Maxim O Arkhipov 1Leonid O Bryzgalov Leonid O Bryzgalov 1Evgeny V Brenner Evgeny V Brenner 1Anastasia A Artyukh Anastasia A Artyukh 1Dmitry N Shtokalo Dmitry N Shtokalo 2,3Denis V Antonets Denis V Antonets 3*Mikhail K Ivanov Mikhail K Ivanov 2,5
  • 1 AO Vector-Best, Novosibirsk, Russia
  • 2 AcademGene LLC, Novosibirsk, Russia
  • 3 MSU Institute for Artificial Intelligence, Lomonosov Moscow State University, Moscow, Russia
  • 4 SibEnzyme Ltd, Novosibirsk, Russia
  • 5 Institute of Molecular and Cellular Biology SB RAS, Novosibirsk, Russia

The final, formatted version of the article will be published soon.

    Modification of natural enzymes to introduce new properties and enhance existing ones is a central challenge in bioengineering. This study is focused on the development of Taq polymerase mutants that show enhanced reverse transcriptase (RTase) activity while retaining other desirable properties such as fidelity, 5'-3' exonuclease activity, effective deoxyuracyl incorporation, and tolerance to locked nucleic acid (LNA)-containing substrates. Our objective was to use AI-driven rational design combined with multiparametric wet-lab analysis to identify and validate Taq polymerase mutants with an optimal combination of these properties. The experimental procedure was conducted in several stages: 1) 18 candidate mutations across six sites were selected from literature for experimental evaluation along with the wild type enzyme to make the initial training dataset; 2) using Taq polymerase variants embeddings obtained from protein language model we trained a Ridge regression model to predict multiple enzyme properties to select 14 more candidates for experimental evaluation, expanding the dataset for further refinement; 3) to better manage the risk by assessing predictions confidence intervals we transitioned to Gaussian process regression with our expanded dataset comprising 33 data points; 4) with this enhanced model, we conducted the in silico screening of over 18 million mutations and narrowed the field to 16 top candidates for comprehensive wet-lab evaluation. This iterative, data-driven strategy ultimately led to the identification of 18 enzyme variants that exhibited markedly improved RTase activity while maintaining a favorable balance of other key properties. These enhancements were generally accompanied by lower Kd, moderately reduced fidelity, and greater tolerance to noncanonical substrates, thereby illustrating a strong interdependence among these traits. Several enzymes validated via this procedure were effective in single-enzyme real-time reverse-transcription PCR setups, implying their utility for the development of new tools for real-time reverse-transcription PCR technologies, such as pathogen RNA detection and gene expression analysis. This study illustrates how AI can be effectively integrated with experimental bioengineering to enhance enzyme functionality systematically. Our approach offers a robust framework for designing enzyme mutants tailored to specific biotechnological applications. The results of our biological activity predictions for mutated Taq polymerases can be accessed at https://huggingface.co/datasets/nerusskikh/taqpol_insilico_dms.

    Keywords: Bioengineering, Function enhancement, Reverse Transcription, machine learning, protein language model, rational design

    Received: 12 Sep 2024; Accepted: 19 Nov 2024.

    Copyright: © 2024 Tomilova, Russkikh, Yi, Shaburova, Tomilov, Pyrinova, Brezhneva, Tikhonyuk, Gololobova, Popichenko, Arkhipov, Bryzgalov, Brenner, Artyukh, Shtokalo, Antonets and Ivanov. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Denis V Antonets, MSU Institute for Artificial Intelligence, Lomonosov Moscow State University, Moscow, Russia

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.