
94% of researchers rate our articles as excellent or good
Learn more about the work of our research integrity team to safeguard the quality of each article we publish.
Find out more
ORIGINAL RESEARCH article
Front. Immunol.
Sec. Cancer Immunity and Immunotherapy
Volume 16 - 2025 | doi: 10.3389/fimmu.2025.1556165
This article is part of the Research Topic Advancing Immunotherapy: Machine Learning and AI in Tumor Microenvironment Analysis View all articles
The final, formatted version of the article will be published soon.
You have multiple emails registered with Frontiers:
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
A cohort study of 143 tumor patients (96 CRC, 47 GC) was conducted. High-throughput TCR sequencing was performed to capture TCR beta (TRB), delta (TRD), and gamma (TRG) chain data. Tissue-specific patterns in TCR repertoire features, such as V-J gene recombination, complementarity-determining region 3 (CDR3) sequences, and motif distributions, were analyzed. Multi-layer machine learning-based diagnostic models were developed by leveraging motif-based feature and deep learning-based feature extraction using ProteinBERT from the 100 most abundant CDR3 sequences per sample. These models were used to differentiate CRC from GC, distinguish between primary and metastatic CRC lesions, and predict disease stages in CRC.Results: Tissue-specific differences in TCR repertoires were observed across CRC, GC, and between primary and metastatic lesions, as well as across disease stages in CRC. Distinct V-J gene recombination patterns were identified, with CRC showing enrichment in TRBV*-TRBJ* combinations, while GC exhibited higher levels of γδT-cell-related recombination. Primary and metastatic lesions of CRC patients displayed distinct V-J recombination preferences (e.g., TRBV7-9/TRBJ2-1 higher in metastatic; TRBV20-1/TRBJ1-2 higher in primary) and CDR3 sequence differences, with metastatic having shorter TRG CDR3 lengths (p-value = 0.019). Across CRC stages, later stages (III-IV) showed higher clonal diversity (p-value < 0.05) and stage-specific V-J patterns, alongside distinct CDR3 amino acid preferences at N-terminal (positions 1-2) and central positions (positions 5-12). Multidimensional machine learning models demonstrated exceptional diagnostic performance across all classification tasks. For distinguishing CRC from GC, the model achieved an accuracy of 97.9% and an area under the curve (AUC) of 0.996. For differentiating primary from metastatic CRC, the model achieved 100% accuracy with an AUC of 1.000. In predicting CRC disease stages, the model attained an accuracy of 96.9% and an AUC of 0.993. Extensive validation using simulated and publicly available datasets, confirmed the robustness and reliability of the models, demonstrating consistent performance across diverse datasets and experimental conditions.Our investigation provides novel insights into TCR repertoire variations in digestive system tumors, and highlight the potential of immune repertoire features as powerful diagnostic tools for understanding cancer progression and potentially improving clinical decision-making.
Keywords: T-cell Receptor Repertoire (TCR), colorectal cancer (CRC), Gastric cancer (GC), multi-layer machine learning, Diagnostic model
Received: 06 Jan 2025; Accepted: 20 Mar 2025.
Copyright: © 2025 Yuan, Wang, Wang, Wang, Li and Zhen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Ya'nan Zhen, Department of Gastrointestinal Surgery, Shandong Provincial Third Hospital, Shandong University, Jinan, China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
Research integrity at Frontiers
Learn more about the work of our research integrity team to safeguard the quality of each article we publish.