AUTHOR=Qian Xinyang , Yang Guang , Li Fan , Zhang Xuanping , Zhu Xiaoyan , Lai Xin , Xiao Xiao , Wang Tao , Wang Jiayin TITLE=DeepLION2: deep multi-instance contrastive learning framework enhancing the prediction of cancer-associated T cell receptors by attention strategy on motifs JOURNAL=Frontiers in Immunology VOLUME=Volume 15 - 2024 YEAR=2024 URL=https://www.frontiersin.org/journals/immunology/articles/10.3389/fimmu.2024.1345586 DOI=10.3389/fimmu.2024.1345586 ISSN=1664-3224 ABSTRACT=T cell receptor (TCR) repertoires provide valuable insights into complex human diseases, including cancers. Recent advancements in immune sequencing technology have significantly improved our understanding on TCR repertoire. Some computational methods have been devised to identify cancer-associated TCRs and enable cancer detection using TCR sequencing data. However, the existing methods are often limited by their inadequate consideration of the correlations among TCRs within a repertoire, hindering the identification of crucial TCRs. Additionally, the sparsity of cancerassociated TCR distribution presents a challenge in accurate prediction. To address these issues, we presented DeepLION2, an innovative deep multi-instance contrastive learning framework specifically designed to enhance cancer-associated TCR prediction. DeepLION2 leveraged contentbased sparse self-attention, focusing on the top k related TCRs for each TCR, to effectively model inter-TCR correlations. Furthermore, it adopted a contrastive learning strategy for bootstrapping parameter updates of the attention matrix, preventing the model from fixating on non-cancerassociated TCRs. Extensive experimentation on diverse patient cohorts, encompassing over ten cancer types, demonstrated that DeepLION2 significantly outperformed current state-of-the-art