AUTHOR=Xie Ping , Batur Jesur , An Xin , Yasen Musha , Fu Xuefeng , Jia Lin , Luo Yun TITLE=Novel, alternative splicing signature to detect lymph node metastasis in prostate adenocarcinoma with machine learning JOURNAL=Frontiers in Oncology VOLUME=12 YEAR=2023 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2022.1084403 DOI=10.3389/fonc.2022.1084403 ISSN=2234-943X ABSTRACT=Background

The presence of lymph node metastasis leads to a poor prognosis for prostate cancer (Pca). Recently, many studies have indicated that gene signatures may be able to predict the status of lymph nodes. The purpose of this study is to probe and validate a new tool to predict lymph node metastasis (LNM) based on alternative splicing (AS).

Methods

Gene expression profiles and clinical information of prostate adenocarcinoma cohort were retrieved from The Cancer Genome Atlas (TCGA) database, and the corresponding RNA-seq splicing events profiles were obtained from the TCGA SpliceSeq. Limma package was used to identify the differentially expressed alternative splicing (DEAS) events between LNM and non-LNM groups. Eight machine learning classifiers were built to train with stratified five-fold cross-validation. SHAP values was used to explain the model.

Results

333 differentially expressed alternative splicing (DEAS) events were identified. Using correlation filter and the least absolute shrinkage and selection operator (LASSO) method, a 96 AS signature was identified that had favorable discrimination in the training set and validated in the validation set. The linear discriminant analysis (LDA) was the best classifier after 100 iterations of training. The LDA classifier was able to distinguish between LNM and non-LNM with an area under the receiver operating curve of 0.962 ± 0.026 in the training set (D1 = 351) and 0.953 in the validation set (D2 = 62). The decision curve analysis plot proved the clinical application of the AS-based model.

Conclusion

Machine learning combined with AS data could robustly distinguish between LNM and non-LNM in Pca.