AUTHOR=Feng Yun , Wang Yilin TITLE=Comparison of the classifiers based on mRNA, microRNA and lncRNA expression and DNA methylation profiles for the tumor origin detection JOURNAL=Frontiers in Genetics VOLUME=15 YEAR=2024 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2024.1383852 DOI=10.3389/fgene.2024.1383852 ISSN=1664-8021 ABSTRACT=Background

Tumor tissue origin detection is of great importance in determining the appropriate course of treatment for cancer patients. Classifiers based on gene expression and DNA methylation profiles have been confirmed to be feasible and reliable to predict the tumor primary. However, few works have been performed to compare the performance of these classifiers based on different profiles.

Methods

Using gene expression and DNA methylation profiles from The Cancer Genome Atlas (TCGA) project, eight machine learning methods were employed for the tumor tissue origin detection. We then evaluated the predictive performance using DNA methylation, mRNA, microRNA (miRNA) and long non-coding RNA (lncRNA) expression profiles in a comparative manner. A statistical method was introduced to select the most informative CpG sites.

Results

We found that LASSO is the most predictive models based on various profiles. Further analyses indicated that the results derived from DNA methylation (overall accuracy: 97.77%) are better than those derived from mRNA expression (overall accuracy: 88.01%), microRNA expression (overall accuracy: 91.03%) and lncRNA expression (overall accuracy: 95.7%). It has been suggested that we can achieve an overall accuracy >90% using only 1,000 methylated CpG sites for prediction.

Conclusion

In this work, we comprehensively evaluated the performance of classifiers based on different profiles for the tumor origin detection. Our findings demonstrated the effectiveness of DNA methylation as biomarker for tracing tumor tissue origin using LASSO and neural network.