AUTHOR=Shen Junjie , Li Huijun , Yu Xinghao , Bai Lu , Dong Yongfei , Cao Jianping , Lu Ke , Tang Zaixiang TITLE=Efficient feature extraction from highly sparse binary genotype data for cancer prognosis prediction using an auto-encoder JOURNAL=Frontiers in Oncology VOLUME=12 YEAR=2023 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2022.1091767 DOI=10.3389/fonc.2022.1091767 ISSN=2234-943X ABSTRACT=
Genomics involving tens of thousands of genes is a complex system determining phenotype. An interesting and vital issue is how to integrate highly sparse genetic genomics data with a mass of minor effects into a prediction model for improving prediction power. We find that the deep learning method can work well to extract features by transforming highly sparse dichotomous data to lower-dimensional continuous data in a non-linear way. This may provide benefits in risk prediction-associated genotype data. We developed a multi-stage strategy to extract information from highly sparse binary genotype data and applied it for cancer prognosis. Specifically, we first reduced the size of binary biomarkers