AUTHOR=Sun Rui , Li Simin , Zhao Ke , Diao Mei , Li Long 

TITLE=Identification of Ten Core Hub Genes as Potential Biomarkers and Treatment Target for Hepatoblastoma

JOURNAL=Frontiers in Oncology

VOLUME=Volume 11 - 2021

YEAR=2021

URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2021.591507

DOI=10.3389/fonc.2021.591507

ISSN=2234-943X

ABSTRACT=Background: This study aimed to systematically investigate gene signatures for hepatoblastoma (HB) and identify potential biomarkers for its diagnosis and treatment.
Materials and Methods: GSE131329 and GSE81928 were obtained from the Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) between hepatoblastoma and normal samples were identified using the Limma package in R. Then, the similarity of network traits between two sets of genes was analyzed by weighted gene correlation network analysis (WGCNA). Cytoscape was used to visualize and select hub genes. GO enrichment and KEGG pathway analyses of hub genes were carried out using ClueGO. The random forest classifier was constructed based on the hub genes using the GSE131329 dataset as the training set, and its reliability was validated using the GSE81928 dataset. The resulting core hub genes were combined with the InnateDB database to identify the innate core genes. 
Results: A total of 4244 DEGs in HB were identified. WGCNA identified four modules that were significantly correlated with the disease status. A total of 114 hub genes were obtained within the top 20 genes of each node rank. GO enrichment and KEGG pathway analyses of hub genes were focused on MAPK, cell cycle, p53, and other crucial pathways involved in HB. A random forest classifier was constructed using the 114 hub genes as feature genes, resulting in a 95.5% true positive rate when classifying HB and normal samples. A total of 35 core hub genes were obtained through the mean decrease in accuracy and mean decrease Gini of the random forest model. The classification efficiency of the random forest model was 81.4%. Finally, CDK1, TOP2A, ADRA1A, FANCI, XRCC1, TPX2, CCNB2, CDK4, GLYATL1, and CFHR3 were identified by cross-comparison with the InnateDB database.
Conclusion: Our study established a random forest classifier that identified 10 core genes in HB. These findings may be beneficial for the diagnosis, prediction, and targeted therapy of HB.