AUTHOR=Luo Zheng , Huang Cong , Chen Jilan , Chen Yunhui , Yang Hongya , Wu Qiaofeng , Lu Fating , Zhang Tian E. TITLE=Potential diagnostic markers and therapeutic targets for non-alcoholic fatty liver disease and ulcerative colitis based on bioinformatics analysis and machine learning JOURNAL=Frontiers in Medicine VOLUME=11 YEAR=2024 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2024.1323859 DOI=10.3389/fmed.2024.1323859 ISSN=2296-858X ABSTRACT=Background

Non-alcoholic fatty liver disease (NAFLD) and ulcerative colitis (UC) are two common health issues that have gained significant global attention. Previous studies have suggested a possible connection between NAFLD and UC, but the underlying pathophysiology remains unclear. This study investigates common genes, underlying pathogenesis mechanisms, identification of diagnostic markers applicable to both conditions, and exploration of potential therapeutic targets shared by NAFLD and UC.

Methods

We obtained datasets for NAFLD and UC from the GEO database. The DEGs in the GSE89632 dataset of the NAFLD and GSE87466 of the UC dataset were analyzed. WGCNA, a powerful tool for identifying modules of highly correlated genes, was employed for both datasets. The DEGs of NAFLD and UC and the modular genes were then intersected to obtain shared genes. Functional enrichment analysis was conducted on these shared genes. Next, we utilize the STRING database to establish a PPI network. To enhance visualization, we employ Cytoscape software. Subsequently, the Cytohubba algorithm within Cytoscape was used to identify central genes. Diagnostic biomarkers were initially screened using LASSO regression and SVM methods. The diagnostic value of ROC curve analysis was assessed to detect diagnostic genes in both training and validation sets for NAFLD and UC. A nomogram was also developed to evaluate diagnostic efficacy. Additionally, we used the CIBERSORT algorithm to explore immune infiltration patterns in both NAFLD and UC samples. Finally, we investigated the correlation between hub gene expression, diagnostic gene expression, and immune infiltration levels.

Results

We identified 34 shared genes that were found to be associated with both NAFLD and UC. These genes were subjected to enrichment analysis, which revealed significant enrichment in several pathways, including the IL-17 signaling pathway, Rheumatoid arthritis, and Chagas disease. One optimal candidate gene was selected through LASSO regression and SVM: CCL2. The ROC curve confirmed the presence of CCL2 in both the NAFLD and UC training sets and other validation sets. This finding was further validated using a nomogram in the validation set. Additionally, the expression levels of CCL2 for NAFLD and UC showed a significant correlation with immune cell infiltration.

Conclusion

This study identified a gene (CCL2) as a biomarker for NAFLD and UC, which may actively participate in the progression of NAFLD and UC. This discovery holds significant implications for understanding the progression of these diseases and potentially developing more effective diagnostic and treatment strategies.