AUTHOR=Li Bao , Shen Yang , Liu Songbo , Yuan Hong , Liu Ming , Li Haokun , Zhang Tonghe , Du Shuyuan , Liu Xinwei TITLE=Identification of immune microenvironment subtypes and clinical risk biomarkers for osteoarthritis based on a machine learning model JOURNAL=Frontiers in Molecular Biosciences VOLUME=11 YEAR=2024 URL=https://www.frontiersin.org/journals/molecular-biosciences/articles/10.3389/fmolb.2024.1376793 DOI=10.3389/fmolb.2024.1376793 ISSN=2296-889X ABSTRACT=Background

Osteoarthritis (OA) is a degenerative disease with a high incidence worldwide. Most affected patients do not exhibit obvious discomfort symptoms or imaging findings until OA progresses, leading to irreversible destruction of articular cartilage and bone. Therefore, developing new diagnostic biomarkers that can reflect articular cartilage injury is crucial for the early diagnosis of OA. This study aims to explore biomarkers related to the immune microenvironment of OA, providing a new research direction for the early diagnosis and identification of risk factors for OA.

Methods

We screened and downloaded relevant data from the Gene Expression Omnibus (GEO) database, and the immune microenvironment-related genes (Imr-DEGs) were identified using the ImmPort data set by combining weighted coexpression analysis (WGCNA). Functional enrichment of GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) were conducted to explore the correlation of Imr-DEGs. A random forest machine learning model was constructed to analyze the characteristic genes of OA, and the diagnostic significance was determined by the Receiver Operating Characteristic Curve (ROC) curve, with external datasets used to verify the diagnostic ability. Different immune subtypes of OA were identified by unsupervised clustering, and the function of these subtypes was analyzed by gene set enrichment analysis (GSVA). The Drug-Gene Interaction Database was used to explore the relationship between characteristic genes and drugs.

Results

Single sample gene set enrichment analysis (ssGSEA) revealed that 16 of 28 immune cell subsets in the dataset significantly differed between OA and normal groups. There were 26 Imr-DEGs identified by WGCNA, showing that functional enrichment was related to immune response. Using the random forest machine learning model algorithm, nine characteristic genes were obtained: BLNK (AUC = 0.809), CCL18 (AUC = 0.692), CD74 (AUC = 0.794), CSF1R (AUC = 0.835), RAC2 (AUC = 0.792), INSR (AUC = 0.765), IL11 (AUC = 0.662), IL18 (AUC = 0.699), and TLR7 (AUC = 0.807). A nomogram was constructed to predict the occurrence and development of OA, and the calibration curve confirmed the accuracy of these 9 genes in OA diagnosis.

Conclusion

This study identified characteristic genes related to the immune microenvironment in OA, providing new insight into the risk factors of OA.