AUTHOR=Xu Jingxi , Liang Chaoyang , Li Jiangtao TITLE=A signal recognition particle-related joint model of LASSO regression, SVM-RFE and artificial neural network for the diagnosis of systemic sclerosis-associated pulmonary hypertension JOURNAL=Frontiers in Genetics VOLUME=13 YEAR=2022 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2022.1078200 DOI=10.3389/fgene.2022.1078200 ISSN=1664-8021 ABSTRACT=

Background: Systemic sclerosis-associated pulmonary hypertension (SSc-PH) is one of the most common causes of death in patients with systemic sclerosis (SSc). The complexity of SSc-PH and the heterogeneity of clinical features in SSc-PH patients contribute to the difficulty of diagnosis. Therefore, there is a pressing need to develop and optimize models for the diagnosis of SSc-PH. Signal recognition particle (SRP) deficiency has been found to promote the progression of multiple cancers, but the relationship between SRP and SSc-PH has not been explored.

Methods: First, we obtained the GSE19617 and GSE33463 datasets from the Gene Expression Omnibus (GEO) database as the training set, GSE22356 as the test set, and the SRP-related gene set from the MSigDB database. Next, we identified differentially expressed SRP-related genes (DE-SRPGs) and performed unsupervised clustering and gene enrichment analyses. Then, we used least absolute shrinkage and selection operator (LASSO) regression and support vector machine-recursive feature elimination (SVM-RFE) to identify SRP-related diagnostic genes (SRP-DGs). We constructed an SRP scoring system and a nomogram model based on the SRP-DGs and established an artificial neural network (ANN) for diagnosis. We used receiver operating characteristic (ROC) curves to identify the SRP-related signature in the training and test sets. Finally, we analyzed immune features, signaling pathways, and drugs associated with SRP and investigated SRP-DGs’ functions using single gene batch correlation analysis-based GSEA.

Results: We obtained 30 DE-SRPGs and found that they were enriched in functions and pathways such as “protein targeting to ER,” “cytosolic ribosome,” and “coronavirus disease—COVID-19”. Subsequently, we identified seven SRP-DGs whose expression levels and diagnostic efficacy were validated in the test set. As one signature, the area under the ROC curve (AUC) values for seven SRP-DGs were 0.769 and 1.000 in the training and test sets, respectively. Predictions made using the nomogram model are likely beneficial for SSc-PH patients. The AUC values of the ANN were 0.999 and 0.860 in the training and test sets, respectively. Finally, we discovered that some immune cells and pathways, such as activated dendritic cells, complement activation, and heme metabolism, were significantly associated with SRP-DGs and identified ten drugs targeting SRP-DGs.

Conclusion: We constructed a reliable SRP-related ANN model for the diagnosis of SSc-PH and investigated the possible role of SRP in the etiopathogenesis of SSc-PH by bioinformatics methods to provide a basis for precision and personalized medicine.