AUTHOR=Wang Kai , Li Ju , Meng Deqian , Zhang Zhongyuan , Liu Shanshan
TITLE=Machine learning based on metabolomics reveals potential targets and biomarkers for primary Sjogren’s syndrome
JOURNAL=Frontiers in Molecular Biosciences
VOLUME=9
YEAR=2022
URL=https://www.frontiersin.org/journals/molecular-biosciences/articles/10.3389/fmolb.2022.913325
DOI=10.3389/fmolb.2022.913325
ISSN=2296-889X
ABSTRACT=
Background: Using machine learning based on metabolomics, this study aimed to construct an effective primary Sjogren’s syndrome (pSS) diagnostics model and reveal the potential targets and biomarkers of pSS.
Methods: From a total of 39 patients with pSS and 38 healthy controls (HCs), serum specimens were collected. The samples were analyzed by ultra-high-performance liquid chromatography coupled with high-resolution mass spectrometry. Three machine learning algorithms, including the least absolute shrinkage and selection operator (LASSO), random forest (RF), and extreme gradient boosting (XGBoost), were used to build the pSS diagnosis models. Afterward, four machine learning methods were used to reduce the dimensionality of the metabolomics data. Finally, metabolites with significant differences were screened and pathway analysis was conducted.
Results: The area under the curve (AUC), sensitivity, and specificity of LASSO, RF and XGBoost test set all reached 1.00. Orthogonal partial least squares discriminant analysis was used to classify the metabolomics data. By combining the results of the univariate false discovery rate and the importance of the variable in projection, we identified 21 significantly different metabolites. Using these 21 metabolites for diagnostic modeling, the AUC, sensitivity, and specificity of LASSO, RF, and XGBoost all reached 1.00. Metabolic pathway analysis revealed that these 21 metabolites are highly correlated with amino acid and lipid metabolisms. On the basis of 21 metabolites, we screened the important variables in the models. Further, five common variables were obtained by intersecting the important variables of three models. Based on these five common variables, the AUC, sensitivity, and specificity of LASSO, RF, and XGBoost all reached 1.00.2-Hydroxypalmitic acid, L-carnitine and cyclic AMP were found to be potential targets and specific biomarkers for pSS.
Conclusion: The combination of machine learning and metabolomics can accurately distinguish between patients with pSS and HCs. 2-Hydroxypalmitic acid, L-carnitine and cyclic AMP were potential targets and biomarkers for pSS.