AUTHOR=Pan Hong , Sun Jijia , Luo Xin , Ai Heling , Zeng Jing , Shi Rong , Zhang An TITLE=A risk prediction model for type 2 diabetes mellitus complicated with retinopathy based on machine learning and its application in health management JOURNAL=Frontiers in Medicine VOLUME=10 YEAR=2023 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2023.1136653 DOI=10.3389/fmed.2023.1136653 ISSN=2296-858X ABSTRACT=Objective

This study aimed to establish a risk prediction model for diabetic retinopathy (DR) in the Chinese type 2 diabetes mellitus (T2DM) population using few inspection indicators and to propose suggestions for chronic disease management.

Methods

This multi-centered retrospective cross-sectional study was conducted among 2,385 patients with T2DM. The predictors of the training set were, respectively, screened by extreme gradient boosting (XGBoost), a random forest recursive feature elimination (RF-RFE) algorithm, a backpropagation neural network (BPNN), and a least absolute shrinkage selection operator (LASSO) model. Model I, a prediction model, was established through multivariable logistic regression analysis based on the predictors repeated ≥3 times in the four screening methods. Logistic regression Model II built on the predictive factors in the previously released DR risk study was introduced into our current study to evaluate the model’s effectiveness. Nine evaluation indicators were used to compare the performance of the two prediction models, including the area under the receiver operating characteristic curve (AUROC), accuracy, precision, recall, F1 score, balanced accuracy, calibration curve, Hosmer-Lemeshow test, and Net Reclassification Index (NRI).

Results

When including predictors, such as glycosylated hemoglobin A1c, disease course, postprandial blood glucose, age, systolic blood pressure, and albumin/urine creatinine ratio, multivariable logistic regression Model I demonstrated a better prediction ability than Model II. Model I revealed the highest AUROC (0.703), accuracy (0.796), precision (0.571), recall (0.035), F1 score (0.066), Hosmer-Lemeshow test (0.887), NRI (0.004), and balanced accuracy (0.514).

Conclusion

We have built an accurate DR risk prediction model with fewer indicators for patients with T2DM. It can be used to predict the individualized risk of DR in China effectively. In addition, the model can provide powerful auxiliary technical support for the clinical and health management of patients with diabetes comorbidities.