AUTHOR=Hu Xiaoqi , Hu Xiaolin , Yu Ya , Wang Jia 

TITLE=Prediction model for gestational diabetes mellitus using the XG Boost machine learning algorithm

JOURNAL=Frontiers in Endocrinology

VOLUME=Volume 14 - 2023

YEAR=2023

URL=https://www.frontiersin.org/journals/endocrinology/articles/10.3389/fendo.2023.1105062

DOI=10.3389/fendo.2023.1105062

ISSN=1664-2392

ABSTRACT=Abstract
Objective: To develop the extreme gradient boosting (XG Boost) machine learning (ML) model for predicting Gestational Diabetes Mellitus (GDM) compared with a model using the traditional logistic regression (LR) method.
Methods: A case-control study was performed of pregnant women in the training set recruited from August to November 2019 and those in the testing set from August 2020. We applied the XG Boost ML model approach to identify the best set of predictors out of variables. The performance of the prediction model was assessed by area under the receiver operating characteristic (ROC) curve (AUC) for discrimination and Hosmer-Lemeshow (HL) and calibration plots for calibration. Decision curve analysis (DCA) was introduced to evaluate the clinical use of each of the models.
Results: A total of 735 pregnant women participated in a training set. 190 pregnant women were included in the testing set. The XG Boost ML model included 20 predictors obtained an AUC of 0.946 and yielded a predictive accuracy of 0.875. And the model using a traditional LR, included 4 predictors, presented an AUC of 0.752 and yielded a predictive accuracy of 0.786. HL and calibration plots show two models have good calibration. DCA indicate that the XG Boost ML model has a net benefit compared to treating all women or none of the women.
Conclusions: The established model using the XG Boost ML showed better predictive ability in discrimination than the traditional LR model. The calibrations of two models were both of good performance.
Keywords: gestational diabetes mellitus, machine learning, prediction model, extreme gradient boosting, logistic regression