AUTHOR=Liu XiaoHuan , Zhang Weiyue , Zhang Qiao , Chen Long , Zeng TianShu , Zhang JiaoYue , Min Jie , Tian ShengHua , Zhang Hao , Huang Hantao , Wang Ping , Hu Xiang , Chen LuLu TITLE=Development and validation of a machine learning-augmented algorithm for diabetes screening in community and primary care settings: A population-based study JOURNAL=Frontiers in Endocrinology VOLUME=13 YEAR=2022 URL=https://www.frontiersin.org/journals/endocrinology/articles/10.3389/fendo.2022.1043919 DOI=10.3389/fendo.2022.1043919 ISSN=1664-2392 ABSTRACT=Background

Opportunely screening for diabetes is crucial to reduce its related morbidity, mortality, and socioeconomic burden. Machine learning (ML) has excellent capability to maximize predictive accuracy. We aim to develop ML-augmented models for diabetes screening in community and primary care settings.

Methods

8425 participants were involved from a population-based study in Hubei, China since 2011. The dataset was split into a development set and a testing set. Seven different ML algorithms were compared to generate predictive models. Non-laboratory features were employed in the ML model for community settings, and laboratory test features were further introduced in the ML+lab models for primary care. The area under the receiver operating characteristic curve (AUC), area under the precision-recall curve (auPR), and the average detection costs per participant of these models were compared with their counterparts based on the New China Diabetes Risk Score (NCDRS) currently recommended for diabetes screening.

Results

The AUC and auPR of the ML model were 0·697and 0·303 in the testing set, seemingly outperforming those of NCDRS by 10·99% and 64·67%, respectively. The average detection cost of the ML model was 12·81% lower than that of NCDRS with the same sensitivity (0·72). Moreover, the average detection cost of the ML+FPG model is the lowest among the ML+lab models and less than that of the ML model and NCDRS+FPG model.

Conclusion

The ML model and the ML+FPG model achieved higher predictive accuracy and lower detection costs than their counterpart based on NCDRS. Thus, the ML-augmented algorithm is potential to be employed for diabetes screening in community and primary care settings.