AUTHOR=Li Jialu , Hao Yiwei , Liu Ying , Wu Liang , Liang Hongyuan , Ni Liang , Wang Fang , Wang Sa , Duan Yujiao , Xu Qiuhua , Xiao Jinjing , Yang Di , Gao Guiju , Ding Yi , Gao Chengyu , Xiao Jiang , Zhao Hongxin TITLE=Supervised machine learning algorithms to predict the duration and risk of long-term hospitalization in HIV-infected individuals: a retrospective study JOURNAL=Frontiers in Public Health VOLUME=11 YEAR=2024 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2023.1282324 DOI=10.3389/fpubh.2023.1282324 ISSN=2296-2565 ABSTRACT=Objective

The study aimed to use supervised machine learning models to predict the length and risk of prolonged hospitalization in PLWHs to help physicians timely clinical intervention and avoid waste of health resources.

Methods

Regression models were established based on RF, KNN, SVM, and XGB to predict the length of hospital stay using RMSE, MAE, MAPE, and R2, while classification models were established based on RF, KNN, SVM, NN, and XGB to predict risk of prolonged hospital stay using accuracy, PPV, NPV, specificity, sensitivity, and kappa, and visualization evaluation based on AUROC, AUPRC, calibration curves and decision curves of all models were used for internally validation.

Results

In regression models, XGB model performed best in the internal validation (RMSE = 16.81, MAE = 10.39, MAPE = 0.98, R2 = 0.47) to predict the length of hospital stay, while in classification models, NN model presented good fitting and stable features and performed best in testing sets, with excellent accuracy (0.7623), PPV (0.7853), NPV (0.7092), sensitivity (0.8754), specificity (0.5882), and kappa (0.4672), and further visualization evaluation indicated that the largest AUROC (0.9779), AUPRC (0.773) and well-performed calibration curve and decision curve in the internal validation.

Conclusion

This study showed that XGB model was effective in predicting the length of hospital stay, while NN model was effective in predicting the risk of prolonged hospitalization in PLWH. Based on predictive models, an intelligent medical prediction system may be developed to effectively predict the length of stay and risk of HIV patients according to their medical records, which helped reduce the waste of healthcare resources.