Cytopenia is a frequent complication among HIV-infected patients who require hospitalization. It can have a negative impact on the treatment outcomes for these patients. However, by leveraging machine learning techniques and electronic medical records, a predictive model can be developed to evaluate the risk of cytopenia during hospitalization in HIV patients. Such a model is crucial for designing a more individualized and evidence-based treatment strategy for HIV patients.
The present study was conducted on HIV patients who were admitted to Guangxi Chest Hospital between June 2016 and October 2021. We extracted a total of 66 clinical features from the electronic medical records and employed them to train five machine learning prediction models (artificial neural network [ANN], adaptive boosting [AdaBoost], k-nearest neighbour [KNN] and support vector machine [SVM], decision tree [DT]). The models were tested using 20% of the data. The performance of the models was evaluated using indicators such as the area under the receiver operating characteristic curve (AUC). The best predictive models were interpreted using the shapley additive explanation (SHAP).
The ANN models have better predictive power. According to the SHAP interpretation of the ANN model, hypoproteinemia and cancer were the most important predictive features of cytopenia in HIV hospitalized patients. Meanwhile, the lower hemoglobin-to-RDW ratio (HGB/RDW), low-density lipoprotein cholesterol (LDL-C) levels, CD4+ T cell counts, and creatinine clearance (Ccr) levels increase the risk of cytopenia in HIV hospitalized patients.
The present study constructed a risk prediction model for cytopenia in HIV patients during hospitalization with machine learning and electronic medical record information. The prediction model is important for the rational management of HIV hospitalized patients and the personalized treatment plan setting.