Machine learning-based prediction of cerebral hemorrhage in patients with hemodialysis: A multicenter, retrospective study

Li, Fengda; Chen, Anmin; Li, Zeyi; Gu, Longyuan; Pan, Qiyang; Wang, Pan; Fan, Yuechao; Feng, Jinhong

doi:10.3389/fneur.2023.1139096

ORIGINAL RESEARCH article

Front. Neurol., 03 April 2023

Sec. Stroke

Volume 14 - 2023 | https://doi.org/10.3389/fneur.2023.1139096

This article is part of the Research TopicMachine learning in data analysis for stroke/endovascular therapyView all 11 articles

Machine learning-based prediction of cerebral hemorrhage in patients with hemodialysis: A multicenter, retrospective study

Fengda Li¹^†

Anmin Chen²^†

Zeyi Li³

Longyuan Gu⁴

Qiyang Pan⁵

Pan Wang³

Yuechao Fan⁴^*

Jinhong Feng⁶^*

¹Department of Neurosurgery, Changshu Hospital Affiliated to Soochow University, Changshu, China
²Department of Nephrology, The First People's Hospital of Jintan, Changzhou, China
³School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, China
⁴Department of Neurosurgery, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
⁵Faculty of Informatics, Università della Svizzera italiana, Lugano, Ticino, Switzerland
⁶Department of Nephrology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China

Background: Intracerebral hemorrhage (ICH) is one of the most serious complications in patients with chronic kidney disease undergoing long-term hemodialysis. It has high mortality and disability rates and imposes a serious economic burden on the patient's family and society. An early prediction of ICH is essential for timely intervention and improving prognosis. This study aims to build an interpretable machine learning-based model to predict the risk of ICH in patients undergoing hemodialysis.

Methods: The clinical data of 393 patients with end-stage kidney disease undergoing hemodialysis at three different centers between August 2014 and August 2022 were retrospectively analyzed. A total of 70% of the samples were randomly selected as the training set, and the remaining 30% were used as the validation set. Five machine learning (ML) algorithms, namely, support vector machine (SVM), extreme gradient boosting (XGB), complement Naïve Bayes (CNB), K-nearest neighbor (KNN), and logistic regression (LR), were used to develop a model to predict the risk of ICH in patients with uremia undergoing long-term hemodialysis. In addition, the area under the curve (AUC) values were evaluated to compare the performance of each algorithmic model. Global and individual interpretive analyses of the model were performed using importance ranking and Shapley additive explanations (SHAP) in the training set.

Results: A total of 73 patients undergoing hemodialysis developed spontaneous ICH among the 393 patients included in the study. The AUC of SVM, CNB, KNN, LR, and XGB models in the validation dataset were 0.725 (95% CI: 0.610 ~ 0.841), 0.797 (95% CI: 0.690 ~ 0.905), 0.675 (95% CI: 0.560 ~ 0.789), 0.922 (95% CI: 0.862 ~ 0.981), and 0.979 (95% CI: 0.953 ~ 1.000), respectively. Therefore, the XGBoost model had the best performance among the five algorithms. SHAP analysis revealed that the levels of LDL, HDL, CRP, and HGB and pre-hemodialysis blood pressure were the most important factors.

Conclusion: The XGB model developed in this study can efficiently predict the risk of a cerebral hemorrhage in patients with uremia undergoing long-term hemodialysis and can help clinicians to make more individualized and rational clinical decisions. ICH events in patients undergoing maintenance hemodialysis (MHD) are associated with serum LDL, HDL, CRP, HGB, and pre-hemodialysis SBP levels.

1. Introduction

Maintenance hemodialysis (MHD) is the primary renal replacement therapy for patients with uremia (1). Intracerebral hemorrhage (ICH), defined as non-traumatic hemorrhage in the brain parenchyma with or without ventricles, accounts for 10–15% of all stroke cases and is an important cause of disability and death globally (2). ICH is one of the most serious complications among patients undergoing MHD. Various factors have an important impact on the occurrence and development of ICH. Recent studies have attempted to identify relevant risk factors, and lipid metabolism and inflammatory responses have been reported as important factors regulating the progression of ICH and subsequent brain injury and brain function repair. Despite the continuous development of hemodialysis technology and the gradual improvement of nursing levels, the risk of a cerebral hemorrhage in patients undergoing MHD is approximately six times higher than that in healthy individuals (3), and the mortality rate is as high as 41–47% (4). Most patients require admission to the intensive care unit (ICU) for monitoring and treatment, which imposes a serious economic burden on the family and society.

ICH often has no identifiable warning signs or symptoms. Although optimal strategies for the medical and surgical management of ICH have been investigated, survival and functional outcomes have not been significantly improved (5). Therefore, establishing risk prediction models to identify high-risk patients undergoing MHD is important for the early implementation of targeted interventions. To date, only a few studies have attempted to develop such models.

Machine learning (ML), an artificial intelligence method, uses computers to statistically learn from datasets and build corresponding models to identify relationships between various factors. In the field of medicine, ML is increasingly used through statistical learning methods to overcome possible obstacles in clinical practice (6, 7). In recent years, although ML has been used to analyze clinical data to predict the complications and adverse outcomes of critical illnesses (8–10), few efforts have been made to develop strategies for predicting the prognosis of patients with uremia undergoing dialysis, especially for predicting the risk of cerebral hemorrhage, a serious complication of dialysis. ML has shown good performance in previous studies; however, because of its “black box” nature, the effects of each feature on the final results remain unknown, and it is difficult to explain the factors that lead to a given prediction. This lack of interpretability limits the widespread application of ML methods in medical research (11, 12). Shapley additive explanation (SHAP) is a method inspired by the classical game theory that assigns a predicted value to each feature and evaluates the contribution of each feature to the results of ML models to achieve a balance between the accuracy and interpretability of the model (13).

To analyze complex variables that may be related to a cerebral hemorrhage after regular hemodialysis, we integrated the demographic data, laboratory test results, hemodialysis indicators, and other information of patients to construct a model for predicting the risk of a cerebral hemorrhage. To make the model more applicable for the diagnosis of chronic kidney disease with intracerebral hemorrhage, overcome the “black box” nature of ML, and explore the relationship between each feature and its clinical significance, we used the extreme gradient boosting (XGBoost) algorithm to develop the model (14). SHAP was used to provide a more intuitive global and local explanation of the model to understand the prediction of the model and improve the clinical understanding of the risk of a cerebral hemorrhage in patients with hemodialysis.

2. Materials and methods

2.1. Study population and data source

Patients with end-stage kidney disease undergoing hemodialysis from August 2014 to August 2022 at the Affiliated Hospital of Xuzhou Medical University, Xuzhou Central Hospital, and the Second Affiliated Hospital of Xuzhou Medical University were recruited for the study. According to the occurrence of ICH, the patients were divided into ICH and non-ICH groups.

2.2. Data collection

The inclusion criteria were as follows: (a) patients diagnosed with uremia according to chronic kidney disease (CKD) staging and recommendations or the Kidney Disease Outcomes Quality Initiative (KDOQI) guidelines formulated by the American Kidney Foundation, that is, patients with estimated glomerular filtration rate (eGFR) of < 15 ml/(min·1.73 m²) diagnosed with CKD stage 5, which is the uremia stage (15); (b) patients receiving hemodialysis regularly, those aged ≥18 years, those with dialysis age of ≥3 months, and dialysis frequency of three times per week and 4 h per dialysis; and (c) patients with ICH confirmed via a CT examination of the head. The exclusion criteria were as follows: (a) patients with severe failure of the heart, lung, and other organs, blood system diseases, autoimmune diseases, and malignant tumors; (b) patients with primary subarachnoid hemorrhage, secondary cerebral hemorrhage, such as trauma, intracranial tumors, ICH caused by hemorrhage after an ischemic stroke, and severe coagulation dysfunction; (c) patients on antiplatelet drugs, hormones, immunosuppressants, and antibacterial agents in the past 1 month; and (d) patients with missing clinical data. Based on the diagnosis and inclusion and exclusion criteria, 393 patients with end-stage kidney disease complicated with cerebral hemorrhage owing to long-term hemodialysis were included. Of these 393 patients, 73 patients were included in the ICH group, whereas 320 patients were included in the non-ICH group. Because this study had a retrospective design, there was no security-related risk. The present study was approved by the Ethics Committee of the Affiliated Hospital of Xuzhou Medical University.

2.3. Inclusion of observed variables

The clinical data of patients were collected with reference to clinical experience, reported literature, and medical records in the electronic medical record systems of the three centers. Data regarding the following five aspects were collected: (1) demographic data (sex and age); (2) vascular risk factors (hypertension, diabetes, polycystic kidney disease, and duration of dialysis); (3) baseline blood pressure (systolic blood pressure [SBP] and diastolic blood pressure [DBP] before and after dialysis); (4) treatment during hemodialysis (including anticoagulant dosage, dialysis access, and blood flow velocity); and (5) laboratory tests (white blood cells [WBCs], platelets [PLTs], hemoglobin [HGB], neutrophils [Nes], lymphocytes [Lys], hematocrit [HCT], C-reactive protein [CRP], neutrophil-to-lymphocyte ratio [NLR], platelet-to-lymphocyte ratio [PLR], alanine aminotransferase [ALT], aspartate aminotransferase [AST], serum total protein [TP], serum albumin [ALB], blood urea nitrogen [BUN], serum creatinine [Scr], cystatin C [CysC], eGFR, uric acid [UA], triglyceride [TG], total cholesterol [TC], low-density lipoprotein [LDL], high-density lipoprotein [HDL], blood potassium [K], blood sodium [Na], blood calcium [Ca], calcium–phosphorus product, and blood phosphorus [P]).

2.4. Selection of machine learning models

Before constructing ML models, the original clinical data were normalized. Normalization can improve the speed of gradient descent to find the optimal solution, and the algorithm for Euclidean distance can effectively improve the accuracy. In this study, the min–max normalization method was used to normalize the characteristic values of clinical data to the range of (0,1).

Approximately 70% of the samples in the dataset were randomly selected as the training set, whereas the remaining 30% of the samples were used as the validation set. The dataset is represented as D = {(x_i, y_i), i = 1, 2, …, N}, where x_i is [x_i1, x_i2, x_i3, …, x_ip], which is a row vector with input variables (or features) of real value as its elements, and y_i∈{0, 1} is a scalar with the output of an integer value as its element. The task in hand was a binary classification problem, that is, the generation of a model (y = f[x]) in the training set. The model was subsequently verified in the validation set to predict $\hat{y_{k}} = f (x_{k})$ . The predicted output $\hat{y_{k}}$ should be similar to the actual output. All models were tested using Python.

We applied five ML algorithms to model the data: logistic regression (LR), support vector machine (SVM), K-nearest neighbor (KNN), complement Naive Bayes (CNB), and XGBoost. To be able to ensure that the training samples selected for multiple-model training were consistent, we generalized the performance of each model over multiple training sessions using a resampling training/validation mechanism. The XGBoost (version 1.2.1), lightGBM (version 3.2.1), and sklearn (version 0.22.1) packages were used for developing the ML models. For the RF algorithm, “ntree” was set to 100, and “mtree” was set to 3. To avoid overfitting and enhance interpretability, the maximum tree depth was set to eight nodes in the XGBoost algorithm. In addition, to evaluate the predictive accuracy of various ML models, accuracy, precision, sensitivity, specificity, F1 score, and the area under the receiver operating characteristic curve (ROC) were evaluated.

SHAP is a “model interpretation” package developed based on Python. To understand the results of the model output, the SHAP package was used to interpret and sort the features of the trained model and examine the contribution of each element in the features to the model.

2.5. Statistical analysis

The R software (version 4.02) was used for data processing and statistical analysis. Categorical variables were expressed in terms of quantity and percentage and were compared using Fisher's exact test or the chi-square test. For continuous variables, the Shapiro–Wilk test was initially used to determine whether the variables conformed to a normal distribution, and the independent sample t-test (conforming to a normal distribution) was subsequently used for comparing the data, which were expressed as mean ± standard deviation. The Mann–Whitney U-test was used to compare data with non-normal distribution, which were expressed as the median (first and third quartiles). A P < 0.05 was considered statistically significant.

3. Results

3.1. Baseline patient characteristics

A total of 393 patients were included in this study, and the baseline characteristics of the ICH and non-ICH groups are shown in Table 1. In terms of demographic characteristics, no significant differences were observed in the sex and age of patients between the two groups. The history of diabetes and polycystic kidney disease was a significant variable in terms of underlying diseases. The blood flow rate and SBP before and after dialysis were important variables in terms of dialysis indicators. Laboratory indices, such as the levels of CRP, LDL, and HDL, were significantly different between the two groups. We further constructed a heat map demonstrating Spearman correlation coefficients to visualize the correlation between variables with differences (Figure 1).

TABLE 1

Table 1. Baseline features of patients.

FIGURE 1

Figure 1. Heat map of the correlation of patient's clinical features.

3.2. Comparison of the predictive performance of all models

Five ML algorithms were used to construct predictive models. The training set was used to create and train the models. All ML models were tested in the test set, and their accuracy, precision, sensitivity, specificity, and F1 score were compared. The XGBoost model had the highest accuracy, precision, sensitivity, specificity, and F1 score (0.939, 0.949, 0.932, 0.952, and 0.938, respectively) (Table 2). Figure 2 shows a ROC curve demonstrating the predictive performance of all models. The XGBoost model (AUC = 0.979; 95% CI, 0.953–1.000) demonstrated optimal performance in the validation set. Therefore, the XGBoost model can be considered an ideal model for predicting the risk of ICH in patients undergoing MHD.

TABLE 2

Table 2. Comparison of the predictive performance of five machine learning algorithms in the validation set.

FIGURE 2

Figure 2. ROC curve demonstrating the performance of ML models in predicting ICH in patients undergoing MHD.

3.3. Explainable analysis of overall features

XGboost was used to rank the importance of features. Figure 3 shows the ranking of the most important variables in the model. The top five variables were LDL, HDL, CRP, pre-dialysis SBP, and HGB. The interpretation of the impact of these features is roughly consistent with that reported in previous studies and clinician perception.

FIGURE 3

Figure 3. Characteristic ranking of important variables in the model.

Figure 4 shows a characteristic density scatter plot, which demonstrates the effects of the main features in the dataset on the predictive performance of the model. The abscissa represents the SHAP value, which represents the contribution of a feature in the model to the overall output. SHAP values < 0, equal to 0, and >0 represent negative, no, and positive contributions, respectively. The left ordinate represents the features sorted by importance. The color of the right ordinate, from blue to red, represents the feature values from low to high. Lower LDL levels, higher CRP levels, lower HDL levels, lower HGB levels, and higher pre-dialysis SBP have higher SHAP values, indicating a higher likelihood of developing ICH.

FIGURE 4

Figure 4. SHAP summary plot of the XGBoost model demonstrates the relationship between each feature in the optimal model (XGboost) and SHAP values. The higher the SHAP value of each feature, the higher the risk of ICH in patients undergoing MHD.

3.4. Explainable analysis of individual features

As shown in Figure 5, the SHAP dependence plot demonstrates the effects of a single feature on the final output of the XGboost model and can be used to select the most significant features of the model. CRP levels and pre-dialysis SBP were positively correlated with SHAP values, that is, the larger the values, the higher the risk of bleeding. However, the levels of LDL, HDL, and HGB were negatively correlated with SHAP values, indicating that the smaller the values, the higher the risk of bleeding (Figure 5A). We selected LDL as a feature to determine the effects of HDL. The red and blue dots represent high and low HDL levels, respectively. After the data were normalized, it was found that when LDL was less than the critical value of 0.3, regardless of HDL levels, the SHAP value of LDL was always greater than zero. In addition, when HDL was greater than the critical value, the SHAP value of HDL was always less than zero (Figure 5B). The cutoff level of LDL is 1.572 mmol/L in actual clinical practice. If this threshold is exceeded, the possibility of ICH decreases. However, if this threshold is not exceeded, the possibility of ICH increases. In addition, the values of all main features are distributed differently in different ranges and vary greatly in some regions. It remains unclear whether these conditions have some specific significance, which may have important implications for clinical outcomes. The feature dependence plot provides information within a given range, showing the trend of possible results. However, it is noteworthy that the plot suggests correlation and not causality. Therefore, it is necessary to integrate the results with clinical experience and specific conditions to determine whether they can be used to develop adjunctive intervention strategies.

FIGURE 5

Figure 5. SHAP dependence plot of main indicators. (A) The SHAP dependence plot demonstrates the effects of a single feature on the final output of the XGboost model. (B) The SHAP dependence plot selects LDL as a feature to determine the effects of HDL.

In addition, SHAP can be used to analyze the influencing factors of a cerebral hemorrhage in each patient. Figure 6 shows the interpretation of the XGBoost model for the prediction of two cases. Specifically, the arrows show the effects of each factor on prediction. Features that increase the risk of developing ICH are shown in red, and those that reduce the risk are shown in blue. The stripe length of each feature indicates the importance of the feature when making predictions. The longer the stripes, the greater the contribution of the feature to the prediction. After combining the influence of all factors, the corresponding prediction score of each factor was calculated. Figure 6A demonstrates the contribution of different features to prediction in a patient correctly predicted to have ICH. CRP, LDL, and HGB had the largest contribution (red), indicating that they were the main causes of cerebral hemorrhage in the patient. The second patient was accurately predicted to have no ICH (Figure 6B), with LDL, CRP, and pre-dialysis SBP identified as protective factors. Although there were some risk factors, the patient had no cerebral hemorrhage.

FIGURE 6

Figure 6. Interpretation of the SHAP model for the prediction of two cases. The red stripe feature is conducive to the prediction of a cerebral hemorrhage in patients undergoing dialysis, whereas the blue stripe feature is conducive to the prediction of no cerebral hemorrhage. (A) The contribution of different features to prediction in a patient correctly predicted to have ICH. (B) The contribution of different features to prediction in a patient correctly predicted to have no ICH.

4. Discussion

Intracerebral hemorrhage is characterized by a high rate of disability and death, which greatly increases the economic burden on families and society, so it is essential to investigate the factors influencing the complications of ICH events in MHD patients. Many scholars have identified the risk factors of ICH and hematoma expansion in patients undergoing MHD and screened variables, such as serum calcium, serum creatinine, and serum antiplatelet agents, via multivariate logistic regression (16, 17). Unlike many previous studies, the present study innovatively used ML algorithms to screen for variables, and to the best of our knowledge, this is the first study to report the development of an ML-based predictive model to evaluate the probability of concurrent ICH events in patients undergoing MHD. In addition, we also applied four mainstream machine learning models, namely, LR, SVM, KNN, and CNB, to compare the predictive performance of the XGBoost algorithm with these machine learning methods.

XGBoost is a lifting algorithm based on tree models. Since its establishment in 2016, it has been used to deal with non-linear relationships and complex interactions between variables owing to its higher prediction accuracy and faster operation speed (14). The XGBoost algorithm has been widely used in the medical field, especially for the prediction of critical illnesses. Po-Yu Tseng et al. used the combination of RF and XGboost to predict the risk of acute kidney injury after cardiac surgery, and the final AUC value was 0.843 (8). Pan et al. used XGBoost to predict the mortality of critically ill patients with COVID-19 admitted to the ICU. The AUC values of the training and validation sets were 0.86 and 0.92, respectively (18). The findings of the present study suggest that XGBoost can effectively improve the prediction of ICH in patients undergoing MHD. In this study, the predictors considered to be related to ICH in actual clinical practice and literature were included; patient information was collected as comprehensively as possible; abnormal indicators of various metabolic disorders were refined; ML algorithms were used to analyze variables; and finally, the ROC-AUC value of the optimal model (XGBoost) was as high as 0.979 (Figure 2), with the highest prediction accuracy and significantly better performance than other mainstream machine learning models.

In addition, in this study, we used SHAP to interpret the results of ML models. Emphasis is placed on features that have the greatest impact on outcome measures, thus helping clinicians to realize the rationale behind predicted outcomes early enough to initiate prompt intervention. The results showed that changes in LDL, HDL, CRP, SBP, and HGB levels were the main predictors of ICH in patients undergoing MHD, which was consistent with clinical studies.

Lipid is an indispensable neutral fat in the human body. To date, numerous studies have investigated the relationship between lipid metabolism and ICH. Lipid metabolism disorders in patients undergoing long-term hemodialysis are closely related to the occurrence of a cerebral hemorrhage (19, 20), which is consistent with the results of this study. The Genetic and Environmental Risk Factors for Hemorrhagic Stroke (GERFHS) reported a 33% reduction in the risk of a cerebral hemorrhage in patients with higher cholesterol levels, and a retrospective study (21) reported a significantly increased risk of hemorrhagic stroke in patients with lower HDL levels. The mechanism may be explored because lower LDL-C levels are closely associated with an increased number of cerebral microbleeds (CMBs) (22). Lobar CMBs are mainly associated with cerebral amyloid angiopathy (CAA) (23). The ε 4 allele variation of apolipoprotein E (APOE) is a known genetic risk factor for CAA. Genetic studies have shown a higher rate of reduction in LDL-C concentrations with the APOE ε 4 genotype vector (24). Recent studies have also shown that higher LDL-C genetic risk scores are associated with a higher prevalence of multiple lobar microbleeds (25). CMBs are independent risk factors for ICH and strong predictors of future cerebral hemorrhage (26). In addition, cholesterol is related to physiological processes such as vascular wall construction. Extremely low cholesterol levels may destroy the integrity of intracranial vascular endothelial cells, aggravate vascular endothelial damage, and increase the risk of cerebral hemorrhage (27). HDL is considered a protective factor for atherosclerosis (28), and low HDL levels can aggravate the progression of atherosclerosis, thus increasing the risk of a cerebral hemorrhage.

CRP is an important part of the immune system and one of the signs of acute inflammation (29). In this study, CRP levels were significantly different between the ICH and non-ICH groups, and CRP was highly correlated with ICH, which is consistent with the findings of previous studies (30–32). Patients undergoing MHD often have comorbid inflammation, which may lead to endothelial damage and atherosclerosis (33, 34), thereby increasing the morbidity and mortality of cerebrovascular diseases (35). Genetic studies have shown that the significantly reduced expression of haplotype H5 in the CRP genotype is closely associated with hemorrhagic stroke (36). CRP induces endothelial dysfunction by directly destroying the blood–brain barrier (BBB) and induces monocytes to release proinflammatory cytokines, leading to increased vascular permeability and cerebral hemorrhage (37, 38).

According to the model results of this study, the SBP before daily hemodialysis in the cerebral hemorrhage group was higher than that in the control group, which is consistent with the conclusion that hypertension is a risk factor for cerebral hemorrhage in MHD patients as reported in previous studies. Hypertension is a known traditional risk factor for ICH (39). In patients with chronic kidney disease, renal function and excretion are impaired, blood volume is increased, renin–angiotensin–aldosterone system is activated in a feedback manner, and water and sodium retention is aggravated. In this study, the higher SBP before dialysis in patients with ICH may be related to inadequate dialysis. In addition, during hemodialysis, the greater hemodynamic changes and the excretion of antihypertensive drugs will aggravate hypertension, resulting in increased pressure on cerebral arteries. When the pressure on the vascular wall exceeds the pressure, the cerebral vessels rupture and bleed, causing cerebral hemorrhage.

Patients undergoing MHD are predisposed to anemia owing to factors such as reduced erythropoietin synthesis (40). HGB is the main indicator reflecting the anemic status of humans. Recent studies have reported that the HGB level of patients with MHD is negatively correlated with the risk of a cerebral hemorrhage (41–43), which is consistent with the results of this study. The underlying mechanisms may include vasoconstriction (44), platelet aggregation (45, 46), and cytotoxic reaction caused by chronic hypoxia (44), leading to brain dysfunction or damage.

This study has some limitations. First, although this study had a multicenter design, it only includes patients from three hospitals in Xuzhou, China. In future studies, we will include datasets from different regions and hospitals for external testing to improve the generalization ability of the model. Second, the number of patients with and without ICH was not well-balanced, which may have led to impaired prediction. Considering that deep learning has been widely used in the medical community in recent years, we will use deep learning models to incorporate a wider range of data in future studies. Overall, compared with traditional models, the prediction model developed in this study contains more information and has better predictive accuracy. In addition, the visualization of results based on SHAP can, to a great extent, alleviate the “black box” problem.

5. Conclusion

A predictive ML model was developed based on XGBoost, and SHAP was used to explain the clinical significance of each risk factor in predicting the occurrence of ICH in patients undergoing MHD. ICH events in patients undergoing MHD are associated with serum LDL, HDL, CRP, HGB, and pre-hemodialysis SBP levels. The combination of the XGBoost algorithm and SHAP can provide a clear explanation for risk prediction, which has great application value in future clinical research. This combination can help clinicians to implement early clinical interventions, provide more comprehensive information for the long-term management of patients undergoing MHD, and prevent and reduce the risk of ICH.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving human participants were reviewed and approved by the Ethics Committee of the Affiliated Hospital of Xuzhou Medical University. Written informed consent from the patients/participants or patients/participants' legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements.

Author contributions

FL and AC conceptualized the study, outlined the study design, collected data, analyzed and interpreted results, and wrote the manuscript. ZL and QP preprocessed input data, built machine learning models, analyzed data, and wrote the manuscript. LG collected data and preprocessed input data. YF, JF, and PW helped to adjust the ideas of the manuscript, suggested changes, and revised the manuscript. All authors agreed to take responsibility for their contributions and read and approved the final manuscript.

Funding

The research was sponsored by the National Natural Science Foundation (General Program) Grant No. 61972211, China and the National Key Research and Development Project Grant No. 2020YFB1804700, China.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Himmelfarb J, Vanholder R, Mehrotra R, Tonelli M. The current and future landscape of dialysis. Nat Rev Nephrol. (2020) 16:573–85. doi: 10.1038/s41581-020-0315-4

PubMed Abstract | CrossRef Full Text | Google Scholar

2. de Oliveira Manoel AL. Surgery for spontaneous intracerebral hemorrhage. Crit Care. (2020) 24:45. doi: 10.1186/s13054-020-2749-2

PubMed Abstract | CrossRef Full Text | Google Scholar

3. Wang HH, Hung SY, Sung JM, Hung KY, Wang JD. Risk of stroke in long-term dialysis patients compared with the general population. Am J Kidney Dis. (2014) 63:604–11. doi: 10.1053/j.ajkd.2013.10.013

PubMed Abstract | CrossRef Full Text | Google Scholar

4. Wyld M, Webster AC. Chronic kidney disease is a risk factor for stroke. J Stroke Cerebrovasc Dis. (2021) 30:105730. doi: 10.1016/j.jstrokecerebrovasdis.2021.105730

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Magid-Bernstein J, Girard R, Polster S, Srinath A, Romanos S, Awad IA, et al. Cerebral hemorrhage: pathophysiology, treatment, and future directions. Circ Res. (2022) 130:1204–29. doi: 10.1161/CIRCRESAHA.121.319949

PubMed Abstract | CrossRef Full Text | Google Scholar

6. Van Calster B, Wynants L. Machine learning in medicine. N Engl J Med. (2019) 380:2588. doi: 10.1056/NEJMc1906060

PubMed Abstract | CrossRef Full Text | Google Scholar

7. Handelman GS, Kok HK, Chandra RV, Razavi AH, Lee MJ, Asadi H. eDoctor: machine learning and the future of medicine. J Intern Med. (2018) 284:603–19. doi: 10.1111/joim.12822

PubMed Abstract | CrossRef Full Text | Google Scholar

8. Tseng PY, Chen YT, Wang CH, Chiu KM, Peng YS, Hsu SP, et al. Prediction of the development of acute kidney injury following cardiac surgery by machine learning. Crit Care. (2020) 24:478. doi: 10.1186/s13054-020-03179-9

PubMed Abstract | CrossRef Full Text | Google Scholar

9. Li W, Dong Y, Liu W, Tang Z, Sun C, Lowe S, et al. A deep belief network-based clinical decision system for patients with osteosarcoma. Front Immunol. (2022) 13:1003347. doi: 10.3389/fimmu.2022.1003347

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Peng J, Zou K, Zhou M, Teng Y, Zhu X, Zhang F, et al. An Explainable artificial intelligence framework for the deterioration risk prediction of hepatitis patients. J Med Syst. (2021) 45:61. doi: 10.1007/s10916-021-01736-5

PubMed Abstract | CrossRef Full Text | Google Scholar

11. Cabitza F, Rasoini R, Gensini GF. Unintended consequences of machine learning in medicine. JAMA. (2017) 318:517–8. doi: 10.1001/jama.2017.7797

PubMed Abstract | CrossRef Full Text | Google Scholar

12. Lundberg SM, Nair B, Vavilala MS, Horibe M, Eisses MJ, Adams T, et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat Biomed Eng. (2018) 2:749–60. doi: 10.1038/s41551-018-0304-0

PubMed Abstract | CrossRef Full Text | Google Scholar

13. Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. (2017) 30.

Google Scholar

14. Chen T, Guestrin C. “Xgboost: A scalable tree boosting system” in Proceedings of the 22^nd acm sigkdd international conference on knowledge discovery and data mining. (2016) p. 785–794. doi: 10.1145/2939672.2939785

CrossRef Full Text | Google Scholar

15. National Kidney Foundation. KDOQI clinical practice guideline for hemodialysis adequacy: 2015 update. Am J Kidney Dis. (2015) 66:884–930. doi: 10.1053/j.ajkd.2015.07.015

PubMed Abstract | CrossRef Full Text | Google Scholar

16. Ozelsancak R, Micozkadioglu H, Torun D, Tekkarismaz N. Cerebrovascular events in hemodialysis patients; a retrospective observational study. BMC Nephrol. (2019) 20:466. doi: 10.1186/s12882-019-1629-y

PubMed Abstract | CrossRef Full Text | Google Scholar

17. Kitamura M, Tateishi Y, Sato S, Kitamura S, Ota Y, Muta K, et al. Association between serum calcium levels and prognosis, hematoma volume, and onset of cerebral hemorrhage in patients undergoing hemodialysis. BMC Nephrol. (2019) 20:210. doi: 10.1186/s12882-019-1400-4

PubMed Abstract | CrossRef Full Text | Google Scholar

18. Pan P, Li Y, Xiao Y, Han B, Su L, Su M, et al. Prognostic assessment of COVID-19 in the intensive care unit by machine learning methods: model development and validation. J Med Internet Res. (2020) 22:e23128. doi: 10.2196/23128

PubMed Abstract | CrossRef Full Text | Google Scholar

19. Ferro CJ, Mark PB, Kanbay M, Sarafidis P, Heine GH, Rossignol P, et al. Lipid management in patients with chronic kidney disease. Nat Rev Nephrol. (2018) 14:727–49. doi: 10.1038/s41581-018-0072-9

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Amarenco P, Bogousslavsky J, Callahan A. 3^rd, Goldstein LB, Hennerici M, Rudolph AE, et al. High-dose atorvastatin after stroke or transient ischemic attack. N Engl J Med. (2006) 355:549–59. doi: 10.1016/j.jvs.2006.10.008

CrossRef Full Text | Google Scholar

21. Shen Y, Shi L, Nauman E, Katzmarzyk PT, Price-Haywood EG, Bazzano AN, et al. Inverse association between HDL (High-Density Lipoprotein) cholesterol and stroke risk among patients with type 2 diabetes mellitus. Stroke. (2019) 50:291–7. doi: 10.1161/STROKEAHA.118.023682

PubMed Abstract | CrossRef Full Text | Google Scholar

22. Ma C, Gurol ME, Huang Z, Lichtenstein AH, Wang X, Wang Y, et al. Low-density lipoprotein cholesterol and risk of intracerebral hemorrhage: a prospective study. Neurology. (2019) 93:e445–57. doi: 10.1212/WNL.0000000000007853

PubMed Abstract | CrossRef Full Text | Google Scholar

23. Schrag M, Kirshner H. Management of intracerebral hemorrhage: JACC focus seminar. J Am Coll Cardiol. (2020) 75:1819–31. doi: 10.1016/j.jacc.2019.10.066

PubMed Abstract | CrossRef Full Text | Google Scholar

24. Phuah CL, Raffeld MR, Ayres AM, Gurol ME, Viswanathan A, Greenberg SM, et al. APOE polymorphisms influence longitudinal lipid trends preceding intracerebral hemorrhage. Neurol Genet. (2016) 2:e81. doi: 10.1212/NXG.0000000000000081

PubMed Abstract | CrossRef Full Text | Google Scholar

25. Akoudad S, Ikram MA, Portegies ML, Adams HH, Bos D, Hofman A, et al. Genetic loci for serum lipid fractions and intracerebral hemorrhage. Atherosclerosis. (2016) 246:287–92. doi: 10.1016/j.atherosclerosis.2016.01.024

PubMed Abstract | CrossRef Full Text | Google Scholar

26. Charidimou A, Shams S, Romero JR, Ding J, Veltkamp R, Horstmann S, et al. Clinical significance of cerebral microbleeds on MRI: a comprehensive meta-analysis of risk of intracerebral hemorrhage, ischemic stroke, mortality, and dementia in cohort studies (v1). Int J Stroke. (2018) 13:454–68. doi: 10.1177/1747493017751931

PubMed Abstract | CrossRef Full Text | Google Scholar

27. Lyu J, Yang EJ, Shim JS. Cholesterol trafficking: an emerging therapeutic target for angiogenesis and cancer. Cells. (2019) 8:389. doi: 10.3390/cells8050389

PubMed Abstract | CrossRef Full Text | Google Scholar

28. Ouimet M, Barrett TJ, Fisher EA. HDL and reverse cholesterol transport. Circ Res. (2019) 124:1505–18. doi: 10.1161/CIRCRESAHA.119.312617

PubMed Abstract | CrossRef Full Text | Google Scholar

29. Pathak A, Agrawal A. Evolution of C-reactive protein. Front Immunol. (2019) 10:943. doi: 10.3389/fimmu.2019.00943

PubMed Abstract | CrossRef Full Text | Google Scholar

30. Xue Y, Zhang L, Fan Y, Li Q, Jiang Y, Shen C. C-reactive protein gene contributes to the genetic susceptibility of hemorrhagic stroke in men: a case-control study in chinese han population. J Mol Neurosci. (2017) 62:395–401. doi: 10.1007/s12031-017-0945-6

PubMed Abstract | CrossRef Full Text | Google Scholar

31. Löppönen P, Qian C, Tetri S, Juvela S, Huhtakangas J, Bode MK, et al. Predictive value of C-reactive protein for the outcome after primary intracerebral hemorrhage. J Neurosurg. (2014) 121:1374–9. doi: 10.3171/2014.7.JNS132678

PubMed Abstract | CrossRef Full Text | Google Scholar

32. Bader ER, Pana TA, Barlas RS, Metcalf AK, Potter JF, Myint PK. Elevated inflammatory biomarkers and poor outcomes in intracerebral hemorrhage. J Neurol. (2022) 269:6330–41. doi: 10.1007/s00415-022-11284-8

PubMed Abstract | CrossRef Full Text | Google Scholar

33. Sá Martins V, Aguiar L, Dias C, Lourenço P, Pinheiro T, Velez B, et al. Predictors of nutritional and inflammation risk in hemodialysis patients. Clin Nutr. (2020) 39:1878–84. doi: 10.1016/j.clnu.2019.07.029

PubMed Abstract | CrossRef Full Text | Google Scholar

34. Maraj M, Kuśnierz-Cabala B, Dumnicka P, Gala-Bładzińska A, Gawlik K, Pawlica-Gosiewska D, et al. Malnutrition, inflammation, atherosclerosis syndrome (MIA) and diet recommendations among end-stage renal disease patients treated with maintenance hemodialysis. Nutrients. (2018) 10:69. doi: 10.3390/nu10010069

PubMed Abstract | CrossRef Full Text | Google Scholar

35. Bihl JC, Zhang C, Zhao Y, Xiao X, Ma X, Chen Y, et al. Angiotensin-(1-7) counteracts the effects of Ang II on vascular smooth muscle cells, vascular remodeling and hemorrhagic stroke: role of the NF?B inflammatory pathway. Vascul Pharmacol. (2015) 73:115–23. doi: 10.1016/j.vph.2015.08.007

PubMed Abstract | CrossRef Full Text | Google Scholar

36. Wang Q, Ding H, Tang JR, Zhang L, Xu YJ, Yan JT, et al. C-reactive protein polymorphisms and genetic susceptibility to ischemic stroke and hemorrhagic stroke in the Chinese Han population. Acta Pharmacol Sin. (2009) 30:291–8. doi: 10.1038/aps.2009.14

PubMed Abstract | CrossRef Full Text | Google Scholar

37. Sproston NR, Ashworth JJ. Role of C-reactive protein at sites of inflammation and infection. Front Immunol. (2018) 9:754. doi: 10.3389/fimmu.2018.00754

PubMed Abstract | CrossRef Full Text | Google Scholar

38. Di Napoli M, Slevin M, Popa-Wagner A, Singh P, Lattanzi S, Divani AA. Monomeric C-Reactive protein and cerebral hemorrhage: from bench to bedside. Front Immunol. (2018) 9:1921. doi: 10.3389/fimmu.2018.01921

PubMed Abstract | CrossRef Full Text | Google Scholar

39. Chaudhary N, Pandey AS, Wang X, Xi G. Hemorrhagic stroke-Pathomechanisms of injury and therapeutic options. CNS Neurosci Ther. (2019) 25:1073–4. doi: 10.1111/cns.13225

PubMed Abstract | CrossRef Full Text | Google Scholar

40. Babitt JL, Lin HY. Mechanisms of anemia in CKD. J Am Soc Nephrol. (2012) 23:1631–4. doi: 10.1681/ASN.2011111078

PubMed Abstract | CrossRef Full Text | Google Scholar

41. Diedler J, Sykora M, Hahn P, Heerlein K, Schölzke MN, Kellert L, et al. Low hemoglobin is associated with poor functional outcome after non-traumatic, supratentorial intracerebral hemorrhage. Crit Care. (2010) 14:R63. doi: 10.1186/cc8961

PubMed Abstract | CrossRef Full Text | Google Scholar

42. Kuragano T, Matsumura O, Matsuda A, Hara T, Kiyomoto H, Murata T, et al. Association between hemoglobin variability, serum ferritin levels, and adverse events/mortality in maintenance hemodialysis patients. Kidney Int. (2014) 86:845–54. doi: 10.1038/ki.2014.114

PubMed Abstract | CrossRef Full Text | Google Scholar

43. Milionis H, Papavasileiou V, Eskandari A, D'Ambrogio-Remillard S, Ntaios G, Michel P. Anemia on admission predicts short- and long-term outcomes in patients with acute ischemic stroke. Int J Stroke. (2015) 10:224–30. doi: 10.1111/ijs.12397

PubMed Abstract | CrossRef Full Text | Google Scholar

44. Zhang S, Pan X, Wei C, Wang L, Cheng Y, Hu Z, et al. Associations of anemia with outcomes in patients with spontaneous intracerebral hemorrhage: a meta-analysis. Front Neurol. (2019) 10:406. doi: 10.3389/fneur.2019.00406

PubMed Abstract | CrossRef Full Text | Google Scholar

45. Dauerman HL, Lessard D, Yarzebski J, Gore JM, Goldberg RJ. Bleeding complications in patients with anemia and acute myocardial infarction. Am J Cardiol. (2005) 96:1379–83. doi: 10.1016/j.amjcard.2005.06.088

PubMed Abstract | CrossRef Full Text | Google Scholar

46. Lisman T, Caldwell SH, Burroughs AK, Northup PG, Senzolo M, Stravitz RT, et al. Hemostasis and thrombosis in patients with liver disease: the ups and downs. J Hepatol. (2010) 53:362–71. doi: 10.1016/j.jhep.2010.01.042

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: hemodialysis, uremia, intracerebral hemorrhage, machine learning, predictive models, Shapley additive explanations

Citation: Li F, Chen A, Li Z, Gu L, Pan Q, Wang P, Fan Y and Feng J (2023) Machine learning-based prediction of cerebral hemorrhage in patients with hemodialysis: A multicenter, retrospective study. Front. Neurol. 14:1139096. doi: 10.3389/fneur.2023.1139096

Received: 06 January 2023; Accepted: 08 March 2023;
Published: 03 April 2023.

Edited by:

Daniel Donoho, Children's National Hospital, United States

Reviewed by:

Ping Hu, Second Affiliated Hospital of Nanchang University, China
Hui Jan Tan, National University of Malaysia, Malaysia

Copyright © 2023 Li, Chen, Li, Gu, Pan, Wang, Fan and Feng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yuechao Fan, ZnljNjI2QDE2My5jb20=; Jinhong Feng, ZmVuZ19qaEAxODkuY29tLmNu

^†These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.