- 1Department of Otorhinolaryngology Head and Neck Surgery, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, China
- 2Molecular Imaging and Translational Medicine Research Center, State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Xiamen University, Xiamen, China
- 3Department of Otorhinolaryngology Head and Neck Surgery, Xi’an Central Hospital, Xi’an, China
- 4Eye Institute of Xiamen University, School of Medicine, Xiamen University, Xiamen, China
Background: Obstructive sleep apnea (OSA) is a globally prevalent disease closely associated with hypertension. To date, no predictive model for OSA-related hypertension has been established. We aimed to use machine learning (ML) to construct a model to analyze risk factors and predict OSA-related hypertension.
Materials and methods: We retrospectively collected the clinical data of OSA patients diagnosed by polysomnography from October 2019 to December 2021 and randomly divided them into training and validation sets. A total of 1,493 OSA patients with 27 variables were included. Independent risk factors for the risk of OSA-related hypertension were screened by the multifactorial logistic regression models. Six ML algorithms, including the logistic regression (LR), the gradient boosting machine (GBM), the extreme gradient boosting (XGBoost), adaptive boosting (AdaBoost), bootstrapped aggregating (Bagging), and the multilayer perceptron (MLP), were used to develop the model on the training set. The validation set was used to tune the model hyperparameters to determine the final prediction model. We compared the accuracy and discrimination of the models to identify the best machine learning algorithm for predicting OSA-related hypertension. In addition, a web-based tool was developed to promote its clinical application. We used permutation importance and Shapley additive explanations (SHAP) to determine the importance of the selected features and interpret the ML models.
Results: A total of 18 variables were selected for the models. The GBM model achieved the most extraordinary discriminatory ability (area under the receiver operating characteristic curve = 0.873, accuracy = 0.885, sensitivity = 0.713), and on the basis of this model, an online tool was built to help clinicians optimize OSA-related hypertension patient diagnosis. Finally, age, family history of hypertension, minimum arterial oxygen saturation, body mass index, and percentage of time of SaO2 < 90% were revealed by the SHAP method as the top five critical variables contributing to the diagnosis of OSA-related hypertension.
Conclusion: We established a risk prediction model for OSA-related hypertension patients using the ML method and demonstrated that among the six ML models, the gradient boosting machine model performs best. This prediction model could help to identify high-risk OSA-related hypertension patients, provide early and individualized diagnoses and treatment plans, protect patients from the serious consequences of OSA-related hypertension, and minimize the burden on society.
Introduction
Obstructive sleep apnea (OSA) is a sleep disorder characterized by intermittent hypoxemia, autonomic fluctuation, and sleep fragmentation. As of 2019, the prevalence of OSA aged 30–69 years (men and women) in China has reached 24.2%, ranking first in the world (1). Aside from the fact that OSA causes difficult symptoms, many studies demonstrated that OSA is closely associated with many complications, such as cardiovascular diseases, metabolic disorders, and cognitive impairment (2–4). Among them, cardiovascular diseases have received extensive attention because of their serious consequences and high morbidity, especially hypertension. Observational studies have illustrated that 45–68% of subjects with OSA have hypertension (5, 6), and the prevalence of OSA is more than 30% among hypertension patients (7).
Hypertension that is primarily caused or exacerbated by OSA is called OSA-related hypertension after excluding other definite secondary etiologies (e.g., renal artery stenosis, renal parenchymal disease, primary aldosteronism, pheochromocytoma, and Cushing’s disease) (8). In addition, OSA-related hypertension is characterized by high rates of masked hypertension, elevated nocturnal blood pressure, a non-dipper pattern of nocturnal hypertension, and an increased blood pressure variability (9). Notably, patients with OSA and hypertension seem to be associated with more severe outcomes. Studies based on ambulatory blood pressure monitoring (ABPM) showed that participants with a non-dipper pattern of nocturnal hypertension and those who have elevated blood pressure at night demonstrate a greater degree of end-organ damage, higher risk of stroke, increased risk of incident heart failure, and increased risk of renal disease progression (10). Regrettably, OSA-related hypertension is easily disregarded by patients.
As for the general population, the reference method for blood pressure testing is primarily an in-office measurement. However, this diagnostic method is unreliable in the OSA population because of the specific characteristics of OSA-related hypertension. Previous studies have shown that among OSA patients, masked hypertension was found in 30% of patients, and white-coat hypertension was found in approximately 33% of patients (11–13). It means that there is a high risk that OSA patients may be underdiagnosed or overdiagnosed with hypertension. The application of ABPM to systematically and correctly assess blood pressure is recommended in clinically normotensive OSA patients (14). However, ABPM is not cost-efficient and often burdensome, and in clinical practice, it seems challenging to propose ABPM to all OSA patients with normal clinic blood pressure. Thus, the necessity of a simple and convenient clinical tool to assess OSA-related hypertension in daily clinical practice is emphasized, which can allow the use of ABPM selectively rather than routinely.
Machine learning (ML) has been widely developed and used in the medical field because of its remarkable performance in recent years. It can extract information from complex and non-linear data, establish models through science, reveal hidden dependencies between factors and diseases in the big data environment, and help clinicians better understand the diseases (15). Especially in cardiovascular diseases, machine learning has a wide range of applications and satisfactory diagnostic performance. For example, Ward et al. demonstrated that the gradient boosting machine (GBM) model has good discrimination for atherosclerotic cardiovascular disease risk (16). Although ML has gained extensive attention because of its powerful predictive capabilities, it is often criticized for being a black box model, making it hard for clinicians to understand and trust these complex models. Hence, this has limited its widespread use in medical decision-making (17).
Timely blood pressure screening and early accurate identification of OSA-related hypertension are crucial in minimizing the associated negative health effects. Regrettably, no ML models are available to predict the risk of OSA-related hypertension. In this study, we aimed to develop ML-based prediction models for OSA-related hypertension based on available clinical data from patients to identify high-risk patients. In addition, we used Shapley additive explanations (SHAP) (18), a method for interpreting results made by machine learning models, to explore the relationship between features and the risk of OSA-related hypertension. In addition, we further provide individual interpretations of the model’s decisions through SHAP. Moreover, we established a web-based risk calculator based on the most predictive maximum likelihood algorithm to promote its clinical application, which provided clinicians with valuable tools for risk assessment in OSA-related hypertension.
Materials and methods
Study design and subjects
This is a retrospective observational study. It retrospectively included the OSA patients admitted to the Department of Otorhinolaryngology—Head and Neck Surgery of the Second Affiliated Hospital of Xi’an Jiaotong University between October 2019 and December 2021. All study subjects underwent nighttime polysomnography or home sleep apnea testing and blood pressure monitoring, additionally, cardiologists assessed their blood pressure. OSA was diagnosed on the basis of apnea–hypopnea index (AHI) ≥ 5 events per hour through polysomnography (19). Hypertension was defined as a previous diagnosis with current antihypertensive therapy. Additionally, patients with elevated nocturnal blood pressure who had no history of hypertension were further examined and identified as newly diagnosed with hypertension by a cardiologist with more than 10 years of working experience. The definition of hypertension is described in detail in the Supplementary material.
The inclusion criteria were as follows: (1) patients with age ≥ 18 years, (2) patients with AHI ≥ 5 events per hour, and (3) patients who have not received OSA-related treatment in the past. The exclusion criteria were as follows: (1) patients with incomplete baseline data; (2) patients with disease potentially affecting blood pressure regulation, such as multiple organ dysfunction syndrome, uremia, severe cardiac heart failure, renal, or cardiac transplantation; (3) patients with the most common causes of secondary hypertension, namely, renal parenchymal disease, renovascular diseases, coarctation of the aorta, Cushing’s syndrome, primary hyperaldosteronism, pheochromocytoma, hyperthyroidism, and hyperparathyroidism; (4) pregnant women; (5) patients with history of snoring shorter than the duration of hypertension; and (6) patients who were diagnosed with central sleep apnea (central AHI ≥ 5 events per hour).
This study was approved by the ethics committee of the Second Affiliated Hospital of Xi’an Jiaotong University (approval no. 2021031). In addition, all patients who participated in the research provided informed consent. The inspection items and processes involved in this study are in line with the Declaration of Helsinki.
Data elements
Twenty-seven relevant clinical indicators were collected, and overall, the 27 candidate variables included were as follows: (1) demographic characteristics, namely, gender, heart disease family history of hypertension, diabetes, hypothyroidism, body mass index (BMI), waist circumference, neck circumference, and age/10; (2) lifestyle behaviors, namely, drinking, smoking, high-salt diet, high-fat diet, poor sleep quality, sedentariness, emotionally stable, mental stress, and smoking amount; and (3) OSA-related medical history and indicators, namely, memory decline, inattention, Epworth Sleepiness Scale (20), course of snoring, course of choking, AHI, obstructive apnea index (OAI), minimum arterial oxygen saturation/10 (minimum SaO2/10), and percentage of time of SaO2 < 90%/10 (CT90/10).
Development and validation of prediction models
By comparing the clinical characteristics of the hypertension and non-hypertension groups, the risk factors for predicting OSA-related hypertension were analyzed using the univariate analysis, and they were incorporated into machine learning as characteristic variables. Additionally, they were also used in the multivariate logistic regression analysis to obtain independent predictors associated with OSA-related hypertension.
All patients were randomly divided into a training set for constructing the predictive model and a test set for the model validation at a ratio of 7:3. The following six representative supervised ML algorithms were used for model construction in the training dataset: adaptive boosting, GBM, multilayer perceptron, bootstrapped aggregating, logistic regression, and extreme gradient boost (21–24). During training, the training cohort internal validation method used 10-fold cross-validation to evaluate the predictive power of each ML classifier in plotting the average area under the receiver operating characteristic curve (AUC). With the use of the validation cohort, the receiver operating characteristics of the six ML models were plotted, and AUC values were calculated to evaluate the predictive ability of the different models in cohorts. By comparing the predictive performance of our ML models, the model with the best predictive performance was selected as the final model. In addition, a confusion matrix was used to evaluate the prediction model performance. Subsequently, on the basis of the best predictive ability model, an online risk calculator that can make predictions using newly entered data of OSA patients was created.
Model interpretation
Shapley additive explanations (SHAP) is a model-agnostic explanation technique based on cooperative game theory that helps interpret the results from a predictive model. The interpretation is based on quantifying the SHAP value for each feature, representing the contribution of a feature to the predicted risk of OSA-related hypertension (25, 26). For each sample, the model produces a prediction value, and the sum or average of the absolute Shapley value of each feature of all individuals is the overall feature importance. Components with large fundamental Shapley values are very important. In addition, the SHAP method also proves each feature value’s positive or negative influence on the predicted results, similar to coefficient values in logistic regression. A positive SHAP value indicates that the corresponding feature contributes to a higher risk of the result, whereas a negative SHAP value indicates that the corresponding feature leads to a lower risk of the result. To determine the main predictors of OSA-related hypertension, we identified the importance of ranking features from the final model through the SHAP summary plot and provided individual interpretations of the model’s decisions.
Statistical analysis
All analyses and random division of training and validation sets were performed with R software (version 3.6.0). Continuous variables were represented as the median (p25, p75), whereas categorical variables were represented as numbers (n) and proportions (%). The Wilcoxon rank-sum test compared the two groups’ differences for continuous variables, and categorical variables were evaluated using the chi-squared test. Logistic regression analysis was used to analyze the relationship between various predictor variables (either categorical or continuous) and an outcome that is binary (dichotomous). The Python programming language (version 3.8) was also used to develop and evaluate ML models and design network calculators. For model interpretation, the SHAP was implemented using the Python Shap package. P < 0.05 was considered statistically significant.
Results
Patient characteristics
After the screening process, a total of 1,493 OSA patients were eligible for the study (Figure 1). The baseline characteristics of these patients are summarized in Table 1. For the demographic variables, the two groups were significantly different in heart disease, family history of hypertension, diabetes, BMI, waist circumference, neck circumference, and age/10 (all P < 0.05). For the lifestyle behavior variables, high-salt diet, poor sleep quality, and smoking amount were significant variables (all P < 0.05). For OSA-related medical history and indicators, memory decline, Epworth Sleepiness Scale, course of snoring, course of choking, AHI, OAI, minimum SaO2/10, and CT90/10 were all significantly different between the two groups (all P < 0.05).
Univariate and multivariate logistic regression
Variables with a P < 0.05 in the univariate analysis were selected for multivariate logistic regression analysis to identify the independent risk factors of OSA-related hypertension patients (Table 2), and all regression coefficients are shown in Supplementary Table 1. In addition, the results indicated that family history of hypertension, BMI, age/10, minimum SaO2/10, and CT90/10 were independent risk factors for OSA-related hypertension (all P < 0.05).
Performance of the machine learning algorithm
The average AUC of the six models determined by 10-fold cross-validation is displayed in Figure 2A, with the GBM model achieving the best performance (AUC = 0.837). The model validation results based on the validation set are displayed in Figure 2B, and the GBM model still exhibited the best performance in predicting OSA-related hypertension (AUC = 0.873). Moreover, we further evaluated the stability and accuracy of GBM through five cross-validations, and the results reveal that the GBM has good stability (average AUC = 0.810 ± 0.048) (Figure 2C). The radar plot of the six ML models is shown in Supplementary Figure 1. A comparison of model performance on the validation set is shown in Table 3. Generally, all models performed satisfactorily in AUC, but not ideally in the sensitivity. Among them, the GBM exhibited the highest sensitivity at 0.713. Because GBM yielded the best results for AUC and sensitivity, we chose the GBM model as the final prediction model and then evaluated it (Figure 3). Meanwhile, on the basis of this model, we developed a prediction tool for the web, which can be accessed to further facilitate clinical use through an online risk calculator at https://shimunana-true-ml-vmz425.streamlitapp.com/ (Figure 4). The receiver operating characteristic properties of other ML models are shown in Supplementary Figure 2.
Figure 2. (A) Area under the curve (AUC) values of 10-fold cross-validation. (B) Validation of machine learning algorithms. (C) Receiver operating characteristic curve in the gradient boosting machine (GBM) model. AdaBoost, adaptive boosting; LR, logistic regression; Bagging, bootstrapped aggregating; MLP, multilayer perceptron; GBM, gradient boosting machine; XGBoost, extreme gradient boost; AUC, average area under the curve; ROC, receiver operating characteristic. AUC is used as an indicator of performance, the GBM model achieved the best predictive performance, and the Bagging model had the lowest predictive performance.
Model interpretability
To identify the features that influenced the prediction model the most, we illustrated the SHAP summary plot of GBM and the top 15 features of the prediction model in decreasing order (Figures 5A,B). The SHAP summary plot shows that age/10, family history of hypertension, minimum SaO2/10, BMI, and CT90/10 were the five most critical predictive features of the GBM model and had the most significant impact on the prediction results.
Figure 5. Shapley additive explanations (SHAP). (A,B) The standard and classified bar charts of the SHAP summary plots showed the influence of each parameter on the gradient boosting machine (GBM) model. (C,D) SHAP model explanation of two typical predictions. The features are ranked according to the sum of the SHAP values for all patients, and the SHAP values are used to show the distribution of the effect of each feature on the GBM model outputs. Each dot represents a case in the dataset. The color of a dot indicates the value of the feature, with blue indicating the lowest range and red the highest range. The horizontal axis shows the corresponding SHAP value of the feature. A positive SHAP value contributes to the prediction of rupture and vice versa. SHAP, Shapley additive explanations; GBM, gradient boosting machine; SaO2, arterial oxygen saturation; BMI, body mass index; AHI, apnea–hypopnea index; CT90/10, percentage of time of SaO2 < 90%/10.
Shapley additive explanations (SHAP) values not only could show the contribution of each feature to the final prediction but also could effectively clarify and explain model predictions for individual patients. We provided two living examples to illustrate the role of the SHAP method in describing the machine learning model: a 46-year-old female patient who was diagnosed with OSA but with normal blood pressure and a 54-year-old male patient who was diagnosed with OSA-related hypertension (Figures 5C,D). The constructed model predicted the probability of OSA-related hypertension to be 23% and 57%, respectively. The model predicted the outcome as non-OSA-related hypertension for the female patient, which was consistent with the actual outcome (true negative). In addition, the model prediction result was OSA-related hypertension for the male patient, which was consistent with the actual situation (true positive).
Discussion
The present study is the first study to assess the predictive performance of several machine learning algorithms for OSA-related hypertension, obtain a GBM model that can be used to predict OSA-related hypertension clinically, and explain the model. GBM is a commonly used ML algorithm with satisfactory performance in managing large and complex non-linear datasets and avoiding overfitting (27). Subsequently, we designed a network risk calculator based on the GBM model to estimate the probability of hypertension in individuals with OSA so as to help clinicians make targeted diagnoses and treatment plans, making precision medicine possible.
As hypothesized, our multivariate logistic regression suggested that BMI, age/10, and minimum SaO2/10 were significant independent risk factors for OSA-related hypertension, which converges with previous research. Pan et al. found that among police officers in southern China, the prevalence of OSA-related hypertension was associated with the age of the patients. However, their study population was small and occupation-specific (28). Furthermore, Natsios and colleagues reported that age, BMI, comorbidity, daytime SaO2, and indices of hypoxia during sleep were estimated to be the most precise predictors of hypertension (29). Additionally, because of the differences in study design and study population, we found some different results from previous studies. Family history of hypertension and CT90/10 were also found to be risk factors for OSA-related hypertension in our study. Interestingly, to further confirm how input factors contribute to the model, we calculated SHAP feature importance and feature effects. The importance of variables also showed that the BMI, age/10, and minimum SaO2/10, family history of hypertension, and CT90/10 were the most important input parameters that contribute to the predicted risk of OSA-related hypertension. This strongly demonstrates that these five variables were significant contributors to OSA-related hypertension, and proved the accuracy and reliability of the GBM model. Surely, a prospective study and animal experiments are essential to confirm the accuracy and reliability of our proposed model.
Interestingly, in addition to identifying several known risk factors, multivariate logistic regression and SHAP analysis also found that CT90/10, a variable that had been overlooked in previous cardiovascular studies, also plays an important role in OSA-related hypertension. Previous studies have shown a significant association between CT90 and Coronary Artery Calcium, cerebral small vessel disease and diabetic nephropathy (30–32), but the relationship between CT90 and hypertension has not been explored. The underlying causes by OSA and hypertension have not been fully elucidated. A few pathophysiological mechanisms have been suggested to participate in it, such as elevated sympathetic nervous system activity, renin-angiotensin aldosterone system activity, endothelial dysfunction, inflammation, and metabolic dysregulation (33). And López-Cano et al. showed a positive and significant association between the nocturnal concentration of urine metanephrines and the CT90 (34), suggesting that CT90 may influence sympathetic activity. And this also explains the important role of CT90 in OSA-related hypertension, and needs more attention in the future. Surprisingly, in our statistical model, AHI, as a diagnostic indicator of adult OSA, participates weakly. Whether there is a dose–response relationship between the severity of OSA and the cumulative incidence of hypertension has been debated. The Wisconsin Sleep Cohort Study discovered a dose–response association between OSA and the presence of hypertension 4 years later (35). At the same time, the Sleep Heart Health Study and the Victoria Sleep Cohort Study found that the relationship between hypertension and OSA was no longer significant after age and BMI were controlled for O’Connor et al. (36) and Cano-Pumarega et al. (37). Additionally, AHI is a simple measure of the average number of respiratory events (apneas and hypopneas) per hour of sleep, and it does not reflect adequately the various phenotypes and comorbidities of OSA. Our results disclosed that blood oxygen indicators (e.g., minimum SaO2/10 and CT90/10) might be better predictors of OSA-related hypertension than AHI.
Notably, the risk for OSA-related hypertension is increased most by family history of hypertension in the multivariate logistic regression, followed by age/10. However, the SHAP analysis showed that minimum SaO2/10 has the highest predictive value for OSA-related hypertension. The discrepancy between multivariate logistic regression and SHAP values can be explained by the prevalence of a variable. Odds ratios were calculated only for patients associated with this variable, but the mean SHAP value for all patients was calculated. In addition, the average SHAP value was further used to evaluate the importance of features and rank them. Hence, variables with low impact and high prevalence will have low odds ratios but high SHAP values.
In our study, full integration of the standard clinical variables with Polysomnography parameters was performed during the construction of the ML model. The model can thus predict OSA-related hypertension risk stratification for the patient, using all relevant covariates rather than individual measures. Our approach was also validated with repeated 10-fold cross-validation to provide a robust estimation of prediction accuracy with minimal bias. The six models performed well, with AUC ranging from 0.698 to 0.873 and sensitivity from 0.353 to 0.713 in the test dataset. And the GBM prediction model with the highest AUC, accuracy, and sensitivity was identified as the final model for this study and clinical use. The GBM model with 0.873 AUC and 0.713 sensitivity proves good discrimination and stability. What’s more, we introduce the Shapley value to explain the GBM model. SHAP is a model-independent interpretation technique that interprets black box models globally and locally, and can provide a rational explanation for the prediction, which can significantly enhance the confidence of clinicians in the model.
However, despite our best efforts to improve it, this study still has some limitations. First, this is a single-center retrospective study, and the performance of machine learning algorithms may vary for datasets with different distributions of patient characteristics and various institutions. Therefore, more patients from multiple sources are required to validate our model’s robustness and repeatability in the future. Second, the undesirable sensitivity may be that the ML algorithm learns from input features, and some discreet relationships may have been lost because of unknown or disregarded features not registered by doctors. In the future, we will conduct prospective validation based on this model, continue to explore crucial risk factors for OSA-related hypertension, and modify the model further to improve the accuracy and reliability of the GBM prediction model.
Conclusion
We established a risk prediction model for OSA-related hypertension patients using the ML method and demonstrated that the GBM model performs best among the six ML models. This prediction model could help to identify high-risk OSA-related hypertension patients, provide early and individualized diagnoses and treatment plans, protect patients from the serious consequences of OSA-related hypertension, and reduce the burden on society.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Ethics statement
The studies involving human participants were reviewed and approved by the Ethics Committee of The Second Affiliated Hospital of Xi’an Jiaotong University (approval no. 2021031). Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.
Author contributions
YS, SW, and XR designed the research. XC and LM wrote the manuscript. YF, YY, HL, and LY collected the data. YX, ZC, and CZ performed data curation. YZ, LM, and WL analyzed and processed the data. XR directed the research and the guarantor of the manuscript and takes full responsibility for the integrity of the work, from its inception to the published manuscript. All authors reviewed the results and approved the final version of the manuscript.
Funding
This work was supported by the National Natural Science Foundation of China (grant no. 62076198) and the Key Research and Development Program in the Social Development Field of Shaanxi, China (grant nos. 2020GXLH-Y005 and 2021SF-286).
Acknowledgments
We wish to thank all who volunteered to participate in this study.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fcvm.2022.1042996/full#supplementary-material
Abbreviations
OSA, obstructive sleep apnea; ML, machine learning; AUC, area under the curve; SHAP, Shapley additive explanations; SaO2, arterial oxygen saturation; OAI, obstructive apnea index; BMI, body mass index; GBM, gradient boosting machine; ABPM, ambulatory blood pressure monitoring; AHI, apnea–hypopnea index; CT90/10, percentage of time of SaO2 < 90%/10.
References
1. Benjafield AV, Ayas NT, Eastwood PR, Heinzer R, Ip MSM, Morrell MJ, et al. Estimation of the global prevalence and burden of obstructive sleep apnoea: a literature-based analysis. Lancet Respir Med. (2019) 7:687–98.
2. Shi Y, Luo H, Liu H, Hou J, Feng Y, Chen J, et al. Related biomarkers of neurocognitive impairment in children with obstructive sleep apnea. Int J Pediatr Otorhinolaryngol. (2019) 116:38–42. doi: 10.1016/j.ijporl.2018.10.015
3. Shi Y, Feng Y, Chen X, Ma L, Cao Z, Shang L, et al. Serum neurofilament light reflects cognitive dysfunctions in children with obstructive sleep apnea. BMC Pediatr. (2022) 22:449. doi: 10.1186/s12887-022-03514-9
4. Yeghiazarians Y, Jneid H, Tietjens JR, Redline S, Brown DL, El-Sherif N, et al. Obstructive sleep apnea and cardiovascular disease: a scientific statement from the American heart association. Circulation. (2021) 144:e56–67. doi: 10.1161/CIR.0000000000000988
5. Kiely JL, McNicholas WT. Cardiovascular risk factors in patients with obstructive sleep apnoea syndrome. Eur Respir J. (2000) 16:128–33. doi: 10.1034/j.1399-3003.2000.16a23.x
6. Millman RP, Redline S, Carlisle CC, Assaf AR, Levinson PD. Daytime hypertension in obstructive sleep apnea. Prevalence and contributing risk factors. Chest. (1991) 99:861–6. doi: 10.1378/chest.99.4.861
7. Worsnop CJ, Naughton MT, Barter CE, Morgan TO, Anderson AI, Pierce RJ. The prevalence of obstructive sleep apnea in hypertensives. Am J Respir Crit Care Med. (1998) 157:111–5. doi: 10.1164/ajrccm.157.1.9609063
8. HPCO Chinese. Medical association expert consensus on clinical diagnosis and treatment of obstructive sleep apnea-related hypertension. Chin J Pract Intern Med. (2013).
9. Kario K, Hettrick DA, Prejbisz A, Januszewicz A. Obstructive sleep apnea-induced neurogenic nocturnal hypertension: a potential role of renal denervation? Hypertension. (2021) 77:1047–60. doi: 10.1161/HYPERTENSIONAHA.120.16378
10. Ahmad M, Makati D, Akbar S. Review of and updates on hypertension in obstructive sleep apnea. Int J Hypertens. (2017) 2017:1848375. doi: 10.1155/2017/1848375
11. Pio-Abreu A, Moreno H Jr, Drager LF. Obstructive sleep apnea and ambulatory blood pressure monitoring: current evidence and research gaps. J Hum Hypertens. (2021) 35:315–24. doi: 10.1038/s41371-020-00470-8
12. Baguet JP, Lévy P, Barone-Rochette G, Tamisier R, Pierre H, Peeters M, et al. Masked hypertension in obstructive sleep apnea syndrome. J Hypertens. (2008) 26:885–92. doi: 10.1097/HJH.0b013e3282f55049
13. García-Río F, Pino JM, Alonso A, Arias MA, Martínez I, Alvaro D, et al. White coat hypertension in patients with obstructive sleep apnea-hypopnea syndrome. Chest. (2004) 125:817–22. doi: 10.1378/chest.125.3.817
14. Parati G, Lombardi C, Hedner J, Bonsignore MR, Grote L, Tkacova R, et al. Recommendations for the management of patients with obstructive sleep apnoea and hypertension. Eur Respir J. (2013) 41:523–38. doi: 10.1183/09031936.00226711
15. Schwalbe N, Wahl B. Artificial intelligence and the future of global health. Lancet. (2020) 395:1579–86. doi: 10.1016/S0140-6736(20)30226-9
16. Ward A, Sarraju A, Chung S, Li J, Harrington R, Heidenreich P, et al. Machine learning and atherosclerotic cardiovascular disease risk prediction in a multi-ethnic population. NPJ Digit Med. (2020) 3:125. doi: 10.1038/s41746-020-00331-1
17. Molnar C. Interpretable Machine Learning. Morrisville: Lulu.com (2020).
18. Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. In: Guyon I, Von Luxburg U, Bengio S, Wallach H, Fergus R, Vishwanathan S editors. Advances in Neural Information Processing Systems 30. San Jose, CA: Curran Associates, Inc (2017).
19. Sateia MJ. International classification of sleep disorders-third edition: highlights and modifications. Chest. (2014) 146:1387–94. doi: 10.1378/chest.14-0970
20. Johns MW. A new method for measuring daytime sleepiness: the epworth sleepiness scale. Sleep. (1991) 14:540–5. doi: 10.1093/sleep/14.6.540
21. Chen T, Guestrin C. X: a scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining-KDD 2016. San Francisco, CA: (2016). p. 785–94. doi: 10.1145/2939672.2939785
22. Singh A, Thakur N, Sharma A. A review of supervised machine learning algorithms. 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom). Piscataway: IEEE (2016). p. 1310–5.
23. Ngiam KY, Khor IW. Big data and machine learning algorithms for health-care delivery. Lancet Oncol. (2019) 20:e262–73. doi: 10.1016/S1470-2045(19)30149-4
24. Zhang Z, Zhao Y, Canes A, Steinberg D, Lyashevska O. Predictive analytics with gradient boosting in clinical medicine. Ann Transl Med. (2019) 7:152. doi: 10.21037/atm.2019.03.29
25. Rodríguez-Pérez R, Bajorath J. Interpretation of compound activity predictions from complex machine learning models using local approximations and shapley values. J Med Chem. (2020) 63:8761–77. doi: 10.1021/acs.jmedchem.9b01101
26. Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. (2020) 2:56–67. doi: 10.1038/s42256-019-0138-9
27. Sheridan RP, Wang WM, Liaw A, Ma J, Gifford EM. Extreme gradient boosting as a method for quantitative structure-activity relationships. J Chem Inf Model. (2016) 56:2353–60. doi: 10.1021/acs.jcim.6b00591
28. Pan M, Ou Q, Chen B, Hong Z, Liu H. Risk factors for obstructive sleep apnea-related hypertension in police officers in Southern China. J Thorac Dis. (2019) 11:4169–78. doi: 10.21037/jtd.2019.09.83
29. Natsios G, Pastaka C, Vavougios G, Zarogiannis SG, Tsolaki V, Dimoulis A, et al. Age, body mass index, and daytime and nocturnal hypoxia as predictors of hypertension in patients with obstructive sleep apnea. J Clin Hypertens. (2016) 18:146–52. doi: 10.1111/jch.12645
30. Liu X, Lam DC, Mak HK, Ip MS, Lau KK. Associations of sleep apnea risk and oxygen desaturation indices with cerebral small vessel disease burden in patients with stroke. Front Neurol. (2022) 13:956208. doi: 10.3389/fneur.2022.956208
31. Seo MY, Lee SH, Hong SD, Chung SK, Kim HY. Hypoxemia during sleep and the progression of coronary artery calcium. Cardiovasc Toxicol. (2021) 21:42–8. doi: 10.1007/s12012-020-09593-3
32. Zhang R, Zhang P, Zhao F, Han X, Ji L. Association of diabetic microvascular complications and parameters of obstructive sleep apnea in patients with type 2 diabetes. Diabetes Technol Ther. (2016) 18:415–20. doi: 10.1089/dia.2015.0433
33. Salman LA, Shulman R, Cohen JB. Obstructive sleep apnea, hypertension, and cardiovascular risk: epidemiology, pathophysiology, and management. Curr Cardiol Rep. (2020) 22:6. doi: 10.1007/s11886-020-1257-y
34. López-Cano C, Gutiérrez-Carrasquilla L, Sánchez E, González J, Yeramian A, Martí R, et al. Sympathetic hyperactivity and sleep disorders in individuals with type 2 diabetes. Front Endocrinol. (2019) 10:752. doi: 10.3389/fendo.2019.00752
35. Peppard PE, Young T, Palta M, Skatrud J. Prospective study of the association between sleep-disordered breathing and hypertension. N Engl J Med. (2000) 342:1378–84. doi: 10.1056/NEJM200005113421901
36. O’Connor GT, Caffo B, Newman AB, Quan SF, Rapoport DM, Redline S, et al. Prospective study of sleep-disordered breathing and hypertension: the Sleep Heart Health Study. Am J Respir Crit Care Med. (2009) 179:1159–64. doi: 10.1164/rccm.200712-1809OC
37. Cano-Pumarega I, Durán-Cantolla J, Aizpuru F, Miranda-Serrano E, Rubio R, Martínez-Null C, et al. Obstructive sleep apnea and systemic hypertension: longitudinal study in the general population: the Vitoria sleep cohort. Am J Respir Crit Care Med. (2011) 184:1299–304. doi: 10.1164/rccm.201101-0130OC
Keywords: obstructive sleep apnea, hypertension, machine learning, risk factor, Shapley additive explanations, gradient boosting machine (GBM)
Citation: Shi Y, Ma L, Chen X, Li W, Feng Y, Zhang Y, Cao Z, Yuan Y, Xie Y, Liu H, Yin L, Zhao C, Wu S and Ren X (2022) Prediction model of obstructive sleep apnea–related hypertension: Machine learning–based development and interpretation study. Front. Cardiovasc. Med. 9:1042996. doi: 10.3389/fcvm.2022.1042996
Received: 13 September 2022; Accepted: 21 November 2022;
Published: 05 December 2022.
Edited by:
Yanwu Xu, Baidu, ChinaReviewed by:
Hongliang Yi, Shanghai Jiao Tong University, ChinaXiangdong Tang, Sichuan University, China
Copyright © 2022 Shi, Ma, Chen, Li, Feng, Zhang, Cao, Yuan, Xie, Liu, Yin, Zhao, Wu and Ren. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Shinan Wu, d3NoaW5hbmE5OUAxNjMuY29t; Xiaoyong Ren, Y29yX3JlbnhpYW95b25nQDEyNi5jb20=
†These authors have contributed equally to this work and share first authorship