Prediction model of obstructive sleep apnea–related hypertension: Machine learning–based development and interpretation study

Shi, Yewen; Ma, Lina; Chen, Xi; Li, Wenle; Feng, Yani; Zhang, Yitong; Cao, Zine; Yuan, Yuqi; Xie, Yushan; Liu, Haiqin; Yin, Libo; Zhao, Changying; Wu, Shinan; Ren, Xiaoyong

doi:10.3389/fcvm.2022.1042996

ORIGINAL RESEARCH article

Front. Cardiovasc. Med. , 05 December 2022

Sec. Hypertension

Volume 9 - 2022 | https://doi.org/10.3389/fcvm.2022.1042996

This article is part of the Research Topic Systems Biology and Data-Driven Machine Learning-Based Models in Personalized Cardiovascular Medicine View all 17 articles

Prediction model of obstructive sleep apnea–related hypertension: Machine learning–based development and interpretation study

$\r\nYewen Shi&#x;$ Yewen Shi^1†

Lina Ma^1†

Xi Chen^1†

Wenle Li²

Yani Feng¹

Yitong Zhang¹

Zine Cao¹

Yuqi Yuan¹

Yushan Xie¹

Haiqin Liu¹

Libo Yin³

Changying Zhao¹

Shinan Wu^4*

Xiaoyong Ren^1*

¹Department of Otorhinolaryngology Head and Neck Surgery, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, China
²Molecular Imaging and Translational Medicine Research Center, State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Xiamen University, Xiamen, China
³Department of Otorhinolaryngology Head and Neck Surgery, Xi’an Central Hospital, Xi’an, China
⁴Eye Institute of Xiamen University, School of Medicine, Xiamen University, Xiamen, China

Background: Obstructive sleep apnea (OSA) is a globally prevalent disease closely associated with hypertension. To date, no predictive model for OSA-related hypertension has been established. We aimed to use machine learning (ML) to construct a model to analyze risk factors and predict OSA-related hypertension.

Materials and methods: We retrospectively collected the clinical data of OSA patients diagnosed by polysomnography from October 2019 to December 2021 and randomly divided them into training and validation sets. A total of 1,493 OSA patients with 27 variables were included. Independent risk factors for the risk of OSA-related hypertension were screened by the multifactorial logistic regression models. Six ML algorithms, including the logistic regression (LR), the gradient boosting machine (GBM), the extreme gradient boosting (XGBoost), adaptive boosting (AdaBoost), bootstrapped aggregating (Bagging), and the multilayer perceptron (MLP), were used to develop the model on the training set. The validation set was used to tune the model hyperparameters to determine the final prediction model. We compared the accuracy and discrimination of the models to identify the best machine learning algorithm for predicting OSA-related hypertension. In addition, a web-based tool was developed to promote its clinical application. We used permutation importance and Shapley additive explanations (SHAP) to determine the importance of the selected features and interpret the ML models.

Results: A total of 18 variables were selected for the models. The GBM model achieved the most extraordinary discriminatory ability (area under the receiver operating characteristic curve = 0.873, accuracy = 0.885, sensitivity = 0.713), and on the basis of this model, an online tool was built to help clinicians optimize OSA-related hypertension patient diagnosis. Finally, age, family history of hypertension, minimum arterial oxygen saturation, body mass index, and percentage of time of SaO₂ < 90% were revealed by the SHAP method as the top five critical variables contributing to the diagnosis of OSA-related hypertension.

Conclusion: We established a risk prediction model for OSA-related hypertension patients using the ML method and demonstrated that among the six ML models, the gradient boosting machine model performs best. This prediction model could help to identify high-risk OSA-related hypertension patients, provide early and individualized diagnoses and treatment plans, protect patients from the serious consequences of OSA-related hypertension, and minimize the burden on society.

Introduction

Obstructive sleep apnea (OSA) is a sleep disorder characterized by intermittent hypoxemia, autonomic fluctuation, and sleep fragmentation. As of 2019, the prevalence of OSA aged 30–69 years (men and women) in China has reached 24.2%, ranking first in the world (1). Aside from the fact that OSA causes difficult symptoms, many studies demonstrated that OSA is closely associated with many complications, such as cardiovascular diseases, metabolic disorders, and cognitive impairment (2–4). Among them, cardiovascular diseases have received extensive attention because of their serious consequences and high morbidity, especially hypertension. Observational studies have illustrated that 45–68% of subjects with OSA have hypertension (5, 6), and the prevalence of OSA is more than 30% among hypertension patients (7).

Hypertension that is primarily caused or exacerbated by OSA is called OSA-related hypertension after excluding other definite secondary etiologies (e.g., renal artery stenosis, renal parenchymal disease, primary aldosteronism, pheochromocytoma, and Cushing’s disease) (8). In addition, OSA-related hypertension is characterized by high rates of masked hypertension, elevated nocturnal blood pressure, a non-dipper pattern of nocturnal hypertension, and an increased blood pressure variability (9). Notably, patients with OSA and hypertension seem to be associated with more severe outcomes. Studies based on ambulatory blood pressure monitoring (ABPM) showed that participants with a non-dipper pattern of nocturnal hypertension and those who have elevated blood pressure at night demonstrate a greater degree of end-organ damage, higher risk of stroke, increased risk of incident heart failure, and increased risk of renal disease progression (10). Regrettably, OSA-related hypertension is easily disregarded by patients.

As for the general population, the reference method for blood pressure testing is primarily an in-office measurement. However, this diagnostic method is unreliable in the OSA population because of the specific characteristics of OSA-related hypertension. Previous studies have shown that among OSA patients, masked hypertension was found in 30% of patients, and white-coat hypertension was found in approximately 33% of patients (11–13). It means that there is a high risk that OSA patients may be underdiagnosed or overdiagnosed with hypertension. The application of ABPM to systematically and correctly assess blood pressure is recommended in clinically normotensive OSA patients (14). However, ABPM is not cost-efficient and often burdensome, and in clinical practice, it seems challenging to propose ABPM to all OSA patients with normal clinic blood pressure. Thus, the necessity of a simple and convenient clinical tool to assess OSA-related hypertension in daily clinical practice is emphasized, which can allow the use of ABPM selectively rather than routinely.

Machine learning (ML) has been widely developed and used in the medical field because of its remarkable performance in recent years. It can extract information from complex and non-linear data, establish models through science, reveal hidden dependencies between factors and diseases in the big data environment, and help clinicians better understand the diseases (15). Especially in cardiovascular diseases, machine learning has a wide range of applications and satisfactory diagnostic performance. For example, Ward et al. demonstrated that the gradient boosting machine (GBM) model has good discrimination for atherosclerotic cardiovascular disease risk (16). Although ML has gained extensive attention because of its powerful predictive capabilities, it is often criticized for being a black box model, making it hard for clinicians to understand and trust these complex models. Hence, this has limited its widespread use in medical decision-making (17).

Timely blood pressure screening and early accurate identification of OSA-related hypertension are crucial in minimizing the associated negative health effects. Regrettably, no ML models are available to predict the risk of OSA-related hypertension. In this study, we aimed to develop ML-based prediction models for OSA-related hypertension based on available clinical data from patients to identify high-risk patients. In addition, we used Shapley additive explanations (SHAP) (18), a method for interpreting results made by machine learning models, to explore the relationship between features and the risk of OSA-related hypertension. In addition, we further provide individual interpretations of the model’s decisions through SHAP. Moreover, we established a web-based risk calculator based on the most predictive maximum likelihood algorithm to promote its clinical application, which provided clinicians with valuable tools for risk assessment in OSA-related hypertension.

Materials and methods

Study design and subjects

This is a retrospective observational study. It retrospectively included the OSA patients admitted to the Department of Otorhinolaryngology—Head and Neck Surgery of the Second Affiliated Hospital of Xi’an Jiaotong University between October 2019 and December 2021. All study subjects underwent nighttime polysomnography or home sleep apnea testing and blood pressure monitoring, additionally, cardiologists assessed their blood pressure. OSA was diagnosed on the basis of apnea–hypopnea index (AHI) ≥ 5 events per hour through polysomnography (19). Hypertension was defined as a previous diagnosis with current antihypertensive therapy. Additionally, patients with elevated nocturnal blood pressure who had no history of hypertension were further examined and identified as newly diagnosed with hypertension by a cardiologist with more than 10 years of working experience. The definition of hypertension is described in detail in the Supplementary material.

The inclusion criteria were as follows: (1) patients with age ≥ 18 years, (2) patients with AHI ≥ 5 events per hour, and (3) patients who have not received OSA-related treatment in the past. The exclusion criteria were as follows: (1) patients with incomplete baseline data; (2) patients with disease potentially affecting blood pressure regulation, such as multiple organ dysfunction syndrome, uremia, severe cardiac heart failure, renal, or cardiac transplantation; (3) patients with the most common causes of secondary hypertension, namely, renal parenchymal disease, renovascular diseases, coarctation of the aorta, Cushing’s syndrome, primary hyperaldosteronism, pheochromocytoma, hyperthyroidism, and hyperparathyroidism; (4) pregnant women; (5) patients with history of snoring shorter than the duration of hypertension; and (6) patients who were diagnosed with central sleep apnea (central AHI ≥ 5 events per hour).

This study was approved by the ethics committee of the Second Affiliated Hospital of Xi’an Jiaotong University (approval no. 2021031). In addition, all patients who participated in the research provided informed consent. The inspection items and processes involved in this study are in line with the Declaration of Helsinki.

Data elements

Twenty-seven relevant clinical indicators were collected, and overall, the 27 candidate variables included were as follows: (1) demographic characteristics, namely, gender, heart disease family history of hypertension, diabetes, hypothyroidism, body mass index (BMI), waist circumference, neck circumference, and age/10; (2) lifestyle behaviors, namely, drinking, smoking, high-salt diet, high-fat diet, poor sleep quality, sedentariness, emotionally stable, mental stress, and smoking amount; and (3) OSA-related medical history and indicators, namely, memory decline, inattention, Epworth Sleepiness Scale (20), course of snoring, course of choking, AHI, obstructive apnea index (OAI), minimum arterial oxygen saturation/10 (minimum SaO₂/10), and percentage of time of SaO₂ < 90%/10 (CT90/10).

Development and validation of prediction models

By comparing the clinical characteristics of the hypertension and non-hypertension groups, the risk factors for predicting OSA-related hypertension were analyzed using the univariate analysis, and they were incorporated into machine learning as characteristic variables. Additionally, they were also used in the multivariate logistic regression analysis to obtain independent predictors associated with OSA-related hypertension.

All patients were randomly divided into a training set for constructing the predictive model and a test set for the model validation at a ratio of 7:3. The following six representative supervised ML algorithms were used for model construction in the training dataset: adaptive boosting, GBM, multilayer perceptron, bootstrapped aggregating, logistic regression, and extreme gradient boost (21–24). During training, the training cohort internal validation method used 10-fold cross-validation to evaluate the predictive power of each ML classifier in plotting the average area under the receiver operating characteristic curve (AUC). With the use of the validation cohort, the receiver operating characteristics of the six ML models were plotted, and AUC values were calculated to evaluate the predictive ability of the different models in cohorts. By comparing the predictive performance of our ML models, the model with the best predictive performance was selected as the final model. In addition, a confusion matrix was used to evaluate the prediction model performance. Subsequently, on the basis of the best predictive ability model, an online risk calculator that can make predictions using newly entered data of OSA patients was created.

Model interpretation

Shapley additive explanations (SHAP) is a model-agnostic explanation technique based on cooperative game theory that helps interpret the results from a predictive model. The interpretation is based on quantifying the SHAP value for each feature, representing the contribution of a feature to the predicted risk of OSA-related hypertension (25, 26). For each sample, the model produces a prediction value, and the sum or average of the absolute Shapley value of each feature of all individuals is the overall feature importance. Components with large fundamental Shapley values are very important. In addition, the SHAP method also proves each feature value’s positive or negative influence on the predicted results, similar to coefficient values in logistic regression. A positive SHAP value indicates that the corresponding feature contributes to a higher risk of the result, whereas a negative SHAP value indicates that the corresponding feature leads to a lower risk of the result. To determine the main predictors of OSA-related hypertension, we identified the importance of ranking features from the final model through the SHAP summary plot and provided individual interpretations of the model’s decisions.

Statistical analysis

All analyses and random division of training and validation sets were performed with R software (version 3.6.0). Continuous variables were represented as the median (p25, p75), whereas categorical variables were represented as numbers (n) and proportions (%). The Wilcoxon rank-sum test compared the two groups’ differences for continuous variables, and categorical variables were evaluated using the chi-squared test. Logistic regression analysis was used to analyze the relationship between various predictor variables (either categorical or continuous) and an outcome that is binary (dichotomous). The Python programming language (version 3.8) was also used to develop and evaluate ML models and design network calculators. For model interpretation, the SHAP was implemented using the Python Shap package. P < 0.05 was considered statistically significant.

Results

Patient characteristics

After the screening process, a total of 1,493 OSA patients were eligible for the study (Figure 1). The baseline characteristics of these patients are summarized in Table 1. For the demographic variables, the two groups were significantly different in heart disease, family history of hypertension, diabetes, BMI, waist circumference, neck circumference, and age/10 (all P < 0.05). For the lifestyle behavior variables, high-salt diet, poor sleep quality, and smoking amount were significant variables (all P < 0.05). For OSA-related medical history and indicators, memory decline, Epworth Sleepiness Scale, course of snoring, course of choking, AHI, OAI, minimum SaO₂/10, and CT90/10 were all significantly different between the two groups (all P < 0.05).

FIGURE 1

Figure 1. Summary of patient inclusion. AHI, apnea–hypopnea index; OSA, obstructive sleep apnea.

TABLE 1

Table 1. Demographic and clinical characteristics.

Univariate and multivariate logistic regression

Variables with a P < 0.05 in the univariate analysis were selected for multivariate logistic regression analysis to identify the independent risk factors of OSA-related hypertension patients (Table 2), and all regression coefficients are shown in Supplementary Table 1. In addition, the results indicated that family history of hypertension, BMI, age/10, minimum SaO₂/10, and CT90/10 were independent risk factors for OSA-related hypertension (all P < 0.05).

TABLE 2

Table 2. Univariate analysis and multivariate logistic regression analysis of variables.

Performance of the machine learning algorithm

The average AUC of the six models determined by 10-fold cross-validation is displayed in Figure 2A, with the GBM model achieving the best performance (AUC = 0.837). The model validation results based on the validation set are displayed in Figure 2B, and the GBM model still exhibited the best performance in predicting OSA-related hypertension (AUC = 0.873). Moreover, we further evaluated the stability and accuracy of GBM through five cross-validations, and the results reveal that the GBM has good stability (average AUC = 0.810 ± 0.048) (Figure 2C). The radar plot of the six ML models is shown in Supplementary Figure 1. A comparison of model performance on the validation set is shown in Table 3. Generally, all models performed satisfactorily in AUC, but not ideally in the sensitivity. Among them, the GBM exhibited the highest sensitivity at 0.713. Because GBM yielded the best results for AUC and sensitivity, we chose the GBM model as the final prediction model and then evaluated it (Figure 3). Meanwhile, on the basis of this model, we developed a prediction tool for the web, which can be accessed to further facilitate clinical use through an online risk calculator at https://shimunana-true-ml-vmz425.streamlitapp.com/ (Figure 4). The receiver operating characteristic properties of other ML models are shown in Supplementary Figure 2.

FIGURE 2

Figure 2. (A) Area under the curve (AUC) values of 10-fold cross-validation. (B) Validation of machine learning algorithms. (C) Receiver operating characteristic curve in the gradient boosting machine (GBM) model. AdaBoost, adaptive boosting; LR, logistic regression; Bagging, bootstrapped aggregating; MLP, multilayer perceptron; GBM, gradient boosting machine; XGBoost, extreme gradient boost; AUC, average area under the curve; ROC, receiver operating characteristic. AUC is used as an indicator of performance, the GBM model achieved the best predictive performance, and the Bagging model had the lowest predictive performance.

TABLE 3

Table 3. Performance comparison of six machine learning (ML) models.

FIGURE 3

Figure 3. Confusion matrix of GBM. GBM, gradient boosting machine.

FIGURE 4

Figure 4. Web calculator for predicting OSA-related hypertension. OSA, obstructive sleep apnea.

Model interpretability

To identify the features that influenced the prediction model the most, we illustrated the SHAP summary plot of GBM and the top 15 features of the prediction model in decreasing order (Figures 5A,B). The SHAP summary plot shows that age/10, family history of hypertension, minimum SaO₂/10, BMI, and CT90/10 were the five most critical predictive features of the GBM model and had the most significant impact on the prediction results.

FIGURE 5

Figure 5. Shapley additive explanations (SHAP). (A,B) The standard and classified bar charts of the SHAP summary plots showed the influence of each parameter on the gradient boosting machine (GBM) model. (C,D) SHAP model explanation of two typical predictions. The features are ranked according to the sum of the SHAP values for all patients, and the SHAP values are used to show the distribution of the effect of each feature on the GBM model outputs. Each dot represents a case in the dataset. The color of a dot indicates the value of the feature, with blue indicating the lowest range and red the highest range. The horizontal axis shows the corresponding SHAP value of the feature. A positive SHAP value contributes to the prediction of rupture and vice versa. SHAP, Shapley additive explanations; GBM, gradient boosting machine; SaO₂, arterial oxygen saturation; BMI, body mass index; AHI, apnea–hypopnea index; CT90/10, percentage of time of SaO₂ < 90%/10.

Shapley additive explanations (SHAP) values not only could show the contribution of each feature to the final prediction but also could effectively clarify and explain model predictions for individual patients. We provided two living examples to illustrate the role of the SHAP method in describing the machine learning model: a 46-year-old female patient who was diagnosed with OSA but with normal blood pressure and a 54-year-old male patient who was diagnosed with OSA-related hypertension (Figures 5C,D). The constructed model predicted the probability of OSA-related hypertension to be 23% and 57%, respectively. The model predicted the outcome as non-OSA-related hypertension for the female patient, which was consistent with the actual outcome (true negative). In addition, the model prediction result was OSA-related hypertension for the male patient, which was consistent with the actual situation (true positive).

Discussion

The present study is the first study to assess the predictive performance of several machine learning algorithms for OSA-related hypertension, obtain a GBM model that can be used to predict OSA-related hypertension clinically, and explain the model. GBM is a commonly used ML algorithm with satisfactory performance in managing large and complex non-linear datasets and avoiding overfitting (27). Subsequently, we designed a network risk calculator based on the GBM model to estimate the probability of hypertension in individuals with OSA so as to help clinicians make targeted diagnoses and treatment plans, making precision medicine possible.

As hypothesized, our multivariate logistic regression suggested that BMI, age/10, and minimum SaO₂/10 were significant independent risk factors for OSA-related hypertension, which converges with previous research. Pan et al. found that among police officers in southern China, the prevalence of OSA-related hypertension was associated with the age of the patients. However, their study population was small and occupation-specific (28). Furthermore, Natsios and colleagues reported that age, BMI, comorbidity, daytime SaO₂, and indices of hypoxia during sleep were estimated to be the most precise predictors of hypertension (29). Additionally, because of the differences in study design and study population, we found some different results from previous studies. Family history of hypertension and CT90/10 were also found to be risk factors for OSA-related hypertension in our study. Interestingly, to further confirm how input factors contribute to the model, we calculated SHAP feature importance and feature effects. The importance of variables also showed that the BMI, age/10, and minimum SaO₂/10, family history of hypertension, and CT90/10 were the most important input parameters that contribute to the predicted risk of OSA-related hypertension. This strongly demonstrates that these five variables were significant contributors to OSA-related hypertension, and proved the accuracy and reliability of the GBM model. Surely, a prospective study and animal experiments are essential to confirm the accuracy and reliability of our proposed model.

Interestingly, in addition to identifying several known risk factors, multivariate logistic regression and SHAP analysis also found that CT90/10, a variable that had been overlooked in previous cardiovascular studies, also plays an important role in OSA-related hypertension. Previous studies have shown a significant association between CT90 and Coronary Artery Calcium, cerebral small vessel disease and diabetic nephropathy (30–32), but the relationship between CT90 and hypertension has not been explored. The underlying causes by OSA and hypertension have not been fully elucidated. A few pathophysiological mechanisms have been suggested to participate in it, such as elevated sympathetic nervous system activity, renin-angiotensin aldosterone system activity, endothelial dysfunction, inflammation, and metabolic dysregulation (33). And López-Cano et al. showed a positive and significant association between the nocturnal concentration of urine metanephrines and the CT90 (34), suggesting that CT90 may influence sympathetic activity. And this also explains the important role of CT90 in OSA-related hypertension, and needs more attention in the future. Surprisingly, in our statistical model, AHI, as a diagnostic indicator of adult OSA, participates weakly. Whether there is a dose–response relationship between the severity of OSA and the cumulative incidence of hypertension has been debated. The Wisconsin Sleep Cohort Study discovered a dose–response association between OSA and the presence of hypertension 4 years later (35). At the same time, the Sleep Heart Health Study and the Victoria Sleep Cohort Study found that the relationship between hypertension and OSA was no longer significant after age and BMI were controlled for O’Connor et al. (36) and Cano-Pumarega et al. (37). Additionally, AHI is a simple measure of the average number of respiratory events (apneas and hypopneas) per hour of sleep, and it does not reflect adequately the various phenotypes and comorbidities of OSA. Our results disclosed that blood oxygen indicators (e.g., minimum SaO₂/10 and CT90/10) might be better predictors of OSA-related hypertension than AHI.

Notably, the risk for OSA-related hypertension is increased most by family history of hypertension in the multivariate logistic regression, followed by age/10. However, the SHAP analysis showed that minimum SaO₂/10 has the highest predictive value for OSA-related hypertension. The discrepancy between multivariate logistic regression and SHAP values can be explained by the prevalence of a variable. Odds ratios were calculated only for patients associated with this variable, but the mean SHAP value for all patients was calculated. In addition, the average SHAP value was further used to evaluate the importance of features and rank them. Hence, variables with low impact and high prevalence will have low odds ratios but high SHAP values.

In our study, full integration of the standard clinical variables with Polysomnography parameters was performed during the construction of the ML model. The model can thus predict OSA-related hypertension risk stratification for the patient, using all relevant covariates rather than individual measures. Our approach was also validated with repeated 10-fold cross-validation to provide a robust estimation of prediction accuracy with minimal bias. The six models performed well, with AUC ranging from 0.698 to 0.873 and sensitivity from 0.353 to 0.713 in the test dataset. And the GBM prediction model with the highest AUC, accuracy, and sensitivity was identified as the final model for this study and clinical use. The GBM model with 0.873 AUC and 0.713 sensitivity proves good discrimination and stability. What’s more, we introduce the Shapley value to explain the GBM model. SHAP is a model-independent interpretation technique that interprets black box models globally and locally, and can provide a rational explanation for the prediction, which can significantly enhance the confidence of clinicians in the model.

However, despite our best efforts to improve it, this study still has some limitations. First, this is a single-center retrospective study, and the performance of machine learning algorithms may vary for datasets with different distributions of patient characteristics and various institutions. Therefore, more patients from multiple sources are required to validate our model’s robustness and repeatability in the future. Second, the undesirable sensitivity may be that the ML algorithm learns from input features, and some discreet relationships may have been lost because of unknown or disregarded features not registered by doctors. In the future, we will conduct prospective validation based on this model, continue to explore crucial risk factors for OSA-related hypertension, and modify the model further to improve the accuracy and reliability of the GBM prediction model.

Conclusion

We established a risk prediction model for OSA-related hypertension patients using the ML method and demonstrated that the GBM model performs best among the six ML models. This prediction model could help to identify high-risk OSA-related hypertension patients, provide early and individualized diagnoses and treatment plans, protect patients from the serious consequences of OSA-related hypertension, and reduce the burden on society.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving human participants were reviewed and approved by the Ethics Committee of The Second Affiliated Hospital of Xi’an Jiaotong University (approval no. 2021031). Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

Author contributions

YS, SW, and XR designed the research. XC and LM wrote the manuscript. YF, YY, HL, and LY collected the data. YX, ZC, and CZ performed data curation. YZ, LM, and WL analyzed and processed the data. XR directed the research and the guarantor of the manuscript and takes full responsibility for the integrity of the work, from its inception to the published manuscript. All authors reviewed the results and approved the final version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (grant no. 62076198) and the Key Research and Development Program in the Social Development Field of Shaanxi, China (grant nos. 2020GXLH-Y005 and 2021SF-286).

Acknowledgments

We wish to thank all who volunteered to participate in this study.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fcvm.2022.1042996/full#supplementary-material

Abbreviations

OSA, obstructive sleep apnea; ML, machine learning; AUC, area under the curve; SHAP, Shapley additive explanations; SaO₂, arterial oxygen saturation; OAI, obstructive apnea index; BMI, body mass index; GBM, gradient boosting machine; ABPM, ambulatory blood pressure monitoring; AHI, apnea–hypopnea index; CT90/10, percentage of time of SaO₂ < 90%/10.

References

1. Benjafield AV, Ayas NT, Eastwood PR, Heinzer R, Ip MSM, Morrell MJ, et al. Estimation of the global prevalence and burden of obstructive sleep apnoea: a literature-based analysis. Lancet Respir Med. (2019) 7:687–98.

PubMed Abstract | Google Scholar

2. Shi Y, Luo H, Liu H, Hou J, Feng Y, Chen J, et al. Related biomarkers of neurocognitive impairment in children with obstructive sleep apnea. Int J Pediatr Otorhinolaryngol. (2019) 116:38–42. doi: 10.1016/j.ijporl.2018.10.015

PubMed Abstract | CrossRef Full Text | Google Scholar

3. Shi Y, Feng Y, Chen X, Ma L, Cao Z, Shang L, et al. Serum neurofilament light reflects cognitive dysfunctions in children with obstructive sleep apnea. BMC Pediatr. (2022) 22:449. doi: 10.1186/s12887-022-03514-9

PubMed Abstract | CrossRef Full Text | Google Scholar

4. Yeghiazarians Y, Jneid H, Tietjens JR, Redline S, Brown DL, El-Sherif N, et al. Obstructive sleep apnea and cardiovascular disease: a scientific statement from the American heart association. Circulation. (2021) 144:e56–67. doi: 10.1161/CIR.0000000000000988

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Kiely JL, McNicholas WT. Cardiovascular risk factors in patients with obstructive sleep apnoea syndrome. Eur Respir J. (2000) 16:128–33. doi: 10.1034/j.1399-3003.2000.16a23.x

PubMed Abstract | CrossRef Full Text | Google Scholar

6. Millman RP, Redline S, Carlisle CC, Assaf AR, Levinson PD. Daytime hypertension in obstructive sleep apnea. Prevalence and contributing risk factors. Chest. (1991) 99:861–6. doi: 10.1378/chest.99.4.861

PubMed Abstract | CrossRef Full Text | Google Scholar

7. Worsnop CJ, Naughton MT, Barter CE, Morgan TO, Anderson AI, Pierce RJ. The prevalence of obstructive sleep apnea in hypertensives. Am J Respir Crit Care Med. (1998) 157:111–5. doi: 10.1164/ajrccm.157.1.9609063

PubMed Abstract | CrossRef Full Text | Google Scholar

8. HPCO Chinese. Medical association expert consensus on clinical diagnosis and treatment of obstructive sleep apnea-related hypertension. Chin J Pract Intern Med. (2013).

Google Scholar

9. Kario K, Hettrick DA, Prejbisz A, Januszewicz A. Obstructive sleep apnea-induced neurogenic nocturnal hypertension: a potential role of renal denervation? Hypertension. (2021) 77:1047–60. doi: 10.1161/HYPERTENSIONAHA.120.16378

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Ahmad M, Makati D, Akbar S. Review of and updates on hypertension in obstructive sleep apnea. Int J Hypertens. (2017) 2017:1848375. doi: 10.1155/2017/1848375

PubMed Abstract | CrossRef Full Text | Google Scholar

11. Pio-Abreu A, Moreno H Jr, Drager LF. Obstructive sleep apnea and ambulatory blood pressure monitoring: current evidence and research gaps. J Hum Hypertens. (2021) 35:315–24. doi: 10.1038/s41371-020-00470-8

PubMed Abstract | CrossRef Full Text | Google Scholar

12. Baguet JP, Lévy P, Barone-Rochette G, Tamisier R, Pierre H, Peeters M, et al. Masked hypertension in obstructive sleep apnea syndrome. J Hypertens. (2008) 26:885–92. doi: 10.1097/HJH.0b013e3282f55049

PubMed Abstract | CrossRef Full Text | Google Scholar

13. García-Río F, Pino JM, Alonso A, Arias MA, Martínez I, Alvaro D, et al. White coat hypertension in patients with obstructive sleep apnea-hypopnea syndrome. Chest. (2004) 125:817–22. doi: 10.1378/chest.125.3.817

PubMed Abstract | CrossRef Full Text | Google Scholar

14. Parati G, Lombardi C, Hedner J, Bonsignore MR, Grote L, Tkacova R, et al. Recommendations for the management of patients with obstructive sleep apnoea and hypertension. Eur Respir J. (2013) 41:523–38. doi: 10.1183/09031936.00226711

PubMed Abstract | CrossRef Full Text | Google Scholar

15. Schwalbe N, Wahl B. Artificial intelligence and the future of global health. Lancet. (2020) 395:1579–86. doi: 10.1016/S0140-6736(20)30226-9

CrossRef Full Text | Google Scholar

16. Ward A, Sarraju A, Chung S, Li J, Harrington R, Heidenreich P, et al. Machine learning and atherosclerotic cardiovascular disease risk prediction in a multi-ethnic population. NPJ Digit Med. (2020) 3:125. doi: 10.1038/s41746-020-00331-1

PubMed Abstract | CrossRef Full Text | Google Scholar

17. Molnar C. Interpretable Machine Learning. Morrisville: Lulu.com (2020).

Google Scholar

18. Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. In: Guyon I, Von Luxburg U, Bengio S, Wallach H, Fergus R, Vishwanathan S editors. Advances in Neural Information Processing Systems 30. San Jose, CA: Curran Associates, Inc (2017).

Google Scholar

19. Sateia MJ. International classification of sleep disorders-third edition: highlights and modifications. Chest. (2014) 146:1387–94. doi: 10.1378/chest.14-0970

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Johns MW. A new method for measuring daytime sleepiness: the epworth sleepiness scale. Sleep. (1991) 14:540–5. doi: 10.1093/sleep/14.6.540

PubMed Abstract | CrossRef Full Text | Google Scholar

21. Chen T, Guestrin C. X: a scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining-KDD 2016. San Francisco, CA: (2016). p. 785–94. doi: 10.1145/2939672.2939785

CrossRef Full Text | Google Scholar

22. Singh A, Thakur N, Sharma A. A review of supervised machine learning algorithms. 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom). Piscataway: IEEE (2016). p. 1310–5.

Google Scholar

23. Ngiam KY, Khor IW. Big data and machine learning algorithms for health-care delivery. Lancet Oncol. (2019) 20:e262–73. doi: 10.1016/S1470-2045(19)30149-4

CrossRef Full Text | Google Scholar

24. Zhang Z, Zhao Y, Canes A, Steinberg D, Lyashevska O. Predictive analytics with gradient boosting in clinical medicine. Ann Transl Med. (2019) 7:152. doi: 10.21037/atm.2019.03.29

PubMed Abstract | CrossRef Full Text | Google Scholar

25. Rodríguez-Pérez R, Bajorath J. Interpretation of compound activity predictions from complex machine learning models using local approximations and shapley values. J Med Chem. (2020) 63:8761–77. doi: 10.1021/acs.jmedchem.9b01101

PubMed Abstract | CrossRef Full Text | Google Scholar

26. Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. (2020) 2:56–67. doi: 10.1038/s42256-019-0138-9

PubMed Abstract | CrossRef Full Text | Google Scholar

27. Sheridan RP, Wang WM, Liaw A, Ma J, Gifford EM. Extreme gradient boosting as a method for quantitative structure-activity relationships. J Chem Inf Model. (2016) 56:2353–60. doi: 10.1021/acs.jcim.6b00591

PubMed Abstract | CrossRef Full Text | Google Scholar

28. Pan M, Ou Q, Chen B, Hong Z, Liu H. Risk factors for obstructive sleep apnea-related hypertension in police officers in Southern China. J Thorac Dis. (2019) 11:4169–78. doi: 10.21037/jtd.2019.09.83

PubMed Abstract | CrossRef Full Text | Google Scholar

29. Natsios G, Pastaka C, Vavougios G, Zarogiannis SG, Tsolaki V, Dimoulis A, et al. Age, body mass index, and daytime and nocturnal hypoxia as predictors of hypertension in patients with obstructive sleep apnea. J Clin Hypertens. (2016) 18:146–52. doi: 10.1111/jch.12645

PubMed Abstract | CrossRef Full Text | Google Scholar

30. Liu X, Lam DC, Mak HK, Ip MS, Lau KK. Associations of sleep apnea risk and oxygen desaturation indices with cerebral small vessel disease burden in patients with stroke. Front Neurol. (2022) 13:956208. doi: 10.3389/fneur.2022.956208

PubMed Abstract | CrossRef Full Text | Google Scholar

31. Seo MY, Lee SH, Hong SD, Chung SK, Kim HY. Hypoxemia during sleep and the progression of coronary artery calcium. Cardiovasc Toxicol. (2021) 21:42–8. doi: 10.1007/s12012-020-09593-3

PubMed Abstract | CrossRef Full Text | Google Scholar

32. Zhang R, Zhang P, Zhao F, Han X, Ji L. Association of diabetic microvascular complications and parameters of obstructive sleep apnea in patients with type 2 diabetes. Diabetes Technol Ther. (2016) 18:415–20. doi: 10.1089/dia.2015.0433

PubMed Abstract | CrossRef Full Text | Google Scholar

33. Salman LA, Shulman R, Cohen JB. Obstructive sleep apnea, hypertension, and cardiovascular risk: epidemiology, pathophysiology, and management. Curr Cardiol Rep. (2020) 22:6. doi: 10.1007/s11886-020-1257-y

PubMed Abstract | CrossRef Full Text | Google Scholar

34. López-Cano C, Gutiérrez-Carrasquilla L, Sánchez E, González J, Yeramian A, Martí R, et al. Sympathetic hyperactivity and sleep disorders in individuals with type 2 diabetes. Front Endocrinol. (2019) 10:752. doi: 10.3389/fendo.2019.00752

PubMed Abstract | CrossRef Full Text | Google Scholar

35. Peppard PE, Young T, Palta M, Skatrud J. Prospective study of the association between sleep-disordered breathing and hypertension. N Engl J Med. (2000) 342:1378–84. doi: 10.1056/NEJM200005113421901

PubMed Abstract | CrossRef Full Text | Google Scholar

36. O’Connor GT, Caffo B, Newman AB, Quan SF, Rapoport DM, Redline S, et al. Prospective study of sleep-disordered breathing and hypertension: the Sleep Heart Health Study. Am J Respir Crit Care Med. (2009) 179:1159–64. doi: 10.1164/rccm.200712-1809OC

PubMed Abstract | CrossRef Full Text | Google Scholar

37. Cano-Pumarega I, Durán-Cantolla J, Aizpuru F, Miranda-Serrano E, Rubio R, Martínez-Null C, et al. Obstructive sleep apnea and systemic hypertension: longitudinal study in the general population: the Vitoria sleep cohort. Am J Respir Crit Care Med. (2011) 184:1299–304. doi: 10.1164/rccm.201101-0130OC

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: obstructive sleep apnea, hypertension, machine learning, risk factor, Shapley additive explanations, gradient boosting machine (GBM)

Citation: Shi Y, Ma L, Chen X, Li W, Feng Y, Zhang Y, Cao Z, Yuan Y, Xie Y, Liu H, Yin L, Zhao C, Wu S and Ren X (2022) Prediction model of obstructive sleep apnea–related hypertension: Machine learning–based development and interpretation study. Front. Cardiovasc. Med. 9:1042996. doi: 10.3389/fcvm.2022.1042996

Received: 13 September 2022; Accepted: 21 November 2022;
Published: 05 December 2022.

Edited by:

Yanwu Xu, Baidu, China

Reviewed by:

Hongliang Yi, Shanghai Jiao Tong University, China
Xiangdong Tang, Sichuan University, China

Copyright © 2022 Shi, Ma, Chen, Li, Feng, Zhang, Cao, Yuan, Xie, Liu, Yin, Zhao, Wu and Ren. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Shinan Wu, d3NoaW5hbmE5OUAxNjMuY29t; Xiaoyong Ren, Y29yX3JlbnhpYW95b25nQDEyNi5jb20=

^†These authors have contributed equally to this work and share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Prediction model of obstructive sleep apnea–related hypertension: Machine learning–based development and interpretation study

Introduction

Materials and methods

Study design and subjects

Data elements

Development and validation of prediction models

Model interpretation

Statistical analysis

Results

Patient characteristics

Univariate and multivariate logistic regression

Performance of the machine learning algorithm

Model interpretability

Discussion

Conclusion

Data availability statement

Ethics statement

Author contributions

Funding

Acknowledgments

Conflict of interest

Publisher’s note

Supplementary material

Abbreviations

References

95% of researchers rate our articles as excellent or good

95% of researchers rate our articles as excellent or good