The use of heart rate variability, oxygen saturation, and anthropometric data with machine learning to predict the presence and severity of obstructive sleep apnea

dos Santos, Rafael Rodrigues; Marumo, Matheo Bellini; Eckeli, Alan Luiz; Salgado, Helio Cesar; Silva, Luiz Eduardo Virgílio; Tinós, Renato; Fazan, Rubens

doi:10.3389/fcvm.2025.1389402

ORIGINAL RESEARCH article

Front. Cardiovasc. Med., 14 March 2025

Sec. Cardioneurology

Volume 12 - 2025 | https://doi.org/10.3389/fcvm.2025.1389402

This article is part of the Research TopicUpdates on Cardiovascular Variability: Underlying Mechanisms and Non-Pharmacological Therapeutic TargetsView all 6 articles

The use of heart rate variability, oxygen saturation, and anthropometric data with machine learning to predict the presence and severity of obstructive sleep apnea

Rafael Rodrigues dos Santos¹

Matheo Bellini Marumo²

Alan Luiz Eckeli³

Helio Cesar Salgado¹

Luiz Eduardo Virgílio Silva⁴

Renato Tinós^2,†

Rubens Fazan Jr^1*^†

¹Department of Physiology, School of Medicine of Ribeirao Preto, University of Sao Paulo, Ribeirão Preto, Brazil
²Department of Computing and Mathematics, Faculty of Philosophy, Sciences and Letters, University of Sao Paulo, Ribeirão Preto, Brazil
³Department of Neuroscience and Behavior Sciences, Division of Neurology, School of Medicine of Ribeirao Preto, University of Sao Paulo, Ribeirão Preto, Brazil
⁴Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, PA, United States

Introduction: Obstructive sleep apnea (OSA) is a prevalent sleep disorder with a high rate of undiagnosed patients, primarily due to the complexity of its diagnosis made by polysomnography (PSG). Considering the severe comorbidities associated with OSA, especially in the cardiovascular system, the development of early screening tools for this disease is imperative. Heart rate variability (HRV) is a simple and non-invasive approach used as a probe to evaluate cardiac autonomic modulation, with a variety of newly developed indices lacking studies with OSA patients.

Objectives: We aimed to evaluate numerous HRV indices, derived from linear but mainly nonlinear indices, combined or not with oxygen saturation indices, for detecting the presence and severity of OSA using machine learning models.

Methods: ECG waveforms were collected from 291 PSG recordings to calculate 34 HRV indices. Minimum oxygen saturation value during sleep (SatMin), the percentage of total sleep time the patient spent with oxygen saturation below 90% (T90), and patient anthropometric data were also considered as inputs to the models. The Apnea-Hypopnea Index (AHI) was used to categorize into severity classes of OSA (normal, mild, moderate, severe) to train multiclass or binary (normal-to-mild and moderate-to-severe) classification models, using the Random Forest (RF) algorithm. Since the OSA severity groups were unbalanced, we used the Synthetic Minority Over-sampling Technique (SMOTE) to oversample the minority classes.

Results: Multiclass models achieved a mean area under the ROC curve (AUROC) of 0.92 and 0.86 in classifying normal individuals and severe OSA patients, respectively, when using all attributes. When the groups were dichotomized into normal-to-mild OSA vs. moderate-to-severe OSA, an AUROC of 0.83 was obtained. As revealed by RF, the importance of features indicates that all feature modalities (HRV, SpO₂, and anthropometric variables) contribute to the top 10 ranks.

Conclusion: The present study demonstrates the feasibility of using classification models to detect the presence and severity of OSA using these indices. Our findings have the potential to contribute to the development of rapid screening tools aimed at assisting individuals affected by this condition, to expedite diagnosis and initiate timely treatment.

Introduction

Obstructive sleep apnea (OSA) is the most prevalent sleep disorder, characterized by repetitive events of partial and/or total obstruction of the superior airway during sleep. These obstructive events cause recurrent episodes of hypoxia and hypercapnia, leading to marked physiological disturbances (1). The gold-standard diagnostic method for OSA is polysomnography (PSG), a comprehensive exam that simultaneously records multiple physiological signals during sleep, enabling the analysis of sleep stages and their disturbances (2, 3). The diagnosis and severity assessment of OSA are determined using the Apnea-Hypopnea Index (AHI), a quantitative measure based on the number of apnea and hypopnea events per hour of total sleep time. This index is calculated by clinical specialists who analyze PSG recordings (4). Nonetheless, the time-consuming nature and associated costs of PSG tests contribute to a substantial backlog of subjects awaiting examination, leading to an increased likelihood of underdiagnosis for OSA from individual to population (5). The recent COVID-19 pandemic worsened this scenario, reducing the number of PSG exams, mainly during lockdown periods (6).

It is well determined that OSA is associated with the development of several comorbidities, especially the ones related to the cardiovascular system. On the other hand, due to poor sleeping, patients with OSA show a general decrease in the quality of life and have an increased risk of being involved in work and traffic accidents, putting their own and other lives in danger (7–9). Since the diagnostic of OSA is a bottleneck in this scenario, the development of new diagnostic techniques for OSA is of utmost relevance.

An increasingly valuable tool in screening for cardiovascular and systemic diseases is the examination of heart rate variability (HRV). It is a non-invasive approach that evaluates time series of cardiac intervals derived from the electrocardiogram and can provide insights into the autonomic nervous system's modulation of cardiac function (10). Many studies reported HRV as an important marker of cardiovascular risk, and a marked increasing number of approaches to studying HRV have been proposed in the near past (10–12). A robust body of studies has highlighted alterations in HRV indices in OSA (11, 13). However, most of these studies are limited to evaluating “traditional” HRV indices derived from conventional linear methods. Notably, there is a scarcity of literature exploring the use of more recent nonlinear approaches for HRV assessment (13). Given the complex dynamics of biological systems, techniques capable of addressing their non-stationarity, stochasticity, and nonlinear characteristics can be highly beneficial (14–16). Since HRV fluctuations exhibit nonlinear dynamics, linear methods are inherently limited in fully capturing the information contained in such signals (17). Therefore, there is a consensus that using a comprehensive set of HRV indices derived from both linear and nonlinear methods is the most effective approach for characterizing health and disease (18–20).

Another tool that has recently taken a vital role in medicine and biomedical sciences is artificial intelligence, in particular, machine learning. Machine learning models automatically identify patterns in a dataset and use them to make decisions (21–23). Machine learning can be pretty robust in leveraging big and complex data, allowing the creation of predictive models in several clinical settings, including diagnosis and treatment decisions, gene expression analysis, drug response, pharmacokinetics, and so on (24, 25). Hence, machine learning has emerged as a promising tool to assist clinicians in decision-making.

The present study aims to evaluate the utility of a comprehensive set of HRV indices in predicting the classification of individuals with suspected OSA into different severity levels (no OSA, mild OSA, moderate OSA, severe OSA) or in a binary classification based on an AHI cutoff of 15 (normal-to-mild OSA vs. moderate-to-severe OSA). To conduct this evaluation, machine learning models were trained using a comprehensive set of linear and nonlinear HRV indices, along with demographic and anthropometric variables. Additionally, we investigate how models incorporating HRV indices perform compared to models using only SpO₂ indices or a combination of both. We hypothesize that a thorough HRV profiling during sleep can function as a screening tool to identify individuals more likely to have OSA, thereby assisting in the management of the waiting list for PSG exams.

Methods

Data acquisition

Four hundred thirty-eight (438) PSG exams, performed between 2015 and 2022 in the University Hospital of the Ribeirao Preto Medical School from the University of Sao Paulo, were collected. All the protocol was approved by the Human Research Ethics Committee of the same hospital (Protocol: 42058720.6.000.5440/4.550.2327). The PSG exam records high-resolution waveforms such as the electroencephalogram (EEG), electrooculogram (EOG), electromyogram (EMG), electrocardiogram (ECG), thoracic and abdominal respiratory inductive plethysmography straps, pulse oximetry (SpO₂), nasal pressure transducer system and nasal and mouth thermocouple airflow sensor to monitor the airflow, microphone to detect snores, sensor to determine body position, and a video camera to monitor the patient during sleep. The signals were exported to a European Data Format (EDF) file for further analysis.

The ECG waveforms (sampled at 512 Hz) were read from the EDF files and analyzed using the software LabChart (ECG module for LabChart, ADInstruments, Dunedin, New Zealand). Six segments of 15 min were obtained from each patient, one for each of the first 6 h of ECG recording during sleep. Segments were selected based on the quality of the ECG signal by visual inspection of the best 15 min periods of the ECG within each hour, i.e., periods with the most minor interferences from external noise, movement artifacts, arrhythmias, or gasps. Following, each segment's RR intervals (RRi) were calculated, and a series of RRi were generated for each segment. The RRi series were corrected for spurious values (e.g., beat misdetections and ectopic beats) using PyBioS software with the following procedure: for each RRi series, the baseline was estimated using a moving median window of size W. Upper and lower tolerance threshold lines were then created by shifting the baseline series up and down by a percentage T of the baseline average (26). RRi values lying below the lower or above the upper tolerance line were replaced using linear interpolation. Corrections were at most 2.5% of total estimated beats (27).

Inclusion/exclusion criteria

The inclusion criteria for the study were patients at least 18 years old, with a minimum time of PSG recording of 6 h. The exclusion criteria were exams with corrupted files, poor ECG signal quality, RRi series with more than 2.5% spurious values, CPAP titration, or a diagnosis of central sleep apnea. After applying the inclusion and exclusion criteria, 291 out of the 438 PSG exams were eligible. A total of 147 exams were excluded due to corrupted file (n = 6), poor ECG quality (n = 75), insufficient collected time (n = 22), age less than 18 (n = 4), CPAP titration (n = 3), missing data in the PSG report (n = 37).

Heart rate variability

Linear and nonlinear HRV indices were calculated for each patient's six corrected RRi series. As linear methods, we used both time- and frequency-domain indices (10). In the time domain, the mean of RRi, the standard deviation of normal-to-normal intervals (SDNN), and the root mean square of the successive RRi differences (RMSSD) were calculated. In the frequency domain, the spectral analysis of RRi was used. For this approach, the RRi series were resampled at 3 Hz using cubic spline interpolation and divided into segments of 512 values overlapped by 50%. Following, after the application of a Hanning window, the segments had their spectra calculated by the periodogram (Fourier transform) and were integrated into bands of very low (VLF; <0.04 Hz.), low (LF; 0.04–0.15 Hz), and high frequencies (HF; 0.15–0.4 Hz). The mean power over all 512-point segments represented the full 15 min segment. Spectral powers were presented in absolute values (“abs”) and normalized units (LFnu and HFnu). The LF/HF ratio was also calculated (28).

Several indices were used from the “family” of nonlinear methods. Detrended fluctuation analysis (DFA), which estimates the fractal (or self-similarity) scaling present on the time series, was calculated for the scaling range 5 < n < 15 (α1), where n is the window size of RRi values considered (29). Also, seven entropy measures were calculated. Entropy is generally characterized as an unpredictability/irregularity analysis of time series, with different nuances among the entropy estimators (30). In this study, we calculated sample entropy (SampEn; sequence length m = 2; tolerance r = 0.15), fuzzy entropy (FuzzyEn; sequence length m = 2; tolerance r = 0.15; fuzzy exponent n = 2), distribution entropy (DistEn; sequence length m = 3; number of bins M = 512), attention entropy (AttEn), dispersion entropy (DispEn; sequence length m = 3; number of classes nc = 6), phase entropy (PhaseEn; number of sectors k = 16), and permutation entropy (PermEn; sequence length m = 3; noise added to deal with equal values). All the entropy mentioned above measures had their formalism and parameters described in detail elsewhere (31–37).

Besides fractal and entropy, two symbolic dynamics analysis methods were calculated in our study. Briefly, the method Max-Min, proposed by Porta and co-workers (38, 39), split the full range of RRi values into 6 equal bins, ranging from the maximum to the minimum RRi, and each RRi value is assigned a symbol (0–5) according to the bin it belongs. Next, “words” composed of a sequence of 3 consecutive symbols are created and classified into one out of four families, namely 0V (zero variation), 1V (one variation), 2LV (two-like variation), and 2UV (two-unlike variation). The percentage of occurrence of each pattern is calculated, generating the indices Symb-0V, Symb-1V, Symb-2LV, and Symb-2UV. Another symbolic approach used is the binary method (19). This approach is similar to Max-Min, differing by how RRi values are converted into symbols. In the binary approach, each RRi is assigned a binary symbol (0 or 1), depending on the sign of the difference between the RRi and its successive RRi. Following this, “words” composed of three successive symbols are classified into one of three families, and their percentage of occurrence is calculated. This results in the generation of the indices Bin-0V, Bin-1V, and Bin-2 V. Both symbolic dynamics approaches were previously shown to be associated with the autonomic modulation of the heart and could be considered a nonlinear alternative to the spectral analysis (19, 39, 40).

Heart rate fragmentation (HRF), a recent and interesting HRV method proposed by Costa and co-workers (18, 41), was also evaluated in the present study. HRF consists of analyzing the inflection points in the RRi series, i.e., changes from HR acceleration to deceleration and vice-versa. The symbolic dynamics approach proposed for HRF assigns each RRi difference a binary symbol (1 when RRi is decreasing and −1 when RRi is increasing; the value 0 was considered when there are no differences). The transitions between two consecutive different symbols, i.e., all except 1–1, −1 to −1, and 0–0, characterize an inflection point. Then, words of 4 consecutive symbols are evaluated, and the percentage of words with zero (W0), one (W1), two (W2), or three (W3) inflection points is calculated. The overall percentage of inflection points (PIP) was also obtained. Although the HRF approach adopted here is based on a symbolic dynamics analysis (except for PIP), we did not include HRF in the symbolic dynamics methods to clearly distinguish their interpretation. The HRF was intended to evaluate the degradation of heart rate dynamics, which appears as a fragmented heart rate.

Other important HRV approaches, such as asymmetry and acceleration/deceleration (AC/DC) capacity, were also used. Asymmetry methods estimate whether the changes in the RRi series are similar when the series is time-reversed. Here, we calculated three asymmetry methods, namely Porta's, Guzik's, and Ehlers' indices. Porta's and Guzik's indices return a value of 50% for perfectly symmetric RRi series, representing the balance between positive and negative variations within the series (42, 43). In contrast, Ehlers' index is based on the skewness of RRi differences, in which values near 0 represent a symmetric (time-reversible) series (43). On the other hand, the AC/DC calculates from the RRi series the average magnitude (capacity) of the heart to accelerate and decelerate (44).

Altogether, the aforementioned methods yield 34 indices for HRV analysis. The mean values of each HRV index were obtained from each of the six hourly RRi segments.

PSG reports

PSG reports were provided by a certified sleep medicine physician, and some scores related to the presence and severity of OSA, as well as anthropometric information, were selected. The PSG scores collected were the AHI, the minimum oxygen saturation value during sleep (SatMin), and the percentage of total sleep time the patient stayed with oxygen saturation below 90% (T90). The AHI is based on the number of apnea/hypopnea events in each hour, averaged over the entire sleep time. An obstructive apnea event can be defined as a reduction >90% of the thermistor airflow lasting at least 10 s, associated with a respiratory effort, while hypopnea is characterized by a reduction >30% of the nasal pressure airflow in a period >10 s, associated with an oxygen desaturation >3% or arousal (4). The AHI was used to classify the level of severity of OSA. Patients with an AHI between 5 and 15 were considered mild-OSA patients; between 15 and 30 moderate; and above 30 severe-OSA patients. Individuals with an AHI below 5 were considered normal (45). As anthropometric data, the patient's gender, age, height, weight, and body mass index (BMI = kg/m²) were obtained.

Machine learning

The machine learning (ML) models were implemented in Python using the sci-kit-learn library (46). Models were trained to classify the patient's OSA class using three different input sets of features: (1) HRV indices + anthropometric data; (2) SpO₂ indices + anthropometric data; and (3) HRV + SpO₂ indices + anthropometric data. Both multiclass (normal, mild, moderate, severe) and binary (normal-to-mild, moderate-to-severe) classification models were evaluated, with the later utilizing an AHI cutoff of 15 (47).

The models were created using the random forest (RF) algorithm, trained with 100 trees and without a depth limit. The performance of models was assessed by a 10-fold cross-validation scheme (48). Since our dataset is unbalanced (see Table 1), we used the Synthetic Minority Over-sampling Technique (SMOTE) technique (49, 50) to oversample the classes with fewer samples. Synthetic data is created by SMOTE based on a set of random samples of the minority classes and their k-nearest neighbors (here, k = 5), generating the artificial data by randomly choosing a point in the linear interpolation space between them.

Table 1

Table 1. Anthropometric and PSG report information about normal and OSA individuals in different severity classes. Sex is the number of men (%) for each group. The other variables represent the median (1st–3rd quartiles).

To assess the importance of each feature, the feature importance assigned by the RF algorithm using the entire dataset was evaluated. RF computes feature importance by measuring the impurity decrease associated with each feature within each decision tree created in the model.

Statistical analysis

The differences in clinical and anthropometric variables among the four groups of OSA were tested using Chi-squared (gender differences) and Kruskal–Wallis with Dunn's post-hoc tests (other variables). To evaluate the performance of the classification models, the area under the receiver operating characteristic curve (AUROC) was calculated. AUROC = 0.5 represents a random model, while AUROC = 1 points to a perfect classification model (20–24). The AUROC 95% confidence intervals were calculated using bootstrap with 2,000 repetitions. The difference between AUROC values was tested using the DeLong test followed by multi-comparison correction for a false positive rate (51–53).

Results

Table 1 summarizes the characteristics of the 291 individuals included in the study. The summary of variables used as inputs for the machine learning models is presented in Table 2.

Table 2

Table 2. Summary of features used as input attributes to create machine learning models in OSA prediction and severity.

Classification models

Table 3 shows the AUROC results obtained for both models. The AUROC for normal and severe classes were the best, independently of the input dataset. In contrast, the OSA moderate class always had the worst performance. The model created using all features (HRV + SpO₂ + Antrop.) resulted in higher AUROC for normal and severe classes compared to the SpO₂ model. For the moderate class, combining HRV and SpO₂ showed superior AUROC compared to either HRV or SpO₂ alone. For the mild class, however, no statistical differences were observed between the three models.

Table 3

Table 3. AUROC (95% confidence interval) of the multiclass and binary models created from different input datasets.

Table 4 shows the top 10 most important features for the models trained using all features available (i.e., HRV + SpO₂ + Anthrop.) and either four or two classes. The oxygen saturation indices (SatMin and T90) were always the top 2 attributes. Anthropometric data, such as weight and BMI, were always in the top 5 attributes. Among HRV indices, VLFabs appear in the top 5 for both multiclass and binary models, while HRF-W0 and DFA- α1 were always in the top 10. Other HRV indices present in the top 10 list were Bin-0V, Bin-1V, HRF-W0, and SampEn. The attribute that contributed the most to one model was T90, reaching a maximum percentage of 8.2%.

Table 4

Table 4. Top 10 attributes for the multiclass and binary models. Values represent the percentage contribution of each feature to the total impurity decrease [rank within the model].

Discussion

In this study, we analyzed a set of non-invasive indices derived from different data sources for the creation of machine-learning predictive models of OSA. The combination of indices (HRV, SpO₂) and anthropometric variables contributed to a consistent and strong performance across OSA severity classes, achieving AUROC values as high as 0.83 in the binary classification.

Predictors of OSA

The features considered as inputs to the machine learning models were chosen based on three main factors: (1) the clinical relevance that those indices can bring related to the pathophysiology of OSA; (2) the low cost and facility to obtain the feature, particularly when compared to the PSG exam; and (3) the lack of studies reporting the predictive value of these features with OSA patients.

In OSA, personal information and anthropometric data have demonstrated clinical relevance, and many are included in risk factor calculation. It's well documented in the literature that male, obese, and older individuals have a high chance of OSA development (54). Standard oxygen saturation indices are also well studied in OSA. As hypoxia is one of the central physiological disturbances in patients with OSA, indices that a pulse oximeter can detect are clinically relevant in this population (55, 56). The assessment of these parameters has been proposed in several questionnaires and screening tools seeking to differentiate between OSA and healthy individuals (55, 57). Techniques involving ECG-derived features aiming to analyze OSA have been studied for almost 40 years, and lately, HRV has been surging as a widely studied tool in this disease (58, 59). It is already established that certain HRV indices exhibit significant differences when comparing patients with OSA to healthy subjects, likely attributed to alterations in cardiac autonomic modulation induced by OSA (60). However, most studies involving HRV and OSA have focused on traditional indices derived from the time and frequency domain, with the predictive value of most nonlinear indices being unknown in this condition (11, 13). In the present study, we demonstrated that some nonlinear HRV features are ranked in the top 10 most important for predicting OSA, although none of them rank in the top 5.

Machine learning predictions of OSA presence and severity

Artificial intelligence models have already been used in sleep medicine, and studies aiming at building screening tests for patients with OSA using machine learning predictive models can be widely found in the literature (61). In the present study, we demonstrated an improvement in AUROC when HRV, SpO₂, and anthropometric data were combined, compared to models that used only HRV and SpO₂ features. Some studies corroborate our findings. A recent study made by Park & Kim in a large sample of Koreans with OSA showed an AUROC of 0.69 for the RF algorithm, using linear HRV indices and anthropometric data for an AHI cutoff of 15 (62). Using HRV, SpO₂ indices, and anthropometric data, Li and co-workers achieved an AUROC of 0.97 in differentiating OSA from normal subjects using a feed-forward neural network algorithm (63). Similarly, Zhu and co-workers used HRV and SpO₂ indices to detect OSA and showed that the combination of HRV and SpO₂ values had a higher performance than using HRV indices alone (64). This reinforces that the combination of these indices could be integrative for developing screening and classification models, thereby enhancing their performance.

Despite the differences between methodologies, many studies using HRV and other clinical indices to create classification models for OSA used binary OSA categorization based on different AHI cutoffs. Ravelo-Garcia and co-workers showed that the combinations of HRV and SpO₂ indices enhanced the metrics of classifiers compared to those created with HRV indices alone, achieving an AUROC of 0.91 for an AHI cutoff of 10 (65). Baty and co-workers used the principal component analysis to select some HRV linear indices to classify OSA individuals and achieved an AUROC of 0.82 with an AHI cutoff of 18 (66). In our study, an AHI cutoff of 15 was considered because AHI > 15 is the clinical threshold established for diagnosing OSA, even when no associated symptoms are present (47).

Creating models based solely on two classes is a strategic approach, aiming to develop a screening tool focused on detecting OSA in its most severe stages. This strategy is intended to prioritize cases that would benefit from increased attention from the healthcare team. It is well-documented that patients with high AHI values can be at a higher risk of developing comorbidities (67). Moreover, severe patients remaining untreated can be at higher risk of presenting cardiovascular events in the future (68). Our best binary model provided an AUROC of 0.83, showing great potential as a screening tool for the most urgent cases, identifying the patients that could be prioritized for a complete PSG evaluation.

It's important to note that, as a retrospective study with data collected from a specialized sleep ambulatory, it was not surprising that the dataset contains a higher number of patients with OSA than normal subjects. Therefore, normal individuals composed the minority class in our group (16.15% of the sample considering an AHI < 5, and 37.8% of the sample considering an AHI < 15). Since machine learning models are sensitive to class imbalance, SMOTE technique was implemented to avoid bias, seeking to improve the models' performance and reliability. This technique has been applied to several clinical studies, including OSA, being important in trials that could have an imbalanced number of individuals, showing an improvement in classifier models for different scenarios (62, 69–71).

Ranking of the features by their importance

The importance of the features obtained with RF revealed that SatMin, T90, weight, and BMI consistently ranked among the top 5 features. This is consistent with the clinical importance of these indices described previously. Moreover, the two SpO₂ indices were always the two most important features in all models, consistent with the clinical importance of these variables for OSA diagnosis (55). Nevertheless, the importance assigned to SpO₂-derived attributes (5.9%–8.2%) does not stand out compared to the importance associated with the other features (2.4%–5.2%). Notably, the list of top 10 features comprises a combination of all three types of variables, i.e., SpO₂, HRV, and anthropometric.

Among the HRV indices, VLFabs stands out for consistently being in the top 5 list. The VLF band of the spectral analysis does not have a clear definition of its physiological meaning. Studies have attributed its value to thermoregulation, the activity of the renin-angiotensin-aldosterone system, and other humoral factors (28). A study by Francis and co-workers showed that changes in periodic breathing and oxygen desaturation can also affect the VLF component, which can be a finding in apneic patients (72). Several other studies have highlighted the significance of the VLF band in the context of OSA (73). Apneic patients exhibit higher VLF compared to normal individuals, and it has been suggested that this alteration can be reversed through therapeutic strategies for OSA (72, 74–76). The study from Baty and co-workers also demonstrated the VLF band as an essential feature in classifying patients with OSA (66). These findings suggest that VLFabs carries valuable information about the condition of OSA. Therefore, it should be considered in the development of OSA risk factors.

In addition to VLFabs, the W0 pattern of the HRF method was also present in all top 10 feature rankings. The HRF is a recent approach that requires additional investigation to further elucidate its biological meaning. HRF indices with the most inflection points (W3 and PIP) are associated with a high cardiovascular risk (18). Nevertheless, the authors who introduced HRF emphasized that the interpretation for each index may vary among different diseases (41). In OSA, W0 may be particularly affected by the cyclic variation found in the ECG (58). Studies made by Guzik and co-workers and Jiang and co-workers, employing an asymmetry approach that quantifies the length of acceleration and deceleration runs, confirm that patients with severe OSA exhibit a high number of long runs for both acceleration and deceleration (20, 77). With long sequences of acceleration and/or deceleration runs, the RRi will be changing in the same direction (up or down) most of the time, creating a high number of “fluent” patterns in HRF (W0). This particularity of OSA may explain the higher importance associated with W0 and also with VLF power of RRi spectra, all reflecting patterns of slow oscillations of heart rate.

Other HRV nonlinear indices that appear in all the top 10 lists of features include DFA-α1, SampEn, and BIN-1V. These indices are often associated with the analysis of the system's complexity, and some of them have been evaluated in studies of OSA (12, 28, 73, 78). A previous study from our research group demonstrated that HRV nonlinear indices were sensitive in detecting differences between OSA classes (particularly severe cases) and normal individuals, with significant correlations observed between these HRV indices and the AHI (79). While the physiological interpretation of these indices may not be entirely clear, they are acknowledged to offer valuable information about the organism's health status, closely tied to the concept of “physiological complexity” (15, 80). Therefore, the physiological changes induced by OSA can also be observed at a system level, as calculated by these HRV nonlinear indices, contributing to a better assessment of OSA.

Limitations

We acknowledge important limitations in the present study. Firstly, the classes of severity of OSA exhibit an unbalanced number of samples. While the SMOTE technique aids in addressing this limitation, it generates artificial data, introducing the possibility of producing noisy instances that may not be entirely comparable to real data. Secondly, several relevant pieces of patient information were not available, including comorbidities, medication usage, and personal history (e.g., level of physical activity, smoking, alcohol consumption, race, etc.). We recognize that these factors may play a crucial role in the disease, and their impact on predictive models should be assessed in future studies. Thirdly, the selection of RRi segments was based on a visual quality assessment to avoid artifacts, without considering the sleep stage or the presence of respiratory events.

This proof-of-concept study encourages follow-up studies with data collected during the waking period, which may be more compelling in some cases. However, it is important to emphasize that, even if the models derived from electrocardiographic recordings collected during sleep do not replicate with data collected during the waking period, a Holter electrocardiographic recording, conducted overnight, is undeniably simpler, easier, and more cost-effective than a PSG. Therefore, the models obtained in the present study emerge as valuable screening tools for patients suspected of having OSA.

Conclusion

The present study demonstrated that a comprehensive set of HRV features, combined with SpO₂ and patient information, can be used to train highly effective predictive models for OSA classification. These models showed strong performance across different OSA severity levels, highlighting their potential as reliable diagnostic tools. Given the non-invasive nature and ease of obtaining the evaluated features, they offer a promising approach for quick and cost-effective screening of patients suspected of having OSA. This combination of HRV, SpO₂, and anthropometric data could enable early detection and stratification of OSA severity, facilitating timely interventions and improving patient outcomes.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving humans were approved by Research Ethics Committee of HC-FMRP/USP (Protocol: 42058720.6.000.5440/4.550.2327). The studies were conducted in accordance with the local legislation and institutional requirements. The ethics committee/institutional review board waived the requirement of written informed consent for participation from the participants or the participants' legal guardians/next of kin because the data was collected in a retrospective manner from exams previously conducted in the University Hospital.

Author contributions

RS: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Software, Visualization, Writing – original draft, Writing – review & editing. MM: Data curation, Formal analysis, Investigation, Validation, Writing – review & editing. HS: Writing – review & editing. AE: Data curation, Writing – review & editing. LS: Conceptualization, Data curation, Formal analysis, Methodology, Software, Validation, Writing – review & editing. RT: Data curation, Formal analysis, Investigation, Supervision, Validation, Writing – review & editing. RF: Conceptualization, Methodology, Project administration, Resources, Supervision, Visualization, Writing – review & editing.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. We acknowledge the funding agencies FAPESP (2020/06043-7), CAPES (88887.596933/2021-00), and CNPq (139305/2019-0 & 423999/2021-4) for the financial support.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The author(s) declared that they were an editorial board member of Frontiers, at the time of submission. This had no impact on the peer review process and the final decision.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Dempsey JA, Veasey SC, Morgan BJ, O’Donnell CP. Pathophysiology of sleep apnea. Physiol Rev. (2010) 90(1):47–112. doi: 10.1152/physrev.00043.2008

PubMed Abstract | Crossref Full Text | Google Scholar

2. Kapur VK, Auckley DH, Chowdhuri S, Kuhlmann DC, Mehra R, Ramar K, et al. Clinical practice guideline for diagnostic testing for adult obstructive sleep apnea: an American academy of sleep medicine clinical practice guideline. J Clin Sleep Med. (2017) 13(3):479–504. doi: 10.5664/jcsm.6506

PubMed Abstract | Crossref Full Text | Google Scholar

3. Jafari B, Mohsenin V. Polysomnography. Clin Chest Med. (2010) 31(2):287–97. doi: 10.1016/j.ccm.2010.02.005

PubMed Abstract | Crossref Full Text | Google Scholar

4. Berry RB, Budhiraja R, Gottlieb DJ, Gozal D, Iber C, Kapur VK, et al. Rules for scoring respiratory events in sleep: update of the 2007 AASM manual for the scoring of sleep and associated events. J Clin Sleep Med. (2012) 8(5):597–619. doi: 10.5664/jcsm.2172

PubMed Abstract | Crossref Full Text | Google Scholar

5. Flemons WW, Douglas NJ, Kuna ST, Rodenstein DO, Wheatley J. Access to diagnosis and treatment of patients with suspected sleep apnea. Am J Respir Crit Care Med. (2004) 169(6):668–72. doi: 10.1164/rccm.200308-1124PP

PubMed Abstract | Crossref Full Text | Google Scholar

6. Grote L, McNicholas WT, Hedner J. Sleep apnoea management in Europe during the COVID-19 pandemic: data from the European sleep apnoea database (ESADA). Eur Respir J. (2020) 55(6):2001323. doi: 10.1183/13993003.01323-2020

PubMed Abstract | Crossref Full Text | Google Scholar

7. Sassani A, Findley LJ, Kryger M, Goldlust E, George C, Davidson TM. Reducing motor-vehicle collisions, costs, and fatalities by treating obstructive sleep apnea syndrome. Sleep. (2004) 27(3):453–8. doi: 10.1093/sleep/27.3.453

PubMed Abstract | Crossref Full Text | Google Scholar

8. Somers VK, White DP, Amin R, Abraham WT, Costa F, Culebras A, et al. Sleep apnea and cardiovascular disease: an American heart association/American college of cardiology foundation scientific statement from the American heart association council for high blood pressure research professional education committee, council on clinical cardiology, stroke council, and council on cardiovascular nursing. In collaboration with the national heart, lung, and blood institute national center on sleep disorders research (national institutes of health). Circulation. (2008) 118(10):1080–111. doi: 10.1161/CIRCULATIONAHA.107.189420

PubMed Abstract | Crossref Full Text | Google Scholar

9. Borsoi L, Armeni P, Donin G, Costa F, Ferini-Strambi L. The invisible costs of obstructive sleep apnea (OSA): systematic review and cost-of-illness analysis. PLoS One. (2022) 17(5):e0268677. doi: 10.1371/journal.pone.0268677

PubMed Abstract | Crossref Full Text | Google Scholar

10. Electrophysiology TF. Heart rate variability: standards of measurement, physiological interpretation and clinical use. Task force of the European society of cardiology and the North American society of pacing and electrophysiology. Circulation. (1996) 93(5):1043–65. doi: 10.1161/01.CIR.93.5.1043

PubMed Abstract | Crossref Full Text | Google Scholar

11. Tobaldini E, Nobili L, Strada S, Casali KR, Braghiroli A, Montano N. Heart rate variability in normal and pathological sleep. Front Physiol. (2013) 4:294. doi: 10.3389/fphys.2013.00294

PubMed Abstract | Crossref Full Text | Google Scholar

12. Sassi R, Cerutti S, Lombardi F, Malik M, Huikuri HV, Peng CK, et al. Advances in heart rate variability signal analysis: a joint position statement by the e-cardiology ESC working group and the European heart rhythm association co-endorsed by the Asia Pacific heart rhythm society. Ep Europace. (2015) 17(9):1341–53. doi: 10.1093/europace/euv015

Crossref Full Text | Google Scholar

13. Sequeira VCC, Bandeira PM, Azevedo JCM. Heart rate variability in adults with obstructive sleep apnea: a systematic review. Sleep Sci. (2019) 12(3):214–21. doi: 10.5935/1984-0063.20190082

PubMed Abstract | Crossref Full Text | Google Scholar

14. Borowska M. Entropy-Based algorithms in the analysis of biomedical signals. Stud Logic Grammar Rhetoric. (2015) 43(1):21–32. doi: 10.1515/slgr-2015-0039

Crossref Full Text | Google Scholar

15. Goldberger AL, Peng CK, Lipsitz LA. What is physiologic complexity and how does it change with aging and disease? Neurobiol Aging. (2002) 23(1):23–6. doi: 10.1016/S0197-4580(01)00266-4

PubMed Abstract | Crossref Full Text | Google Scholar

16. Silva LEV, Silva CAA, Salgado HC, Fazan R. The role of sympathetic and vagal cardiac control on the complexity of heart rate dynamics. Am J Physiol Heart Circ Physiol. (2017) 312(3):H469–77. doi: 10.1152/ajpheart.00507.2016

PubMed Abstract | Crossref Full Text | Google Scholar

17. Silva LEV, Lataro RM, Castania JA, Silva CAA, Salgado HC, Fazan R, et al. Nonlinearities of heart rate variability in animal models of impaired cardiac control: contribution of different time scales. J Appl Physiol. (2017) 123(2):344–51. doi: 10.1152/japplphysiol.00059.2017

PubMed Abstract | Crossref Full Text | Google Scholar

18. Costa MD, Davis RB, Goldberger AL. Heart rate fragmentation: a new approach to the analysis of cardiac interbeat interval dynamics. Front Physiol. (2017) 8:255. doi: 10.3389/fphys.2017.00255

PubMed Abstract | Crossref Full Text | Google Scholar

19. Cysarz D, Van Leeuwen P, Edelhäuser F, Montano N, Somers VK, Porta A. Symbolic transformations of heart rate variability preserve information about cardiac autonomic control. Physiol Meas. (2015) 36(4):643–57. doi: 10.1088/0967-3334/36/4/643

PubMed Abstract | Crossref Full Text | Google Scholar

20. Guzik P, Piskorski J, Awan K, Krauze T, Fitzpatrick M, Baranchuk A. Obstructive sleep apnea and heart rate asymmetry microstructure during sleep. Clin Auton Res. (2013) 23(2):91–100. doi: 10.1007/s10286-013-0188-8

PubMed Abstract | Crossref Full Text | Google Scholar

21. Choi RY, Coyner AS, Kalpathy-Cramer J, Chiang MF, Campbell JP. Introduction to machine learning, neural networks, and deep learning. Transl Vis Sci Technol. (2020) 9(2):14. doi: 10.1167/tvst.9.2.14

PubMed Abstract | Crossref Full Text | Google Scholar

22. Greener JG, Kandathil SM, Moffat L, Jones DT. A guide to machine learning for biologists. Nat Rev Mol Cell Biol. (2022) 23(1):40–55. doi: 10.1038/s41580-021-00407-0

PubMed Abstract | Crossref Full Text | Google Scholar

23. Uddin S, Khan A, Hossain ME, Moni MA. Comparing different supervised machine learning algorithms for disease prediction. BMC Med Inform Decis Mak. (2019) 19(1):281. doi: 10.1186/s12911-019-1004-8

PubMed Abstract | Crossref Full Text | Google Scholar

24. Handelman GS, Kok HK, Chandra RV, Razavi AH, Lee MJ, Asadi H. Edoctor: machine learning and the future of medicine. J Intern Med. (2018) 284(6):603–19. doi: 10.1111/joim.12822

PubMed Abstract | Crossref Full Text | Google Scholar

25. van IJzendoorn DGP, Szuhai K, Briaire-de Bruijn IH, Kostine M, Kuijjer ML, Bovée JVMG. Machine learning analysis of gene expression data reveals novel diagnostic and prognostic biomarkers and identifies therapeutic targets for soft tissue sarcomas. PLoS Comput Biol. (2019) 15(2):e1006826. doi: 10.1371/journal.pcbi.1006826

PubMed Abstract | Crossref Full Text | Google Scholar

26. Silva LEV, Fazan R, Marin-Neto JA. Pybios: a freeware computer software for analysis of cardiovascular signals. Comput Methods Programs Biomed. (2020) 197:105718. doi: 10.1016/j.cmpb.2020.105718

PubMed Abstract | Crossref Full Text | Google Scholar

27. Rincon Soler AI, Silva LEV, Fazan R, Murta LO. The impact of artifact correction methods of RR series on heart rate variability parameters. J Appl Physiol. (2018) 124(3):646–52. doi: 10.1152/japplphysiol.00927.2016

PubMed Abstract | Crossref Full Text | Google Scholar

28. Shaffer F, Ginsberg JP. An overview of heart rate variability metrics and norms. Front Public Health. (2017) 5:258. doi: 10.3389/fpubh.2017.00258

PubMed Abstract | Crossref Full Text | Google Scholar

29. Peng CK, Havlin S, Stanley HE, Goldberger AL. Quantification of scaling exponents and crossover phenomena in nonstationary heartbeat time series. Chaos. (1995) 5(1):82–7. doi: 10.1063/1.166141

PubMed Abstract | Crossref Full Text | Google Scholar

30. Ribeiro M, Henriques T, Castro L, Souto A, Antunes L, Costa-Santos C, et al. The entropy universe. Entropy. (2021) 23(2):222. doi: 10.3390/e23020222

PubMed Abstract | Crossref Full Text | Google Scholar

31. Richman JS, Moorman JR. Physiological time-series analysis using approximate entropy and sample entropy. Am J Physiol Heart Circ Physiol. (2000) 278(6):H2039–2049. doi: 10.1152/ajpheart.2000.278.6.H2039

PubMed Abstract | Crossref Full Text | Google Scholar

32. Rostaghi M, Azami H. Dispersion entropy: a measure for time-series analysis. IEEE Signal Process Lett. (2016) 23(5):610–4. doi: 10.1109/LSP.2016.2542881

Crossref Full Text | Google Scholar

33. Li P, Liu C, Li K, Zheng D, Liu C, Hou Y. Assessing the complexity of short-term heartbeat interval series by distribution entropy. Med Biol Eng Comput. (2015) 53(1):77–87. doi: 10.1007/s11517-014-1216-0

PubMed Abstract | Crossref Full Text | Google Scholar

34. Chen W, Wang Z, Xie H, Yu W. Characterization of surface EMG signal based on fuzzy entropy. IEEE Trans Neural Syst Rehabil Eng. (2007) 15(2):266–72. doi: 10.1109/TNSRE.2007.897025

PubMed Abstract | Crossref Full Text | Google Scholar

35. Bandt C, Pompe B. Permutation entropy: a natural complexity measure for time series. Phys Rev Lett. (2002) 88(17):174102. doi: 10.1103/PhysRevLett.88.174102

PubMed Abstract | Crossref Full Text | Google Scholar

36. Rohila A, Sharma A. Phase entropy: a new complexity measure for heart rate variability. Physiol Meas. (2019) 40(10):105006. doi: 10.1088/1361-6579/ab499e

PubMed Abstract | Crossref Full Text | Google Scholar

37. Yang J, Choudhary GI, Rahardja S, Fränti P. Classification of interbeat interval time-series using attention entropy. IEEE Trans Affect Comput. (2023) 14(1):321–30. doi: 10.1109/TAFFC.2020.3031004

Crossref Full Text | Google Scholar

38. Porta A, Guzzetti S, Montano N, Furlan R, Pagani M, Malliani A, et al. Entropy, entropy rate, and pattern classification as tools to typify complexity in short heart period variability series. IEEE Trans Biomed Eng. (2001) 48(11):1282–91. doi: 10.1109/10.959324

PubMed Abstract | Crossref Full Text | Google Scholar

39. Porta A, Tobaldini E, Guzzetti S, Furlan R, Montano N, Gnecchi-Ruscone T. Assessment of cardiac autonomic modulation during graded head-up tilt by symbolic analysis of heart rate variability. Am J Physiol-Heart Circ Physiol. (2007) 293(1):H702–8. doi: 10.1152/ajpheart.00006.2007

PubMed Abstract | Crossref Full Text | Google Scholar

40. Silva LEV, Geraldini VR, de Oliveira BP, Silva CAA, Porta A, Fazan R. Comparison between spectral analysis and symbolic dynamics for heart rate variability analysis in the rat. Sci Rep. (2017) 7(1):8428. doi: 10.1038/s41598-017-08888-w

PubMed Abstract | Crossref Full Text | Google Scholar

41. Costa MD, Davis RB, Goldberger AL. Heart rate fragmentation: a symbolic dynamical approach. Front Physiol. (2017) 8:827. doi: 10.3389/fphys.2017.00827

PubMed Abstract | Crossref Full Text | Google Scholar

42. Guzik P, Piskorski J, Krauze T, Wykretowicz A, Wysocki H. Heart rate asymmetry by Poincaré plots of RR intervals. Biomed Tech. (2006) 51(4):272–5. doi: 10.1515/BMT.2006.054

PubMed Abstract | Crossref Full Text | Google Scholar

43. Porta A, Casali KR, Casali AG, Gnecchi-Ruscone T, Tobaldini E, Montano N, et al. Temporal asymmetries of short-term heart period variability are linked to autonomic regulation. Am J Physiol Regul Integr Comp Physiol. (2008) 295(2):R550–557. doi: 10.1152/ajpregu.00129.2008

PubMed Abstract | Crossref Full Text | Google Scholar

44. Bauer A, Kantelhardt JW, Barthel P, Schneider R, Mäkikallio T, Ulm K, et al. Deceleration capacity of heart rate as a predictor of mortality after myocardial infarction: cohort study. Lancet. (2006) 367(9523):1674–81. doi: 10.1016/S0140-6736(06)68735-7

PubMed Abstract | Crossref Full Text | Google Scholar

45. American Academy of Sleep Medicine. Sleep-related breathing disorders in adults: recommendations for syndrome definition and measurement techniques in clinical research. Sleep. (1999) 22(5):667–89. doi: 10.1093/sleep/22.5.667

PubMed Abstract | Crossref Full Text | Google Scholar

46. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in python. J Mach Learn Res. (2011) 12(85):2825–30.

Google Scholar

47. Sateia MJ. International classification of sleep disorders-third edition: highlights and modifications. Chest. (2014) 146(5):1387–94. doi: 10.1378/chest.14-0970

PubMed Abstract | Crossref Full Text | Google Scholar

48. Lo Vercio L, Amador K, Bannister JJ, Crites S, Gutierrez A, MacDonald ME, et al. Supervised machine learning tools: a tutorial for clinicians. J Neural Eng. (2020) 17(6):062001. doi: 10.1088/1741-2552/abbff2

Crossref Full Text | Google Scholar

49. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res. (2002) 16:321–57. doi: 10.1613/jair.953

Crossref Full Text | Google Scholar

50. Elreedy D, Atiya AF. A comprehensive analysis of synthetic minority oversampling technique (SMOTE) for handling class imbalance. Inf Sci. (2019) 505:32–64. doi: 10.1016/j.ins.2019.07.070

Crossref Full Text | Google Scholar

51. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B. (1995) 57(1):289–300. doi: 10.1111/j.2517-6161.1995.tb02031.x

Crossref Full Text | Google Scholar

52. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. (1988) 44(3):837–45. doi: 10.2307/2531595

PubMed Abstract | Crossref Full Text | Google Scholar

53. Sun X, Xu W. Fast implementation of DeLong’s algorithm for comparing the areas under correlated receiver operating characteristic curves. IEEE Signal Process Lett. (2014) 21(11):1389–93. doi: 10.1109/LSP.2014.2337313

Crossref Full Text | Google Scholar

54. Gharibeh T, Mehra R. Obstructive sleep apnea syndrome: natural history, diagnosis, and emerging treatment options. Nat Sci Sleep. (2010) 2:233–55. doi: 10.2147/NSS.S6844

PubMed Abstract | Crossref Full Text | Google Scholar

55. Del Campo F, Crespo A, Cerezo-Hernández A, Gutiérrez-Tobal GC, Hornero R, Álvarez D. Oximetry use in obstructive sleep apnea. Expert Rev Respir Med. (2018) 12(8):665–81. doi: 10.1080/17476348.2018.1495563

PubMed Abstract | Crossref Full Text | Google Scholar

56. Azarbarzin A, Sands SA, Stone KL, Taranto-Montemurro L, Messineo L, Terrill PI, et al. The hypoxic burden of sleep apnoea predicts cardiovascular disease-related mortality: the osteoporotic fractures in men study and the sleep heart health study. Eur Heart J. (2019) 40(14):1149–57. doi: 10.1093/eurheartj/ehy624

PubMed Abstract | Crossref Full Text | Google Scholar

57. Costa J C, Rebelo-Marques A, Machado JN, Gama JMR, Santos C, Teixeira F, et al. Validation of NoSAS (neck, obesity, snoring, age, sex) score as a screening tool for obstructive sleep apnea: analysis in a sleep clinic. Pulmonology. (2019) 25(5):263–70. doi: 10.1016/j.pulmoe.2019.04.004

PubMed Abstract | Crossref Full Text | Google Scholar

58. Guilleminault C, Connolly S, Winkle R, Melvin K, Tilkian A. Cyclical variation of the heart rate in sleep apnoea syndrome: mechanisms, and usefulness of 24 h electrocardiography as a screening technique. Lancet. (1984) 1(8369):126–31. doi: 10.1016/S0140-6736(84)90062-X

PubMed Abstract | Crossref Full Text | Google Scholar

59. Ucak S, Dissanayake HU, Sutherland K, de Chazal P, Cistulli PA. Heart rate variability and obstructive sleep apnea: current perspectives and novel technologies. J Sleep Res. (2021) 30(4):e13274. doi: 10.1111/jsr.13274

PubMed Abstract | Crossref Full Text | Google Scholar

60. Somers VK, Dyken ME, Clary MP, Abboud FM. Sympathetic neural mechanisms in obstructive sleep apnea. J Clin Invest. (1995) 96(4):1897–904. doi: 10.1172/JCI118235

PubMed Abstract | Crossref Full Text | Google Scholar

61. Goldstein CA, Berry RB, Kent DT, Kristo DA, Seixas AA, Redline S, et al. Artificial intelligence in sleep medicine: background and implications for clinicians. J Clin Sleep Med. (2020) 16(4):609–18. doi: 10.5664/jcsm.8388

PubMed Abstract | Crossref Full Text | Google Scholar

62. Park P, Kim JW. A classifying model of obstructive sleep apnea based on heart rate variability in a large Korean population. J Korean Med Sci. (2023) 38(7):e49. doi: 10.3346/jkms.2023.38.e49

PubMed Abstract | Crossref Full Text | Google Scholar

63. Li Z, Li Y, Zhao G, Zhang X, Xu W, Han D. A model for obstructive sleep apnea detection using a multi-layer feed-forward neural network based on electrocardiogram, pulse oxygen saturation, and body mass index. Sleep Breathing. (2021) 25(4):2065–72. doi: 10.1007/s11325-021-02302-6

PubMed Abstract | Crossref Full Text | Google Scholar

64. Zhu J, Zhou A, Gong Q, Zhou Y, Huang J, Chen Z. Detection of sleep apnea from electrocardiogram and pulse oximetry signals using random forest. Appl Sci. (2022) 12(9):4218. doi: 10.3390/app12094218

Crossref Full Text | Google Scholar

65. Ravelo-García AG, Kraemer JF, Navarro-Mesa JL, Hernández-Pérez E, Navarro-Esteva J, Juliá-Serdá G, et al. Oxygen saturation and RR intervals feature selection for sleep apnea detection. Entropy. (2015) 17(5):2932–57. doi: 10.3390/e17052932

Crossref Full Text | Google Scholar

66. Baty F, Boesch M, Widmer S, Annaheim S, Fontana P, Camenzind M, et al. Classification of sleep apnea severity by electrocardiogram monitoring using a novel wearable device. Sensors. (2020) 20(1):286. doi: 10.3390/s20010286

PubMed Abstract | Crossref Full Text | Google Scholar

67. Loke YK, Brown JWL, Kwok CS, Niruban A, Myint PK. Association of obstructive sleep apnea with risk of serious cardiovascular events: a systematic review and meta-analysis. Circ Cardiovasc Qual Outcomes. (2012) 5(5):720–8. doi: 10.1161/CIRCOUTCOMES.111.964783

PubMed Abstract | Crossref Full Text | Google Scholar

68. Marin JM, Carrizo SJ, Vicente E, Agusti AGN. Long-term cardiovascular outcomes in men with obstructive sleep apnoea-hypopnoea with or without treatment with continuous positive airway pressure: an observational study. Lancet. (2005) 365(9464):1046–53. doi: 10.1016/S0140-6736(05)71141-7

PubMed Abstract | Crossref Full Text | Google Scholar

69. Waqar M, Dawood H, Dawood H, Majeed N, Banjar A, Alharbey R. An efficient SMOTE-based deep learning model for heart attack prediction. Sci Program. (2021) 2021:e6621622. doi: 10.1155/2021/6621622

Crossref Full Text | Google Scholar

70. Ishaq A, Sadiq S, Umer M, Ullah S, Mirjalili S, Rupapara V, et al. Improving the prediction of heart failure patients’ survival using SMOTE and effective data mining techniques. IEEE Access. (2021) 9:39707–16. doi: 10.1109/ACCESS.2021.3064084

Crossref Full Text | Google Scholar

71. Mencar C, Gallo C, Mantero M, Tarsia P, Carpagnano GE, Foschino Barbaro MP, et al. Application of machine learning to predict obstructive sleep apnea syndrome severity. Health Inf J. (2020) 26(1):298–317. doi: 10.1177/1460458218824725

PubMed Abstract | Crossref Full Text | Google Scholar

72. Francis DP, Davies LC, Willson K, Ponikowski P, Coats AJ, Piepoli M. Very-low-frequency oscillations in heart rate and blood pressure in periodic breathing: role of the cardiovascular limb of the hypoxic chemoreflex. Clin Sci. (2000) 99(2):125–32. doi: 10.1042/cs0990125

Crossref Full Text | Google Scholar

73. Qin H, Steenbergen N, Glos M, Wessel N, Kraemer JF, Vaquerizo-Villar F, et al. The different facets of heart rate variability in obstructive sleep apnea. Front Psychiatry. (2021) 12:642333. doi: 10.3389/fpsyt.2021.642333

PubMed Abstract | Crossref Full Text | Google Scholar

74. Nastałek P, Bochenek G, Kania A, Celejewska-Wójcik N, Mejza F, Sładek K. Heart rate variability in the diagnostics and CPAP treatment of obstructive sleep apnea. Adv Biomed. (2019) 1176:25–33. doi: 10.1007/5584_2019_385

PubMed Abstract | Crossref Full Text | Google Scholar

75. Noda A, Hayano J, Ito N, Miyata S, Yasuma F, Yasuda Y. Very low-frequency component of heart rate variability as a marker for therapeutic efficacy in patients with obstructive sleep apnea: preliminary study. J Res Med Sci. (2019) 24:84. doi: 10.4103/jrms.JRMS_62_18

PubMed Abstract | Crossref Full Text | Google Scholar

76. Shiomi T, Guilleminault C, Sasanabe R, Hirota I, Maekawa M, Kobayashi T. Augmented very low frequency component of heart rate variability during obstructive sleep apnea. Sleep. (1996) 19(5):370–7. doi: 10.1093/sleep/19.5.370

PubMed Abstract | Crossref Full Text | Google Scholar

77. Jiang J, Chen X, Zhang C, Wang G, Fang J, Ma J, et al. Heart rate acceleration runs and deceleration runs in patients with obstructive sleep apnea syndrome. Sleep and Breathing. (2017) 21(2):443–51. doi: 10.1007/s11325-016-1437-6

PubMed Abstract | Crossref Full Text | Google Scholar

78. Liang D, Wu S, Tang L, Feng K, Liu G. Short-term HRV analysis using nonparametric sample entropy for obstructive sleep apnea. Entropy. (2021) 23(3):267. doi: 10.3390/e23030267

PubMed Abstract | Crossref Full Text | Google Scholar

79. Dos Santos RR, da Silva TM, Silva LEV, Eckeli AL, Salgado HC, Fazan R. Correlation between heart rate variability and polysomnography-derived scores of obstructive sleep apnea. Front Networks Physiol. (2022) 2:958550. doi: 10.3389/fnetp.2022.958550

PubMed Abstract | Crossref Full Text | Google Scholar

80. Seely AJ, Macklem PT. Complex systems and the technology of variability analysis. Crit Care. (2004) 8(6):R367–84. doi: 10.1186/cc2948

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: obstructive sleep apnea, autonomic modulation of the heart, heart rate variability, oxygen saturation, machine learning

Citation: dos Santos RR, Marumo MB, Eckeli AL, Salgado HC, Silva LEV, Tinós R and Fazan Jr R (2025) The use of heart rate variability, oxygen saturation, and anthropometric data with machine learning to predict the presence and severity of obstructive sleep apnea. Front. Cardiovasc. Med. 12:1389402. doi: 10.3389/fcvm.2025.1389402

Received: 21 February 2024; Accepted: 3 March 2025;
Published: 14 March 2025.

Edited by:

Jesus Lazaro, University of Zaragoza, Spain

Reviewed by:

J. S. Murguia, Autonomous University of San Luis Potosí, Mexico
Beatrice Cairo, University of Milan, Italy
José Javier Reyes-Lagos, Universidad Autónoma del Estado de México, Mexico

Copyright: © 2025 dos Santos, Marumo, Eckeli, Salgado, Silva, Tinós and Fazan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Rubens Fazan, cmZhemFuQHVzcC5icg==

^†These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.