Artificial intelligence-enabled electrocardiographic screening for left ventricular systolic dysfunction and mortality risk prediction

Huang, Yu-Chang; Hsu, Yu-Chun; Liu, Zhi-Yong; Lin, Ching-Heng; Tsai, Richard; Chen, Jung-Sheng; Chang, Po-Cheng; Liu, Hao-Tien; Lee, Wen-Chen; Wo, Hung-Ta; Chou, Chung-Chuan; Wang, Chun-Chieh; Wen, Ming-Shien; Kuo, Chang-Fu

doi:10.3389/fcvm.2023.1070641

ORIGINAL RESEARCH article

Front. Cardiovasc. Med., 03 March 2023

Sec. General Cardiovascular Medicine

Volume 10 - 2023 | https://doi.org/10.3389/fcvm.2023.1070641

Artificial intelligence-enabled electrocardiographic screening for left ventricular systolic dysfunction and mortality risk prediction

Yu-Chang Huang¹^†

Yu-Chun Hsu^2,3^†

Richard Tsai²

Po-Cheng Chang^1,4

Wen-Chen Lee¹

Hung-Ta Wo^1,4

Chung-Chuan Chou^1,4

Chun-Chieh Wang^1,4

Ming-Shien Wen^1,4^*

Chang-Fu Kuo^2,4,5^*

¹Division of Cardiology, Chang Gung Memorial Hospital, Taoyuan, Taiwan
²Center for Artificial Intelligence in Medicine, Chang Gung Memorial Hospital, Taoyuan, Taiwan
³School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, United States
⁴School of Medicine, Chang Gung University, Taoyuan, Taiwan
⁵Division of Rheumatology, Allergy and Immunology, Chang Gung Memorial Hospital, Taoyuan, Taiwan

Background: Left ventricular systolic dysfunction (LVSD) characterized by a reduced left ventricular ejection fraction (LVEF) is associated with adverse patient outcomes. We aimed to build a deep neural network (DNN)-based model using standard 12-lead electrocardiogram (ECG) to screen for LVSD and stratify patient prognosis.

Methods: This retrospective chart review study was conducted using data from consecutive adults who underwent ECG examinations at Chang Gung Memorial Hospital in Taiwan between October 2007 and December 2019. DNN models were developed to recognize LVSD, defined as LVEF <40%, using original ECG signals or transformed images from 190,359 patients with paired ECG and echocardiogram within 14 days. The 190,359 patients were divided into a training set of 133,225 and a validation set of 57,134. The accuracy of recognizing LVSD and subsequent mortality predictions were tested using ECGs from 190,316 patients with paired data. Of these 190,316 patients, we further selected 49,564 patients with multiple echocardiographic data to predict LVSD incidence. We additionally used data from 1,194,982 patients who underwent ECG only to assess mortality prognostication. External validation was performed using data of 91,425 patients from Tri-Service General Hospital, Taiwan.

Results: The mean age of patients in the testing dataset was 63.7 ± 16.3 years (46.3% women), and 8,216 patients (4.3%) had LVSD. The median follow-up period was 3.9 years (interquartile range 1.5–7.9 years). The area under the receiver-operating characteristic curve (AUROC), sensitivity, and specificity of the signal-based DNN (DNN-signal) to identify LVSD were 0.95, 0.91, and 0.86, respectively. DNN signal-predicted LVSD was associated with age- and sex-adjusted hazard ratios (HRs) of 2.57 (95% confidence interval [CI], 2.53–2.62) for all-cause mortality and 6.09 (5.83–6.37) for cardiovascular mortality. In patients with multiple echocardiograms, a positive DNN prediction in patients with preserved LVEF was associated with an adjusted HR (95% CI) of 8.33 (7.71 to 9.00) for incident LVSD. Signal- and image-based DNNs performed equally well in the primary and additional datasets.

Conclusion: Using DNNs, ECG becomes a low-cost, clinically feasible tool to screen LVSD and facilitate accurate prognostication.

Introduction

Heart failure (HF) is a major health issue affecting over 26 million people worldwide. It causes a significant increase in both morbidity and mortality and imposes a financial burden on society (1). Echouffo-Tcheugui et al. have classified left ventricular dysfunction into two categories: left ventricular systolic dysfunction (LVSD) and left ventricular diastolic dysfunction. LVSD is characterized by a reduced left ventricular ejection fraction (LVEF) and is associated with three times the risk of developing overt HF (2). Early identification of individuals with asymptomatic LVSD can lead to effective interventions, such as lifestyle changes, and medications, including angiotensin-converting enzyme inhibitors, angiotensin II receptor blockers, mineralocorticoid receptor antagonists, and beta-blockers (3–7), which can delay the onset of HF, reduce the rate of cardiac events, and improve survival (8–10).

The most commonly used method to assess LVSD is the transthoracic echocardiogram (TTE), but its limitations, including portability, cost, and operator dependency, restrict its use as a screening tool. To address this, there is a need for more accurate and accessible screening tools to identify LVSD in asymptomatic patients, such as a weighted scoring model incorporating clinical characteristics and plasma natriuretic peptides. However, these tools lack the specificity to predict LVSD in asymptomatic populations (11, 12).

The electrocardiogram (ECG) is an inexpensive and widely available method that measures the collective electrical activity of the heart and may contain information related to LVSD. While ECG recording is a standardized process, the accuracy and consistency of human interpretation can vary widely based on the experience and expertise of the interpreter. In addition, subtle ECG features that are invisible to the human eye may be useful for LVSD detection and prognostication. To overcome these challenges, the use of deep neural networks (DNNs) is proposed.

In recent years, DNNs have been applied successfully in the healthcare industry, including image analysis (13), predictive modeling (14), natural language processing (15), and drug discovery (16). They are superior to traditional pattern recognition methods (17) and form the foundation of clinical applications such as fracture detection (18), retinopathy grading (19), and lung nodule identification (20). DNN tools can interpret ECGs with similar accuracy to experienced physicians. Attia et al. developed a DNN-based ECG screening tool to identify individuals with LVEF ≤35% (21). A subsequent pragmatic clinical trial showed that a DNN-based intervention increased the likelihood of identifying patients with low LVEF during routine primary care (22). However, the effectiveness of DNN-based models in predicting incident LVSD and mortality has not been studied in a large clinical setting.

With data from approximately 1.7 million individuals, we conducted this study to evaluate the feasibility of using DNN-based ECG interpretation as a screening tool for LVSD and to assess its utility in risk assessment. The primary outcome was the ability of the DNN model to accurately identify individuals with LVSD (defined as LVEF <40%) based solely on the ECG. The secondary outcome was the ability of the DNN model to identify individuals at increased risk of death and at increased risk of developing LVSD.

Materials and methods

Data sources and study population

This study was conducted at Chang Gung Memorial Hospital (CGMH), the largest private hospital system in Taiwan. The study population included consecutive adult patients (age ≥ 18) who underwent standard 12-lead ECG at CGMH between October 2007 and December 2019 (1,777,039 individuals, 5,148,718 ECG tracings). ECGs with poor recording quality or unavailable leads were excluded. The ECG data were linked to the Chang Gung Research Database (CGRD), which included the electronic health records of all patients who visited any one of the following seven hospitals: Keelung, Taipei, Linkou (headquarters), Taoyuan, Yunlin, Chiayi, and Kaohsiung.

The patients’ survival status was confirmed by linking the CGRD to the National Death Registry. Valid internal patient record linkage was achieved by using unique patient identifiers, and these were encrypted before the data were released to researchers to protect patient confidentiality. This study was approved by the Institutional Review Board of CGMH and Tri-Service General Hospital. This study used anonymous and nontraceable data, so the need for patient consent was waived.

Collection of data

Standard 12-lead ECGs with 10-s voltage-time traces were acquired at a sampling rate of 500 Hz using a MAC 5000, MAC 5500, or MAC5500HD ECG machine (GE Healthcare, Chicago, IL, United States) and stored using the Marquette Universal System for Electrocardiography (MUSE). Each standard 12-lead ECG was stored as a 12 × 5,000 matrix. Both the raw ECG signal data and processed ECG images at a 400 × 600-pixel resolution were obtained.

Transthoracic echocardiograms were performed and interpreted in accordance with the guidelines set forth by the American Society of Echocardiography and the American College of Cardiology/American Heart Association. Comprehensive two-dimensional (2D) or three-dimensional (3D) Doppler echocardiographic profiles and quantitative measurements were recorded in Chang Gung’s health information system. For this study, we only extracted LVEF values for analysis. LVEF was routinely measured using standardized methodologies. If different methods were used to measure LVEF in a report, the order of data preference was as follows: 3D echocardiogram, the Simpson biplane method, 2D method, linear measurement using M-mode. If multiple LVEF values were obtained using one method, the mean value was used for analysis.

To achieve proper correlation between ECG and TTE data, only TTEs obtained within 2 weeks of the index ECG were used for DNN model creation.

Development of DNN models for identification of LVSD

In this study, we implemented two types of DNNs using the Pytorch framework and Python 3.6. All training was performed on an NVIDIA DGX-1 platform with 8 V100 GPUs and 32 GB of RAM per GPU. For the DNN that used signal inputs (DNN-signal), we used the deep residual network (ResNet) (23) modified to fit the signal input (Supplementary Figure 1). We used a wider kernel for the first convolution layer compared with the original ResNet framework as used for images. This architecture used skip connections, which allowed information to pass directly to the next layer to avoid the degradation caused by deeper neural networks. The network consisted of a convolution layer followed by eight residual blocks. Each residual block contained two convolution layers. The output of the last block was fed into hybrid pooling because combining max- and average-pooling methods improved the generalization ability while reducing dimensionality (24, 25). The output of hybrid pooling was subsequently sent to a fully connected layer to perform the final classification. The output of each convolutional layer was followed by batch normalization for distribution normalization and fed into a rectified linear activation unit (26). Cross-entropy loss with an Adam optimizer (27) was used in the model. Dropout was applied to reduce the overfitting by breakup co-adaptation on the training data (28).

For the DNN using the image inputs (DNN-image), we prepared a 400 × 600-pixel image similar to standard 12-lead ECG images (Supplementary Figure 2) using the signal data (12 × 5,000 matrix). The resolution was determined by a series of experiments using different image resolutions. The images were fed to ResNet-18 (23), and the output layer had two classes (Softmax function). The validation set was used to optimize the network architecture and network hyperparameters. The DNN-signal and DNN-image used the same training and validation sets for model building and were tested on the same testing set. A receiver operating characteristic (ROC) curve was plotted to assess the performance. The model with the highest area under the ROC curve (AUROC) was selected as the final model. We used the validation dataset ROC to select optimal threshold for the probability of LVSD by applying the Youden index (J) method.

We further assessed the network performance in different age, sex, and comorbidity strata. The odds ratio (OR), sensitivity, and specificity were calculated for each strata.

Division of dataset

Among 1,684,298 adult patients with ECG tracings, 380,675 had at least one TTE data within 2 weeks of the index ECG during the study period (Figure 1). For patients with multiple ECG–TTE pairs, the earliest pair with the shortest ECG–TTE interval was selected for model development. Total 380,675 ECG–TTE paired datasets were used for the primary analysis. These ECG-TTE pairs were randomly allocated into a training, validation, or testing set using simple random sampling in which each dataset had an equal probability of selection without replacement. The final DNN development cohort included 133,225 patients in the training set, 57,134 in the validation set, and 190,316 in the testing set. No patient was allocated to more than one group (Figure 1).

FIGURE 1

Figure 1. Data flow for ECG and TTE data pairing.

We further conducted an external validation using paired ECG-TTE data from the Tri-service General Hospital. The external validation cohort included 91,425 consecutive adults between April 2010 and September 2021. The criteria of patient selection and echocardiographic performance methodology were the same as for the derivation cohort. Different from the ECG machine used at CGMH, ECGs from Tri-service General Hospital were obtained using the Philips system.

Performance evaluation of the DNN models in predicting mortality

The ability of DNN to predict all-cause and cardiovascular mortality was assessed. According to the differences between the results of echocardiographic measurements and DNN predictions, we defined the following names: (i) ‘true positive’ DNN prediction represents both DNN-predicted and echo-measured LVEF <40%; (ii) ‘true negative’ DNN prediction represents both DNN-predicted and echo-measured LVEF ≥40%; (iii) ‘false positive’ DNN prediction represents DNN-predicted LVEF<40% and contemporaneous echo-measured LVEF ≥40%; and (iv) ‘false negative’ DNN prediction represents DNN-predicted LVEF≥40% and contemporaneous echo-measured LVEF <40%. The associations of different groups with all-cause or cardiovascular mortality were also assessed. The National Death Registry was linked to the study dataset. In Taiwan, it is mandatory for physicians to report deaths and causes of death to the Department of Health and Welfare. Therefore, death records within the National Death Registry are considered complete and accurate. A previous validation study estimated the effect of the misrecorded causes of death in the National Death Registry on cardiovascular mortality rates. The effect was less than 4%, suggesting accurate cause-of-death coding in Taiwan (29).

Sensitivity analyses

We conducted sensitivity analyses in patients who were not included in the primary analysis. These patients were included in the following sub-analyses (Figure 1): (i) among patients with multiple TTE examinations in the original testing dataset (dataset A1, n = 49,564), the incidence of LVSD and mortality were compared in patients with ‘false-positive’ versus ‘true-negative’ predictions of LVSD; (ii) among patients who underwent TTE after more than 2 weeks of the index ECG (dataset B), the incidence of LVSD and mortality were compared in patients with positive versus negative predictions of LVSD; and (iii) among patients without echocardiographic data (dataset C), mortality rate was compared in patients with positive versus negative predictions of LVSD. Age- and sex-weighted Kaplan–Meier analysis was used to determine the incidence of LVSD or mortality. Cox proportional hazard regression was used to estimate the age- and sex-adjusted hazard ratios (HR; 95% confidence intervals [CI]) for LVSD and mortality.

Statistical methods

Only the testing datasets were evaluated for performance measures. The model’s diagnostic performance was evaluated by calculating the AUROC, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). The F1 score, harmonic mean of the PPV, and sensitivity based on the selected threshold were also computed. Continuous variables are expressed as means ± standard deviation (SD). Categorical variables are expressed as numbers and percentages. Adjusted odds ratios (OR; 95% CI) were calculated. For comparisons of population characteristics, the chi-square test was used for categorical variables and the unpaired Student’s t-test for continuous variables. Cox proportional hazards models were used to estimate hazard ratios (HR; 95%CI) for LVSD, all-cause, and cardiovascular mortality. A value of p < 0.05 was considered statistically significant. Statistical analyses were conducted using SAS 9.4 software.

Results

The testing dataset contained 190,316 patients (46.3% females), and 8,216 patients (4.3%) had LVSD. The mean age was 63.7 ± 16.3 years. The median follow-up time was 3.9 years (interquartile range 1.5–7.9 years) for testing dataset. Table 1 shows the characteristics of the patients in the training, validation, and testing sets. There were no significant differences between groups.

TABLE 1

Table 1. Patient characteristics and comorbidities.

Performance of the DNN models in identifying LVSD

The AUROC values of DNN-signal and DNN-image for identifying LVSD in the testing dataset were 0.95 and 0.94, respectively (Supplementary Figure 3). When selecting a threshold maximizing the Youden’s index, the overall accuracy of DNN-signal was 0.86, with a sensitivity of 0.91, specificity of 0.86, PPV of 0.22 and NPV of 0.995. The DNN-image model performed with similar robustness to DNN-signal (sensitivity, 0.91; specificity, 0.84; PPV, 0.20; NPV, 0.995). The similarly robust DNN performances across different age, sex, and comorbidity strata in both DNN-signal and DNN-image are shown in Figure 2. External validation using ECG obtained by the Philips system was conducted. The AUROC of the DNN-signal for data from Tri-service General Hospital was 0.95. The overall accuracy of DNN-signal was 0.87, with a sensitivity of 0.90, specificity of 0.87, PPV of 0.19 and NPV of 0.99. Supplementary Tables 1, 2 show the patient characteristics and the performance of DNN-signal using data from Tri-service General Hospital.

FIGURE 2

Figure 2. Deep neural network sensitivity, specificity, and odds ratio for detecting LVSD across different subgroups. The neural network’s sensitivity and specificity for detecting LVSD is tabulated across subgroups. The odds ratio (OR), which is the ratio of the positive ratio [sensitivity / (1−specificity)] to the negative likelihood [(1−sensitivity) / specificity], with the 95% CI, are shown for the subgroups and overall study sample. (A) LVSD prediction using signal. (B) LVSD prediction using image.

Performance of the DNN models in predicting mortality

Age- and sex-weighted Kaplan–Meier curves for mortality of patients with DNN signal-predicted LVSD and echo-derived LVSD are shown in Figure 3. A total of 8,216 LVSD patients were identified using echocardiographic data, and 33,535 LVSD patients were identified using DNN-signal. DNN signal-predicted LVSD was associated with age- and sex-adjusted HRs (95% CI) of 2.57 (2.53–2.62) for all-cause mortality and 6.09 (5.83–6.37) for cardiovascular mortality at a median follow-up of 3.9 years. Echo-derived LVSD was associated with age- and sex-adjusted HRs (95% CI) of 2.68 (2.60–2.76) for all-cause mortality and 7.79 (7.39–8.22) for cardiovascular mortality. The DNN-image performed similarly to DNN-signal with age- and sex-adjusted HRs (95% CI) of 2.70 (2.66–2.75) for all-cause mortality and 6.47 (6.19–6.77) for cardiovascular mortality (Supplementary Figure 4).

FIGURE 3

Figure 3. Associations of echocardiogram and DNN-signal predictions with all-cause and cardiovascular mortalities. Age- and sex-weighted Kaplan–Meier curves, death rates, and adjusted HRs (95% CI) stratified by (A) echo-derived LVSD for all-cause mortality (blue line, LVEF≥40%; yellow line, LVEF<40%), (B) DNN signal-predicted LVSD for all-cause mortality (blue line, LVEF≥40%; yellow line, LVEF<40%), (C) echo-derived LVSD for cardiovascular mortality (blue line, LVEF≥40%; yellow line, LVEF<40%), (D) DNN signal-predicted LVSD for cardiovascular mortality (blue line, LVEF≥40%; yellow line, LVEF<40%). ^a Adjusted K-M curves were adjusted by the inverse probability of treatment weighting, which calculated using sex and age. ^b The unit of incidence rate was 1,000 person-years. CI, confidence interval; DNN, deep neural network; LVEF, left ventricular ejection fraction.

Compared with ‘true negative’ DNN predictions, ‘true positive’ DNN-signal predictions were associated with HRs (95% CI) of 3.27 (3.17–3.38) for all-cause mortality and 12.46 (11.75–13.21) for cardiovascular mortality. ‘True positive’ DNN-image predictions were associated with HRs (95% CI) of 3.47 (3.36–3.58) for all-cause mortality and 13.8 (13.03–14.67) for cardiovascular mortality (Figure 4).

FIGURE 4

Figure 4. Associations of DNN-signal and DNN-image predictions with all-cause and cardiovascular mortalities. Age- and sex-weighted Kaplan–Meier curves, death rates, and adjusted HRs (95% CI) stratified by both echocardiography and DNN (true negative: blue line, both echo-measured and DNN-predicted LVEF ≥40%; false negative: green line, echo-measured LVEF <40% and DNN-predicted LVEF ≥40%; true positive: red line, both echo-measured and DNN-predicted LVEF <40%; and false positive: yellow line, echo-measured LVEF ≥40% and DNN-predicted LVEF <40%) for all-cause and cardiovascular mortality (A) DNN-signal predictions and all-cause mortality, (B) DNN-image predictions and all-cause mortality, (C) DNN-signal predictions and cardiovascular mortality, and (D) DNN-image predictions and cardiovascular mortality. ^a Adjusted K-M curves were adjusted by the inverse probability of treatment weighting, which calculated using sex and age. ^b The unit of incidence rate was 1,000 person-years. CI, confidence interval; DNN, deep neural network; EF, ejection fraction; FN, false negative; FP, false positive; HR, hazard ratio; K-M, Kaplan–Meier; LVSD, left ventricular systolic dysfunction; No., number; TN, true negative; TP, true positive.

Among patients with ‘false positive’ DNN prediction, a higher mortality rate was also observed during follow-up. ‘False positive’ DNN-signal predictions were associated with HRs (95% CI) of 2.43 (2.38–2.47) for all-cause mortality and 4.78 (3.55–5.03) for cardiovascular mortality. ‘False positive’ DNN-image predictions were associated with HRs (95% CI) of 2.57 (2.52–2.61) for all-cause mortality and 5.16 (4.92–5.42) for cardiovascular mortality (Figure 4).

Sensitivity analyses

Table 2 summarizes the performance of the DNN models in additional datasets. Subset A1 included 49,564 patients with multiple echocardiograms. Within this subset, ‘false positive’ DNN-signal predictions were associated with HRs (95% CI) of 8.33 (7.71–9.00) for incident LVSD, 1.99 (1.92–2.06) for all-cause mortality, and 3.51 (3.25–3.80) for cardiovascular mortality compared to ‘true negative’ DNN-signal predictions. ‘False positive’ DNN-image predictions were associated with HRs (95% CI) of 8.19 (7.57–8.87) for incident LVSD, 2.05 (1.98–2.12) for all-cause mortality, and 3.77 (3.49–4.07) for cardiovascular mortality compared to ‘true negative’ DNN-image predictions.

TABLE 2

Table 2. Sensitivity analyses of model performance to identify patients with future left ventricular systolic dysfunction (LVSD) and those at risk of all-cause and cardiovascular mortalities.

Within subset B, including 83,787 patients, positive DNN-signal predictions were associated HRs (95% CI) of 19.23 (16.56–22.33) for incident LVSD, 2.18 (2.09–2.26) for all-cause mortality, and 5.20 (4.70–5.75) for cardiovascular mortality. Positive DNN-image predictions were associated HRs (95% CI) of 19.52 (16.72–22.80) for incident LVSD, 2.32 (2.24–2.41) for all-cause mortality, and 4.99 (4.52–5.52) for cardiovascular mortality.

Within subset C, including 1,194,982 patients, DNN signal-predicted LVSD was associated with a HR (95% CI) of 3.24 (3.19–3.29) for all-cause mortality and 6.83 (6.51–7.16) for cardiovascular mortality. DNN image-predicted LVSD was associated with a HR (95% CI) of 3.46 (3.40–3.51) for all-cause mortality and 6.82 (6.51–7.14) for cardiovascular mortality. Supplementary Figures 5–12 show Kaplan–Meier curves for incident LVSD, all-cause and cardiovascular mortality for subsets A1, B, and C.

Discussion

The prevalence of LVSD ranges from 2 to 8% in adults depending on the study population and cut-off value used (8–10). In both symptomatic and asymptomatic cases, LVSD is associated with increased morbidity and mortality. The Framingham cohort study showed that individuals with asymptomatic LVSD (LVEF <40%) have around eight-fold increased risk of developing HF (30). The combination of definite treatment and primary prevention of incident HF can reduce the disease burden. One such strategy is to screen for asymptomatic LVSD; however, the best method for this is unclear (11, 31, 32). Our study demonstrated the potential of DNNs for screening asymptomatic LVSD. In addition, comprehensive real-world testing demonstrated the robustness of DNN to identify LVSD and patients at risk of future LVSD and mortality. Furthermore, we constructed DNN models based on both raw ECG signals and transformed images. In clinical settings in which raw ECG signals are not available, this method can digest ECG image tracing and provide similar performance. Consequently, the applicability of DNN-enabled ECG is broadened.

ECG is a ubiquitous and economical point-of-care diagnostic tool in cardiology. Previous research has demonstrated that LVSD might be characterized by specific ECG changes, such as Q-waves (33, 34), left bundle branch block (35), and wide QRS duration (>120 ms) (36). However, no single feature had high enough predictive value to offer clinical utility. These various features seemed to interact in a non-linear fashion that could not be accounted for by traditional statistical methods or algorithmic approaches. DNNs afford the ability to consider complex datasets in the context of all of the contained data rather than preselected discrete data elements. Identifying these features may offer novel findings that can provide new diagnostic approaches or therapeutic targets. Finding ways to understand what drives the network’s interpretation is also the direction of future efforts.

We used DNN algorithms to perform binary classification of LVEF in a hospital-based population, with excellent performance (AUROC, 0.95) superior to known screening tests (e.g., natriuretic peptides) (11). The DNN performed well across all age, sex, and comorbidity groups (Figure 2). In addition, the model performance was validated externally using data from the Phillips system, suggesting its robustness across different machine types. The diagnostic performance was characterized by a high NPV, which helps exclude LVSD with high confidence. The ‘false positive’ rates were high. However, we further demonstrated that ‘false positive’ DNN predictions were associated with an eight-fold increased risk of incident LVSD (confirmed by TTE), a two-fold increased risk of all-cause mortality, and a five-fold increased risk of cardiovascular mortality compared to ‘true negative’ DNN predictions. This means that DNN could detect early, subclinical, electrical or structural abnormalities shown on the ECG. These abnormalities may include cardiac arrhythmias, left ventricular deformation, valvular heart disease, or metabolic derangements and thus increase the risk of LVSD incidence and death. In this case, DNN-enabled ECG is an effective screening tool to identify patients at risk.

Several studies have demonstrated the potential of AI in turning ECGs into functional screening and diagnostic tools for various heart disorders. For instance, Mayo Clinic researchers have applied AI to automatically detect LVSD and even tried to identify atrial fibrillation through sinus rhythm. Compared with prior studies (21, 37), we not only verified the diagnostic effectiveness of AI-assisted ECG reading on LVSD screening, but also explored the use of ECGs as an outcome prediction tool with the assistance of AI. Individuals with a positive DNN prediction were associated with a two-fold increased risk of all-cause mortality and a six-fold increased risk of cardiovascular mortality at a median follow-up of 3.9 years. This finding suggested that some trivial electrical abnormalities due to metabolic or myocardial disturbances may precede LVSD. It was speculated that some of these disturbances might be irreversible or progressive, eventually causing long-term adverse effects.

While this study reveals that DNN-enabled ECG interpretation is a reliable method of detecting LVSD, the selection of target populations for screening remains to be addressed. Galasko et al. evaluated a variety of LVSD screening strategies and demonstrated that LVSD screening is more cost-effective in high-risk subjects than in the general population (38). High-risk subjects were defined as those with hypertension, diabetes, atherosclerotic cardiovascular disease, and heavy alcohol consumpton (39). Our research included individuals who visited the hospital for various reasons, not just for known heart disease. This hospital-based population did have higher prevalences of diabetes mellitus (28.2%), hypertension (53.6%), and coronary heart disease (7.6%), which fits the definition of a high-risk group.

Based on this study, we propose a prototype approach for in-hospital LVSD screening. Step one involves ECG screening using the DNN-enabled classification of individuals who will undergo high-risk invasive treatment or those with pre-existing cardiovascular risk. Step two involves TTE evaluation of individuals identified as abnormal by DNN models. This DNN-enabled screening strategy offers an advantage, as ECG machines and internet services are widely available in modern hospitals, and the strategy is also financially sustainable. This DNN model also provides a potential complementary care approach to plasma natriuretic peptide measurement for primary LVSD screening. Further studies are needed to assess the impacts of the proposed DNN-enabled screening strategy on the incidence and prognosis of in-hospital HF-associated adverse events. Furthermore, a comprehensive analysis may be conducted to examine the cost-effectiveness of the proposed strategy.

In summary, DNN-enabled ECG is a valuable tool to screen for LVSD and predict outcomes. Given the low cost of DNN-enabled ECG, serial screening is possible, which also helps optimize screening strategy for LVSD without using invasive laboratory testing, particularly in settings with limited medical resources.

Limitations of the study

There are several limitations to this study. First, some of the LVEF data used for analysis were measured using M-mode way. The major limitation of M-mode is its one dimensional nature and lack of direct spatial information. When regional LV deformation exists, the M-mode-derived LVEF is not reliable. Although most operators choose the 2D or 3D methods when performing LVEF measurements in patients with structural heart disease, we cannot completely rule out this potential bias. Second, echocardiographic parameters other than LVEF, such as left ventricular diameter, left ventricular diastolic function, right ventricular function or valvular heart disease, also affect mortality risk. However, the present study did not introduce these parameters to analyze and evaluate their impact on prognosis. Further research should be conducted to assess the differences between clinical characteristics of patients with DNN-predicted LVSD compared to those without DNN-predicted LVSD. Third, the study was conducted in an academic medical center in patients with more complex diseases. The primary analysis consisted of patients with a higher prevalence of HF and other cardiovascular comorbidities, whom clinicians identified as needing a TTE evaluation. Considering these cohort characteristics, the findings may not be generalizable to relatively healthy and truly asymptomatic populations. To verify the generalizability of our DNN models, we conducted multiple additional analyses in more than 1 million patients with different clinical characteristics. In addition, the stratified analysis of patients without known comorbidities showed a similar performance of the models. Finally, although the sensitivity and specificity were both satisfying in our study, we observed a relatively lower PPV. The performance of PPV is highly correlated to the proportion of positive subjects in the testing group. The low likelihood of LVSD (4.3%) in testing dataset caused a low PPV. Despite this, an appropriate sensitivity is more critical in applying ECG as an LVSD screening tool. The purpose of this screening tool is to detect all potential subjects who are at risk of developing LVSD for following echocardiogram exams.

Conclusion

The established DNN algorithms in this study enable rapid LVSD detection and represent an essential step in transforming the ECG into an effective, real-time screening tool. Its ability to predict LVSD incidence and long-term mortality may help stratify patient risk and initiate relevant interventions. With good accuracy and accessibility, DNN-enabled ECG has the potential to optimize the screening process for LVSD among at-risk populations and to advance HF care significantly.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary material, further inquiries can be directed to the corresponding author.

Ethics statement

The studies involving human participants were reviewed and approved by the Chang Gung Medical Foundation—Institutional Review Board. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

Author contributions

M-SW and C-FK conceived and designed the study. Y-CHu and Y-CHs did the literature search, acquired data, and wrote the manuscript. C-HL, RT, and J-SC did the statistical analyses. Y-CHs and Z-YL developed, trained, and applied the deep neural network. J-SC prepared the figures and tables. C-FK accessed and verified the data. H-TL, W-CL, H-TW, P-CC, C-CC, C-CW, and M-SW provided the commentary. All authors contributed to the interpretation of data and the revision of the manuscript, and approved the final manuscript.

Funding

This work was supported by the Ministry of Science and Technology of Taiwan (grant number MOST 109-2321-B-182A-007, MOST 110-2314-B-182A-123, and MOST 110-2745-B-075A-001) and Chang Gung Memorial Hospital (grant number CLRPG3H0013, CORPG3L0161, and CORPG3L0461). We were also given methodological assistance from the University of Nottingham.

Acknowledgments

We thank Tri-Services General Hospital for providing data for external validation.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fcvm.2023.1070641/full#supplementary-material

Abbreviations

AI, artificial intelligence; DNN, deep neural network; ECG, electrocardiogram; HF, heart failure; LVEF, left ventricular ejection fraction; LVSD, left ventricular systolic dysfunction; TTE, transthoracic echocardiogram.

References

1. Ponikowski, P, Anker, SD, AlHabib, KF, Cowie, MR, Force, TL, Hu, S, et al. Heart failure: preventing disease and death worldwide. ESC Heart Fail. (2014) 1:4–25. doi: 10.1002/ehf2.12005

CrossRef Full Text | Google Scholar

2. Echouffo-Tcheugui, JB, Erqou, S, Butler, J, Yancy, CW, and Fonarow, GC. Assessing the risk of progression from asymptomatic left ventricular dysfunction to overt heart failure: a systematic overview and meta-analysis. JACC Heart Fail. (2016) 4:237–48. doi: 10.1016/j.jchf.2015.09.015

PubMed Abstract | CrossRef Full Text | Google Scholar

3. SOLVD InvestigatorsYusuf, S, Pitt, B, Davis, CE, Hood, WB Jr, and Cohn, JN. Effect of enalapril on mortality and the development of heart failure in asymptomatic patients with reduced left ventricular ejection fractions. [published correction appears in N Engl J Med 1992 Dec 10;327(24):1768]. N Engl J Med. (1992) 327:685–91. doi: 10.1056/NEJM199209033271003,

CrossRef Full Text | Google Scholar

4. Jong, P, Yusuf, S, Rousseau, MF, Ahn, SA, and Bangdiwala, SI. Effect of enalapril on 12-year survival and life expectancy in patients with left ventricular systolic dysfunction: a follow-up study. Lancet. (2003) 361:1843–8. doi: 10.1016/S0140-6736(03)13501-5

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Dahlöf, B, Devereux, RB, Kjeldsen, SE, Julius, S, Beevers, G, de Faire, U, et al. Cardiovascular morbidity and mortality in the losartan intervention for endpoint reduction in hypertension study (LIFE): a randomised trial against atenolol. Lancet. (2002) 359:995–1003. doi: 10.1016/S0140-6736(02)08089-3

PubMed Abstract | CrossRef Full Text | Google Scholar

6. Exner, DV, Dries, DL, Waclawiw, MA, Shelton, B, and Domanski, MJ. Beta-adrenergic blocking agent use and mortality in patients with asymptomatic and symptomatic left ventricular systolic dysfunction: a post hoc analysis of the studies of left ventricular dysfunction. J Am Coll Cardiol. (1999) 33:916–23. doi: 10.1016/s0735-1097(98)00675-5

PubMed Abstract | CrossRef Full Text | Google Scholar

7. Colucci, WS, Kolias, TJ, Adams, KF, Armstrong, WF, Ghali, JK, Gottlieb, SS, et al. Metoprolol reverses left ventricular remodeling in patients with asymptomatic systolic dysfunction: the REversal of VEntricular remodeling with Toprol-XL (REVERT) trial. Circulation. (2007) 116:49–56. doi: 10.1161/CIRCULATIONAHA.106.666016

PubMed Abstract | CrossRef Full Text | Google Scholar

8. Wang, TJ, Levy, D, Benjamin, EJ, and Vasan, RS. The epidemiology of "asymptomatic" left ventricular systolic dysfunction: implications for screening. Ann Intern Med. (2003) 138:907–16. doi: 10.7326/0003-4819-138-11-200306030-00012

PubMed Abstract | CrossRef Full Text | Google Scholar

9. McDonagh, TA, Metra, M, Adamo, M, Gardner, RS, Baumbach, A, Böhm, M, et al. 2021 ESC guidelines for the diagnosis and treatment of acute and chronic heart failure [published correction appears in Eur Heart J. 2021 Oct 14;]. Eur Heart J. (2021) 42:3599–726. doi: 10.1093/eurheartj/ehab368

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Heidenreich, PA, Bozkurt, B, Aguilar, D, Allen, LA, Byun, JJ, Colvin, MM, et al. 2022 AHA/ACC/HFSA guideline for the management of heart failure: a report of the American College of Cardiology/American Heart Association joint committee on clinical practice guidelines [published correction appears in circulation. 2022 May 3;145(18):e1033] [published correction appears in circulation. 2022 Sep 27;146(13):e185]. Circulation. (2022) 145:e895–e1032. doi: 10.1161/CIR.0000000000001063

PubMed Abstract | CrossRef Full Text | Google Scholar

11. Vasan, RS, Benjamin, EJ, Larson, MG, Leip, EP, Wang, TJ, Wilson, PWF, et al. Plasma natriuretic peptides for community screening for left ventricular hypertrophy and systolic dysfunction: the Framingham heart study. JAMA. (2002) 288:1252–9. doi: 10.1001/jama.288.10.1252

PubMed Abstract | CrossRef Full Text | Google Scholar

12. Betti, I, Castelli, G, Barchielli, A, Beligni, C, Boscherini, V, de Luca, L, et al. The role of N-terminal PRO-brain natriuretic peptide and echocardiography for screening asymptomatic left ventricular dysfunction in a population at high risk for heart failure. The PROBE-HF study. J Card Fail. (2009) 15:377–84. doi: 10.1016/j.cardfail.2008.12.002

PubMed Abstract | CrossRef Full Text | Google Scholar

13. Suganyadevi, S, Seethalakshmi, V, and Balasamy, K. A review on deep learning in medical image analysis. Int J Multimed Inf Retr. (2022) 11:19–38. doi: 10.1007/s13735-021-00218-1

PubMed Abstract | CrossRef Full Text | Google Scholar

14. Li, X, Zhu, D, and Levy, P. Predicting Clinical Outcomes with Patient Stratification via Deep Mixture Neural Networks. AMIA Jt Summits Transl Sci Proc. (2020) 2020:367–76.

Google Scholar

15. Collobert, R, and Weston, J. A unified architecture for natural language processing: deep neural networks with multitask learning. Proceedings of the 25th International Conference on Machine Learning. ACM: New York. (2008). 160–167.

Google Scholar

16. Grebner, C, Matter, H, Kofink, D, Wenzel, J, Schmidt, F, and Hessler, G. Application of deep neural network models in drug discovery programs. ChemMedChem. (2021) 16:3772–86. doi: 10.1002/cmdc.202100418

PubMed Abstract | CrossRef Full Text | Google Scholar

17. LeCun, Y, Bengio, Y, and Hinton, G. Deep learning. Nature. (2015) 521:436–44. doi: 10.1038/nature14539

PubMed Abstract | CrossRef Full Text | Google Scholar

18. Lindsey, R, Daluiski, A, Chopra, S, Lachapelle, A, Mozer, M, Sicular, S, et al. Deep neural network improves fracture detection by clinicians. Proc Natl Acad Sci U S A. (2018) 115:11591–6. doi: 10.1073/pnas.1806905115

PubMed Abstract | CrossRef Full Text | Google Scholar

19. Gulshan, V, Peng, L, Coram, M, Stumpe, MC, Wu, D, Narayanaswamy, A, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. (2016) 316:2402–10. doi: 10.1001/jama.2016.17216

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Ardila, D, Kiraly, AP, Bharadwaj, S, Choi, B, Reicher, JJ, Peng, L, et al. End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography [published correction appears in Nat Med. (2019);25(8):1319]. Nat Med. (2019) 25:954–61. doi: 10.1038/s41591-019-0447-x

PubMed Abstract | CrossRef Full Text | Google Scholar

21. Attia, ZI, Kapa, S, Lopez-Jimenez, F, McKie, PM, Ladewig, DJ, Satam, G, et al. Screening for cardiac contractile dysfunction using an artificial intelligence-enabled electrocardiogram. Nat Med. (2019) 25:70–4. doi: 10.1038/s41591-018-0240-2

PubMed Abstract | CrossRef Full Text | Google Scholar

22. Yao, X, Rushlow, DR, Inselman, JW, McCoy, RG, Thacher, TD, Behnken, EM, et al. Artificial intelligence-enabled electrocardiograms for identification of patients with low ejection fraction: a pragmatic, randomized clinical trial. Nat Med. (2021) 27:815–9. doi: 10.1038/s41591-021-01335-4

PubMed Abstract | CrossRef Full Text | Google Scholar

23. He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA. (2016).

Google Scholar

24. Tong, Z, Aihara, K, and Tanaka, G. A hybrid pooling method for convolutional neural networks In: A Hirose, S Ozawa, K Doya, K Ikeda, M Lee, and D Liu, editors. International conference on neural information processing. ICONIP 2016. Cham: Springer (2016). 454–61.

Google Scholar

25. Tong, Z, and Tanaka, G. Hybrid pooling for enhancement of generalization ability in deep convolutional neural networks. Neurocomputing. (2019) 333:76–85. doi: 10.1016/j.neucom.2018.12.036

CrossRef Full Text | Google Scholar

26. Gu, J, Wang, Z, Kuen, J, Ma, L, Shahroudy, A, Shuai, B, et al. Recent advances in convolutional neural networks. Pattern Recogn. (2018) 77:354–77. doi: 10.48550/arXiv.1512.07108

CrossRef Full Text | Google Scholar

27. Kingma, DP, and Adam, BJ. A method for stochastic optimization. arXiv Preprint arXiv:1412.6980. (2014). doi: 10.48550/arXiv.1412.6980

CrossRef Full Text | Google Scholar

28. Srivastava, N, Hinton, G, Krizhevsky, A, Sutskever, I, and Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. (2014) 15:1929–58.

Google Scholar

29. Lu, TH, Lee, MC, and Chou, MC. Accuracy of cause-of-death coding in Taiwan: types of miscoding and effects on mortality statistics. Int J Epidemiol. (2000) 29:336–43. doi: 10.1093/ije/29.2.336

PubMed Abstract | CrossRef Full Text | Google Scholar

30. Wang, TJ, Evans, JC, Benjamin, EJ, Levy, D, LeRoy, EC, and Vasan, RS. Natural history of asymptomatic left ventricular systolic dysfunction in the community. Circulation. (2003) 108:977–82. doi: 10.1161/01.CIR.0000085166.44904.79

PubMed Abstract | CrossRef Full Text | Google Scholar

31. Cincin, A, Ozben, B, and Erdogan, O. Diagnostic utility of specific electrocardiographical parameters in predicting left ventricular function. Exp Clin Cardiol. (2012) 17:210–4.

Google Scholar

32. Atherton, JJ. Screening for left ventricular systolic dysfunction: is imaging a solution? JACC Cardiovasc Imaging. (2010) 3:421–8. doi: 10.1016/j.jcmg.2009.11.014

CrossRef Full Text | Google Scholar

33. Nielsen, OW, Hansen, JF, Hilden, J, Larsen, CT, and Svanegaard, J. Risk assessment of left ventricular systolic dysfunction in primary care: cross sectional study evaluating a range of diagnostic tests. BMJ. (2000) 320:220–4. doi: 10.1136/bmj.320.7229.220

PubMed Abstract | CrossRef Full Text | Google Scholar

34. Sheifer, SE, Gersh, BJ, Yanez, ND 3rd, Ades, PA, Burke, GL, and Manolio, TA. Prevalence, predisposing factors, and prognosis of clinically unrecognized myocardial infarction in the elderly. J Am Coll Cardiol. (2000) 35:119–26. doi: 10.1016/s0735-1097(99)00524-0

PubMed Abstract | CrossRef Full Text | Google Scholar

35. Boonman-de Winter, LJ, Rutten, FH, Cramer, MJ, Landman, MJ, Zuithoff, NP, Liem, AH, et al. Efficiently screening heart failure in patients with type 2 diabetes. Eur J Heart Fail. (2015) 17:187–95. doi: 10.1002/ejhf.216

PubMed Abstract | CrossRef Full Text | Google Scholar

36. Madias, JE. The resting electrocardiogram in the management of patients with congestive heart failure: established applications and new insights. Pacing Clin Electrophysiol. (2007) 30:123–8. doi: 10.1111/j.1540-8159.2007.00586.x

PubMed Abstract | CrossRef Full Text | Google Scholar

37. Attia, ZI, Noseworthy, PA, Lopez-Jimenez, F, Asirvatham, SJ, Deshmukh, AJ, Gersh, BJ, et al. An artificial intelligence-enabled ECG algorithm for the identification of patients with atrial fibrillation during sinus rhythm: a retrospective analysis of outcome prediction. Lancet. (2019) 394:861–7. doi: 10.1016/S0140-6736(19)31721-0

PubMed Abstract | CrossRef Full Text | Google Scholar

38. Galasko, GI, Barnes, SC, Collinson, P, Lahiri, A, and Senior, R. What is the most cost-effective strategy to screen for left ventricular systolic dysfunction: natriuretic peptides, the electrocardiogram, hand-held echocardiography, traditional echocardiography, or their combination? Eur Heart J. (2006) 27:193–200. doi: 10.1093/eurheartj/ehi559

CrossRef Full Text | Google Scholar

39. Arnett, DK, Blumenthal, RS, Albert, MA, Buroker, AB, Goldberger, ZD, Hahn, EJ, et al. 2019 ACC/AHA guideline on the primary prevention of cardiovascular disease: a report of the American College of Cardiology/American Heart Association task force on clinical practice guidelines [published correction appears in circulation. 2019 Sep 10;140(11):e649-e650] [published correction appears in circulation. 2020 Jan 28;141(4):e60] [published correction appears in circulation. 2020 Apr 21;141(16):e774]. Circulation. (2019) 140:e596–646. doi: 10.1161/CIR.0000000000000678

CrossRef Full Text | Google Scholar

Keywords: electrocardiogram, left ventricular systolic dysfunction, left ventricular ejection fraction, all-cause mortality, deep neural network

Citation: Huang Y-C, Hsu Y-C, Liu Z-Y, Lin C-H, Tsai R, Chen J-S, Chang P-C, Liu H-T, Lee W-C, Wo H-T, Chou C-C, Wang C-C, Wen M-S and Kuo C-F (2023) Artificial intelligence-enabled electrocardiographic screening for left ventricular systolic dysfunction and mortality risk prediction. Front. Cardiovasc. Med. 10:1070641. doi: 10.3389/fcvm.2023.1070641

Received: 15 October 2022; Accepted: 14 February 2023;
Published: 03 March 2023.

Edited by:

Jiong-Wei Wang, National University of Singapore, Singapore

Reviewed by:

Pier Paolo Bocchino, “Città della Salute e della Scienza” Hospital, Italy
Qianqian Ni, National University of Singapore, Singapore
Rabia Saleem, University of Leicester, United Kingdom
Richard Segall, Arkansas State University, United States

Copyright © 2023 Huang, Hsu, Liu, Lin, Tsai, Chen, Chang, Liu, Lee, Wo, Chou, Wang, Wen and Kuo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Ming-Shien Wen, d2VubXMxMjNAZ21haWwuY29t; Chang-Fu Kuo, emFuZGlzQGdtYWlsLmNvbQ==

^†These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.