- 1Division of Cardiology, Department of Internal Medicine, Chang Gung Memorial Hospital, Linkou and Chang Gung University Medical School, Taoyuan, Taiwan
- 2Center for Artificial Intelligence in Medicine, Chang Gung Memorial Hospital, Taoyuan, Taiwan
- 3School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, United States
- 4Division of Rheumatology, Allergy and Clinical Immunology, Chang Gung Memorial Hospital, Linkou and Chang Gung University Medical School, Taoyuan, Taiwan
Background: The risk of mortality is relatively high among patients who visit the emergency department (ED), and stratifying patients at high risk can help improve medical care. This study aimed to create a machine-learning model that utilizes the standard 12-lead ECG to forecast acute mortality risk in ED patients.
Methods: The database included patients who visited the EDs and underwent standard 12-lead ECG between October 2007 and December 2017. A convolutional neural network (CNN) ECG model was developed to classify survival and mortality using 12-lead ECG tracings acquired from 345,593 ED patients. For machine learning model development, the patients were randomly divided into training, validation and testing datasets. The performance of the mortality risk prediction in this model was evaluated for various causes of death.
Results: Patients who visited the ED and underwent one or more ECG examinations experienced a high incidence of 30-day mortality [18,734 (5.42%)]. The developed CNN model demonstrated high accuracy in predicting acute mortality (hazard ratio 8.50, 95% confidence interval 8.20–8.80) with areas under the receiver operating characteristic (ROC) curve of 0.84 for the 30-day mortality risk prediction models. This CNN model also demonstrated good performance in predicting one-year mortality (hazard ratio 3.34, 95% confidence interval 3.30–3.39). This model exhibited good predictive performance for 30-day mortality not only for cardiovascular diseases but also across various diseases.
Conclusions: The machine learning-based ECG model utilizing CNN screens the risks for 30-day mortality. This model can complement traditional early warning scoring indexes as a useful screening tool for mortality prediction.
1. Introduction
Patients admitted to the emergency department (ED) have a considerable risk of mortality, estimated to be between 3% and 8% for 30-day mortality (1, 2). Identifying high-risk patients early on can help make appropriate medical management decisions. Early warning scores (EWS) based on simple and widely available parameters are valuable tools for predicting acute mortality risk. Various EWS, including the National Early Warning Score (NEWS) (3), Modified Early Warning Score (MEWS) (4), Rapid Acute Physiology Score (RAPS) (5), Rapid Emergency Medicine Score (REMS) (6), and Cardiac Arrest Risk Triage Score (CART) (7), have been developed to assess acute mortality risk. Immediate risk stratification guides medical staff in making appropriate emergent management decisions and arranging admission to the intensive care unit.
The 12-lead electrocardiogram (ECG) is an important medical test in the ED, and most high-risk patients who present to the ED undergo on or more ECG examinations. Physicians diagnose various medical disorders, such as cardiovascular diseases, arrhythmias, and electrolyte disorders via reading the ECG. Alterations of medical conditions can cause ECG changes, some of which are easily recognized, while others are subtle and difficult to interpret visually by physicians. It can be challenging for physicians to assess the mortality risk using a 12-lead ECG examination. However, the use of convolutional neural network (CNN) machine learning allows for the recognition of these subtle ECG changes.
Detecting a high risk of acute mortality early on with a 12-lead ECG examination can aid in risk stratification for ED patients. In this study, we aimed to develop a CNN machine learning model using the standard 12-lead ECG to predict acute mortality risk in patients who visit the ED.
2. Methods and materials
2.1. Study population
This study was approved by the Institutional Review Board (IRB No. 202002464B0). The database of this study included all patients who visited the emergency departments and underwent standard 12-lead ECG at the seven hospitals between October 2007 and December 2017. Patients who visited the EDs and received one or more standard 12-lead ECG examinations during their visit were included in this study. The demographics, medical history, medications, and laboratory data were acquired from the Chang Gung Research Database. The survival status was acquired from the National Death Registry Database of Taiwan. All the data were de-identified before analyses, and all personal information was encrypted before the data were released to researchers to protect patient confidentiality. Since the NEWS, MEWS, RAPS, REMS, and CART indexes are derived from the Glasgow coma scale (GCS), oxygen saturation, body temperature, pulse rate, blood pressure, and respiratory rate. We excluded subjects missing any of the EWS index data in the database.
2.2. ECG collection and artificial intelligence (AI) model development
Standard 12-lead ECGs with 10-second voltage-time traces were acquired using MAC 5,000 or MAC 5,500 ECG recorder (GE Healthcare, Chicago, IL, USA) at a sampling rate of 500 Hz. After ECG acquisition, the ECG tracings were processed and stored using the Marquette Universal System for Electrocardiography (MUSE, GE Healthcare, Chicago, IL, USA). If a patient has two or more ECG records, we used all ECGs during their ED visit. Each standard 12-lead ECG was stored as a 5,000 × 12 matrix.
For signal input, we used the convolutional network framework (CNN) residual network (ResNet 18) (8) but modified it to fit our signal input (Supplementary Figure S1). We used a wider kernel 15 for the first convolution layer compared with the original ResNet framework as used for the image. This architecture uses skip connections, which allow information to directly pass to the next layer to avoid the degradation caused by deeper neural networks. The network consisted of a convolution layer followed by 4 residual blocks, and each residual block contains two convolution layers. The output of the last block was fed into hybrid pooling (9) by combining max- and average-pooling methods to improve the generalization ability while reducing dimensionality. The output of hybrid pooling was later sent to a fully connected layer to perform the final classification. The output of each convolutional layer is followed by batch normalization for distribution normalization and fed into a rectified linear activation unit (10). Cross-entropy loss with Adam (11) optimizer was used in the model. Dropout is applied to reduce overfitting by breakup co-adaptation on training data (12).
The AI ECG model incorporated ECG and the mortality scoring systems were analyzed based on the abovementioned model (Supplementary Figure S2). The additional scoring system variables were sent to a fully connected layer and combined max- and average-pooling of the ECG model. The output was later sent to a fully connected layer to perform the final classification.
2.3. Statistical analyses
The Kolmogorov-Smirnov test was utilized to assess normality due to the substantial sample size exceeding 2,000. Consequently, all P-values were less than 0.05, leading to the rejection of the assumption of normality. Continuous variables are expressed as median and interquartile range (IQR), and categorical variables are expressed as numbers and percentages. Adjusted odds ratios (OR) and 95% confidence interval (CI) were calculated. For comparisons of population characteristics, the chi-square test was used for categorical variables and the unpaired Student's t-test for continuous variables. Cox proportional hazards were used to estimate hazard ratios (HR) for mortality. A P-value < 0.05 was considered statistically significant. Statistical analyses were conducted using SAS 9.4 software (SAS Institute, Cary, NC, USA).
3. Results
3.1. Clinical characteristics
The dataset comprised 5,148,498 standard 12-lead ECG examinations from 1,776,968 patients collected between October 2007 and December 2019 (Figure 1). Among these patients, 1,684,298 had recorded data in the National Health Insurance or National Death Registry Databases, from which we obtained the mortality outcome and the primary cause of death. After excluding patients with inadequate ECG quality and those age less than 18 years, a total of 610,611 patients were included. We excluded 265,018 patients due to incomplete ED triage data, and the remaining 345,593 patients were randomly divided into training and testing datasets. Table 1 shows the clinical characteristics.
Figure 1. Data flow for ECG and data pairing. Patients who visited the emergency departments between 2006 and 2017 were included in this study. ECG, electrocardiogram; ED, emergency department.
3.2. Acute mortality prediction outcomes
Among these patients, 18,734 (5.42%) died within 30 days, indicating a relatively high 30-day mortality risk among ED patients who received one or more ECG examinations. The CNN model showed a good performance in predicting 30-day mortality. The sensitivity, specificity, and negative predictive values were 0.81, 0.71, and 0.99, respectively. Patients who predicted that they would die had a 19% risk of mortality within 30 days, whereas patients who predicted that they would survive had a 1% risk of mortality within 30 days. Figure 2A shows the ROC curve of acute mortality prediction, with the area under the ROC curve of 0.84. Figure 2B demonstrates the Kaplan–Meier curve of 30-day mortality (odds ratios 8.50, 95% CI 8.20–8.80). Although the CNN model was originally developed for short-term mortality, it demonstrated good performance in predicting long-term mortality as well (Figure 2C). The group predicted to be at high mortality risk had a significantly higher one-year mortality rate than the group predicted to be alive in this mortality prediction model (hazard ratio 3.34, 95% CI 3.30–3.39).
Figure 2. The performance of the AI ECG model in predicting acute mortality. (A) The receiver operating characteristic curves. (B) The Kaplan–Meier curves of 30-day survival. (C) The Kaplan–Meier curves of extenteded survival prediction using the same model. The graphs demonstrated that this model performed well in predicting one-year mortality. AUC, area under the receiver operating characteristic curve; HR, hazard ratio; NPV, negative predictive value; PPV, positive predictive value.
The model has acute mortality prediction ability in all subgroups, including older patients (>60 years old), hypertension, diabetes, heart failure, atrial fibrillation, chronic kidney disease, liver cirrhosis, chronic obstructive lung disease, and those taking angiotensin-converting enzyme inhibitors/angiotensin II receptor blockers, calcium channel blockers and statins as revealed by the subgroup analyses (Figure 3). The model exhibited good predictive performance for 30-day mortality across a range of diseases, as illustrated in Figure 4. This included cardiovascular, respiratory, kidney, liver, cerebrovascular, and malignancy diseases.
Figure 3. Subgroup analyses for 30-day emergency department mortality. ACEI/ARB, angiotensin-converting enzyme inhibitor/angiotensin receptor blocker; CCB, calcium channel blocker; CI, confidence interval; CKD, chronic kidney disease; COPD, chronic obstructive pulmonary disease; DM, diabetes mellitus; OR, odds ratio.
Figure 4. Kaplan–MMeier curves of acute survival prediction in patients among various major diseases, including cardiovascular, respiratory, kidney, liver, and cerebrovascular diseases.
Since we employed a multimodal machine learning approach, we also analyzed the performances of machine learning using ECG and the EWS index (see Supplementary Table S1). Our findings indicate that multimodal machine learning, incorporating both ECG and EWS indexes, performed better than machine learning using ECG data alone. Furthermore, the performance of machine learning using only the EWS indexes was the poorest.
4. Discussions
In this study, machine learning models were developed to predict 30-day mortality in patients admitted to the ED. The models showed good performance in predicting 30-day mortality. The Kaplan–Meier curve demonstrates that the model performed well for predicting 30-day survival as well as for predicting long-term survival. The performance was good across all subgroups, as demonstrated by subgroup analyses. The ECG model also performed well in predicting mortality across various diseases, including cardiovascular, respiratory, kidney, liver, and cerebraovascular diseases.
4.1. Acute mortality prediction for patients admitted to the emergency department
High-risk patients with cardiovascular, respiratory, kidney, and liver diseases may require early intensive monitoring and care. EWS indexes, including MEWS, NEWS, and qSOFA, are widely used in the ED to stratify risk for early intensive health care. The NEWS index is a commonly used prediction model for early detection of clinical deterioration, which is based on vital signs and consciousness levels, making it a simple and straightforward scoring system (13). Compared to the MEWS and the qSOFA scoring system, NEWS is one of the most accurate tools for predicting mortality within 24 h (14). In previous reports (14–17), the area under the ROC curve for predicting short-term (7–30 days) mortality in patients who visited ED was 0.61–0.81.
Predicting acute mortality can aid clinicians in managing patients at the appropriate time. The EWS indexes use readily available clinical data, including vital signs, oxygen saturation, and consciousness levels. However, the ECG data might be altered by cardiovascular disease, electrolytes, autonomic activities, intracranial diseases, and other systemic diseases. Subtle changes associated the systemic diseases can be identified using the ECG CNN model for predicting acute mortality. The ECG model complements EWS indexes and can effectively predict the risk of acute mortality, enabling clinicians to make early decisions in critical medical care. As the 12-lead ECG is one of the most readily available examinations in the emergency department, combining an ECG examination with the scoring systems could aid in risk stratification and reduce waiting times for intensive care.
4.2. The performance of acute vs. one-year mortality prediction
Initially, we trained a model to predict one-year mortality and evaluated its ability to predict mortality on a monthly basis. The ECG machine learning model exhibited superior predictive accuracy for patients with a high risk of mortality within a month (refer to Supplementary Figure S2). The model's monthly accuracy indicated that it performed best during the first month and gradually declined after the first month. The model's superior performance in predicting acute mortality during the first month indicates that it is better suited for predicting acute mortality. This better performance during the first month may be attributed to ECG changes that reflect the immediate systemic clinical condition. In addition, the most critical concern for patients who seek emergency medical attention is the prediction of acute mortality, rather than one-year mortality. Thus, we modified the model to predict acute mortality. However, the acute-mortality model still performs effectively in predicting long-term mortality.
4.3. The future application of mortality prediction in preventive medicine
Cardiovascular diseases are the primary cause of death in developed countries, with atherosclerotic coronary heart disease (CHD) having major risk factors that include diabetes, hypertension, increased total serum cholesterol, high LDL level, low HDL level, cigarette smoking, obesity, and family history of CHD. Accurate prediction of CHD events is crucial for guiding decisions on preventive therapy for hypertension, diabetes, and dyslipidemia. The Pooled Cohort Equation is a commonly used method to estimate 10-year absolute rates of CHD events in a primary prevention population (18). However, most CHD prediction models are based on major risk factors (19–22) and provide no information on overall survival, although they do offer valuable risk information for clinicians to decide whether to prescribe preventive therapy. This ECG model exhibited good prediction performance for most major causes of death, suggesting that the ECG signals are influenced by not only cardiovascular diseases but also other systemic diseases. Disease progression may cause changes in the ECG, some of which may be subtle, but recognizable by the CNN ECG model. Therefore, the CNN ECG model may help clinicians evaluate the future risk of acute as well as long-term mortality for various diseases.
For clinical ED staff, risk/mortality screening tools, such as early warning indexes, can be helpful in identifying patients at a relatively higher risk. For patients with a lower risk, the staff can continue to observe them until they exhibit high-risk warning signs. Clinical staff can allocate more attention to the high-risk group. Therefore, a clinical screening tool with low PPV is acceptable.
4.4. Limitations
Some limitations exist in this study. Firstly, only ECGs recorded using GE Healthcare ECG recorders were analyzed, so the model's performance may be poorer for ECGs recorded with other recorders. As the model is based on a convolutional neural network, the algorithm cannot provide complete interpretability of mortality. Additionally, the majority of ECGs analyzed in this study were obtained from Asian patients, and hence the generalizability of the model for mortality prediction may be restricted. Patients in the ED who undergo one or more ECG exams typically have more comorbidities and a higher risk of mortality. The ECG-based machine learning model may not be effective for all ED patients. Moreover, the machine learning study is based on a convolutional neural network, and the interpretability of mortality prediction is currently limited.
5. Conclusions
The machine learning-driven ECG model is a reliable screening tool that provides reasonably accurate predictions for 30-day mortality. The study confirms that this model performs well not only for cardiovascular diseases but also for other medical disorders. The machine learning model's ability to swiftly predict acute mortality based on twelve-lead ECGs can assist clinicians in managing patients who visit the ED. Furthermore, the machine learning-driven ECG model can complement traditional EWS as a useful screening tool.
Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding authors.
Ethics statement
The studies involving humans were approved by Institutional Review Board of Chang Gung Memorial Hospital (IRB No. 202002464B0). The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required from the participants or the participants’ legal guardians/next of kin in accordance with the national legislation and institutional requirements.
Author contributions
The authors confirm contribution to the paper as follows: study conception and design: P-CC, Y-CHu, C-CW, and C-FK; data collection: Z-YL, Y-CHs, J-SC, C-HL, and RT; analysis and interpretation of results: P-CC, Y-CHu, Z-YL, Y-CHs, C-CC, M-SW, H-TW, W-CL, and H-TL; draft manuscript preparation: P-CC, Y-CHu, and C-CW. All authors contributed to the article and approved the submitted version.
Funding
This study was supported by the Taiwan Ministry of Science and Technology (grand No. MOST 110-2314-B-182A-123).
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fcvm.2023.1245614/full#supplementary-material
References
1. Conway R, Cournane S, Byrne D, O'Riordan D, Silke B. Time patterns in mortality after an emergency medical admission; relationship to weekday or weekend admission. Eur J Intern Med. (2016) 36:44–9. doi: 10.1016/j.ejim.2016.08.010
2. Walker AS, Mason A, Quan TP, Fawcett NJ, Watkinson P, Llewelyn M, et al. Mortality risks associated with emergency admissions during weekends and public holidays: an analysis of electronic health records. Lancet. (2017) 390(10089):62–72. doi: 10.1016/S0140-6736(17)30782-1
3. Jones M. Newsdig: the national early warning score development and implementation group. Clin Med (Lond). (2012) 12(6):501–3. doi: 10.7861/clinmedicine.12-6-501
4. Gardner-Thorpe J, Love N, Wrightson J, Walsh S, Keeling N. The value of modified early warning score (mews) in surgical in-patients: a prospective observational study. Ann R Coll Surg Engl. (2006) 88(6):571–5. doi: 10.1308/003588406X130615
5. Rhee KJ, Fisher CJ Jr., Willitis NH. The rapid acute physiology score. Am J Emerg Med. (1987) 5(4):278–82. doi: 10.1016/0735-6757(87)90350-0
6. Olsson T, Terent A, Lind L. Rapid emergency medicine score: a new prognostic tool for in-hospital mortality in nonsurgical emergency department patients. J Intern Med. (2004) 255(5):579–87. doi: 10.1111/j.1365-2796.2004.01321.x
7. Churpek MM, Yuen TC, Park SY, Meltzer DO, Hall JB, Edelson DP. Derivation of a cardiac arrest prediction model using ward vital signs*. Crit Care Med. (2012) 40(7):2102–8. doi: 10.1097/CCM.0b013e318250aa5a
8. He K, Zhang X, Ren S, Sun J, editors. Deep residual learning for image recognition. 2016 IEEE conference on computer vision and pattern recognition (CVPR); 27-30 June 2016 (2016).
9. Tong Z, Tanaka G. Hybrid pooling for enhancement of generalization ability in deep convolutional neural networks. Neurocomputing. (2019) 333:76–85. doi: 10.1016/j.neucom.2018.12.036
10. Gu J, Wang Z, Kuen J, Ma L, Shahroudy A, Shuai B, et al. Recent advances in convolutional neural networks. Pattern Recogn. (2018) 77:354–77. doi: 10.1016/j.patcog.2017.10.013
11. Kingma DP, Ba J. Adam: a method for stochastic optimization. arXiv preprint arXiv: 14126980 (2014).
12. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. (2014) 15(1):1929–58. doi: 10.5555/2627435.2670313
13. Smith GB, Prytherch DR, Meredith P, Schmidt PE, Featherstone PI. The ability of the national early warning score (news) to discriminate patients at risk of early cardiac arrest, unanticipated intensive care unit admission, and death. Resuscitation. (2013) 84(4):465–70. doi: 10.1016/j.resuscitation.2012.12.016
14. Chen L, Zheng H, Chen L, Wu S, Wang S. National early warning score in predicting severe adverse outcomes of emergency medicine patients: a retrospective cohort study. J Multidiscip Healthc. (2021) 14:2067–78. doi: 10.2147/JMDH.S324068
15. Mitsunaga T, Hasegawa I, Uzura M, Okuno K, Otani K, Ohtaki Y, et al. Comparison of the national early warning score (news) and the modified early warning score (mews) for predicting admission and in-hospital mortality in elderly patients in the Pre-hospital setting and in the emergency department. PeerJ. (2019) 7:e6947. doi: 10.7717/peerj.6947
16. Graham CA, Leung LY, Lo RSL, Yeung CY, Chan SY, Hung KKC. News and qsirs superior to qsofa in the prediction of 30-day mortality in emergency department patients in Hong Kong. Ann Med. (2020) 52(7):403–12. doi: 10.1080/07853890.2020.1782462
17. Alam N, Vegting IL, Houben E, van Berkel B, Vaughan L, Kramer MH, et al. Exploring the performance of the national early warning score (news) in a European emergency department. Resuscitation. (2015) 90:111–5. doi: 10.1016/j.resuscitation.2015.02.011
18. Goff DC Jr., Lloyd-Jones DM, Bennett G, Coady S, D'Agostino RB, Gibbons R, et al. 2013 Acc/Aha guideline on the assessment of cardiovascular risk: a report of the American college of cardiology/American heart association task force on practice guidelines. Circulation. (2014) 129(25 Suppl 2):S49–73. doi: 10.1161/01.cir.0000437741.48606.98
19. D'Agostino RB Sr., Vasan RS, Pencina MJ, Wolf PA, Cobain M, Massaro JM, et al. General cardiovascular risk profile for use in primary care: the Framingham heart study. Circulation. (2008) 117(6):743–53. doi: 10.1161/CIRCULATIONAHA.107.699579
20. Ridker PM, Buring JE, Rifai N, Cook NR. Development and validation of improved algorithms for the assessment of global cardiovascular risk in women: the Reynolds risk score. JAMA. (2007) 297(6):611–9. doi: 10.1001/jama.297.6.611
21. Catapano AL, Graham I, De Backer G, Wiklund O, Chapman MJ, Drexel H, et al. 2016 Esc/eas guidelines for the management of dyslipidaemias. Eur Heart J. (2016) 37(39):2999–3058. doi: 10.1093/eurheartj/ehw272
Keywords: mortality, emergency department, convolutional neural network, machine learning, electrocardiogram
Citation: Chang P-C, Liu Z-Y, Huang Y-C, Hsu Y-C, Chen J-S, Lin C-H, Tsai R, Chou C-C, Wen M-S, Wo H-T, Lee W-C, Liu H-T, Wang C-C and Kuo C-F (2023) Machine learning-based prediction of acute mortality in emergency department patients using twelve-lead electrocardiogram. Front. Cardiovasc. Med. 10:1245614. doi: 10.3389/fcvm.2023.1245614
Received: 23 June 2023; Accepted: 13 October 2023;
Published: 27 October 2023.
Edited by:
Bert Vandenberk, University Hospitals Leuven, BelgiumReviewed by:
Gau-Jun Tang, National Yang-Ming University, TaiwanTsung-Chien Lu, National Taiwan University Hospital, Taiwan
© 2023 Chang, Liu, Huang, Hsu, Chen, Lin, Tsai, Chou, Wen, Wo, Lee, Liu, Wang and Kuo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Chun-Chieh Wang chcwang@ms17.hinet.net Chang-Fu Kuo zandis@gmail.com