Machine Learning Patient-Specific Prediction of Heart Failure Hospitalization Using Cardiac MRI-Based Phenotype and Electronic Health Information

Cornhill, Aidan K.; Dykstra, Steven; Satriano, Alessandro; Labib, Dina; Mikami, Yoko; Flewitt, Jacqueline; Prosio, Easter; Rivest, Sandra; Sandonato, Rosa; Howarth, Andrew G.; Lydell, Carmen; Eastwood, Cathy A.; Quan, Hude; Fine, Nowell; Lee, Joon; White, James A.

doi:10.3389/fcvm.2022.890904

ORIGINAL RESEARCH article

Front. Cardiovasc. Med. , 16 June 2022

Sec. Cardiovascular Imaging

Volume 9 - 2022 | https://doi.org/10.3389/fcvm.2022.890904

This article is part of the Research Topic Systems Biology and Data-Driven Machine Learning-Based Models in Personalized Cardiovascular Medicine View all 17 articles

Machine Learning Patient-Specific Prediction of Heart Failure Hospitalization Using Cardiac MRI-Based Phenotype and Electronic Health Information

$\r\nAidan K. Cornhill$ Aidan K. Cornhill¹

Steven Dykstra¹

Alessandro Satriano^1,2,3

Dina Labib^1,2,3

Yoko Mikami^1,2,3

Jacqueline Flewitt¹

Easter Prosio¹

Sandra Rivest¹

Rosa Sandonato¹

Andrew G. Howarth^1,2,3

Carmen Lydell^1,2,3,4

Cathy A. Eastwood^3,5

Hude Quan^3,5

Nowell Fine^2,3

Joon Lee^5,6,7

James A. White^1,2,3*

¹Stephenson Cardiac Imaging Centre, University of Calgary, Calgary, AB, Canada
²Division of Cardiology, Department of Cardiac Sciences, Libin Cardiovascular Institute of Alberta, Calgary, AB, Canada
³Libin Cardiovascular Institute of Alberta, Calgary, AB, Canada
⁴Department of Diagnostic Imaging, University of Calgary, Calgary, AB, Canada
⁵Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
⁶Data Intelligence for Health Lab, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
⁷Department of Cardiac Science, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada

Background: Heart failure (HF) hospitalization is a dominant contributor of morbidity and healthcare expenditures in patients with systolic HF. Cardiovascular magnetic resonance (CMR) imaging is increasingly employed for the evaluation of HF given capacity to provide highly reproducible phenotypic markers of disease. The combined value of CMR phenotypic markers and patient health information to deliver predictions of future HF events has not been explored. We sought to develop and validate a novel risk model for the patient-specific prediction of time to HF hospitalization using routinely reported CMR variables, patient-reported health status, and electronic health information.

Methods: Standardized data capture was performed for 1,775 consecutive patients with chronic systolic HF referred for CMR imaging. Patient demographics, symptoms, Health-related Quality of Life, pharmacy, and routinely reported CMR features were provided to both machine learning (ML) and competing risk Fine-Gray-based models (FGM) for the prediction of time to HF hospitalization.

Results: The mean age was 59 years with a mean LVEF of 36 ± 11%. The population was evenly distributed between ischemic (52%) and idiopathic non-ischemic cardiomyopathy (48%). Over a median follow-up of 2.79 years (IQR: 1.59–4.04) 333 patients (19%) experienced HF related hospitalization. Both ML and competing risk FGM based models achieved robust performance for the prediction of time to HF hospitalization. Respective 90-day, 1 and 2-year AUC values were 0.87, 0.83, and 0.80 for the ML model, and 0.89, 0.84, and 0.80 for the competing risk FGM-based model in a holdout validation cohort. Patients classified as high-risk by the ML model experienced a 34-fold higher occurrence of HF hospitalization at 90 days vs. the low-risk group.

Conclusion: In this study we demonstrated capacity for routinely reported CMR phenotypic markers and patient health information to be combined for the delivery of patient-specific predictions of time to HF hospitalization. This work supports an evolving migration toward multi-domain data collection for the delivery of personalized risk prediction at time of diagnostic imaging.

Introduction

Heart failure (HF) is estimated to affect approximately 64 million people worldwide (1) and is associated with a high incidence of disease-related hospitalization (2). HF hospitalization is increasingly prioritized as an important clinical outcome by patients and healthcare organizations given strong associations with morbidity, mortality and dominant contribution to healthcare expenditures (3). In 2017 it was estimated that each HF hospitalization incurred a mean cost of $14,631 USD with 40% of patients readmitted within 90 days (4). Of all patient populations, those with systolic HF provide greatest contributions to HF hospitalization costs (2), justifying an expanding focus on this population for risk modeling. While the prediction of HF re-admission early following index HF hospitalization has been explored from administrative health data (5–7), risk models for incident HF hospitalization applicable to broader HF populations are required. The deployment of such models at time of diagnostic imaging, delivering descriptors of disease with opportunity for the capture of contextual health information, provides an attractive solution for personalized prediction modeling.

Cardiovascular magnetic resonance (CMR) imaging has become a routinely engaged test for the diagnosis and management of systolic heart failure. This has been justified by its versatility for the delivery of a broad range of phenotypic markers that accurately differentiate ischemic from non-ischemic etiologies (8–10), describe patterns of tissue injury (8), identify valvular pathology (11), and deliver reference standard quantification of chamber volumes, function, and ventricular mass (12, 13). While demonstrated to provide independent value for the prediction of composite outcomes (14–17), the combined value of CMR-reported phenotypic features and contextual patient health information to deliver personalized predictions of HF-related outcomes remains unexplored.

We hypothesized that CMR-reported markers of disease contextualized to patient-reported and EHR-derived markers of health can permit patient-specific predictions of time to HF hospitalization. To achieve this, we explored both machine learning (ML)-based modeling and competing risk Fine-Gray (FGM)-based risk modeling techniques for individualized predictions of time to HF hospitalization at time of CMR.

Materials and Methods

Dataset Available for Risk Modeling

CMR imaging data, patient-reported measures of health status and electronic health record (EHR) abstracted data was provided by the Cardiovascular Imaging Registry of Calgary (CIROC, NCT04367220). CIROC is a prospectively recruiting clinical outcomes Registry of the Libin Cardiovascular Institute engaging patients clinically referred for cardiac diagnostic imaging. Consenting patients undergoing CMR imaging between February 2015 and October 2019 for the evaluation of systolic HF and completing a minimum 1-year follow-up period were included. All data was collected at time of diagnostic test performance using a commercial workflow, data integration, and diagnostic test reporting software platform (cardioDI™, Cohesic Inc., Calgary).

Patients with chronic systolic HF resulting from ischemic cardiomyopathy or idiopathic non-ischemic cardiomyopathy were identified. All patients were required to have CMR-based confirmation of reduced global systolic function, defined as a left ventricular ejection fraction (LVEF) ≤ 50%. Recognizing the unique natural history of patients with specific non-ischemic cardiomyopathy states, all patients with confirmed cardiac amyloid, cardiac sarcoidosis, and hypertrophic cardiomyopathy were excluded. Patients with an acute cardiomyopathy state due to recent (within 90 days) acute coronary syndrome, takotsubo cardiomyopathy, per-partum cardiomyopathy, or viral infection (suspected or confirmed acute myocarditis) were also excluded. This established a final patient cohort with chronic systolic HF of either ischemic or idiopathic non-ischemic etiology. Ischemic cardiomyopathy (ICM) was defined by occurrence of prior myocardial infarction, percutaneous coronary intervention and/or coronary bypass surgery, or presence of ischemic (subendocardial) pattern injury on late gadolinium enhancement (LGE) imaging corresponding to one or more vascular territories. Patients not meeting this criterion were classified as idiopathic non-ischemic dilated cardiomyopathy. For patients who underwent multiple CMR studies, the index study was used for prediction modeling.

A total of 8,773 unique patients enrolled in the CIROC Registry were considered. Of these, 2,455 had an LVEF ≤ 50% by CMR. Following application of the inclusion and exclusion criteria, 1,775 unique patients satisfied cohort eligibility.

The study was approved by the University of Calgary Conjoint Health Research Ethics Board. All subjects provided written informed consent. All research activities were performed in accordance with the Declaration of Helsinki.

Data Element Generation and Collection

Patient Reported Health Data

Patient health questionnaires were electronically deployed prior to each CMR examination to collect patient demographics, comorbid cardiac and non-cardiac illness, smoking, and alcohol history, patient-reported shortness of breath [based on New-York Heart Association (NYHA) classification], and HRQoL using the EQ-5D tool (18).

Cardiovascular Magnetic Resonance Imaging-Based Phenotype Data

CMR imaging was performed on 3 Tesla clinical scanners (Prisma or Skyra, Siemens Healthcare, Erlangen, Germany) using standardized imaging protocols inclusive of routine cine and LGE imaging techniques in sequential short-axis views and 2-, 3-, and 4-chamber long axis views. Quantitative image analysis was performed using standardized operating procedures developed according to guidelines of the Society for Cardiovascular Magnetic Resonance (19). Image analysis was performed using commercially available software (cvi42; Circle Cardiovascular Inc., Calgary) to obtain left ventricular (LV) and right ventricular (RV) volumes and function from semi-automated contour tracing of the endocardial and epicardial borders followed by manual adjustment. Papillary muscles were considered part of the LV mass. Maximal left atrial volume was assessed in the phase immediately prior to mitral valve opening using the bi-plane area-length method on matched 2- and 4-chamber cine images. All measurements were indexed to body surface area, where appropriate, using the Mosteller formula.

Standardized software was used to receive and code quantitative markers of chamber volumes and function, and to code disease-specific phenotypes (cardioDI™; Cohesic Inc., Calgary). LGE images were visually scored for the presence, extent, and pattern of fibrosis: the latter scored as subendocardial, mid-wall striae, right ventricular insertion site, mid-wall patchy, subepicardial, and diffuse patterns, as previously described (20, 21). Valvular pathology was coded based upon visually graded assessments of regurgitation and stenosis severity. The presence of pleural and pericardial effusions was routinely coded.

Electronic Health Record Abstracted Data

Electronic health information was abstracted from institutional EHR data warehouses and was inclusive of pharmacy, laboratory and ICD-10 coded diagnostic and procedural data. Historic data was abstracted at time of index CMR, and every 3-months perpetually thereafter. The primary clinical outcome of HF-related hospitalization was identified by ICD-10 coding registered in the Discharge Abstract Database System, using primary ICD-10 codes of I50.X. All documented events were manually adjudicated by medical chart review. Mortality data, used for competing risk analysis, was collected from Vital Statistics Alberta.

Statistical Analysis

Continuous variables are reported as means ± standard deviation whereas categorical variables are expressed as counts with percentages. Comparison between groups for continuous variables were performed using a Student’s t-test or a Welch’s t-test where appropriate. Chi-squared tests without a Yates Correction and Fishers Exact tests were used for comparison between groups for categorical variables. Univariable CoxPH analysis of baseline variables was performed to identify associations with the primary outcome, this used for identifying candidate variables for FGM-based modeling.

Risk Model Development

We aimed to develop and compare performance of machine learning (ML) and non-ML based modeling for the patient-specific prediction of time to HF hospitalization with reference comparison to a historic HF prediction model. As a ML-based approach we used Random Survival Forest (RSF) based modeling, this compared to a FGM-based model. For the development of our novel prediction models our study dataset was split into a 70% (n = 1,245) development and 30% (n = 530) holdout validation cohort, balanced for both event rate and follow-up duration. The development cohort was partitioned into five training and testing datasets for 5-fold cross-validation-based model development and selection. Within each cross-validation fold, missing data was independently imputed by multivariable feature imputation (Hmisc: aregImpute) (22). Manual variable reduction was executed to remove variables with a missingness rate > 15% and those believed to have poor generalizability to other clinical settings (i.e., unique to the local institution). This led to 63 consistently available disease phenotype (imaging) and patient health variables for the development of our risk models (Supplementary Table 1).

Machine Learning Model Development and Performance Evaluation

For each development fold, 100 bootstrap samples with replacement were generated and 100 RSF models were trained for variable selection. These models were applied to out-of-bag data where variable importance was then assessed using permutation importance rank. The top 15 performing variables for each training fold were selected by their mean variable rank across all 100 out-of-bag datasets. A comprehensive grid search technique was used for hyperparameter tuning, as summarized in Supplementary Table 2. Optimized models in each training fold were applied to the test sets for final model evaluation and selection using time-dependent area under the curve (tAUC) and C-index. Models containing a range of 13–17 features were assessed in the test set, comparing tAUC and C-index to identify the optimal number of model features. The final model was then applied to the holdout dataset where performance was assessed using C-index, tAUC, average positive predictive value (PPV), average recall and F1 score. A model threshold for discriminating “High” from “Low” risk cohorts was then defined by observing the inflection point of observed events across deciles of predicted risk in the development cohort. Cumulative incidence plots accounting for competing events and stratifying patients by predicted risk category were generated. Calibration plots were generated by plotting mean difference in predicted and observed event rates for each decile of risk across 100 bootstrap replicates at 2 years. The Aalen-Johansen method was used to account for competing events.

Competing Risk Fine-Gray-Based Models Risk Model Development and Performance Evaluation

For development of the FGM based model variable candidacy was defined by a threshold p-value of 0.1 in univariable analysis. Backward variable selection was performed to select features, based on order of variable exclusion. Highly correlated features were excluded using a threshold of a Pearson’s coefficient greater than 0.7. The competing risk FGM model was applied to generate coefficients for each variable that could then be used to estimate 90 day, 1 and 2-year probability of HF hospitalization for individual patients, as previously described (23). The test dataset was used to assess model performance for each development and test fold, resulting in five candidate risk models. C-index and tAUC were then used to select an optimal model for validation in the holdout dataset. Model performance in the holdout dataset was assessed using C-index, tAUC, average PPV, average recall and F1 score. Calibration was assessed using the method described above.

MAGGIC Score-Based Risk Model Performance Evaluation

Originally developed for mortality prediction in systolic HF populations (24), the MAGGIC risk score served as the best available surrogate model for the estimation of HF outcomes in our referral population. MAGGIC risk scores were applied to the holdout cohort to provide matched assessments of performance vs. novel risk models.

All statistical analysis and modeling was performed in R version 4.0.3 and Python version 3.8.8 (25).

Results

Population Characteristics

Our chronic systolic HF population consisted of 1,775 unique patients with a mean age of 59 ± 13 years and 24% being female. Baseline clinical and CMR characteristics are summarized in Table 1. The population was composed of 52% ischemic cardiomyopathy and 48% non-ischemic dilated cardiomyopathy patients.

TABLE 1

Table 1. Baseline clinical and CMR characteristics of the study cohort.

During a median follow-up period of 2.79 years (IQR: 1.59–4.04) 333 patients (19%) experienced the primary outcome of HF hospitalization. Ninety-five patients (5%) died, 40 of these (2%) dying without prior HF hospitalization.

No significant differences were observed between development and validation cohorts (Supplementary Table 3). In the development cohort, 233 patients (19%) experienced the primary outcome over a median follow-up period of 2.8 (IQR: 1.59–4.04) years. In the validation cohort 100 patients (19%) experienced the primary outcome over a median follow-up period of 2.74 (1.58–4.04) years.

Machine Learning Risk Model Performance

The final RSF risk model contained15 predictive variables, 9 of which were sourced from the CMR-reported phenotype. The variable selection process produced a model containing LVESVi, LVEDVi, LVEF, RVESVi, RVEDVi, and RVEF. Given that end systolic volumes are implicit in a model containing end diastolic volumes and ejection fraction, two predictive late gadolinium enhancement patterns (subendocardial and mid-wall striae) were added to the model and performance in the test datasets compared. The model containing LGE features achieved higher C-index and tAUC values and this feature set was subsequently used to train the final RSF model. The mean permutation importance of each selected variables is shown in Figure 1. In the holdout cohort, the RSF model achieved a C-index of 0.77 and provided 90-day, 1 and 2-year AUC values of 0.87, 0.83, and 0.80, respectively (Figure 2). The RSF model delivered a mean PPV of 0.50 with an F1 score of 0.60 with good calibration across all deciles of risk (Figure 3). We defined patients with risk estimates above the seventh-risk decile to be “high-risk,” these patients experiencing 66% of all observed outcomes in the holdout cohort. Cumulative incidence curves (Figure 4) for patients predicted to be “high-risk” vs. “low-risk” showed significantly higher occurrence of HF hospitalization. The respective event rates for high vs. low-risk cohorts were 19 vs. 0.6% (p < 0.0001) at 90 days; 28 vs. 4% (p < 0.0001) at 1-year; and at 35 vs. 7% (p < 0.0001) at 2-years.

FIGURE 1

Figure 1. Mean permutation importance values over 100 bootstrap samples for the features included in the final CIROC-HF-RSF model.

FIGURE 2

Figure 2. (A) Receiver operating characteristic curves for the CIROC-HF-RSF Model, CIROC-HF-FGM Risk Model, and modified MAGGIC risk score at 90 days, 1 and 2 years follow-up in the holdout cohort. (B) Summary of CIROC-HF-RSF model, CIROC-HF-FGM Risk Model, and modified MAGGIC risk score performance in the holdout cohort.

FIGURE 3

Figure 3. Calibration plots for (A) CIROC-HF-RSF risk model, and (B) CIROC-HF-FGM risk model for the prediction of HF hospitalization in the holdout cohort. Plots display difference between observed and expected event rates at each decile of risk. Confidence intervals are derived from 100 bootstrapped datasets.

FIGURE 4

Figure 4. Cumulative incidence curves describing time to HF hospitalization in the holdout dataset stratified by “High-risk” vs. “Low-risk” classification by the (A) CIROC-HF-RSF model, and (B) CIROC-HF-FGM Risk Model.

Competing Risk Fine-Gray-Based Models Model Performance

Similar to machine learning-based modeling, variables from all data domains were shown to provide value toward an optimal FGM-based model with respective associations summarized in Supplementary Table 4. In holdout validation CIROC-HF-FGM delivered a mean C-index of 0.77 with 90-day, 1 and 2-year tAUC’s of 0.89, 0.84, 0.80, respectively. The mean PPV and F1 score was 0.49 and 0.59, respectively (Figure 2). Similar to the ML-based model, patients with a predicted risk above the seventh-risk decile were defined as “high-risk.” High-risk patients experienced 62% of all observed outcomes in the holdout cohort, with cumulative incidence curves shown in Figure 4. The respective event rates for high vs. low-risk groups were 18 vs. 1% (p < 0.0001) at 90-days; 28 vs. 3% (p < 0.0001) at 1-year; and 35 vs. 7% (p < 0.0001) at 2-years.

Comparison of CIROC-HF Risk Models to the MAGGIC Risk Score

Both CIROC-HF risk models were compared to the MAGGIC Risk Score (24) in the validation cohort. The MAGGIC Risk Score delivered a mean C-index of 0.72 with a respective 90-day, 1-year, 2-year tAUC’s of 0.81, 0.78, 0.74. Comparisons of tAUC performance between the CIROC-HF risk models and MAGGIC Risk Score are shown in Figure 2, demonstrating superior performance for both novel CIROC-HF models.

Discussion

In this study we demonstrated the capacity of routinely reported CMR disease markers to be contextualized by patient health information at the time of diagnostic testing for delivery of patient-specific estimations of time to HF hospitalization. Our modeling identified unique and independent value from each of the imaging phenotype, patient-reported health, and EHR data domains; their collective availability permitting improved prediction performance vs. the MAGGIC Risk Score. Using our ML-based model, patients classified to the high-risk category experienced a 34-fold higher occurrence of HF hospitalization at 90-days, 8-fold at 1-year, and 5-fold at 2-years. To our knowledge, this represents the first validated model for the prediction of HF hospitalization in HF patients undergoing CMR imaging.

HF hospitalization risk models have, to date, focussed on the prediction of re-admission following index hospitalizations for acute decompensation (5–7, 26–28). These models have consistently focussed on data sourced from in-patient electronic health records to identify those at higher likelihood of re-admission, typically at 90-days. All have struggled to achieve the performance of models trained to predict mortality (29, 30), suggesting elevated need to consider patient-specific disease phenotypes. The latter concept was explored in a study of 3,189 HF in-patients where multi-domain phenotypic data, gathered from routine echocardiography reporting, enabled prediction of all-cause early re-hospitalization with higher predictive accuracy than prior administrative data supported models, achieving an AUC of 0.76 at 90-days (31). While demonstrating value from multi-domain imaging phenotypes, this study was limited to high-risk inpatient populations, preventing generalizability to those patients routinely encountered by diagnostic imaging services.

Supported by a prior study showing incremental value from ML-based modeling for the prediction of HF re-admission using EHR sourced data (32), we postulated similar performance gains in our referral population. In contrast, we observed very similar performance for our modeled clinical outcome when compared to a FGM-based model provided matched multi-domain data resources. The exception was improved stability in time-dependent AUC seen using the ML-based approach at 2-years (Figure 2). However, a distinct advantage of ML-based modeling is its capacity to consider non-linear interactions between features without the limitations introduced by the proportional hazards assumption. Through this, we were afforded the opportunity to objectively evaluate the respective contributions of imaging phenotype, patient-reported health, and EHR-based markers have on the incident occurrence of HF hospitalization. As shown in Figure 1, we identified that current use of loop diuretics was the strongest contributor to model performance, followed by left atrial volume, LVEF, age and LV mass index. Other relevant features included volumetric markers of right ventricular health and patterns of myocardial fibrosis, these demonstrating the unique contributory value that CMR-based phenotyping can provide in this patient population. Collectively, these selected features appropriately represented phenotypic markers recognized to have strong predictive value in HF populations from prior studies (14, 15, 17, 20, 21, 33–46).

The capacity of contextual patient health information to contribute value for HF hospitalization prediction has been previously reported (47–49). To our knowledge, our current study is the first to describe the routine clinical deployment of patient-reported health questionnaires at time of diagnostic imaging for the delivery of this unique data domain. Of the top fifteen variables selected by our ML-based model, three were selected from the EQ-5D health related quality of life instrument (18). This demonstrates strong value from the synchronous capture of patient-reported measures of health at time of diagnostic testing, features recognized to be critical for the optimal prediction of HF-related events (29).

Limitations

This study was executed at a single tertiary care healthcare institution and therefore requires external site validation prior to deployment beyond the local environment. This is of particular importance for unique clinical environments that may be exposed to a local referral bias in diagnostic testing or have altered socio-demographic profiles. While systematically explored, we did not report results of other classification-based ML techniques for event prediction at specific time-points given lower performance metrics. Due to lack of routine performed surrounding the time of CMR imaging, we were unable to consider BNP or NT-proBNP values into our predictive models. In addition, given the high engagement of private out-patient echocardiography laboratories in clinical practice, direct comparison to models trained from echocardiographic variables in the same patient population was not feasible. Advanced CMR based markers of myocardial disease of recognized value, such as tissue mapping (50), were not undertaken in this large cohort study given desire for generalizability to routine practice and a high degree of vendor and hardware dependence for such measures.

Conclusion

In this study we developed and validated a machine learning based model for the prediction of time to HF hospitalization in systolic HF patients undergoing CMR. Our study was focussed on demonstrating the respective value provided by imaging phenotypes, patient-reported measures of health, and EHR-sourced data for the delivery of personalized HF predictions. Overall, our study supports strong value provided by the routine capture of multi-domain health data resources at time of diagnostic imaging, this approach facilitating the implementation of personalized outcome prediction.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors, upon appropriate request.

Ethics Statement

The studies involving human participants were reviewed and approved by the University of Calgary Conjoint Health Research Ethics Board. The patients/participants provided their written informed consent to participate in this study.

Author Contributions

AC performed all data analysis, modeling, and manuscript authorship. JW conceived, designed, edited, and finalized manuscript content. SD developed data structures and data matching for the CIROC Registry. DL, YM, JF, SR, RS, EP, and AS assisted in data collection and adjudication. NF, CL, AH, HQ, and CE performed study design and manuscript review. JL assisted in data modeling. All authors contributed to the article and approved the submitted version.

Funding

This study was funded, in part, by the Calgary Health Foundation.

Conflict of Interest

JW, AH, and JF were shareholders in Cohesic Inc. JW has received research funding from Siemens Healthineers, Circle Cardiovascular Inc., Pfizer Inc. AH received funding from Amgen. JW was Chief Medical Officer of Cohesic Inc.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fcvm.2022.890904/full#supplementary-material

Abbreviations

CIROC, Cardiovascular Imaging Registry of Calgary; CMR, Cardiovascular Magnetic Resonance Imaging; CoxPH, Cox Proportional Hazard; EHR, Electronic Health Record; FGM, Fine-Gray (Model); HF, Heart Failure; HR-QOL, Health Related Quality of Life; LGE, Late Gadolinium Enhancement; LVEF, Left Ventricular Ejection Fraction; ML, Machine Learning; PPV, Positive Predictive Value; RSF, Random Survival Forest; tAUC, Time Dependent Area Under the Receiver Operating Characteristic Curve.

References

1. James SL, Abate D, Abate KH, Abay SM, Abbafati C, Abbasi N, et al. Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories: a systematic analysis for the global burden of disease study 2017. Lancet. (2018) 392:1789–858. doi: 10.1016/S0140-6736(18)32279-7

PubMed Abstract | CrossRef Full Text | Google Scholar

2. Urbich M, Globe G, Pantiri K, Heisen M, Bennison C, Wirtz HS, et al. A systematic review of medical costs associated with heart failure in the USA (2014-2020). Pharmacoeconomics. (2020) 38:1219–36. doi: 10.1007/s40273-020-00952-0

PubMed Abstract | CrossRef Full Text | Google Scholar

3. Ruigomez A, Michel A, Martin-Perez M, Garcia Rodriguez LA. Heart failure hospitalization: an important prognostic factor for heart failure re-admission and mortality. Int J Cardiol. (2016) 220:855–61. doi: 10.1016/j.ijcard.2016.06.080

PubMed Abstract | CrossRef Full Text | Google Scholar

4. Kilgore M, Patel HK, Kielhorn A, Maya JF, Sharma P. Economic burden of hospitalizations of medicare beneficiaries with heart failure. Risk Manag Healthc Policy. (2017) 10:63–70. doi: 10.2147/RMHP.S130341

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Formiga F, Masip J, Chivite D, Corbella X. Applicability of the heart failure readmission risk score: a first European study. Int J Cardiol. (2017) 236:304–9. doi: 10.1016/j.ijcard.2017.02.024

PubMed Abstract | CrossRef Full Text | Google Scholar

6. Lenzi J, Avaldi VM, Hernandez-Boussard T, Descovich C, Castaldini I, Urbinati S, et al. Risk-adjustment models for heart failure patients’ 30-day mortality and readmission rates: the incremental value of clinical data abstracted from medical charts beyond hospital discharge record. BMC Health Serv Res. (2016) 16:473. doi: 10.1186/s12913-016-1731-9

PubMed Abstract | CrossRef Full Text | Google Scholar

7. Sudhakar S, Zhang W, Kuo YF, Alghrouz M, Barbajelata A, Sharma G. Validation of the readmission risk score in heart failure patients at a tertiary hospital. J Card Fail. (2015) 21:885–91. doi: 10.1016/j.cardfail.2015.07.010

PubMed Abstract | CrossRef Full Text | Google Scholar

8. Stirrat J, White JA. The prognostic role of late gadolinium enhancement magnetic resonance imaging in patients with cardiomyopathy. Can J Cardiol. (2013) 29:329–36. doi: 10.1016/j.cjca.2012.11.033

PubMed Abstract | CrossRef Full Text | Google Scholar

9. Mordi I, Bezerra H, Carrick D, Tzemos N. The combined incremental prognostic value of LVEF, late gadolinium enhancement, and global circumferential strain assessed by CMR. JACC Cardiovasc Imaging. (2015) 8:540–9. doi: 10.1016/j.jcmg.2015.02.005

PubMed Abstract | CrossRef Full Text | Google Scholar

10. McCrohon JA, Moon JC, Prasad SK, McKenna WJ, Lorenz CH, Coats AJ, et al. Differentiation of heart failure related to dilated cardiomyopathy and coronary artery disease using gadolinium-enhanced cardiovascular magnetic resonance. Circulation. (2003) 108:54–9. doi: 10.1161/01.CIR.0000078641.19365.4C

PubMed Abstract | CrossRef Full Text | Google Scholar

11. Gajjar K, Kashyap K, Badlani J, Williams RB, Biederman RWW. A review of the pivotal role of cardiac MRI in mitral valve regurgitation. Echocardiography. (2021) 38:128–41. doi: 10.1111/echo.14941

PubMed Abstract | CrossRef Full Text | Google Scholar

12. Grothues F, Smith GC, Moon JC, Bellenger NG, Collins P, Klein HU, et al. Comparison of interstudy reproducibility of cardiovascular magnetic resonance with two-dimensional echocardiography in normal subjects and in patients with heart failure or left ventricular hypertrophy. Am J Cardiol. (2002) 90:29–34. doi: 10.1016/s0002-9149(02)02381-0

PubMed Abstract | CrossRef Full Text | Google Scholar

13. Grothues F, Moon JC, Bellenger NG, Smith GS, Klein HU, Pennell DJ. Interstudy reproducibility of right ventricular volumes, function, and mass with cardiovascular magnetic resonance. Am Heart J. (2004) 147:218–23. doi: 10.1016/j.ahj.2003.10.005

PubMed Abstract | CrossRef Full Text | Google Scholar

14. Bluemke DA, Kronmal RA, Lima JA, Liu K, Olson J, Burke GL, et al. The relationship of left ventricular mass and geometry to incident cardiovascular events: the MESA (Multi-ethnic study of atherosclerosis) study. J Am Coll Cardiol. (2008) 52:2148–55. doi: 10.1016/j.jacc.2008.09.014

PubMed Abstract | CrossRef Full Text | Google Scholar

15. Modin D, Sengelov M, Jorgensen PG, Olsen FJ, Bruun NE, Fritz-Hansen T, et al. Prognostic value of left atrial functional measures in heart failure with reduced ejection fraction. J Card Fail. (2019) 25:87–96. doi: 10.1016/j.cardfail.2018.11.016

PubMed Abstract | CrossRef Full Text | Google Scholar

16. Wong TC, Piehler KM, Zareba KM, Lin K, Phrampus A, Patel A, et al. Myocardial damage detected by late gadolinium enhancement cardiovascular magnetic resonance is associated with subsequent hospitalization for heart failure. J Am Heart Assoc. (2013) 2:e000416. doi: 10.1161/JAHA.113.000416

PubMed Abstract | CrossRef Full Text | Google Scholar

17. Meyer P, Filippatos GS, Ahmed MI, Iskandrian AE, Bittner V, Perry GJ, et al. Effects of right ventricular ejection fraction on outcomes in chronic systolic heart failure. Circulation. (2010) 121:252–8. doi: 10.1161/CIRCULATIONAHA.109.887570

PubMed Abstract | CrossRef Full Text | Google Scholar

18. EuroQol Group. EuroQol–a new facility for the measurement of health-related quality of life. Health Policy. (1990) 16:199–208.

Google Scholar

19. Hundley WG, Bluemke D, Bogaert JG, Friedrich MG, Higgins CB, Lawson MA, et al. Society for cardiovascular magnetic resonance guidelines for reporting cardiovascular magnetic resonance examinations. J Cardiovasc Magn Reson. (2009) 11:5.

Google Scholar

20. Almehmadi F, Joncas SX, Nevis I, Zahrani M, Bokhari M, Stirrat J, et al. Prevalence of myocardial fibrosis patterns in patients with systolic dysfunction: prognostic significance for the prediction of sudden cardiac arrest or appropriate implantable cardiac defibrillator therapy. Circ Cardiovasc Imaging. (2014) 7:593–600. doi: 10.1161/CIRCIMAGING.113.001768

PubMed Abstract | CrossRef Full Text | Google Scholar

21. Halliday BP, Baksi AJ, Gulati A, Ali A, Newsome S, Izgi C, et al. Outcome in dilated cardiomyopathy related to the extent, location, and pattern of late gadolinium enhancement. JACC Cardiovasc Imaging. (2019) 12:1645–55. doi: 10.1016/j.jcmg.2018.07.015

PubMed Abstract | CrossRef Full Text | Google Scholar

22. White IR, Royston P, Wood AM. Multiple imputation using chained equations: issues and guidance for practice. Stat Med. (2011) 30:377–99. doi: 10.1002/sim.4067

PubMed Abstract | CrossRef Full Text | Google Scholar

23. Harrell F. Regression Modeling Strategies. 2nd ed. Berlin: Springer International Publishing (2015).

Google Scholar

24. Pocock SJ, Ariti CA, McMurray JJ, Maggioni A, Kober L, Squire IB, et al. Predicting survival in heart failure: a risk score based on 39 372 patients from 30 studies. Eur Heart J. (2013) 34:1404–13. doi: 10.1093/eurheartj/ehs337

PubMed Abstract | CrossRef Full Text | Google Scholar

25. R Core Team. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing (2020).

Google Scholar

26. Amarasingham R, Moore BJ, Tabak YP, Drazner MH, Clark CA, Zhang S, et al. An automated model to identify heart failure patients at risk for 30-day readmission or death using electronic medical record data. Med Care. (2010) 48:981–8. doi: 10.1097/MLR.0b013e3181ef60d9

PubMed Abstract | CrossRef Full Text | Google Scholar

27. Hammill BG, Curtis LH, Fonarow GC, Heidenreich PA, Yancy CW, Peterson ED, et al. Incremental value of clinical data beyond claims data in predicting 30-day outcomes after heart failure hospitalization. Circ Cardiovasc Qual Outcomes. (2011) 4:60–7. doi: 10.1161/CIRCOUTCOMES.110.954693

PubMed Abstract | CrossRef Full Text | Google Scholar

28. Tan BY, Gu JY, Wei HY, Chen L, Yan SL, Deng N. Electronic medical record-based model to predict the risk of 90-day readmission for patients with heart failure. BMC Med Inform Decis Mak. (2019) 19:193. doi: 10.1186/s12911-019-0915-8

PubMed Abstract | CrossRef Full Text | Google Scholar

29. Ouwerkerk W, Voors AA, Zwinderman AH. Factors influencing the predictive power of models for predicting mortality and/or heart failure hospitalization in patients with heart failure. JACC Heart Fail. (2014) 2:429–36. doi: 10.1016/j.jchf.2014.04.006

PubMed Abstract | CrossRef Full Text | Google Scholar

30. Rahimi K, Bennett D, Conrad N, Williams TM, Basu J, Dwight J, et al. Risk prediction in patients with heart failure: a systematic review and analysis. JACC Heart Fail. (2014) 2:440–6. doi: 10.1016/j.jchf.2014.04.008

PubMed Abstract | CrossRef Full Text | Google Scholar

31. Sarijaloo F, Park J, Zhong X, Wokhlu A. Predicting 90 day acute heart failure readmission and death using machine learning-supported decision analysis. Clin Cardiol. (2021) 44:230–7. doi: 10.1002/clc.23532

PubMed Abstract | CrossRef Full Text | Google Scholar

32. Shameer K, Johnson KW, Yahi A, Miotto R, Li LI, Ricks D, et al. Predictive modeling of hospital readmission rates using electronic medical record-wide machine learning: a case-study using mount sinai heart failure cohort. Pac Symp Biocomput. (2017) 22:276–87. doi: 10.1142/9789813207813_0027

PubMed Abstract | CrossRef Full Text | Google Scholar

33. Krittayaphong R, Boonyasirinant T, Saiviroonporn P, Thanapiboonpol P, Nakyen S, Ruksakul K, et al. Prognostic significance of left ventricular mass by magnetic resonance imaging study in patients with known or suspected coronary artery disease. J Hypertens. (2009) 27:2249–56. doi: 10.1097/HJH.0b013e3283309ac4

PubMed Abstract | CrossRef Full Text | Google Scholar

34. Dini FL, Cortigiani L, Baldini U, Boni A, Nuti R, Barsotti L, et al. Prognostic value of left atrial enlargement in patients with idiopathic dilated cardiomyopathy and ischemic cardiomyopathy. Am J Cardiol. (2002) 89:518–23. doi: 10.1016/s0002-9149(01)02290-1

PubMed Abstract | CrossRef Full Text | Google Scholar

35. Wong M, Staszewsky L, Latini R, Barlera S, Glazer R, Aknay N, et al. Severity of left ventricular remodeling defines outcomes and response to therapy in heart failure: valsartan heart failure trial (Val-HeFT) echocardiographic data. J Am Coll Cardiol. (2004) 43:2022–7. doi: 10.1016/j.jacc.2003.12.053

PubMed Abstract | CrossRef Full Text | Google Scholar

36. Gulati A, Jabbour A, Ismail TF, Guha K, Khwaja J, Raza S, et al. Association of fibrosis with mortality and sudden cardiac death in patients with nonischemic dilated cardiomyopathy. JAMA. (2013) 309:896–908. doi: 10.1001/jama.2013.1363

PubMed Abstract | CrossRef Full Text | Google Scholar

37. Gulati A, Ismail TF, Jabbour A, Alpendurada F, Guha K, Ismail NA, et al. The prevalence and prognostic significance of right ventricular systolic dysfunction in nonischemic dilated cardiomyopathy. Circulation. (2013) 128:1623–33. doi: 10.1161/CIRCULATIONAHA.113.002518

PubMed Abstract | CrossRef Full Text | Google Scholar

38. Doesch C, Dierks DM, Haghi D, Schimpf R, Kuschyk J, Suselbeck T, et al. Right ventricular dysfunction, late gadolinium enhancement, and female gender predict poor outcome in patients with dilated cardiomyopathy. Int J Cardiol. (2014) 177:429–35. doi: 10.1016/j.ijcard.2014.09.004

PubMed Abstract | CrossRef Full Text | Google Scholar

39. Purmah Y, Lei LY, Dykstra S, Mikami Y, Cornhill A, Satriano A, et al. Right ventricular ejection fraction for the prediction of major adverse cardiovascular and heart failure-related events: a cardiac MRI based study of 7131 patients with known or suspected cardiovascular disease. Circ Cardiovasc Imaging. (2021) 14:e011337. doi: 10.1161/CIRCIMAGING.120.011337

PubMed Abstract | CrossRef Full Text | Google Scholar

40. Mikami Y, Jolly U, Heydari B, Peng M, Almehmadi F, Zahrani M, et al. Right ventricular ejection fraction is incremental to left ventricular ejection fraction for the prediction of future arrhythmic events in patients with systolic dysfunction. Circ Arrhythm Electrophysiol. (2017) 10:e004067. doi: 10.1161/CIRCEP.116.004067

PubMed Abstract | CrossRef Full Text | Google Scholar

41. Abdi-Ali A, Miller RJH, Southern D, Zhang M, Mikami Y, Knudtson M, et al. LV mass independently predicts mortality and need for future revascularization in patients undergoing diagnostic coronary angiography. JACC Cardiovasc Imaging. (2018) 11:423–33. doi: 10.1016/j.jcmg.2017.04.012

PubMed Abstract | CrossRef Full Text | Google Scholar

42. Halliday BP, Gulati A, Ali A, Guha K, Newsome S, Arzanauskaite M, et al. Association between midwall late gadolinium enhancement and sudden cardiac death in patients with dilated cardiomyopathy and mild and moderate left ventricular systolic dysfunction. Circulation. (2017) 135:2106–15.

Google Scholar

43. Di Marco A, Anguera I, Schmitt M, Klem I, Neilan TG, White JA, et al. Late gadolinium enhancement and the risk for ventricular arrhythmias or sudden death in dilated cardiomyopathy: systematic review and meta-analysis. JACC Heart Fail. (2017) 5:28–38. doi: 10.1016/j.jchf.2016.09.017

PubMed Abstract | CrossRef Full Text | Google Scholar

44. Alba AC, Gaztanaga J, Foroutan F, Thavendiranathan P, Merlo M, Alonso-Rodriguez D, et al. Prognostic value of late gadolinium enhancement for the prediction of cardiovascular outcomes in dilated cardiomyopathy: an international, multi-institutional study of the MINICOR group. Circ Cardiovasc Imaging. (2020) 13:e010105. doi: 10.1161/CIRCIMAGING.119.010105

PubMed Abstract | CrossRef Full Text | Google Scholar

45. Mikami Y, Cornhill A, Heydari B, Joncas SX, Almehmadi F, Zahrani M, et al. Objective criteria for septal fibrosis in non-ischemic dilated cardiomyopathy: validation for the prediction of future cardiovascular events. J Cardiovasc Magn Reson. (2016) 18:82. doi: 10.1186/s12968-016-0300-z

PubMed Abstract | CrossRef Full Text | Google Scholar

46. Mikami Y, Cornhill A, Dykstra S, Satriano A, Hansen R, Flewitt J, et al. Right ventricular insertion site fibrosis in a dilated cardiomyopathy referral population: phenotypic associations and value for the prediction of heart failure admission or death. J Cardiovasc Magn Reson. (2021) 23:79. doi: 10.1186/s12968-021-00761-0

PubMed Abstract | CrossRef Full Text | Google Scholar

47. Voors AA, Ouwerkerk W, Zannad F, van Veldhuisen DJ, Samani NJ, Ponikowski P, et al. Development and validation of multivariable models to predict mortality and hospitalization in patients with heart failure. Eur J Heart Fail. (2017) 19:627–34. doi: 10.1002/ejhf.785

PubMed Abstract | CrossRef Full Text | Google Scholar

48. Bello NA, Claggett B, Desai AS, McMurray JJ, Granger CB, Yusuf S, et al. Influence of previous heart failure hospitalization on cardiovascular events in patients with reduced and preserved ejection fraction. Circ Heart Fail. (2014) 7:590–5. doi: 10.1161/CIRCHEARTFAILURE.113.001281

PubMed Abstract | CrossRef Full Text | Google Scholar

49. Rodríguez-Artalejo F, Guallar-Castillón P, Pascual CR, Otero CM, Montes AO, García AN, et al. Health-related quality of life as a predictor of hospital readmission and death among patients with heart failure. Arch Intern Med. (2005) 165:1274–9. doi: 10.1001/archinte.165.11.1274

PubMed Abstract | CrossRef Full Text | Google Scholar

50. Puntmann VO, Carr-White G, Jabbour A, Yu CY, Gebker R, Kelle S, et al. T1-mapping and outcome in nonischemic cardiomyopathy: all-cause mortality and heart failure. JACC Cardiovasc Imaging. (2016) 9:40–50.

PubMed Abstract | Google Scholar

Keywords: cardiovascular magnetic resonance imaging, machine learning, heart failure hospitalization, prediction, systolic heart failure (HF)

Citation: Cornhill AK, Dykstra S, Satriano A, Labib D, Mikami Y, Flewitt J, Prosio E, Rivest S, Sandonato R, Howarth AG, Lydell C, Eastwood CA, Quan H, Fine N, Lee J and White JA (2022) Machine Learning Patient-Specific Prediction of Heart Failure Hospitalization Using Cardiac MRI-Based Phenotype and Electronic Health Information. Front. Cardiovasc. Med. 9:890904. doi: 10.3389/fcvm.2022.890904

Received: 07 March 2022; Accepted: 10 May 2022;
Published: 16 June 2022.

Edited by:

Ali Yilmaz, University Hospital Münster, Germany

Reviewed by:

Giulia Barbati, University of Trieste, Italy
Minglong Chen, The First Affiliated Hospital of Nanjing Medical University, China

Copyright © 2022 Cornhill, Dykstra, Satriano, Labib, Mikami, Flewitt, Prosio, Rivest, Sandonato, Howarth, Lydell, Eastwood, Quan, Fine, Lee and White. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: James A. White, amF3aGl0QHVjYWxnYXJ5LmNh

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Machine Learning Patient-Specific Prediction of Heart Failure Hospitalization Using Cardiac MRI-Based Phenotype and Electronic Health Information

Introduction

Materials and Methods

Dataset Available for Risk Modeling

Data Element Generation and Collection

Patient Reported Health Data

Cardiovascular Magnetic Resonance Imaging-Based Phenotype Data

Electronic Health Record Abstracted Data

Statistical Analysis

Risk Model Development

Machine Learning Model Development and Performance Evaluation

Competing Risk Fine-Gray-Based Models Risk Model Development and Performance Evaluation

MAGGIC Score-Based Risk Model Performance Evaluation

Results

Population Characteristics

Machine Learning Risk Model Performance

Competing Risk Fine-Gray-Based Models Model Performance

Comparison of CIROC-HF Risk Models to the MAGGIC Risk Score

Discussion

Limitations

Conclusion

Data Availability Statement

Ethics Statement

Author Contributions

Funding

Conflict of Interest

Publisher’s Note

Supplementary Material

Abbreviations

References

94% of researchers rate our articles as excellent or good

94% of researchers rate our articles as excellent or good