AUTHOR=Ntalianis Evangelos , Cauwenberghs Nicholas , Sabovčik František , Santana Everton , Haddad Francois , Claus Piet , Kuznetsova Tatiana TITLE=Feature-based clustering of the left ventricular strain curve for cardiovascular risk stratification in the general population JOURNAL=Frontiers in Cardiovascular Medicine VOLUME=10 YEAR=2023 URL=https://www.frontiersin.org/journals/cardiovascular-medicine/articles/10.3389/fcvm.2023.1263301 DOI=10.3389/fcvm.2023.1263301 ISSN=2297-055X ABSTRACT=Objective

Identifying individuals with subclinical cardiovascular (CV) disease could improve monitoring and risk stratification. While peak left ventricular (LV) systolic strain has emerged as a strong prognostic factor, few studies have analyzed the whole temporal profiles of the deformation curves during the complete cardiac cycle. Therefore, in this longitudinal study, we applied an unsupervised machine learning approach based on time-series-derived features from the LV strain curve to identify distinct strain phenogroups that might be related to the risk of adverse cardiovascular events in the general population.

Method

We prospectively studied 1,185 community-dwelling individuals (mean age, 53.2 years; 51.3% women), in whom we acquired clinical and echocardiographic data including LV strain traces at baseline and collected adverse events on average 9.1 years later. A Gaussian Mixture Model (GMM) was applied to features derived from LV strain curves, including the slopes during systole, early and late diastole, peak strain, and the duration and height of diastasis. We evaluated the performance of the model using the clinical characteristics of the participants and the incidence of adverse events in the training dataset. To ascertain the validity of the trained model, we used an additional community-based cohort (n = 545) as external validation cohort.

Results

The most appropriate number of clusters to separate the LV strain curves was four. In clusters 1 and 2, we observed differences in age and heart rate distributions, but they had similarly low prevalence of CV risk factors. Cluster 4 had the worst combination of CV risk factors, and a higher prevalence of LV hypertrophy and diastolic dysfunction than in other clusters. In cluster 3, the reported values were in between those of strain clusters 2 and 4. Adjusting for traditional covariables, we observed that clusters 3 and 4 had a significantly higher risk for CV (28% and 20%, P ≤ 0.038) and cardiac (57% and 43%, P ≤ 0.024) adverse events. Using SHAP values we observed that the features that incorporate temporal information, such as the slope during systole and early diastole, had a higher impact on the model's decision than peak LV systolic strain.

Conclusion

Employing a GMM on features derived from the raw LV strain curves, we extracted clinically significant phenogroups which could provide additive prognostic information over the peak LV strain.