Decades of research have established the association between adverse childhood experiences (ACEs) and adult onset of chronic diseases, influenced by health behaviors and social determinants of health (SDoH). Machine Learning (ML) is a powerful tool for computing these complex associations and accurately predicting chronic health conditions.
Using the 2021 Behavioral Risk Factor Surveillance Survey, we developed several ML models—random forest, logistic regression, support vector machine, Naïve Bayes, and K-Nearest Neighbor—over data from a sample of 52,268 respondents. We predicted 13 chronic health conditions based on ACE history, health behaviors, SDoH, and demographics. We further assessed each variable’s importance in outcome prediction for model interpretability. We evaluated model performance via the Area Under the Curve (AUC) score.
With the inclusion of data on ACEs, our models outperformed or demonstrated similar accuracies to existing models in the literature that used SDoH to predict health outcomes. The most accurate models predicted diabetes, pulmonary diseases, and heart attacks. The random forest model was the most effective for diabetes (AUC = 0.784) and heart attacks (AUC = 0.732), and the logistic regression model most accurately predicted pulmonary diseases (AUC = 0.753). The strongest predictors across models were age, ever monitored blood sugar or blood pressure, count of the monitoring behaviors for blood sugar or blood pressure, BMI, time of last cholesterol check, employment status, income, count of vaccines received, health insurance status, and total ACEs. A cumulative measure of ACEs was a stronger predictor than individual ACEs.
Our models can provide an interpretable, trauma-informed framework to identify and intervene with at-risk individuals early to prevent chronic health conditions and address their inequalities in the U.S.