Skip to main content

SYSTEMATIC REVIEW article

Front. Public Health

Sec. Aging and Public Health

Volume 13 - 2025 | doi: 10.3389/fpubh.2025.1509458

This article is part of the Research Topic New Environmental Pollutants, Aging, and Age-Related Diseases View all 3 articles

The Relationship Between Epigenetic Biomarkers and the Risk of Diabetes and Cancer: A Machine Learning Modeling Approach

Provisionally accepted
  • 1 Zhejiang Chinese Medical University, Hangzhou, China
  • 2 Zhejiang University, Hangzhou, Zhejiang Province, China
  • 3 Shulan Hangzhou Hospital, Hangzhou, Zhejiang Province, China

The final, formatted version of the article will be published soon.

    Epigenetic biomarkers are molecular indicators of epigenetic changes, and some studies have suggested that these biomarkers have predictive power for disease risk. This study aims to analyze the relationship between 30 epigenetic biomarkers and the risk of diabetes and cancer using machine learning modeling.Methods: The data for this study were sourced from the NHANES database, which includes DNA methylation arrays and epigenetic biomarker datasets. Nine machine learning algorithms were used to build models: AdaBoost, GBM, KNN, lightGBM, MLP, RF, SVM, XGBoost, and logistics. Model stability was evaluated using metrics such as Accuracy, MCC, and Sensitivity. The performance and decision-making ability of the models were displayed using ROC curves and DCA curves, while SHAP values were used to visualize the importance of each epigenetic biomarker.Results: Epigenetic age acceleration was strongly associated with cancer risk but had a weaker relationship with diabetes. In the diabetes model, the top three contributing features were logA1Mort, family income-to-poverty ratio, and marital status. In the cancer model, the top three contributing features were gender, non-Hispanic White ethnicity, and PACKYRSMort.Conclusion: Our study identified the relationship between epigenetic biomarkers and the risk of diabetes and cancer, and used machine learning techniques to analyze the contributions of various epigenetic biomarkers to disease risk.

    Keywords: Epigenetic biomarkers, Epigenetic clocks, Epigenetic age acceleration, diabetes, Cancer, machine learning

    Received: 11 Oct 2024; Accepted: 24 Feb 2025.

    Copyright: © 2025 zhang, JIn, Zheng and Mou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Jianan JIn, Zhejiang Chinese Medical University, Hangzhou, China

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

    Research integrity at Frontiers

    Man ultramarathon runner in the mountains he trains at sunset

    94% of researchers rate our articles as excellent or good

    Learn more about the work of our research integrity team to safeguard the quality of each article we publish.


    Find out more