Zhang S, Jin J, Xu B, Zheng Q, Mou H. The relationship between epigenetic biomarkers and the risk of diabetes and cancer: a machine learning modeling approach.
Front Public Health 2025;
13:1509458. [PMID:
40190762 PMCID:
PMC11968389 DOI:
10.3389/fpubh.2025.1509458]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Accepted: 02/24/2025] [Indexed: 04/09/2025] Open
Abstract
Introduction
Epigenetic biomarkers are molecular indicators of epigenetic changes, and some studies have suggested that these biomarkers have predictive power for disease risk. This study aims to analyze the relationship between 30 epigenetic biomarkers and the risk of diabetes and cancer using machine learning modeling.
Methods
The data for this study were sourced from the NHANES database, which includes DNA methylation arrays and epigenetic biomarker datasets. Nine machine learning algorithms were used to build models: AdaBoost, GBM, KNN, lightGBM, MLP, RF, SVM, XGBoost, and logistics. Model stability was evaluated using metrics such as Accuracy, MCC, and Sensitivity. The performance and decision-making ability of the models were displayed using ROC curves and DCA curves, while SHAP values were used to visualize the importance of each epigenetic biomarker.
Results
Epigenetic age acceleration was strongly associated with cancer risk but had a weaker relationship with diabetes. In the diabetes model, the top three contributing features were logA1Mort, family income-to-poverty ratio, and marital status. In the cancer model, the top three contributing features were gender, non-Hispanic White ethnicity, and PACKYRSMort.
Conclusion
Our study identified the relationship between epigenetic biomarkers and the risk of diabetes and cancer, and used machine learning techniques to analyze the contributions of various epigenetic biomarkers to disease risk.
Collapse