Dong R, Wang Y, Yao H, Chen T, Zhou Q, Zhao B, Xu J. Development and Validation of Predictive Models for Inflammatory Bowel Disease Diagnosis: A Machine Learning and Nomogram-Based Approach.
J Inflamm Res 2025;
18:5115-5131. [PMID:
40255659 PMCID:
PMC12009038 DOI:
10.2147/jir.s378069]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2024] [Accepted: 03/21/2025] [Indexed: 04/22/2025] Open
Abstract
Background
Inflammatory bowel disease (IBD) is a chronic, incurable gastrointestinal disease without a gold standard for diagnosis. This study aimed to develop predictive models for diagnosing IBD, Crohn's disease (CD), and Ulcerative colitis (UC) by combining two approaches: machine learning (ML) and traditional nomogram models.
Methods
Cohorts 1 and 2 comprised data from the UK Biobank (UKB), and the First Hospital of Jilin University, respectively, which represented the initial laboratory tests upon admission for 1135 and 237 CD patients, 2192 and 326 UC patients, and 1798 and 298 non-IBD patients. Cohorts 1 and 2 were used to create predictive models. The parameters of the machine learning model established by Cohorts 1 and 2 were merged, and nomogram models were developed using Logistic regression. Cohort 3 collected initial laboratory tests from 117 CD patients, 197 UC patients, and 241 non IBD patients at a tertiary hospital in different regions of China for external testing of three nomogram models.
Results
For Cohort 1, ML-IBD-1, ML-CD-1 and ML-UC-1 models developed using the LightGBM algorithm demonstrated exceptional discrimination (ML-IBD-1: AUC = 0.788; ML-CD-1: AUC = 0.772; ML-UC-1: AUC = 0.841). For Cohort 2, ML-IBD-2, ML-CD-2, and ML-UC-2 models developed using XGBoost and Logistic Regression algorithms demonstrated exceptional discrimination (ML-IBD-2: AUC = 0.894; ML-CD-2: AUC = 0.932; ML-UC-2: AUC = 0.778). The nomogram model exhibits good diagnostic capability (nomogram-IBD: AUC=0.778, 95% CI (0.688-0.868); nomogram-CD: AUC=0.744, 95% CI (0.710-0.778); nomogram-UC, AUC=0.702, 95% CI (0.591-0.814)). The predictive ability of the three models was validated in cohort 3 (nomogram-IBD: AUC=0.758, 95% CI (0.683-0.832); nomogram-CD: AUC=0.791, 95% CI (0.717-0.865); nomogram-UC, AUC=0.817, 95% CI (0.702-0.932)).
Conclusion
This study utilized three cohorts and developed risk prediction models for IBD, CD, and UC with good diagnostic capability, based on conventional laboratory data using ML and nomogram.
Collapse