1
|
Teng X, Liu M, Wang Z, Dong X. Machine learning prediction of preterm birth in women under 35 using routine biomarkers in a retrospective cohort study. Sci Rep 2025; 15:10213. [PMID: 40133418 PMCID: PMC11937320 DOI: 10.1038/s41598-025-92814-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2024] [Accepted: 03/03/2025] [Indexed: 03/27/2025] Open
Abstract
Preterm birth (PTB), defined as delivery before 37 weeks, affects 15 million infants annually, accounting for 11% of live births and over 35% of neonatal deaths. While advanced maternal age (≥ 35 years) is a known risk factor, PTB risk in women under 35 is underexplored. This study aimed to develop a machine learning-based model for PTB prediction in women under 35. A retrospective cohort of 2606 cases (2019-2022) equally split between full-term and preterm births was analyzed. Logistic Regression, LightGBM, Gradient Boosting Decision Tree (GBDT), and XGBoost models were evaluated. External validation was conducted using 803 independent cases (2023). Model performance was assessed using area under the curve (AUC), accuracy, sensitivity, and specificity. SHAP (SHapley Additive exPlanations) values were used to interpret model predictions. The XGBoost model demonstrated superior performance with an AUC of 0.893 (95% CI: 0.860-0.925) on the validation set. In comparison, Logistic Regression, LightGBM, and GBDT achieved AUCs of 0.872, 0.840, and 0.879, respectively. External validation of the XGBoost model yielded an AUC of 0.91 (95% CI: 0.889-0.931). SHAP analysis highlighted seven key predictors: alkaline phosphatase (ALP), alpha-fetoprotein (AFP), hemoglobin (HGB), urea (UREA), lymphocyte count (Lym1), sodium (Na), and red cell distribution width coefficient of variation (RDWCV). The XGBoost model provides accurate PTB risk prediction and key insights for early intervention in women under 35, supporting its potential clinical utility.
Collapse
Affiliation(s)
- Xiaojing Teng
- Department of Laboratory Medicine, Affiliated Hangzhou First People's Hospital, School of Medicine, Westlake University, Hangzhou, Zhejiang, China
| | - Mengting Liu
- The Fourth School of Clinical Medicine, Zhejiang Chinese Medical University (Hangzhou First People's Hospital), Hangzhou, China
| | - Zhiyi Wang
- Department of Clinical Laboratory, Hangzhou Women's Hospital (Hangzhou Maternity and Child Health Care Hospital), Hangzhou, Zhejiang, China.
- Department of Clinical Laboratory, Hangzhou Women's Hospital, No. 369, Kunpeng Road, Shangcheng District, Hangzhou, 310008, Zhejiang, China.
| | - Xueyan Dong
- Department of Laboratory Medicine, Affiliated Hangzhou First People's Hospital, School of Medicine, Westlake University, Hangzhou, Zhejiang, China.
- Department of Laboratory Medicine, Affiliated Hangzhou First People's Hospital, School of Medicine, Westlake University, No. 261, Huansha Road, Shangcheng District, Hangzhou, 31000, Zhejiang, China.
| |
Collapse
|
2
|
Larsen A, Pintye J, Abuna F, Dettinger JC, Gomez L, Marwa MM, Ngumbau N, Odhiambo B, Richardson BA, Watoyi S, Stern J, Kinuthia J, John-Stewart G. Identifying psychosocial predictors and developing a risk score for preterm birth among Kenyan pregnant women. BMC Pregnancy Childbirth 2025; 25:2. [PMID: 39748327 PMCID: PMC11697889 DOI: 10.1186/s12884-024-07058-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 12/10/2024] [Indexed: 01/04/2025] Open
Abstract
BACKGROUND Preterm birth (PTB) is a leading cause of neonatal mortality, particularly in sub-Saharan Africa where 40% of global neonatal deaths occur. We identified and combined demographic, clinical, and psychosocial correlates of PTB among Kenyan women to develop a risk score. METHODS We used data from a prospective study enrolling HIV-negative women from 20 antenatal clinics in Western Kenya (NCT03070600). Depressive symptoms were assessed by study nurses using the Center for Epidemiologic Studies Depression Scale (CESD-10), intimate partner violence (IPV) with the Hurt, Insult, Threaten, Scream scale (HITS), and social support using the Medical Outcomes Survey scale (MOS-SSS). Predictors of PTB (birth < 37 weeks gestation) were identified using multivariable Cox proportional hazards models, clustered by facility. We used stratified k-fold cross-validation methods for risk score derivation and validation. Area under the receiver operating characteristic curve (AUROC) was used to evaluate discrimination of the risk score and Brier score for calibration. RESULTS Among 4084 women, 19% had PTB (incidence rate: 70.9 PTB per 100 fetus-years (f-yrs)). Predictors of PTB included being unmarried (HR:1.29, 95% CI:1.08-1.54), lower education (years) (HR:0.97, 95% CI:0.94-0.99), IPV (HITS score ≥ 5, HR:1.28, 95% CI:0.98-1.68), higher CESD-10 score (HR:1.02, 95% CI:0.99-1.04), lower social support score (HR:0.99, 95% CI:0.97-1.01), and mild-to-severe depressive symptoms (CESD-10 score ≥ 5, HR:1.46, 95% CI:1.07-1.99). The final risk score included being unmarried, social support score, IPV, and MSD. The risk score had modest discrimination between PTB and term deliveries (AUROC:0.56, 95% CI:0.54-0.58), and Brier Score was 0.4672. Women considered "high risk" for PTB (optimal risk score cut-point) had 40% higher risk of PTB (83.6 cases per 100 f-yrs) than "low risk" women (59.6 cases per 100 f-ys; HR:1.6, 95% CI:1.2-1.7, p < 0.001). CONCLUSION A fifth of pregnancies were PTB in this large multi-site cohort; PTB was associated with several social factors amenable to intervention. Combining these factors in a risk score did not predict PTB, reflecting the multifactorial nature of PTB and need to include other unmeasured factors. However, our findings suggest PTB risk could be better understood by integrating mental health and support services into routine antenatal care.
Collapse
Affiliation(s)
- Anna Larsen
- Department of Epidemiology, University of Washington, 3980 15th Ave NE, Box 351619, Seattle, WA, 98195, USA.
- Department of Psychiatry and Behavioral Sciences, University of Washington, Seattle, WA, USA.
| | - Jillian Pintye
- Department of Global Health, University of Washington, Seattle, WA, USA
- Department of Biobehavioral Nursing and Health Informatics, University of Washington, Seattle, WA, USA
| | | | - Julia C Dettinger
- Department of Global Health, University of Washington, Seattle, WA, USA
| | - Laurén Gomez
- Department of Global Health, University of Washington, Seattle, WA, USA
| | | | - Nancy Ngumbau
- Department of Research and Programs, Kenyatta National Hospital, Nairobi, Kenya
| | | | - Barbra A Richardson
- Department of Global Health, University of Washington, Seattle, WA, USA
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | | | - Joshua Stern
- Department of Global Health, University of Washington, Seattle, WA, USA
| | - John Kinuthia
- Department of Global Health, University of Washington, Seattle, WA, USA
- University of Nairobi, Nairobi, Kenya
- Department of Research and Programs, Kenyatta National Hospital, Nairobi, Kenya
| | - Grace John-Stewart
- Department of Epidemiology, University of Washington, 3980 15th Ave NE, Box 351619, Seattle, WA, 98195, USA
- Department of Global Health, University of Washington, Seattle, WA, USA
- School of Medicine, Department of Pediatrics, University of Washington, Seattle, WA, USA
- School of Medicine, Department of Allergy and Infectious Disease, University of Washington, Seattle, WA, USA
| |
Collapse
|
3
|
Kassahun EA, Gebreyesus SH, Tesfamariam K, Endris BS, Roro MA, Getnet Y, Hassen HY, Brusselaers N, Coenen S. Development and validation of a simplified risk prediction model for preterm birth: a prospective cohort study in rural Ethiopia. Sci Rep 2024; 14:4845. [PMID: 38418507 PMCID: PMC10901814 DOI: 10.1038/s41598-024-55627-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 02/26/2024] [Indexed: 03/01/2024] Open
Abstract
Preterm birth is one of the most common obstetric complications in low- and middle-income countries, where access to advanced diagnostic tests and imaging is limited. Therefore, we developed and validated a simplified risk prediction tool to predict preterm birth based on easily applicable and routinely collected characteristics of pregnant women in the primary care setting. We used a logistic regression model to develop a model based on the data collected from 481 pregnant women. Model accuracy was evaluated through discrimination (measured by the area under the Receiver Operating Characteristic curve; AUC) and calibration (via calibration graphs and the Hosmer-Lemeshow goodness of fit test). Internal validation was performed using a bootstrapping technique. A simplified risk score was developed, and the cut-off point was determined using the "Youden index" to classify pregnant women into high or low risk for preterm birth. The incidence of preterm birth was 19.5% (95% CI:16.2, 23.3) of pregnancies. The final prediction model incorporated mid-upper arm circumference, gravidity, history of abortion, antenatal care, comorbidity, intimate partner violence, and anemia as predictors of preeclampsia. The AUC of the model was 0.687 (95% CI: 0.62, 0.75). The calibration plot demonstrated a good calibration with a p-value of 0.713 for the Hosmer-Lemeshow goodness of fit test. The model can identify pregnant women at high risk of preterm birth. It is applicable in daily clinical practice and could contribute to the improvement of the health of women and newborns in primary care settings with limited resources. Healthcare providers in rural areas could use this prediction model to improve clinical decision-making and reduce obstetrics complications.
Collapse
Affiliation(s)
- Eskeziaw Abebe Kassahun
- Department of Family Medicine & Population Health, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium.
| | - Seifu Hagos Gebreyesus
- Departmentof of Nutrition and Dietetics, School of Public Health, Addis Ababa University, Addis Ababa, Ethiopia
| | - Kokeb Tesfamariam
- Department of Food Technology, Safety, and Health, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium
| | - Bilal Shikur Endris
- Departmentof of Nutrition and Dietetics, School of Public Health, Addis Ababa University, Addis Ababa, Ethiopia
| | - Meselech Assegid Roro
- Department of Reproductive Health and Health Service Management, School of Public Health, Addis Ababa University, Addis Ababa, Ethiopia
| | - Yalemwork Getnet
- Departmentof of Nutrition and Dietetics, School of Public Health, Addis Ababa University, Addis Ababa, Ethiopia
| | - Hamid Yimam Hassen
- Department of Family Medicine & Population Health, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium
| | - Nele Brusselaers
- Global Health Institute, Department of Family Medicine & Population Health, Antwerp University, Antwerp, Belgium
- Centre for Translational Microbiome Research, Department of Microbiology, Tumour and Cell Biology, Karolinska Institute, Stockholm, Sweden
| | - Samuel Coenen
- Centre for General Practice, Department of Family Medicine & Population Health, Faculty of Medicine and Health Sciences, University of Antwerp, 2000, Antwerp, Belgium
| |
Collapse
|
4
|
Fente BM, Asaye MM, Tesema GA, Gudayu TW. Development and validation of a prognosis risk score model for preterm birth among pregnant women who had antenatal care visit, Northwest, Ethiopia, retrospective follow-up study. BMC Pregnancy Childbirth 2023; 23:732. [PMID: 37848836 PMCID: PMC10583360 DOI: 10.1186/s12884-023-06018-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2023] [Accepted: 09/21/2023] [Indexed: 10/19/2023] Open
Abstract
BACKGROUND Prematurity is the leading cause of neonatal morbidity and mortality, specifically in low-resource settings. The majority of prematurity can be prevented if early interventions are implemented for high-risk pregnancies. Developing a prognosis risk score for preterm birth based on easily available predictors could support health professionals as a simple clinical tool in their decision-making. Therefore, the study aims to develop and validate a prognosis risk score model for preterm birth among pregnant women who had antenatal care visit at Debre Markos Comprehensive and Specialized Hospital, Ethiopia. METHODS A retrospective follow-up study was conducted among a total of 1,132 pregnant women. Client charts were selected using a simple random sampling technique. Data were extracted using structured checklist prepared in the Kobo Toolbox application and exported to STATA version 14 and R version 4.2.2 for data management and analysis. Stepwise backward multivariable analysis was done. A simplified risk prediction model was developed based on a binary logistic model, and the model's performance was assessed by discrimination power and calibration. The internal validity of the model was evaluated by bootstrapping. Decision Curve Analysis was used to determine the clinical impact of the model. RESULT The incidence of preterm birth was 10.9%. The developed risk score model comprised of six predictors that remained in the reduced multivariable logistic regression, including age < 20, late initiation of antenatal care, unplanned pregnancy, recent pregnancy complications, hemoglobin < 11 mg/dl, and multiparty, for a total score of 17. The discriminatory power of the model was 0.931, and the calibration test was p > 0.05. The optimal cut-off for classifying risks as low or high was 4. At this cut point, the sensitivity, specificity and accuracy is 91.0%, 82.1%, and 83.1%, respectively. It was internally validated and has an optimism of 0.003. The model was found to have clinical benefit. CONCLUSION The developed risk-score has excellent discrimination performance and clinical benefit. It can be used in the clinical settings by healthcare providers for early detection, timely decision making, and improving care quality.
Collapse
Affiliation(s)
- Bezawit Melak Fente
- Department of General Midwifery, School of Midwifery, College of Medicine & Health sciences, University of Gondar, Gondar, Ethiopia
| | - Mengstu Melkamu Asaye
- Department of Women’s and Family Health, School of midwifery, College of Medicine & Health sciences, University of Gondar, Gondar, Ethiopia
| | - Getayeneh Antehunegn Tesema
- Department of Epidemiology and Biostatistics, Institute of Public Health, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia
| | - Temesgen Worku Gudayu
- Department of Clinical Midwifery, School of Midwifery, College of Medicine & Health sciences, University of Gondar, Gondar, Ethiopia
| |
Collapse
|
5
|
Pons-Duran C, Wilder B, Hunegnaw BM, Haneuse S, Goddard FG, Bekele D, Chan GJ. Development of risk prediction models for preterm delivery in a rural setting in Ethiopia. J Glob Health 2023; 13:04051. [PMID: 37224519 DOI: 10.7189/jogh.13.04051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023] Open
Abstract
Background Preterm birth complications are the leading causes of death among children under five years. However, the inability to accurately identify pregnancies at high risk of preterm delivery is a key practical challenge, especially in resource-constrained settings with limited availability of biomarkers assessment. Methods We evaluated whether risk of preterm delivery can be predicted using available data from a pregnancy and birth cohort in Amhara region, Ethiopia. All participants were enrolled in the cohort between December 2018 and March 2020. The study outcome was preterm delivery, defined as any delivery occurring before week 37 of gestation regardless of vital status of the foetus or neonate. A range of sociodemographic, clinical, environmental, and pregnancy-related factors were considered as potential inputs. We used Cox and accelerated failure time models, alongside decision tree ensembles to predict risk of preterm delivery. We estimated model discrimination using the area-under-the-curve (AUC) and simulated the conditional distributions of cervical length (CL) and foetal fibronectin (FFN) to ascertain whether they could improve model performance. Results We included 2493 pregnancies; among them, 138 women were censored due to loss-to-follow-up before delivery. Overall, predictive performance of models was poor. The AUC was highest for the tree ensemble classifier (0.60, 95% confidence interval = 0.57-0.63). When models were calibrated so that 90% of women who experienced a preterm delivery were classified as high risk, at least 75% of those classified as high risk did not experience the outcome. The simulation of CL and FFN distributions did not significantly improve models' performance. Conclusions Prediction of preterm delivery remains a major challenge. In resource-limited settings, predicting high-risk deliveries would not only save lives, but also inform resource allocation. It may not be possible to accurately predict risk of preterm delivery without investing in novel technologies to identify genetic factors, immunological biomarkers, or the expression of specific proteins.
Collapse
Affiliation(s)
- Clara Pons-Duran
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
| | - Bryan Wilder
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| | - Bezawit Mesfin Hunegnaw
- Department of Pediatrics and Child Health, St. Paul's Hospital Millennium Medical College, Addis Ababa, Ethiopia
| | - Sebastien Haneuse
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
| | - Frederick Gb Goddard
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
| | - Delayehu Bekele
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Department of Obstetrics and Gynecology, St. Paul's Hospital Millennium Medical College, Addis Ababa, Ethiopia
| | - Grace J Chan
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Department of Pediatrics and Child Health, St. Paul's Hospital Millennium Medical College, Addis Ababa, Ethiopia
- Division of Medical Critical Care, Boston Children's Hospital, Department of Pediatrics, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|