1
|
Silva AB, Rocha EDS, Lorenzato JF, Endo PT. Evaluating how different balancing data techniques impact on prediction of premature birth using machine learning models. PLoS One 2025; 20:e0316574. [PMID: 40173408 PMCID: PMC11964454 DOI: 10.1371/journal.pone.0316574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Accepted: 12/12/2024] [Indexed: 04/04/2025] Open
Abstract
Premature birth can be defined as birth before 37 weeks of gestation, which is a significant global health issue, being the main cause for neonatal deaths. In this work, we evaluate machine learning models for predicting premature birth using Brazilian sociodemographic and obstetric data, focusing on the challenge of data imbalance, a common problem that can lead to biased predictions. We evaluate five data balancing techniques: Undersampling, Oversampling, and three Hybridsampling configurations where the minority class was increased by factors 2, 3, and 4. The machine learning models, including Decision Tree, Random Forest, and AdaBoost, are trained and evaluated on a dataset of over 483,000 cases. The use of the Hybridsampling approach resulted in an accuracy of 70%, a recall of 64%, and a precision of 74% in the Decision Tree model. Results show that Hybridsampling techniques significantly improves models' performance compared to Undersampling and Oversampling, highlighting the importance of a proper data balancing in predictive models for preterm birth. The relevance of our work is particularly significant for the Brazilian Unified Health System (SUS). By improving the accuracy of premature birth predictions, our models could assist healthcare providers in identifying at-risk pregnancies earlier, allowing for timely interventions. This integration could enhance maternal and neonatal care, reduce the incidence of preterm births, and potentially decrease neonatal mortality, especially in underserved regions.
Collapse
|
2
|
Liu Y, Liu J, Shen H. Machine learning model-based preterm birth prediction and clinical nomogram: A big retrospective cohort study. Int J Gynaecol Obstet 2025; 169:332-340. [PMID: 39552525 DOI: 10.1002/ijgo.16036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2024] [Revised: 10/23/2024] [Accepted: 10/29/2024] [Indexed: 11/19/2024]
Abstract
OBJECTIVE This study sought to develop a multifactorial predictive model for preterm birth risk, with the goal of providing clinical practitioners with early prevention. METHODS This retrospective cohort study utilized 2022 and 2018 National Vital Statistics System (NVSS) birth data, with the 2022 cohort arbitrarily split into training (70%) and internal verification (30%) subsets, and the 2018 cohort for external validation. Four machine learning algorithms-logistic regression, adaptive lasso regression, bootstrap forest, and boosted trees-identified features associated with preterm birth. The study then integrated the consensus features identified across the four models to construct a logistic regression-based preterm birth prediction nomogram. To evaluate the model's efficacy, calibration, receiver operating characteristic (ROC), and decision curve analysis were applied to both the internal and external validation sets. RESULTS The study included 2 567 040 mother-infant pairs from the 2022 cohort and 2 688 568 mother-infant pairs from the 2018 cohort. All four machine learning models demonstrated high accuracy (area under the curve [AUC] >0.7) in predicting preterm birth, and the internal validation results indicated good model generalizability. Feature selection identified nine common risk factors associated with preterm birth. The prediction nomogram based on these nine common features achieved AUCs of 0.701, 0.702, and 0.704 in the training, internal validation, and external validation sets, respectively. The calibration curves showed good agreement, and the decision curve analysis confirmed the model's net clinical benefits. CONCLUSION This study developed a reliable preterm birth prediction tool using large-scale birth cohort data, filling the gap of lacking external validation for existing preterm birth prediction models.
Collapse
Affiliation(s)
- Ya Liu
- State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory and State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, School of Public Health, Xiamen University, Xiamen, China
| | - Jiangling Liu
- State Key Laboratory of Vaccines for Infectious Diseases, Xiang An Biomedicine Laboratory and State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, School of Public Health, Xiamen University, Xiamen, China
| | - Heqing Shen
- Department of Obstetrics, Women and Children's Hospital, School of Medicine, Xiamen University, Xiamen, China
| |
Collapse
|
3
|
Kloska A, Harmoza A, Kloska SM, Marciniak T, Sadowska-Krawczenko I. Predicting preterm birth using machine learning methods. Sci Rep 2025; 15:5683. [PMID: 39956843 PMCID: PMC11830770 DOI: 10.1038/s41598-025-89905-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2024] [Accepted: 02/10/2025] [Indexed: 02/18/2025] Open
Abstract
Preterm birth is a significant public health concern, given its correlation with neonatal mortality and morbidity. The aetiology of preterm birth is complex and multifactorial. The objective of this study was to develop and compare machine learning models for predicting the risk of preterm birth. Data were collected from 50 patients in a maternity ward, with an analysis performed based on the timing of delivery (preterm vs. term). The applicability of XGBoost, CatBoost, logistic regression, support vector machines (SVM), and decision trees for predicting preterm delivery was evaluated through training. The linear SVM with boosted parameters demonstrated the highest performance, achieving an accuracy of 82%, precision of 83%, recall of 86%, and an F1-score of 84%. The logistic regression model, also boosted, demonstrated comparable performance to the linear SVM, with similar accuracy (80%), precision (82%), recall (82%), and F1-score (82%). The performance of other models, including decision trees and more complex algorithms, was inferior, which is likely attributable to the limited dataset and the number of parameters involved. In particular, machine learning models, most notably the linear SVM, can be effectively employed to assess the risk of preterm birth. The findings indicate that the linear SVM model exhibits the greatest efficacy among the tested models.
Collapse
Affiliation(s)
- Anna Kloska
- Faculty of Medicine, Bydgoszcz University of Science and Technology, 85796, Bydgoszcz, Poland.
| | - Alicja Harmoza
- Faculty of Medicine, The Ludwik Rydygier Collegium Medicum, 85067, Bydgoszcz, Poland
| | - Sylwester M Kloska
- Faculty of Medicine, Bydgoszcz University of Science and Technology, 85796, Bydgoszcz, Poland
| | - Tomasz Marciniak
- Faculty of Telecommunications, Computer Science and Electrical Engineering, Bydgoszcz University of Science and Technology, 85796, Bydgoszcz, Poland
| | | |
Collapse
|
4
|
Holloway IW, Wu ESC, Boka C, Young N, Hong C, Fuentes K, Kärkkäinen K, Beikzadeh M, Avendaño A, Jauregui JC, Zhang A, Sevillano L, Fyfe C, Brisbin CD, Beltran RM, Cordero L, Parsons JT, Sarrafzadeh M. Novel Machine Learning HIV Intervention for Sexual and Gender Minority Young People Who Have Sex With Men (uTECH): Protocol for a Randomized Comparison Trial. JMIR Res Protoc 2024; 13:e58448. [PMID: 39163591 PMCID: PMC11372318 DOI: 10.2196/58448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 06/20/2024] [Accepted: 06/24/2024] [Indexed: 08/22/2024] Open
Abstract
BACKGROUND Sexual and gender minority (SGM) young people are disproportionately affected by HIV in the United States, and substance use is a major driver of new infections. People who use web-based venues to meet sex partners are more likely to report substance use, sexual risk behaviors, and sexually transmitted infections. To our knowledge, no machine learning (ML) interventions have been developed that use web-based and digital technologies to inform and personalize HIV and substance use prevention efforts for SGM young people. OBJECTIVE This study aims to test the acceptability, appropriateness, and feasibility of the uTECH intervention, a SMS text messaging intervention using an ML algorithm to promote HIV prevention and substance use harm reduction among SGM people aged 18 to 29 years who have sex with men. This intervention will be compared to the Young Men's Health Project (YMHP) alone, an existing Centers for Disease Control and Prevention best evidence intervention for young SGM people, which consists of 4 motivational interviewing-based counseling sessions. The YMHP condition will receive YMHP sessions and will be compared to the uTECH+YMHP condition, which includes YMHP sessions as well as uTECH SMS text messages. METHODS In a study funded by the National Institutes of Health, we will recruit and enroll SGM participants (aged 18-29 years) in the United States (N=330) to participate in a 12-month, 2-arm randomized comparison trial. All participants will receive 4 counseling sessions conducted over Zoom (Zoom Video Communications, Inc) with a master's-level social worker. Participants in the uTECH+YMHP condition will receive curated SMS text messages informed by an ML algorithm that seek to promote HIV and substance use risk reduction strategies as well as undergoing YMHP counseling. We hypothesize that the uTECH+YMHP intervention will be considered acceptable, appropriate, and feasible to most participants. We also hypothesize that participants in the combined condition will experience enhanced and more durable reductions in substance use and sexual risk behaviors compared to participants receiving YMHP alone. Appropriate statistical methods, models, and procedures will be selected to evaluate primary hypotheses and behavioral health outcomes in both intervention conditions using an α<.05 significance level, including comparison tests, tests of fixed effects, and growth curve modeling. RESULTS This study was funded in August 2019. As of June 2024, all participants have been enrolled. Data analysis has commenced, and expected results will be published in the fall of 2025. CONCLUSIONS This study aims to develop and test the acceptability, appropriateness, and feasibility of uTECH, a novel approach to reduce HIV risk and substance use among SGM young adults. TRIAL REGISTRATION ClinicalTrials.gov NCT04710901; https://clinicaltrials.gov/study/NCT04710901. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) DERR1-10.2196/58448.
Collapse
Affiliation(s)
- Ian W Holloway
- Department of Social Welfare, UCLA Luskin School of Public Affairs, University of California, Los Angeles, Los Angeles, CA, United States
| | - Elizabeth S C Wu
- Department of Social Welfare, UCLA Luskin School of Public Affairs, University of California, Los Angeles, Los Angeles, CA, United States
| | - Callisto Boka
- Department of Epidemiology, UCLA Fielding School of Public Health, University of California, Los Angeles, Los Angeles, CA, United States
| | - Nina Young
- Department of Social Welfare, UCLA Luskin School of Public Affairs, University of California, Los Angeles, Los Angeles, CA, United States
| | - Chenglin Hong
- Department of Social Welfare, UCLA Luskin School of Public Affairs, University of California, Los Angeles, Los Angeles, CA, United States
| | - Kimberly Fuentes
- Department of Social Welfare, UCLA Luskin School of Public Affairs, University of California, Los Angeles, Los Angeles, CA, United States
| | - Kimmo Kärkkäinen
- Department of Computer Science, UCLA Samueli School Of Engineering, University of California, Los Angeles, Los Angeles, CA, United States
| | - Mehrab Beikzadeh
- Department of Computer Science, UCLA Samueli School Of Engineering, University of California, Los Angeles, Los Angeles, CA, United States
| | - Alexandra Avendaño
- Department of Social Welfare, UCLA Luskin School of Public Affairs, University of California, Los Angeles, Los Angeles, CA, United States
| | - Juan C Jauregui
- Department of Social Welfare, UCLA Luskin School of Public Affairs, University of California, Los Angeles, Los Angeles, CA, United States
| | - Aileen Zhang
- Department of Social Welfare, UCLA Luskin School of Public Affairs, University of California, Los Angeles, Los Angeles, CA, United States
| | - Lalaine Sevillano
- Department of Social Welfare, UCLA Luskin School of Public Affairs, University of California, Los Angeles, Los Angeles, CA, United States
- School of Social Work, Portland State University, Portland, OR, United States
| | - Colin Fyfe
- Department of Social Welfare, UCLA Luskin School of Public Affairs, University of California, Los Angeles, Los Angeles, CA, United States
| | - Cal D Brisbin
- Department of Social Welfare, UCLA Luskin School of Public Affairs, University of California, Los Angeles, Los Angeles, CA, United States
| | - Raiza M Beltran
- Department of Social Welfare, UCLA Luskin School of Public Affairs, University of California, Los Angeles, Los Angeles, CA, United States
| | - Luisita Cordero
- Department of Social Welfare, UCLA Luskin School of Public Affairs, University of California, Los Angeles, Los Angeles, CA, United States
| | | | - Majid Sarrafzadeh
- Department of Computer Science, UCLA Samueli School Of Engineering, University of California, Los Angeles, Los Angeles, CA, United States
| |
Collapse
|
5
|
Mirzaei A, Hiller BC, Stelzer IA, Thiele K, Tan Y, Becker M. Computational Approaches for Connecting Maternal Stress to Preterm Birth. Clin Perinatol 2024; 51:345-360. [PMID: 38705645 DOI: 10.1016/j.clp.2024.02.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/07/2024]
Abstract
Multiple studies have hinted at a complex connection between maternal stress and preterm birth (PTB). This article describes the potential of computational methods to provide new insights into this relationship. For this, we outline existing approaches for stress assessments and various data modalities available for profiling stress responses, and review studies that sought either to establish a connection between stress and PTB or to predict PTB based on stress-related factors. Finally, we summarize the challenges of computational methods, highlighting potential future research directions within this field.
Collapse
Affiliation(s)
- Amin Mirzaei
- Department of Computer Science and Electrical Engineering, Institute for Visual and Analytic Computing, Universität Rostock, Albert-Einstein-Straße 22, 18059 Rostock, Germany
| | - Bjarne C Hiller
- Department of Computer Science and Electrical Engineering, Institute for Visual and Analytic Computing, Universität Rostock, Albert-Einstein-Straße 22, 18059 Rostock, Germany
| | - Ina A Stelzer
- Department of Pathology, University of California San Diego, GPL/CMM-West, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Kristin Thiele
- Division for Experimental Feto-Maternal Medicine, Department of Obstetrics and Fetal Medicine, University Medical Center Hamburg-Eppendorf, Center for Obstetrics and Pediatrics, Martinistrasse 52, 20246 Hamburg, Germany
| | - Yuqi Tan
- Department of Microbiology and Immunology, Stanford University School of Medicine, CSSR3220, 269 Campus Drive, Stanford, CA 94305, USA
| | - Martin Becker
- Department of Computer Science and Electrical Engineering, Institute for Visual and Analytic Computing, Universität Rostock, Albert-Einstein-Straße 22, 18059 Rostock, Germany.
| |
Collapse
|
6
|
Khan I, Khare BK. Exploring the potential of machine learning in gynecological care: a review. Arch Gynecol Obstet 2024; 309:2347-2365. [PMID: 38625543 DOI: 10.1007/s00404-024-07479-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Accepted: 03/10/2024] [Indexed: 04/17/2024]
Abstract
Gynecological health remains a critical aspect of women's overall well-being, with profound implications for maternal and reproductive outcomes. This comprehensive review synthesizes the current state of knowledge on four pivotal aspects of gynecological health: preterm birth, breast cancer and cervical cancer and infertility treatment. Machine learning (ML) has emerged as a transformative technology with the potential to revolutionize gynecology and women's healthcare. The subsets of AI, namely, machine learning (ML) and deep learning (DL) methods, have aided in detecting complex patterns from huge datasets and using such patterns in making predictions. This paper investigates how machine learning (ML) algorithms are employed in the field of gynecology to tackle crucial issues pertaining to women's health. This paper also investigates the integration of ultrasound technology with artificial intelligence (AI) during the initial, intermediate, and final stages of pregnancy. Additionally, it delves into the diverse applications of AI throughout each trimester.This review paper provides an overview of machine learning (ML) models, introduces natural language processing (NLP) concepts, including ChatGPT, and discusses the clinical applications of artificial intelligence (AI) in gynecology. Additionally, the paper outlines the challenges in utilizing machine learning within the field of gynecology.
Collapse
Affiliation(s)
- Imran Khan
- Harcourt Butler Technical University, Kanpur, India.
| | | |
Collapse
|
7
|
Kassahun EA, Gebreyesus SH, Tesfamariam K, Endris BS, Roro MA, Getnet Y, Hassen HY, Brusselaers N, Coenen S. Development and validation of a simplified risk prediction model for preterm birth: a prospective cohort study in rural Ethiopia. Sci Rep 2024; 14:4845. [PMID: 38418507 PMCID: PMC10901814 DOI: 10.1038/s41598-024-55627-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 02/26/2024] [Indexed: 03/01/2024] Open
Abstract
Preterm birth is one of the most common obstetric complications in low- and middle-income countries, where access to advanced diagnostic tests and imaging is limited. Therefore, we developed and validated a simplified risk prediction tool to predict preterm birth based on easily applicable and routinely collected characteristics of pregnant women in the primary care setting. We used a logistic regression model to develop a model based on the data collected from 481 pregnant women. Model accuracy was evaluated through discrimination (measured by the area under the Receiver Operating Characteristic curve; AUC) and calibration (via calibration graphs and the Hosmer-Lemeshow goodness of fit test). Internal validation was performed using a bootstrapping technique. A simplified risk score was developed, and the cut-off point was determined using the "Youden index" to classify pregnant women into high or low risk for preterm birth. The incidence of preterm birth was 19.5% (95% CI:16.2, 23.3) of pregnancies. The final prediction model incorporated mid-upper arm circumference, gravidity, history of abortion, antenatal care, comorbidity, intimate partner violence, and anemia as predictors of preeclampsia. The AUC of the model was 0.687 (95% CI: 0.62, 0.75). The calibration plot demonstrated a good calibration with a p-value of 0.713 for the Hosmer-Lemeshow goodness of fit test. The model can identify pregnant women at high risk of preterm birth. It is applicable in daily clinical practice and could contribute to the improvement of the health of women and newborns in primary care settings with limited resources. Healthcare providers in rural areas could use this prediction model to improve clinical decision-making and reduce obstetrics complications.
Collapse
Affiliation(s)
- Eskeziaw Abebe Kassahun
- Department of Family Medicine & Population Health, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium.
| | - Seifu Hagos Gebreyesus
- Departmentof of Nutrition and Dietetics, School of Public Health, Addis Ababa University, Addis Ababa, Ethiopia
| | - Kokeb Tesfamariam
- Department of Food Technology, Safety, and Health, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium
| | - Bilal Shikur Endris
- Departmentof of Nutrition and Dietetics, School of Public Health, Addis Ababa University, Addis Ababa, Ethiopia
| | - Meselech Assegid Roro
- Department of Reproductive Health and Health Service Management, School of Public Health, Addis Ababa University, Addis Ababa, Ethiopia
| | - Yalemwork Getnet
- Departmentof of Nutrition and Dietetics, School of Public Health, Addis Ababa University, Addis Ababa, Ethiopia
| | - Hamid Yimam Hassen
- Department of Family Medicine & Population Health, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium
| | - Nele Brusselaers
- Global Health Institute, Department of Family Medicine & Population Health, Antwerp University, Antwerp, Belgium
- Centre for Translational Microbiome Research, Department of Microbiology, Tumour and Cell Biology, Karolinska Institute, Stockholm, Sweden
| | - Samuel Coenen
- Centre for General Practice, Department of Family Medicine & Population Health, Faculty of Medicine and Health Sciences, University of Antwerp, 2000, Antwerp, Belgium
| |
Collapse
|
8
|
Gondane P, Kumbhakarn S, Maity P, Kapat K. Recent Advances and Challenges in the Early Diagnosis and Treatment of Preterm Labor. Bioengineering (Basel) 2024; 11:161. [PMID: 38391647 PMCID: PMC10886370 DOI: 10.3390/bioengineering11020161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 01/30/2024] [Accepted: 02/04/2024] [Indexed: 02/24/2024] Open
Abstract
Preterm birth (PTB) is the primary cause of neonatal mortality and long-term disabilities. The unknown mechanism behind PTB makes diagnosis difficult, yet early detection is necessary for controlling and averting related consequences. The primary focus of this work is to provide an overview of the known risk factors associated with preterm labor and the conventional and advanced procedures for early detection of PTB, including multi-omics and artificial intelligence/machine learning (AI/ML)- based approaches. It also discusses the principles of detecting various proteomic biomarkers based on lateral flow immunoassay and microfluidic chips, along with the commercially available point-of-care testing (POCT) devices and associated challenges. After briefing the therapeutic and preventive measures of PTB, this review summarizes with an outlook.
Collapse
Affiliation(s)
- Prashil Gondane
- Department of Medical Devices, National Institute of Pharmaceutical Education and Research Kolkata, 168, Maniktala Main Road, Kankurgachi, Kolkata 700054, India
| | - Sakshi Kumbhakarn
- Department of Medical Devices, National Institute of Pharmaceutical Education and Research Kolkata, 168, Maniktala Main Road, Kankurgachi, Kolkata 700054, India
| | - Pritiprasanna Maity
- Department of Regenerative Medicine and Cell Biology, Medical University of South Carolina, Charleston, SC 29425, USA
| | - Kausik Kapat
- Department of Medical Devices, National Institute of Pharmaceutical Education and Research Kolkata, 168, Maniktala Main Road, Kankurgachi, Kolkata 700054, India
| |
Collapse
|
9
|
Shimoga Narayana Rao K, Asha V. An automatic classification approach for preterm delivery detection based on deep learning. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
10
|
Artificial intelligence in the diagnosis of necrotising enterocolitis in newborns. Pediatr Res 2023; 93:376-381. [PMID: 36195629 DOI: 10.1038/s41390-022-02322-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Accepted: 09/03/2022] [Indexed: 11/09/2022]
Abstract
Necrotising enterocolitis (NEC) is one of the most common diseases in neonates and predominantly affects premature or very-low-birth-weight infants. Diagnosis is difficult and needed in hours since the first symptom onset for the best therapeutic effects. Artificial intelligence (AI) may play a significant role in NEC diagnosis. A literature search on the use of AI in the diagnosis of NEC was performed. Four databases (PubMed, Embase, arXiv, and IEEE Xplore) were searched with the appropriate MeSH terms. The search yielded 118 publications that were reduced to 8 after screening and checking for eligibility. Of the eight, five used classic machine learning (ML), and three were on the topic of deep ML. Most publications showed promising results. However, no publications with evident clinical benefits were found. Datasets used for training and testing AI systems were small and typically came from a single institution. The potential of AI to improve the diagnosis of NEC is evident. The body of literature on this topic is scarce, and more research in this area is needed, especially with a focus on clinical utility. Cross-institutional data for the training and testing of AI algorithms are required to make progress in this area. IMPACT: Only a few publications on the use of AI in NEC diagnosis are available although they offer some evidence that AI may be helpful in NEC diagnosis. AI requires large, multicentre, and multimodal datasets of high quality for model training and testing. Published results in the literature are based on data from single institutions and, as such, have limited generalisability. Large multicentre studies evaluating broad datasets are needed to evaluate the true potential of AI in diagnosing NEC in a clinical setting.
Collapse
|
11
|
Nieto-del-Amor F, Prats-Boluda G, Garcia-Casado J, Diaz-Martinez A, Diago-Almela VJ, Monfort-Ortiz R, Hao D, Ye-Lin Y. Combination of Feature Selection and Resampling Methods to Predict Preterm Birth Based on Electrohysterographic Signals from Imbalance Data. SENSORS 2022; 22:s22145098. [PMID: 35890778 PMCID: PMC9319575 DOI: 10.3390/s22145098] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 07/01/2022] [Accepted: 07/05/2022] [Indexed: 02/01/2023]
Abstract
Due to its high sensitivity, electrohysterography (EHG) has emerged as an alternative technique for predicting preterm labor. The main obstacle in designing preterm labor prediction models is the inherent preterm/term imbalance ratio, which can give rise to relatively low performance. Numerous studies obtained promising preterm labor prediction results using the synthetic minority oversampling technique. However, these studies generally overestimate mathematical models’ real generalization capacity by generating synthetic data before splitting the dataset, leaking information between the training and testing partitions and thus reducing the complexity of the classification task. In this work, we analyzed the effect of combining feature selection and resampling methods to overcome the class imbalance problem for predicting preterm labor by EHG. We assessed undersampling, oversampling, and hybrid methods applied to the training and validation dataset during feature selection by genetic algorithm, and analyzed the resampling effect on training data after obtaining the optimized feature subset. The best strategy consisted of undersampling the majority class of the validation dataset to 1:1 during feature selection, without subsequent resampling of the training data, achieving an AUC of 94.5 ± 4.6%, average precision of 84.5 ± 11.7%, maximum F1-score of 79.6 ± 13.8%, and recall of 89.8 ± 12.1%. Our results outperformed the techniques currently used in clinical practice, suggesting the EHG could be used to predict preterm labor in clinics.
Collapse
Affiliation(s)
- Félix Nieto-del-Amor
- Centro de Investigación e Innovación en Bioingeniería, Universitat Politècnica de València, 46022 Valencia, Spain; (F.N.-d.-A.); (J.G.-C.); (A.D.-M.); (Y.Y.-L.)
| | - Gema Prats-Boluda
- Centro de Investigación e Innovación en Bioingeniería, Universitat Politècnica de València, 46022 Valencia, Spain; (F.N.-d.-A.); (J.G.-C.); (A.D.-M.); (Y.Y.-L.)
- Correspondence:
| | - Javier Garcia-Casado
- Centro de Investigación e Innovación en Bioingeniería, Universitat Politècnica de València, 46022 Valencia, Spain; (F.N.-d.-A.); (J.G.-C.); (A.D.-M.); (Y.Y.-L.)
| | - Alba Diaz-Martinez
- Centro de Investigación e Innovación en Bioingeniería, Universitat Politècnica de València, 46022 Valencia, Spain; (F.N.-d.-A.); (J.G.-C.); (A.D.-M.); (Y.Y.-L.)
| | | | - Rogelio Monfort-Ortiz
- Servicio de Obstetricia, H.U.P. La Fe, 46026 Valencia, Spain; (V.J.D.-A.); (R.M.-O.)
| | - Dongmei Hao
- Faculty of Environment and Life, Beijing University of Technology, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing 100124, China;
| | - Yiyao Ye-Lin
- Centro de Investigación e Innovación en Bioingeniería, Universitat Politècnica de València, 46022 Valencia, Spain; (F.N.-d.-A.); (J.G.-C.); (A.D.-M.); (Y.Y.-L.)
| |
Collapse
|
12
|
Xie F, Khadka N, Fassett MJ, Chiu VY, Avila CC, Shi J, Yeh M, Kawatkar A, Mensah NA, Sacks DA, Getahun D. Identifying Preterm Labor Evaluation Visits and Extraction of Cervical Length Measures from Electronic Health Records Within a large Integrated Healthcare System (Preprint). JMIR Med Inform 2022; 10:e37896. [PMID: 36066930 PMCID: PMC9490529 DOI: 10.2196/37896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 07/15/2022] [Accepted: 08/12/2022] [Indexed: 11/25/2022] Open
Abstract
Background Preterm birth (PTB) represents a significant public health problem in the United States and throughout the world. Accurate identification of preterm labor (PTL) evaluation visits is the first step in conducting PTB-related research. Objective We aimed to develop a validated computerized algorithm to identify PTL evaluation visits and extract cervical length (CL) measures from electronic health records (EHRs) within a large integrated health care system. Methods We used data extracted from the EHRs at Kaiser Permanente Southern California between 2009 and 2020. First, we identified triage and hospital encounters with fetal fibronectin (fFN) tests, transvaginal ultrasound (TVUS) procedures, PTL medications, or PTL diagnosis codes within 240/7-346/7 gestational weeks. Second, clinical notes associated with triage and hospital encounters within 240/7-346/7 gestational weeks were extracted from EHRs. A computerized algorithm and an automated process were developed and refined by multiple iterations of chart review and adjudication to search the following PTL indicators: fFN tests, TVUS procedures, abdominal pain, uterine contractions, PTL medications, and descriptions of PTL evaluations. An additional process was constructed to extract the CLs from the corresponding clinical notes of these identified PTL evaluation visits. Results A total of 441,673 live birth pregnancies were identified between 2009 and 2020. Of these, 103,139 pregnancies (23.35%) had documented PTL evaluation visits identified by the computerized algorithm. The trend of pregnancies with PTL evaluation visits slightly decreased from 24.41% (2009) to 17.42% (2020). Of the first 103,139 PTL visits, 19,439 (18.85%) and 44,423 (43.97%) had an fFN test and a TVUS, respectively. The percentage of first PTL visits with an fFN test decreased from 18.06% at 240/7 gestational weeks to 2.32% at 346/7 gestational weeks, and TVUS from 54.67% at 240/7 gestational weeks to 12.05% in 346/7 gestational weeks. The mean (SD) of the CL was 3.66 (0.99) cm with a mean range of 3.61-3.69 cm that remained stable across the study period. Of the pregnancies with PTL evaluation visits, the rate of PTB remained stable over time (20,399, 19.78%). Validation of the computerized algorithms against 100 randomly selected records from these potential PTL visits showed positive predictive values of 97%, 94.44%, 100%, and 96.43% for the PTL evaluation visits, fFN tests, TVUS, and CL, respectively, along with sensitivity values of 100%, 90%, and 90%, and specificity values of 98.8%, 100%, and 98.6% for the fFN test, TVUS, and CL, respectively. Conclusions The developed computerized algorithm effectively identified PTL evaluation visits and extracted the corresponding CL measures from the EHRs. Validation against this algorithm achieved a high level of accuracy. This computerized algorithm can be used for conducting PTL- or PTB-related pharmacoepidemiologic studies and patient care reviews.
Collapse
Affiliation(s)
- Fagen Xie
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Nehaa Khadka
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Michael J Fassett
- Department of Obstetrics & Gynecology, Kaiser Permanente West Los Angeles Medical Center, Los Angeles, CA, United States
- Department of Clinical Science, Kaiser Permanente Bernard J Tyson School of Medicine, Pasadena, CA, United States
| | - Vicki Y Chiu
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Chantal C Avila
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Jiaxiao Shi
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Meiyu Yeh
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Aniket Kawatkar
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Nana A Mensah
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - David A Sacks
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Darios Getahun
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
- Department of Health Systems Science, Kaiser Permanente Bernard J. Tyson School of Medicine, Pasadena, CA, United States
| |
Collapse
|
13
|
|
14
|
Sharifi-Heris Z, Laitala J, Airola A, Rahmani AM, Bender M. Machine learning modeling for preterm birth prediction using health record: A systematic review (Preprint). JMIR Med Inform 2021; 10:e33875. [PMID: 35442214 PMCID: PMC9069277 DOI: 10.2196/33875] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 01/29/2022] [Accepted: 02/26/2022] [Indexed: 11/24/2022] Open
Abstract
Background Preterm birth (PTB), a common pregnancy complication, is responsible for 35% of the 3.1 million pregnancy-related deaths each year and significantly affects around 15 million children annually worldwide. Conventional approaches to predict PTB lack reliable predictive power, leaving >50% of cases undetected. Recently, machine learning (ML) models have shown potential as an appropriate complementary approach for PTB prediction using health records (HRs). Objective This study aimed to systematically review the literature concerned with PTB prediction using HR data and the ML approach. Methods This systematic review was conducted in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) statement. A comprehensive search was performed in 7 bibliographic databases until May 15, 2021. The quality of the studies was assessed, and descriptive information, including descriptive characteristics of the data, ML modeling processes, and model performance, was extracted and reported. Results A total of 732 papers were screened through title and abstract. Of these 732 studies, 23 (3.1%) were screened by full text, resulting in 13 (1.8%) papers that met the inclusion criteria. The sample size varied from a minimum value of 274 to a maximum of 1,400,000. The time length for which data were extracted varied from 1 to 11 years, and the oldest and newest data were related to 1988 and 2018, respectively. Population, data set, and ML models’ characteristics were assessed, and the performance of the model was often reported based on metrics such as accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve. Conclusions Various ML models used for different HR data indicated potential for PTB prediction. However, evaluation metrics, software and package used, data size and type, selected features, and importantly data management method often remain unjustified, threatening the reliability, performance, and internal or external validity of the model. To understand the usefulness of ML in covering the existing gap, future studies are also suggested to compare it with a conventional method on the same data set.
Collapse
Affiliation(s)
- Zahra Sharifi-Heris
- Sue & Bill Gross School of Nursing, University of California, Irvine, CA, United States
| | - Juho Laitala
- Department of Computing, University of Turku, Turku, Finland
| | - Antti Airola
- Department of Computing, University of Turku, Turku, Finland
| | - Amir M Rahmani
- Sue & Bill Gross School of Nursing, University of California, Irvine, CA, United States
| | - Miriam Bender
- Sue & Bill Gross School of Nursing, University of California, Irvine, CA, United States
| |
Collapse
|