1
|
Li Q, Alfonso YN, Wolfson C, Aziz KB, Creanga AA. Leveraging Machine Learning to Predict and Assess Disparities in Severe Maternal Morbidity in Maryland. Healthcare (Basel) 2025; 13:284. [PMID: 39942473 PMCID: PMC11817442 DOI: 10.3390/healthcare13030284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2024] [Revised: 01/23/2025] [Accepted: 01/27/2025] [Indexed: 02/16/2025] Open
Abstract
BACKGROUND Severe maternal morbidity (SMM) is increasing in the United States. The main objective of this study is to test the use of machine learning (ML) techniques to develop models for predicting SMM during delivery hospitalizations in Maryland. Secondarily, we examine disparities in SMM by key sociodemographic characteristics. METHODS We used the linked State Inpatient Database (SID) and the American Hospital Association (AHA) Annual Survey data from Maryland for 2016-2019 (N = 261,226 delivery hospitalizations). We first estimated relative risks for SMM across key sociodemographic factors (e.g., race, income, insurance, and primary language). Then, we fitted LASSO and, for comparison, Logit models with 75 and 18 features. The selection of SMM features was based on clinical expert opinion, a literature review, statistical significance, and computational resource constraints. Various model performance metrics, including the area under the receiver operating characteristic curve (AUC), accuracy, precision, and recall values were computed to compare predictive performance. RESULTS During 2016-2019, 76 per 10,000 deliveries (1976 of 261,226) were in patients who experienced an SMM event. The Logit model with a full list of 75 features achieved an AUC of 0.71 in the validation dataset, which marginally decreased to 0.69 in the reduced model with 18 features. The LASSO algorithm with the same 18 features demonstrated slightly superior predictive performance and an AUC of 0.80. We found significant disparities in SMM among patients living in low-income areas, with public insurance, and who were non-Hispanic Black or non-English speakers. CONCLUSION Our results demonstrate the feasibility of utilizing ML and administrative hospital discharge data for SMM prediction. The low recall score is a limitation across all models we compared, signifying that the algorithms struggle with identifying all SMM cases. This study identified substantial disparities in SMM across various sociodemographic factors. Addressing these disparities requires multifaceted interventions that include improving access to quality care, enhancing cultural competence among healthcare providers, and implementing policies that help mitigate social determinants of health.
Collapse
Affiliation(s)
- Qingfeng Li
- Department of International Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA (C.W.)
| | - Y. Natalia Alfonso
- Department of International Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA (C.W.)
| | - Carrie Wolfson
- Department of International Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA (C.W.)
| | - Khyzer B. Aziz
- Johns Hopkins Children’s Center, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA
| | - Andreea A. Creanga
- Department of International Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA (C.W.)
- Department of Gynecology and Obstetrics, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA
| |
Collapse
|
2
|
Clapp MA, Li S, James KE, Reiff ES, Little SE, McCoy TH, Perlis RH, Kaimal AJ. Development of a Practical Prediction Model for Adverse Neonatal Outcomes at the Start of the Second Stage of Labor. Obstet Gynecol 2025; 145:73-81. [PMID: 39481108 DOI: 10.1097/aog.0000000000005776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Accepted: 09/26/2024] [Indexed: 11/02/2024]
Abstract
OBJECTIVE To develop a prediction model for adverse neonatal outcomes using electronic fetal monitoring (EFM) interpretation data and other relevant clinical information known at the start of the second stage of labor. METHODS This was a retrospective cohort study of individuals who labored and delivered at two academic medical centers between July 2016 and June 2020. Individuals were included if they had a singleton gestation at term (more than 37 weeks of gestation), a vertex-presenting, nonanomalous fetus, and planned vaginal delivery and reached the start of the second stage of labor. The primary outcome was a composite of severe adverse neonatal outcomes. We developed and compared three modeling approaches to predict the primary outcome using factors related to EFM data (as interpreted and entered in structured data fields in the electronic health record by the bedside nurse), maternal comorbidities, and labor characteristics: traditional logistic regression, LASSO (least absolute shrinkage and selection operator), and extreme gradient boosting. Model discrimination and calibration were compared. Predicted probabilities were stratified into risk groups to facilitate clinical interpretation, and positive predictive values for adverse neonatal outcomes were calculated for each. RESULTS A total of 22,454 patients were included: 14,820 in the training set and 7,634 in the test set. The composite adverse neonatal outcome occurred in 3.2% of deliveries. Of the three modeling methods compared, the logistic regression model had the highest discrimination (0.690, 95% CI, 0.656-0.724) and was well calibrated. When stratified into risk groups (no increased risk, higher risk, and highest risk), the rates of the composite adverse neonatal outcome were 2.6% (95% CI, 2.3-3.1%), 6.7% (95% CI, 4.6-9.6%), and 10.3% (95% CI, 7.6-13.8%), respectively. Factors with the strongest associations with the composite adverse neonatal outcome included the presence of meconium (adjusted odds ratio [aOR] 2.10, 95% CI, 1.68-2.62), fetal tachycardia within the 2 hours preceding the start of the second stage (aOR 1.94, 95% CI, 1.03-3.65), and number of prior deliveries (aOR 0.77, 95% CI, 0.60-0.99).
Collapse
Affiliation(s)
- Mark A Clapp
- Department of Obstetrics and Gynecology, the Center for Quantitative Health, and the Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, the Department of Obstetrics and Gynecology, Brigham and Women's Hospital, and the Department of Obstetrics and Gynecology, Beth Israel Deaconess Medical Center, Boston, Massachusetts; and the Department of Obstetrics and Gynecology, University of South Florida, Tampa, Florida
| | | | | | | | | | | | | | | |
Collapse
|
3
|
Joshi A. Big data and AI for gender equality in health: bias is a big challenge. Front Big Data 2024; 7:1436019. [PMID: 39479339 PMCID: PMC11521869 DOI: 10.3389/fdata.2024.1436019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Accepted: 09/30/2024] [Indexed: 11/02/2024] Open
Abstract
Artificial intelligence and machine learning are rapidly evolving fields that have the potential to transform women's health by improving diagnostic accuracy, personalizing treatment plans, and building predictive models of disease progression leading to preventive care. Three categories of women's health issues are discussed where machine learning can facilitate accessible, affordable, personalized, and evidence-based healthcare. In this perspective, firstly the promise of big data and machine learning applications in the context of women's health is elaborated. Despite these promises, machine learning applications are not widely adapted in clinical care due to many issues including ethical concerns, patient privacy, informed consent, algorithmic biases, data quality and availability, and education and training of health care professionals. In the medical field, discrimination against women has a long history. Machine learning implicitly carries biases in the data. Thus, despite the fact that machine learning has the potential to improve some aspects of women's health, it can also reinforce sex and gender biases. Advanced machine learning tools blindly integrated without properly understanding and correcting for socio-cultural sex and gender biased practices and policies is therefore unlikely to result in sex and gender equality in health.
Collapse
Affiliation(s)
- Anagha Joshi
- Computational Biology Unit, Department of Clinical Science, University of Bergen, Bergen, Norway
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, IIT Madras, Chennai, India
- Center for Integrative Biology and Systems Medicine, Wadhwani School of Data Science & Artificial Intelligence, IIT Madras, Chennai, India
| |
Collapse
|
4
|
Clapp MA, Kim E, James KE, Perlis RH, Kaimal AJ, McCoy TH, Easter SR. Comparison of Natural Language Processing of Clinical Notes With a Validated Risk-Stratification Tool to Predict Severe Maternal Morbidity. JAMA Netw Open 2022; 5:e2234924. [PMID: 36197662 PMCID: PMC9535539 DOI: 10.1001/jamanetworkopen.2022.34924] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
IMPORTANCE Risk-stratification tools are routinely used in obstetrics to assist care teams in assessing and communicating risk associated with delivery. Electronic health record data and machine learning methods may offer a novel opportunity to improve and automate risk assessment. OBJECTIVE To compare the predictive performance of natural language processing (NLP) of clinician documentation with that of a previously validated tool to identify individuals at high risk for maternal morbidity. DESIGN, SETTING, AND PARTICIPANTS This retrospective diagnostic study was conducted at Brigham and Women's Hospital and Massachusetts General Hospital, Boston, Massachusetts, and included individuals admitted for delivery at the former institution from July 1, 2016, to February 29, 2020. A subset of these encounters (admissions from February to December 2018) was part of a previous prospective validation study of the Obstetric Comorbidity Index (OB-CMI), a comorbidity-weighted score to stratify risk of severe maternal morbidity (SMM). EXPOSURES Natural language processing of clinician documentation and OB-CMI scores. MAIN OUTCOMES AND MEASURES Natural language processing of clinician-authored admission notes was used to predict SMM in individuals delivering at the same institution but not included in the prospective OB-CMI study. The NLP model was then compared with the OB-CMI in the subset with a known OB-CMI score. Model discrimination between the 2 approaches was compared using the DeLong test. Sensitivity and positive predictive value for the identification of individuals at highest risk were prioritized as the characteristics of interest. RESULTS This study included 19 794 individuals; 4034 (20.4%) were included in the original prospective validation study of the OB-CMI (testing set), and the remaining 15 760 (79.6%) composed the training set. Mean (SD) age was 32.3 (5.2) years in the testing cohort and 32.2 (5.2) years in the training cohort. A total of 115 individuals in the testing cohort (2.9%) and 468 in the training cohort (3.0%) experienced SMM. The NLP model was built from a pruned vocabulary of 2783 unique words that occurred within the 15 760 admission notes from individuals in the training set. The area under the receiver operating characteristic curve of the NLP-based model for the prediction of SMM was 0.76 (95% CI, 0.72-0.81) and was comparable with that of the OB-CMI model (0.74; 95% CI, 0.70-0.79) in the testing set (P = .53). Sensitivity (NLP, 28.7%; OB-CMI, 24.4%) and positive predictive value (NLP, 19.4%; OB-CMI, 17.6%) were comparable between the NLP and OB-CMI high-risk designations for the prediction of SMM. CONCLUSIONS AND RELEVANCE In this study, the NLP method and a validated risk-stratification tool had a similar ability to identify patients at high risk of SMM. Future prospective research is needed to validate the NLP approach in clinical practice and determine whether it could augment or replace tools requiring manual user input.
Collapse
Affiliation(s)
- Mark A. Clapp
- Department of Obstetrics and Gynecology, Massachusetts General Hospital, Boston
| | - Ellen Kim
- Department of Radiation Oncology, Brigham and Women’s Hospital, Boston, Massachusetts
| | - Kaitlyn E. James
- Department of Obstetrics and Gynecology, Massachusetts General Hospital, Boston
| | - Roy H. Perlis
- Center for Quantitative Health, Massachusetts General Hospital, Boston
- Department of Psychiatry, Massachusetts General Hospital, Boston
| | - Anjali J. Kaimal
- Department of Obstetrics and Gynecology, Massachusetts General Hospital, Boston
- Department of Population Medicine, Harvard Medical School, Boston, Massachusetts
| | - Thomas H. McCoy
- Center for Quantitative Health, Massachusetts General Hospital, Boston
- Department of Psychiatry, Massachusetts General Hospital, Boston
| | - Sarah Rae Easter
- Department of Obstetrics and Gynecology, Brigham and Women’s Hospital, Boston, Massachusetts
| |
Collapse
|
5
|
Natural language processing of admission notes to predict severe maternal morbidity during the delivery encounter. Am J Obstet Gynecol 2022; 227:511.e1-511.e8. [PMID: 35430230 DOI: 10.1016/j.ajog.2022.04.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 03/31/2022] [Accepted: 04/09/2022] [Indexed: 11/23/2022]
Abstract
BACKGROUND Severe maternal morbidity and mortality remain public health priorities in the United States, given their high rates relative to other high-income countries and the notable racial and ethnic disparities that exist. In general, accurate risk stratification methods are needed to help patients, providers, hospitals, and health systems plan for and potentially avert adverse outcomes. OBJECTIVE Our objective was to understand if machine learning methods with natural language processing of history and physical notes could identify a group of patients at high risk of maternal morbidity on admission for delivery without relying on any additional patient information (eg, demographics and diagnosis codes). STUDY DESIGN This was a retrospective study of people admitted for delivery at 2 hospitals (hospitals A and B) in a single healthcare system between July 1, 2016, and June 30, 2020. The primary outcome was severe maternal morbidity, as defined by the Centers for Disease Control and Prevention; furthermore, we examined nontransfusion severe maternal morbidity. Clinician documents designated as history and physical notes were extracted from the electronic health record for processing and analysis. A bag-of-words approach was used for this natural language processing analysis (ie, each history or physical note was converted into a matrix of counts of individual words (or phrases) that occurred within the document). The least absolute shrinkage and selection operator models were used to generate prediction probabilities for severe maternal morbidity and nontransfusion severe maternal morbidity for each note. Model discrimination was assessed via the area under the receiver operating curve. Discrimination was compared between models using the DeLong test. Calibration plots were generated to assess model calibration. Moreover, the natural language processing models with history and physical note texts were compared with validated obstetrical comorbidity risk scores based on diagnosis codes. RESULTS There were 13,572 delivery encounters with history and physical notes from hospital A, split between training (Atrain, n=10,250) and testing (Atest, n=3,322) datasets for model derivation and internal validation. There were 23,397 delivery encounters with history and physical notes from hospital B (Bvalid) used for external validation. For the outcome of severe maternal morbidity, the natural language processing model had an area under the receiver operating curve of 0.67 (95% confidence interval, 0.63-0.72) and 0.72 (95% confidence interval, 0.70-0.74) in the Atest and Bvalid datasets, respectively. For the outcome of nontransfusion severe maternal morbidity, the area under the receiver operating curve was 0.72 (95% confidence interval, 0.65-0.80) and 0.76 (95% confidence interval, 0.73-0.79) in the Atest and Bvalid datasets, respectively. The calibration plots demonstrated the bag-of-words model's ability to distinguish a group of individuals at a substantially higher risk of severe maternal morbidity and nontransfusion severe maternal morbidity, notably those in the top deciles of predicted risk. Areas under the receiver operating curve in the natural language processing-based models were similar to those generated using a validated, retrospectively derived, diagnosis code-based comorbidity score. CONCLUSION In this practical application of machine learning, we demonstrated the capabilities of natural language processing for the prediction of severe maternal morbidity based on provider documentation inherently generated at the time of admission. This work should serve as a catalyst for providers, hospitals, and electronic health record systems to explore ways that artificial intelligence can be incorporated into clinical practice and evaluated rigorously for their ability to improve health.
Collapse
|
6
|
Critical Care in Obstetrics. Best Pract Res Clin Anaesthesiol 2022; 36:209-225. [DOI: 10.1016/j.bpa.2022.02.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 02/02/2022] [Indexed: 11/20/2022]
|
7
|
Clapp MA, McCoy TH. The potential of big data for obstetrics discovery. Curr Opin Endocrinol Diabetes Obes 2021; 28:553-557. [PMID: 34709211 DOI: 10.1097/med.0000000000000679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
PURPOSE OF REVIEW The purpose of this article is to introduce the concept of 'Big Data' and review its potential to advance scientific discovery in obstetrics. RECENT FINDINGS Big Data is now ubiquitous in medicine, being used in many specialties to understand the pathophysiology, risk factors, and treatment for many diseases. Big Data analyses often employ machine learning methods to understand the complex relationships that may exist within these sources. We review the basic principles of supervised and unsupervised machine learning methods, including deep learning. We highlight how these methods have been used to study genetic risk factors for preterm birth, interpreting electronic fetal heart rate tracings, and predict adverse maternal and neonatal outcomes during pregnancy and delivery. Despite its promise, there are challenges with using Big Data, including data integrity, generalizability (namely the concerns about perpetuating inequalities), and confidentiality. SUMMARY The combination of new data and enhanced methods present a synergistic opportunity to explore the complex relationships common to human illness and medical practice, including obstetrics. With prediction as a primary objective instead of the more familiar goals of hypothesis testing, these analytic methods can capture multifaceted, rare, and nuanced relationships between exposures and outcomes that exist within these large data sets.
Collapse
Affiliation(s)
- Mark A Clapp
- Department of Obstetrics and Gynecology
- Center for Quantitative Health, Massachusetts General Hospital
- Harvard Medical School, Boston, Massachusetts, USA
| | - Thomas H McCoy
- Center for Quantitative Health, Massachusetts General Hospital
- Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|