1
|
Wu L, Xu J, Tong W. PERform: assessing model performance with predictivity and explainability readiness formula. J Environ Sci Health C Toxicol Carcinog 2024:1-16. [PMID: 38619534 DOI: 10.1080/26896583.2024.2340391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
In the rapidly evolving field of artificial intelligence (AI), explainability has been traditionally assessed in a post-modeling process and is often subjective. In contrary, many quantitative metrics have been routinely used to assess a model's performance. We proposed a unified formular named PERForm, by incorporating explainability as a weight into the existing statistical metrics to provide an integrated and quantitative measure of both predictivity and explainability to guide model selection, application, and evaluation. PERForm was designed as a generic formula and can be applied to any data types. We applied PERForm on a range of diverse datasets, including DILIst, Tox21, and three MAQC-II benchmark datasets, using various modeling algorithms to predict a total of 73 distinct endpoints. For example, AdaBoost algorithms exhibited superior performance (PERForm AUC for AdaBoost is 0.129 where Linear regression is 0) in DILIst prediction, where linear regression outperformed other models in the majority of Tox21 endpoints (PERForm AUC for linear regression is 0.301 where AdaBoost is 0.283 in average). This research marks a significant step toward comprehensively evaluating the utility of an AI model to advance transparency and interpretability, where the tradeoff between a model's performance and its interpretability can have profound implications.
Collapse
Affiliation(s)
- Leihong Wu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, FDA, Jefferson, AR, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, FDA, Jefferson, AR, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, FDA, Jefferson, AR, USA
| |
Collapse
|
2
|
Hildesheim FE, Ophey A, Zumbansen A, Funck T, Schuster T, Jamison KW, Kuceyeski A, Thiel A. Predicting Language Function Post-Stroke: A Model-Based Structural Connectivity Approach. Neurorehabil Neural Repair 2024:15459683241245410. [PMID: 38602161 DOI: 10.1177/15459683241245410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]
Abstract
BACKGROUND The prediction of post-stroke language function is essential for the development of individualized treatment plans based on the personal recovery potential of aphasic stroke patients. OBJECTIVE To establish a framework for integrating information on connectivity disruption of the language network based on routinely collected clinical magnetic resonance (MR) images into Random Forest modeling to predict post-stroke language function. METHODS Language function was assessed in 76 stroke patients from the Non-Invasive Repeated Therapeutic Stimulation for Aphasia Recovery trial, using the Token Test (TT), Boston Naming Test (BNT), and Semantic Verbal Fluency (sVF) Test as primary outcome measures. Individual infarct masks were superimposed onto a diffusion tensor imaging tractogram reference set to calculate Change in Connectivity scores of language-relevant gray matter regions as estimates of structural connectivity disruption. Multivariable Random Forest models were derived to predict language function. RESULTS Random Forest models explained moderate to high amount of variance at baseline and follow-up for the TT (62.7% and 76.2%), BNT (47.0% and 84.3%), and sVF (52.2% and 61.1%). Initial language function and non-verbal cognitive ability were the most important variables to predict language function. Connectivity disruption explained additional variance, resulting in a prediction error increase of up to 12.8% with variable omission. Left middle temporal gyrus (12.8%) and supramarginal gyrus (9.8%) were identified as among the most important network nodes. CONCLUSION Connectivity disruption of the language network adds predictive value beyond lesion volume, initial language function, and non-verbal cognitive ability. Obtaining information on connectivity disruption based on routine clinical MR images constitutes a significant advancement toward practical clinical application.
Collapse
Affiliation(s)
- Franziska E Hildesheim
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, QC, Canada
- Department of Neurology & Neurosurgery, McGill University, Montréal, QC, Canada
- Canadian Platform for Trials in Non-Invasive Brain Stimulation (CanStim), Montréal, QC, Canada
| | - Anja Ophey
- Department of Medical Psychology | Neuropsychology and Gender Studies, Center for Neuropsychological Diagnostics and Intervention, University Hospital Cologne, Medical Faculty of the University of Cologne, Cologne, Germany
| | - Anna Zumbansen
- School of Rehabilitation Sciences, University of Ottawa, Ottawa, ON, Canada
- Music and Health Research Institute, University of Ottawa, Ottawa, ON, Canada
| | - Thomas Funck
- Institute of Neurosciences and Medicine INM-1, Research Centre Jülich, Jülich, Germany
| | - Tibor Schuster
- Department of Family Medicine, McGill University, Montréal, QC, Canada
| | - Keith W Jamison
- Department of Radiology, Weill Cornell Medicine, New York, NY, USA
| | - Amy Kuceyeski
- Department of Radiology, Weill Cornell Medicine, New York, NY, USA
| | - Alexander Thiel
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, QC, Canada
- Department of Neurology & Neurosurgery, McGill University, Montréal, QC, Canada
- Canadian Platform for Trials in Non-Invasive Brain Stimulation (CanStim), Montréal, QC, Canada
| |
Collapse
|
3
|
Yland JJ, Zad Z, Wang TR, Wesselink AK, Jiang T, Hatch EE, Paschalidis IC, Wise LA. Predictive models of miscarriage based on data from a preconception cohort study. Fertil Steril 2024:S0015-0282(24)00235-8. [PMID: 38604264 DOI: 10.1016/j.fertnstert.2024.04.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Revised: 03/30/2024] [Accepted: 04/01/2024] [Indexed: 04/13/2024]
Abstract
OBJECTIVE To use self-reported preconception data to derive models that predict risk of miscarriage. DESIGN Prospective preconception cohort study. SUBJECTS Study participants were female, aged 21-45 years, residents of the United States or Canada, and attempting spontaneous pregnancy at enrollment during 2013-2022. Participants were followed for up to 12 months of pregnancy attempts; those who conceived were followed through pregnancy and postpartum. We restricted analyses to participants who conceived during the study period. EXPOSURE On baseline and follow-up questionnaires completed every 8 weeks until pregnancy, we collected self-reported data on sociodemographic factors, reproductive history, lifestyle, anthropometrics, diet, medical history, and male partner characteristics. We included 160 potential predictor variables in our models. MAIN OUTCOME MEASURES The primary outcome was miscarriage, defined as pregnancy loss before 20 weeks' gestation. We followed participants from their first positive pregnancy test until miscarriage or a censoring event (induced abortion, ectopic pregnancy, loss to follow-up, or 20 weeks' gestation), whichever occurred first. We fit both survival and static models, using Cox proportional hazards models, logistic regression, support vector machines, Gradient Boosted Trees, and Random Forest algorithms. We evaluated model performance using the concordance index (survival models) and the weighted-F1 score (static models). RESULTS Among 8,720 participants who conceived, 20.4% reported miscarriage. In multivariable models, the strongest predictors of miscarriage were female age, history of miscarriage, and male partner age. The weighted-F1 score ranged from 73-89% for static models and the concordance index ranged from 53-56% for survival models, indicating better discrimination for the static models compared with the survival models (i.e., ability of the model to discriminate between individuals with and without miscarriage). No appreciable differences were observed across strata of miscarriage history or among models restricted to ≥8 weeks' gestation. CONCLUSION Our findings suggest that miscarriage is not easily predicted based on preconception lifestyle characteristics, and that advancing age and history of miscarriage are the most important predictors of incident miscarriage.
Collapse
Affiliation(s)
- Jennifer J Yland
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA.
| | - Zahra Zad
- Hariri Institute for Computing and Computational Science & Engineering, Boston University, Boston, MA, USA; Department of Electrical and Computer Engineering, Division of Systems Engineering, Boston University, Boston, MA, USA
| | - Tanran R Wang
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA
| | - Amelia K Wesselink
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA
| | - Tammy Jiang
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA
| | - Elizabeth E Hatch
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA
| | - Ioannis Ch Paschalidis
- Hariri Institute for Computing and Computational Science & Engineering, Boston University, Boston, MA, USA; Department of Electrical and Computer Engineering, Division of Systems Engineering, Boston University, Boston, MA, USA; Department of Biomedical Engineering, Boston University, Boston, MA, USA
| | - Lauren A Wise
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA
| |
Collapse
|
4
|
Li N, Li YL, Shao JM, Wang CH, Li SB, Jiang Y. Optimizing early neurological deterioration prediction in acute ischemic stroke patients following intravenous thrombolysis: a LASSO regression model approach. Front Neurosci 2024; 18:1390117. [PMID: 38633265 PMCID: PMC11022961 DOI: 10.3389/fnins.2024.1390117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Accepted: 03/15/2024] [Indexed: 04/19/2024] Open
Abstract
Background Acute ischemic stroke (AIS) remains a leading cause of disability and mortality globally among adults. Despite Intravenous Thrombolysis (IVT) with recombinant tissue plasminogen activator (rt-PA) emerging as the standard treatment for AIS, approximately 6-40% of patients undergoing IVT experience Early Neurological Deterioration (END), significantly impacting treatment efficacy and patient prognosis. Objective This study aimed to develop and validate a predictive model for END in AIS patients post rt-PA administration using the Least Absolute Shrinkage and Selection Operator (LASSO) regression approach. Methods In this retrospective cohort study, data from 531 AIS patients treated with intravenous alteplase across two hospitals were analyzed. LASSO regression was employed to identify significant predictors of END, leading to the construction of a multivariate predictive model. Results Six key predictors significantly associated with END were identified through LASSO regression analysis: previous stroke history, Body Mass Index (BMI), age, Onset to Treatment Time (OTT), lymphocyte count, and glucose levels. A predictive nomogram incorporating these factors was developed, effectively estimating the probability of END post-IVT. The model demonstrated robust predictive performance, with an Area Under the Curve (AUC) of 0.867 in the training set and 0.880 in the validation set. Conclusion The LASSO regression-based predictive model accurately identifies critical risk factors leading to END in AIS patients following IVT. This model facilitates timely identification of high-risk patients by clinicians, enabling more personalized treatment strategies and optimizing patient management and outcomes.
Collapse
Affiliation(s)
- Ning Li
- Department of Neurology, Affiliated Hospital of Hebei University, Baoding, China
| | - Ying-Lei Li
- Department of Emergency Medicine, Baoding No.1 Central Hospital, Baoding, China
| | - Jia-Min Shao
- Department of Neurology, Affiliated Hospital of Hebei University, Baoding, China
| | - Chu-Han Wang
- Department of Neurology, Affiliated Hospital of Hebei University, Baoding, China
| | - Si-Bo Li
- Department of Neurology, Affiliated Hospital of Hebei University, Baoding, China
| | - Ye Jiang
- Department of Neurology, Affiliated Hospital of Hebei University, Baoding, China
| |
Collapse
|
5
|
Mermans F, De Baets H, García-Timermans C, Teughels W, Boon N. Unlocking the mechanism of action: a cost-effective flow cytometry approach for accelerating antimicrobial drug development. Microbiol Spectr 2024; 12:e0393123. [PMID: 38483479 DOI: 10.1128/spectrum.03931-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 03/04/2024] [Indexed: 04/06/2024] Open
Abstract
Antimicrobial resistance is one of the greatest challenges to global health. While the development of new antimicrobials can combat resistance, low profitability reduces the number of new compounds brought to market. Elucidating the mechanism of action is crucial for developing new antimicrobials. This can become expensive as there are no universally applicable pipelines. Phenotypic heterogeneity of microbial populations resulting from antimicrobial treatment can be captured through flow cytometric fingerprinting. Since antimicrobials are classified into limited groups, the mechanism of action of known compounds can be used for predictive modeling. We demonstrate a cost-effective flow cytometry approach for determining the mechanism of action of new compounds. Cultures of Actinomyces viscosus and Fusobacterium nucleatum were treated with different antimicrobials and measured by flow cytometry. A Gaussian mixture mask was applied over the data to construct phenotypic fingerprints. Fingerprints were used to assess statistical differences between mechanism of action groups and to train random forest classifiers. Classifiers were then used to predict the mechanism of action of cephalothin. Statistical differences were found among the different mechanisms of action groups. Pairwise comparison showed statistical differences for 35 out of 45 pairs for A. viscosus and for 32 out of 45 pairs for F. nucleatum after 3.5 h of treatment. The best-performing random forest classifier yielded a Matthews correlation coefficient of 0.92 and the mechanism of action of cephalothin could be successfully predicted. These findings suggest that flow cytometry can be a cheap and fast alternative for determining the mechanism of action of new antimicrobials.IMPORTANCEIn the context of the emerging threat of antimicrobial resistance, the development of novel antimicrobials is a commonly employed strategy to combat resistance. Elucidating the mechanism of action of novel compounds is crucial in this development but can become expensive, as no universally applicable pipelines currently exist. We present a novel flow cytometry-based approach capable of determining the mechanism of action swiftly and cost-effectively. The workflow aims to accelerate drug discovery and could help facilitate a more targeted approach for antimicrobial treatment of patients.
Collapse
Affiliation(s)
- Fabian Mermans
- Center for Microbial Ecology and Technology, Faculty of Bioscience Engineering, Ghent University, Gent, Belgium
- Department of Oral Health Sciences, KU Leuven & Dentistry (Periodontology), University Hospitals Leuven, Leuven, Belgium
| | - Hanna De Baets
- Center for Microbial Ecology and Technology, Faculty of Bioscience Engineering, Ghent University, Gent, Belgium
| | - Cristina García-Timermans
- Center for Microbial Ecology and Technology, Faculty of Bioscience Engineering, Ghent University, Gent, Belgium
| | - Wim Teughels
- Department of Oral Health Sciences, KU Leuven & Dentistry (Periodontology), University Hospitals Leuven, Leuven, Belgium
| | - Nico Boon
- Center for Microbial Ecology and Technology, Faculty of Bioscience Engineering, Ghent University, Gent, Belgium
| |
Collapse
|
6
|
Noviyanti F, Mochida M, Kawasaki S. Predictive modeling of Salmonella spp. growth behavior in cooked and raw chicken samples: Real-time PCR quantification approach and model assessment in different handling scenarios. J Food Sci 2024; 89:2410-2422. [PMID: 38465765 DOI: 10.1111/1750-3841.17020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 01/15/2024] [Accepted: 02/18/2024] [Indexed: 03/12/2024]
Abstract
The increasing prevalence of Salmonella contamination in poultry meat emphasizes the importance of suitable predictive microbiological models for estimating Salmonella growth behavior. This study was conducted to evaluate the potential of chicken juice as a model system to predict the behavior of Salmonella spp. in cooked and raw chicken products and to assess its ability to predict cross-contamination scenarios. A cocktail of four Salmonella serovars was inoculated into chicken juice, sliced chicken, ground chicken, and chicken patties, with subsequent incubation at 10, 15, 20, and 25°C for 39 h. The number of Salmonella spp. in each sample was determined using real-time polymerase chain reaction. Growth curves were fitted into the primary Baranyi and Roberts model to obtain growth parameters. Interactions between temperature and growth parameters were described using the secondary Ratkowsky's square root model. The predictive results generated by the chicken juice model were compared with those obtained from other chicken meat models. Furthermore, the parameters of the chicken juice model were used to predict Salmonella spp. numbers in six worst-case cross-contamination scenarios. Performance of the chicken juice model was evaluated using the acceptable prediction zone from -1.0 (fail-safe) to 0.5 (fail-dangerous) log. Chicken juice model accurately predicted all observed data points within the acceptable range, with the distribution of residuals being wider near the fail-safe zone (75%) than near the fail-dangerous zone (25%). This study offers valuable insights into a novel approach for modeling Salmonella growth in chicken meat products, with implications for food safety through the development of strategic interventions. PRACTICAL APPLICATION: The findings of this study have important implications in the food industry, as chicken juice could be a useful tool for predicting Salmonella behavior in different chicken products and thus reducing the risk of foodborne illnesses through the development of strategic interventions. However, it is important to recognize that some modifications to the chicken juice model will be necessary to accurately mimic all real-life conditions, as multiple factors particularly those related to food processing can vary between different products.
Collapse
Affiliation(s)
- Fia Noviyanti
- Division of Food Quality and Safety Research, Institute of Food Research, National Agriculture and Food Research Organization, Tsukuba, Japan
| | - Mari Mochida
- Division of Food Quality and Safety Research, Institute of Food Research, National Agriculture and Food Research Organization, Tsukuba, Japan
| | - Susumu Kawasaki
- Division of Food Quality and Safety Research, Institute of Food Research, National Agriculture and Food Research Organization, Tsukuba, Japan
| |
Collapse
|
7
|
Khodabandehlou H, Rashedi M, Wang T, Tulsyan A, Schorner G, Garvin C, Undey C. Cell culture product quality attribute prediction using convolutional neural networks and Raman spectroscopy. Biotechnol Bioeng 2024; 121:1231-1243. [PMID: 38284180 DOI: 10.1002/bit.28646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 11/14/2023] [Accepted: 12/19/2023] [Indexed: 01/30/2024]
Abstract
Advanced process control in the biopharmaceutical industry often lacks real-time measurements due to resource constraints. Raman spectroscopy and Partial Least Squares (PLS) models are often used to monitor bioprocess cultures in real-time. In spite of the ease of training, the accuracy of the PLS model is impacted if it is not used to predict quality attributes for the cell lines it is trained on. To address this issue, a deep convolutional neural network (CNN) is proposed for offline modeling of metabolites using Raman spectroscopy. By utilizing asymmetric least squares smoothing to adjust Raman spectra baselines, a generic training data set is created by amalgamating spectra from various cell lines and operating conditions. This data set, combined with their derivatives, forms a two-dimensional model input. The CNN model is developed and validated for predicting different quality variables against measurements from various continuous and fed-batch experimental runs. Validation results confirm that the deep CNN model is an accurate generic model of the process to predict real-time quality attributes, even in experimental runs not included in the training data. This model is robust and versatile, requiring no recalibration when deployed at different sites to monitor various cell lines and experimental runs.
Collapse
Affiliation(s)
- Hamid Khodabandehlou
- Digital Integration & Predictive Technologies, Process Development Department, Amgen Inc., Thousand Oaks, California, USA
| | - Mohammad Rashedi
- Digital Integration & Predictive Technologies, Process Development Department, Amgen Inc., Thousand Oaks, California, USA
| | - Tony Wang
- Digital Integration & Predictive Technologies, Process Development Department, Amgen Inc., West Greenwich, Rhode Island, USA
| | - Aditya Tulsyan
- Digital Integration & Predictive Technologies, Process Development Department, Amgen Inc., West Greenwich, Rhode Island, USA
| | - Gregg Schorner
- Digital Integration & Predictive Technologies, Process Development Department, Amgen Inc., West Greenwich, Rhode Island, USA
| | - Christopher Garvin
- Digital Integration & Predictive Technologies, Process Development Department, Amgen Inc., West Greenwich, Rhode Island, USA
| | - Cenk Undey
- Digital Integration & Predictive Technologies, Process Development Department, Amgen Inc., Thousand Oaks, California, USA
| |
Collapse
|
8
|
Heneghan JA, Walker SB, Fawcett A, Bennett TD, Dziorny AC, Sanchez-Pinto LN, Farris RW, Winter MC, Badke C, Martin B, Brown SR, McCrory MC, Ness-Cochinwala M, Rogerson C, Baloglu O, Harwayne-Gidansky I, Hudkins MR, Kamaleswaran R, Gangadharan S, Tripathi S, Mendonca EA, Markovitz BP, Mayampurath A, Spaeder MC. The Pediatric Data Science and Analytics Subgroup of the Pediatric Acute Lung Injury and Sepsis Investigators Network: Use of Supervised Machine Learning Applications in Pediatric Critical Care Medicine Research. Pediatr Crit Care Med 2024; 25:364-374. [PMID: 38059732 PMCID: PMC10994770 DOI: 10.1097/pcc.0000000000003425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/08/2023]
Abstract
OBJECTIVE Perform a scoping review of supervised machine learning in pediatric critical care to identify published applications, methodologies, and implementation frequency to inform best practices for the development, validation, and reporting of predictive models in pediatric critical care. DESIGN Scoping review and expert opinion. SETTING We queried CINAHL Plus with Full Text (EBSCO), Cochrane Library (Wiley), Embase (Elsevier), Ovid Medline, and PubMed for articles published between 2000 and 2022 related to machine learning concepts and pediatric critical illness. Articles were excluded if the majority of patients were adults or neonates, if unsupervised machine learning was the primary methodology, or if information related to the development, validation, and/or implementation of the model was not reported. Article selection and data extraction were performed using dual review in the Covidence tool, with discrepancies resolved by consensus. SUBJECTS Articles reporting on the development, validation, or implementation of supervised machine learning models in the field of pediatric critical care medicine. INTERVENTIONS None. MEASUREMENTS AND MAIN RESULTS Of 5075 identified studies, 141 articles were included. Studies were primarily (57%) performed at a single site. The majority took place in the United States (70%). Most were retrospective observational cohort studies. More than three-quarters of the articles were published between 2018 and 2022. The most common algorithms included logistic regression and random forest. Predicted events were most commonly death, transfer to ICU, and sepsis. Only 14% of articles reported external validation, and only a single model was implemented at publication. Reporting of validation methods, performance assessments, and implementation varied widely. Follow-up with authors suggests that implementation remains uncommon after model publication. CONCLUSIONS Publication of supervised machine learning models to address clinical challenges in pediatric critical care medicine has increased dramatically in the last 5 years. While these approaches have the potential to benefit children with critical illness, the literature demonstrates incomplete reporting, absence of external validation, and infrequent clinical implementation.
Collapse
Affiliation(s)
- Julia A. Heneghan
- Division of Pediatric Critical Care, University of Minnesota Masonic Children’s Hospital; Minneapolis, MN
| | - Sarah B. Walker
- Department of Pediatrics (Critical Care), Northwestern University Feinberg School of Medicine and Ann & Robert H. Lurie Children’s Hospital of Chicago; Chicago, IL
| | - Andrea Fawcett
- Department of Clinical and Organizational Development; Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, IL
| | - Tellen D. Bennett
- Departments of Biomedical Informatics and Pediatrics (Critical Care Medicine), University of Colorado School of Medicine; Aurora, CO
| | - Adam C. Dziorny
- Department of Pediatrics, University of Rochester; Rochester, NY
| | - L. Nelson Sanchez-Pinto
- Department of Pediatrics (Critical Care) and Preventive Medicine (Health & Biomedical Informatics), Northwestern University Feinberg School of Medicine and Ann & Robert H. Lurie Children’s Hospital of Chicago; Chicago, IL
| | - Reid W.D. Farris
- Department of Pediatrics, University of Washington and Seattle Children’s Hospital; Seattle, WA
| | - Meredith C. Winter
- Department of Anesthesiology Critical Care Medicine, Children’s Hospital Los Angeles and Keck School of Medicine, University of Southern California; Los Angeles, CA
| | - Colleen Badke
- Department of Pediatrics (Critical Care), Northwestern University Feinberg School of Medicine and Ann & Robert H. Lurie Children’s Hospital of Chicago; Chicago, IL
| | - Blake Martin
- Departments of Biomedical Informatics and Pediatrics (Critical Care Medicine), University of Colorado School of Medicine; Aurora, CO
| | - Stephanie R. Brown
- Section of Pediatric Critical Care, Oklahoma Children’s Hospital and Department of Pediatrics, University of Oklahoma Health Sciences Center, Oklahoma City, OK
| | - Michael C. McCrory
- Department of Anesthesiology, Wake Forest University School of Medicine; Winston Salem, NC
| | | | - Colin Rogerson
- Division of Critical Care, Department of Pediatrics, Indiana University; Indianapolis, IN
| | - Orkun Baloglu
- Pediatric Critical Care Medicine and Pediatric Cardiology, Cleveland Clinic Children’s Center for Artificial Intelligence (C4AI), Cleveland Clinic; Cleveland, OH
| | | | - Matthew R. Hudkins
- Division of Pediatric Critical Care, Department of Pediatrics, Oregon Health & Science University; Portland, OR
| | - Rishikesan Kamaleswaran
- Departments of Biomedical Informatics and Pediatrics, Emory University School of Medicine; Department of Biomedical Engineering, Georgia Institute of Technology; Atlanta, GA
| | - Sandeep Gangadharan
- Department of Pediatrics, Mount Sinai Icahn School of Medicine; New York, NY
| | - Sandeep Tripathi
- Department of Pediatrics. University of Illinois College of Medicine at Peoria/OSF HealthCare, Children’s Hospital of Illinois; Peoria, IL
| | - Eneida A. Mendonca
- Division of Biomedical Informatics, Department of Pediatrics, Cincinnati Children’s Hospital Medical Center and University of Cincinnati; Cincinnati, OH
| | - Barry P. Markovitz
- Division of Pediatric Critical Care, Department of Pediatrics, University of Utah Spencer F Eccles School of Medicine, Intermountain Primary Children’s Hospital; Salt Lake City, UT
| | - Anoop Mayampurath
- Department of Biostatistics & Medical Informatics, University of Wisconsin-Madison; Madison, WI
| | - Michael C. Spaeder
- Department of Pediatrics, University of Virginia School of Medicine, Charlottesville, VA
| |
Collapse
|
9
|
Li F, Li F, Zhao D, Lu H. Predictors of cancer-specific survival and overall survival among patients aged ≥60 years with lung adenocarcinoma using the SEER database. J Int Med Res 2024; 52:3000605241240993. [PMID: 38606733 PMCID: PMC11015783 DOI: 10.1177/03000605241240993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Accepted: 03/04/2024] [Indexed: 04/13/2024] Open
Abstract
OBJECTIVE We developed a simple, rapid predictive model to evaluate the prognosis of older patients with lung adenocarcinoma. METHODS Demographic characteristics and clinical information of patients with lung adenocarcinoma aged ≥60 years were retrospectively analyzed using Surveillance, Epidemiology, and End Results (SEER) data. We built nomograms of overall survival and cancer-specific survival using Cox single-factor and multi-factor regression. We used the C-index, calibration curve, receiver operating characteristic (ROC) curves, and decision curve analysis (DCA) to evaluate performance of the nomograms. RESULTS We included 14,117 patients, divided into a training set and validation set. We used the chi-square test to compare baseline data between groups and found no significant differences. We used Cox regression analysis to screen out independent prognostic factors affecting survival time and used these factors to construct the nomogram. The ROC curve, calibration curve, C-index, and DCA curve were used to verify the model. The final results showed that our predictive model had good predictive ability, and showed better predictive ability compared with tumor-node-metastasis (TNM) staging. We also achieved good results using data of our center for external verification. CONCLUSION The present nomogram could accurately predict prognosis in older patients with lung adenocarcinoma.
Collapse
Affiliation(s)
- Feiyang Li
- Ward 2, Department of Medical Oncology, Lixin People’s Hospital of Bozhou City, Anhui Province, China
| | - Fang Li
- Ward 1, Department of Medical Oncology, Affiliated Hospital of Qinghai University, Qinghai Province, China
| | - Dong Zhao
- Ward 2, Department of Medical Oncology, Lixin People’s Hospital of Bozhou City, Anhui Province, China
| | - Haowei Lu
- Ward 2, Department of Medical Oncology, Lixin People’s Hospital of Bozhou City, Anhui Province, China
| |
Collapse
|
10
|
Serajian R, Sun JQ, Cobian-Iñiguez J, Ehsani R. Predictive Neural Network Modeling for Almond Harvest Dust Control. Sensors (Basel) 2024; 24:2136. [PMID: 38610348 PMCID: PMC11014124 DOI: 10.3390/s24072136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 03/12/2024] [Accepted: 03/26/2024] [Indexed: 04/14/2024]
Abstract
This study introduces a neural network-based approach to predict dust emissions, specifically PM2.5 particles, during almond harvesting in California. Using a feedforward neural network (FNN), this research predicted PM2.5 emissions by analyzing key operational parameters of an advanced almond harvester. Preprocessing steps like outlier removal and normalization were employed to refine the dataset for training. The network's architecture was designed with two hidden layers and optimized using tanh activation and MSE loss functions through the Adam algorithm, striking a balance between model complexity and predictive accuracy. The model was trained on extensive field data from an almond pickup system, including variables like brush speed, angular velocity, and harvester forward speed. The results demonstrate a notable predictive accuracy of the FNN model, with a mean squared error (MSE) of 0.02 and a mean absolute error (MAE) of 0.01, indicating high precision in forecasting PM2.5 levels. By integrating machine learning with agricultural practices, this research provides a significant tool for environmental management in almond production, offering a method to reduce harmful emissions while maintaining operational efficiency. This model presents a solution for the almond industry and sets a precedent for applying predictive analytics in sustainable agriculture.
Collapse
Affiliation(s)
- Reza Serajian
- Department of Mechanical Engineering, University of California Merced, 5200 N. Lake Road, Merced, CA 95343, USA
| | | | | | | |
Collapse
|
11
|
Suresh K, Görg C, Ghosh D. Model-agnostic explanations for survival prediction models. Stat Med 2024. [PMID: 38530157 DOI: 10.1002/sim.10057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 02/13/2024] [Accepted: 02/26/2024] [Indexed: 03/27/2024]
Abstract
Advanced machine learning methods capable of capturing complex and nonlinear relationships can be used in biomedical research to accurately predict time-to-event outcomes. However, these methods have been criticized as "black boxes" that are not interpretable and thus are difficult to trust in making important clinical decisions. Explainable machine learning proposes the use of model-agnostic explainers that can be applied to predictions from any complex model. These explainers describe how a patient's characteristics are contributing to their prediction, and thus provide insight into how the model is arriving at that prediction. The specific application of these explainers to survival prediction models can be used to obtain explanations for (i) survival predictions at particular follow-up times, and (ii) a patient's overall predicted survival curve. Here, we present a model-agnostic approach for obtaining these explanations from any survival prediction model. We extend the local interpretable model-agnostic explainer framework for classification outcomes to survival prediction models. Using simulated data, we assess the performance of the proposed approaches under various settings. We illustrate application of the new methodology using prostate cancer data.
Collapse
Affiliation(s)
- Krithika Suresh
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA
- Department of Biostatistics and Informatics, University of Colorado, Aurora, Colorado, USA
| | - Carsten Görg
- Department of Biostatistics and Informatics, University of Colorado, Aurora, Colorado, USA
| | - Debashis Ghosh
- Department of Biostatistics and Informatics, University of Colorado, Aurora, Colorado, USA
| |
Collapse
|
12
|
Wang DR, Jamshidi S, Han R, Edwards JD, McClung AM, McCouch SR. Positive effects of public breeding on US rice yields under future climate scenarios. Proc Natl Acad Sci U S A 2024; 121:e2309969121. [PMID: 38498708 PMCID: PMC10990131 DOI: 10.1073/pnas.2309969121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 02/02/2024] [Indexed: 03/20/2024] Open
Abstract
In this study, we model and predict rice yields by integrating molecular marker variation, varietal productivity, and climate, focusing on the Southern U.S. rice-growing region. This region spans the states of Arkansas, Louisiana, Texas, Mississippi, and Missouri and accounts for 85% of total U.S. rice production. By digitizing and combining four decades of county-level variety acreage data (1970 to 2015) with varietal information from genotyping-by-sequencing data, we estimate annual historical county-level allele frequencies. These allele frequencies are used together with county-level weather and yield data to develop ten machine learning models for yield prediction. A two-layer meta-learner ensemble model that combines all ten methods is externally evaluated against observations from historical Uniform Regional Rice Nursery trials (1980 to 2018) conducted in the same states. Finally, the ensemble model is used with forecasted weather from the Coupled Model Intercomparison Project across the 110 rice-growing counties to predict production in the coming decades for Composite Variety Groups assembled based on year of release, breeding program, and several breeding trends. Results indicate positive effects over time of public breeding on rice resilience to future climates, and potential reasons are discussed.
Collapse
Affiliation(s)
- Diane R. Wang
- Department of Agronomy, Purdue University, West Lafayette, IN47901
| | - Sajad Jamshidi
- Department of Agronomy, Purdue University, West Lafayette, IN47901
| | - Rongkui Han
- Department of Plant Sciences, University of California, Davis, CA95616
| | - Jeremy D. Edwards
- Dale Bumpers National Rice Research Center, United States Department of Agriculture - Agricultural Research Service, Stuttgart, AR72160
| | - Anna M. McClung
- Dale Bumpers National Rice Research Center, United States Department of Agriculture - Agricultural Research Service, Stuttgart, AR72160
| | - Susan R. McCouch
- Section of Plant Breeding and Genetics, School of Integrative Plant Science, Cornell University, Ithaca, NY14853
| |
Collapse
|
13
|
Wen F, Liu R, Garcia Y Garcia A, Ye H, Lu L, Qimuge E, Sun Z, Nie C, Han X, Zhang Y. Study on the prediction method of grasshopper occurrence risk in Inner Mongolia based on the maximum entropy model during the growing period. J Econ Entomol 2024:toae036. [PMID: 38493360 DOI: 10.1093/jee/toae036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 02/03/2024] [Accepted: 02/15/2024] [Indexed: 03/18/2024]
Abstract
Grasshoppers represent a significant biological challenge in Inner Mongolia's grasslands, severely affecting the region's animal husbandry. Thus, dynamic monitoring of grasshopper infestation risk is crucial for sustainable livestock farming. This study employed the Maxent model, along with remote sensing data, to forecast Oedaleus decorus asiaticus occurrence during the growing season, using grasshopper suitability habitats as a base. The Maxent model's predictive accuracy was high, with an AUC of 0.966. The most influential environmental variables for grasshopper distribution were suitable habitat data (34.27%), the temperature-vegetation dryness index during the spawning period (18.81%), and various other meteorological and vegetation factors. The risk index model was applied to calculate the grasshopper distribution across different risk levels for the years 2019-2022. The data indicated that the level 1 risk area primarily spans central, eastern, and southwestern Inner Mongolia. By examining the variable weights, the primary drivers of risk level fluctuation from 2019 to 2022 were identified as accumulated precipitation and land surface temperature anomalies during the overwintering period. This study offers valuable insights for future O. decorus asiaticus monitoring in Inner Mongolia.
Collapse
Affiliation(s)
- Fu Wen
- International Research Center of Big Data for Sustainable Development Goals, Beijing 100094, China
- College of Water Resources Science and Engineering, Taiyuan University of Technology, Taiyuan 030024, China
| | - Ronghao Liu
- College of Water Resources Science and Engineering, Taiyuan University of Technology, Taiyuan 030024, China
| | - Axel Garcia Y Garcia
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108, USA
- Southwest Research and Outreach Center, University of Minnesota, Lamberton, MN 56152, USA
| | - Huichun Ye
- International Research Center of Big Data for Sustainable Development Goals, Beijing 100094, China
- Key Laboratory of Digital Earth Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China
- Key Laboratory of Earth Observation of Hainan Province, Hainan Aerospace Information Research Institute, Sanya 572029, China
| | - Longhui Lu
- International Research Center of Big Data for Sustainable Development Goals, Beijing 100094, China
- Key Laboratory of Digital Earth Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China
| | - Eerdeng Qimuge
- Grassland Workstation of Xilingol League, Xilinhot 026000, China
| | | | - Chaojia Nie
- International Research Center of Big Data for Sustainable Development Goals, Beijing 100094, China
- Key Laboratory of Digital Earth Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China
| | - Xuemei Han
- International Research Center of Big Data for Sustainable Development Goals, Beijing 100094, China
- School of Geology and Mining Engineering, Xinjiang University, Urumqi 830046, China
| | - Yue Zhang
- International Research Center of Big Data for Sustainable Development Goals, Beijing 100094, China
- College of Water Resources Science and Engineering, Taiyuan University of Technology, Taiyuan 030024, China
| |
Collapse
|
14
|
Hatef E, Chang HY, Richards TM, Kitchen C, Budaraju J, Foroughmand I, Lasser EC, Weiner JP. Development of a Social Risk Score in the Electronic Health Record to Identify Social Needs Among Underserved Populations: Retrospective Study. JMIR Form Res 2024; 8:e54732. [PMID: 38470477 DOI: 10.2196/54732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 02/02/2024] [Accepted: 02/08/2024] [Indexed: 03/13/2024] Open
Abstract
BACKGROUND Patients with unmet social needs and social determinants of health (SDOH) challenges continue to face a disproportionate risk of increased prevalence of disease, health care use, higher health care costs, and worse outcomes. Some existing predictive models have used the available data on social needs and SDOH challenges to predict health-related social needs or the need for various social service referrals. Despite these one-off efforts, the work to date suggests that many technical and organizational challenges must be surmounted before SDOH-integrated solutions can be implemented on an ongoing, wide-scale basis within most US-based health care organizations. OBJECTIVE We aimed to retrieve available information in the electronic health record (EHR) relevant to the identification of persons with social needs and to develop a social risk score for use within clinical practice to better identify patients at risk of having future social needs. METHODS We conducted a retrospective study using EHR data (2016-2021) and data from the US Census American Community Survey. We developed a prospective model using current year-1 risk factors to predict future year-2 outcomes within four 2-year cohorts. Predictors of interest included demographics, previous health care use, comorbidity, previously identified social needs, and neighborhood characteristics as reflected by the area deprivation index. The outcome variable was a binary indicator reflecting the likelihood of the presence of a patient with social needs. We applied a generalized estimating equation approach, adjusting for patient-level risk factors, the possible effect of geographically clustered data, and the effect of multiple visits for each patient. RESULTS The study population of 1,852,228 patients included middle-aged (mean age range 53.76-55.95 years), White (range 324,279/510,770, 63.49% to 290,688/488,666, 64.79%), and female (range 314,741/510,770, 61.62% to 278,488/448,666, 62.07%) patients from neighborhoods with high socioeconomic status (mean area deprivation index percentile range 28.76-30.31). Between 8.28% (37,137/448,666) and 11.55% (52,037/450,426) of patients across the study cohorts had at least 1 social need documented in their EHR, with safety issues and economic challenges (ie, financial resource strain, employment, and food insecurity) being the most common documented social needs (87,152/1,852,228, 4.71% and 58,242/1,852,228, 3.14% of overall patients, respectively). The model had an area under the curve of 0.702 (95% CI 0.699-0.705) in predicting prospective social needs in the overall study population. Previous social needs (odds ratio 3.285, 95% CI 3.237-3.335) and emergency department visits (odds ratio 1.659, 95% CI 1.634-1.684) were the strongest predictors of future social needs. CONCLUSIONS Our model provides an opportunity to make use of available EHR data to help identify patients with high social needs. Our proposed social risk score could help identify the subset of patients who would most benefit from further social needs screening and data collection to avoid potentially more burdensome primary data collection on all patients in a target population of interest.
Collapse
Affiliation(s)
- Elham Hatef
- Division of General Internal Medicine, Department of Medicine, Johns Hopkins School of Medicine, Baltimore, MD, United States
- Center for Population Health Information Technology, Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States
| | - Hsien-Yen Chang
- Center for Population Health Information Technology, Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States
| | - Thomas M Richards
- Center for Population Health Information Technology, Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States
| | - Christopher Kitchen
- Center for Population Health Information Technology, Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States
| | - Janya Budaraju
- Center for Population Health Information Technology, Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States
| | - Iman Foroughmand
- Center for Population Health Information Technology, Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States
| | - Elyse C Lasser
- Center for Population Health Information Technology, Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States
| | - Jonathan P Weiner
- Center for Population Health Information Technology, Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States
| |
Collapse
|
15
|
Odeyemi YE, Lal A, Barreto EF, LeMahieu AM, Yadav H, Gajic O, Schulte P. Early machine learning prediction of hospitalized patients at low risk of respiratory deterioration or mortality in community-acquired pneumonia: Derivation and validation of a multivariable model. Biomol Biomed 2024; 24:337-345. [PMID: 37795970 PMCID: PMC10950343 DOI: 10.17305/bb.2023.9754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 09/26/2023] [Accepted: 10/04/2023] [Indexed: 10/06/2023]
Abstract
Current prognostic tools for pneumonia predominantly focus on mortality, often neglecting other crucial outcomes such as the need for advanced respiratory support. The objective of this study was to develop and validate a tool that predicts the early risk of non-occurrence of respiratory deterioration or mortality. We conducted a single-center, retrospective cohort study involving hospitalized adult patients with community-acquired pneumonia (CAP) and acute hypoxic respiratory failure from January 2009 to December 2019 (n = 4379). We employed the gradient boosting machine (GBM) learning to create a model that estimates the likelihood of patients requiring advanced respiratory support (high flow nasal cannula [HFNC], non-invasive mechanical ventilation [NIMV], and invasive mechanical ventilation [IMV]) or facing mortality during hospitalization. This model utilized readily available data including demographic, physiologic, and laboratory data, sourced from electronic health records and obtained within the first six hours of admission. Out of the cohort, 890 patients (25.2%) either required advanced respiratory support or died during their hospital stay. Our predictive model displayed superior discrimination and higher sensitivity (cross-validation C-statistic = 0.71; specificity = 0.56; sensitivity = 0.72) compared to the pneumonia severity index (PSI) (C-statistic = 0.65; specificity = 0.91; sensitivity = 0.24; P value < 0.001), while maintaining a negative predictive value (NPV) of approximately 0.85. These data demonstrate that our machine learning model predicted the non-occurrence of respiratory deterioration or mortality among hospitalized CAP patients more accurately than the PSI. The enhanced sensitivity of this model holds potential for reliably excluding low-risk patients from pneumonia clinical trials.
Collapse
Affiliation(s)
- Yewande E Odeyemi
- Division of Pulmonary and Critical Care Medicine, Mayo Clinic, Rochester, MN, United States
| | - Amos Lal
- Division of Pulmonary and Critical Care Medicine, Mayo Clinic, Rochester, MN, United States
| | - Erin F Barreto
- Department of Pharmacy, Mayo Clinic, Rochester, MN, United States
| | - Allison M LeMahieu
- Division of Clinical Trials and Biostatistics, Mayo Clinic, Rochester, MN, United States
| | - Hemang Yadav
- Division of Pulmonary and Critical Care Medicine, Mayo Clinic, Rochester, MN, United States
| | - Ognjen Gajic
- Division of Pulmonary and Critical Care Medicine, Mayo Clinic, Rochester, MN, United States
| | - Phillip Schulte
- Division of Clinical Trials and Biostatistics, Mayo Clinic, Rochester, MN, United States
| |
Collapse
|
16
|
Xiao Y, Xiao L, Zhang Y, Xu X, Guan X, Guo Y, Shen Y, Lei X, Dou Y, Yu J. Prediction of tumor lysis syndrome in childhood acute lymphoblastic leukemia based on machine learning models: a retrospective study. Front Oncol 2024; 14:1337295. [PMID: 38515564 PMCID: PMC10955075 DOI: 10.3389/fonc.2024.1337295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Accepted: 02/19/2024] [Indexed: 03/23/2024] Open
Abstract
Background Tumor lysis syndrome (TLS) often occurs early after induction chemotherapy for acute lymphoblastic leukemia (ALL) and can rapidly progress. This study aimed to construct a machine learning model to predict the risk of TLS using clinical indicators at the time of ALL diagnosis. Methods This observational cohort study was conducted at the National Clinical Research Center for Child Health and Disease. Data were collected from pediatric ALL patients diagnosed between December 2008 and December 2021. Four machine learning models were constructed using the Least Absolute Shrinkage and Selection Operator (LASSO) to select key clinical indicators for model construction. Results The study included 2,243 pediatric ALL patients, and the occurrence of TLS was 8.87%. A total of 33 indicators with missing values ≤30% were collected, and 12 risk factors were selected through LASSO regression analysis. The CatBoost model with the best performance after feature screening was selected to predict the TLS of ALL patients. The CatBoost model had an AUC of 0.832 and an accuracy of 0.758. The risk factors most associated with TLS were the absence of potassium, phosphorus, aspartate transaminase (AST), white blood cell count (WBC), and urea levels. Conclusion We developed the first TLS prediction model for pediatric ALL to assist clinicians in risk stratification at diagnosis and in developing personalized treatment protocols. This study is registered on the China Clinical Trials Registry platform (ChiCTR2200060616). Clinical trial registration https://www.chictr.org.cn/, identifier ChiCTR2200060616.
Collapse
Affiliation(s)
- Yao Xiao
- Department of Hematology and Oncology, Children’s Hospital of Chongqing Medical University, National Clinical Research Center for Child Health and Disorders, Chongqing Key Laboratory of Pediatrics, Ministry of Education Key Laboratory of Child Development and Disorders, Chongqing, China
| | - Li Xiao
- Department of Hematology and Oncology, Children’s Hospital of Chongqing Medical University, National Clinical Research Center for Child Health and Disorders, Chongqing Key Laboratory of Pediatrics, Ministry of Education Key Laboratory of Child Development and Disorders, Chongqing, China
| | - Yang Zhang
- College of Medical Informatics, Chongqing Medical University, Chongqing, China
| | - Ximing Xu
- Big Data Engineering Center for Children’s Medical Care, Children’s Hospital of Chongqing Medical University, Chongqing, China
| | - Xianmin Guan
- Department of Hematology and Oncology, Children’s Hospital of Chongqing Medical University, National Clinical Research Center for Child Health and Disorders, Chongqing Key Laboratory of Pediatrics, Ministry of Education Key Laboratory of Child Development and Disorders, Chongqing, China
| | - Yuxia Guo
- Department of Hematology and Oncology, Children’s Hospital of Chongqing Medical University, National Clinical Research Center for Child Health and Disorders, Chongqing Key Laboratory of Pediatrics, Ministry of Education Key Laboratory of Child Development and Disorders, Chongqing, China
| | - Yali Shen
- Department of Hematology and Oncology, Children’s Hospital of Chongqing Medical University, National Clinical Research Center for Child Health and Disorders, Chongqing Key Laboratory of Pediatrics, Ministry of Education Key Laboratory of Child Development and Disorders, Chongqing, China
| | - XiaoYing Lei
- Department of Hematology and Oncology, Children’s Hospital of Chongqing Medical University, National Clinical Research Center for Child Health and Disorders, Chongqing Key Laboratory of Pediatrics, Ministry of Education Key Laboratory of Child Development and Disorders, Chongqing, China
| | - Ying Dou
- Department of Hematology and Oncology, Children’s Hospital of Chongqing Medical University, National Clinical Research Center for Child Health and Disorders, Chongqing Key Laboratory of Pediatrics, Ministry of Education Key Laboratory of Child Development and Disorders, Chongqing, China
| | - Jie Yu
- Department of Hematology and Oncology, Children’s Hospital of Chongqing Medical University, National Clinical Research Center for Child Health and Disorders, Chongqing Key Laboratory of Pediatrics, Ministry of Education Key Laboratory of Child Development and Disorders, Chongqing, China
| |
Collapse
|
17
|
Hozo I, Guyatt G, Djulbegovic B. Decision curve analysis based on summary data. J Eval Clin Pract 2024; 30:281-289. [PMID: 38044860 DOI: 10.1111/jep.13945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 11/16/2023] [Accepted: 11/20/2023] [Indexed: 12/05/2023]
Abstract
BACKGROUND To realize the potential of precision medicine, predictive models should be integrated within the framework of decision analysis, such as the decision curve analysis (DCA). To date, its application has required individual patient data (IPD) that are often unavailable. Performing DCA using aggregate data without requiring IPD may advance the goals of precision medicine. METHODS We present a statistical framework demonstrating that DCA can be conducted by using only the mean and standard deviation (SD) from the raw probabilities of the predictive model. We tested our theoretical framework by performing extensive simulations and comparing the aggregate-based DCA with IPD DCA. The latter was conducted using IPD from four predictive models that employed logistic regression, Cox or competing risk time-to-event modeling including (a) statins for primary prevention of cardiovascular disease (n = 4859), (b) hospice referral for terminally ill patients (n = 9104), (c) use of thromboprophylaxis for preventing venous thromboembolism in patients with cancer (n = 425) and (d) prevention of sinusoidal obstruction syndrome after hematopoietic cell transplantation (SCT) (n = 80). RESULTS Simulations assuming perfect calibration showed that regardless of which probability distributions informed the predictive models, the differences in DCA were negligible. Similarly, for the adequately powered models, the results of DCA based on the summary data were similar to IPD-derived DCA. The inherent instability of the predictive models, based on the smaller sample sizes, resulted in a somewhat larger discrepancy between aggregate and IPD-based DCA. CONCLUSIONS DCA informed by adequately powered and well-calibrated models using only summary statistical estimates (mean and SD) approximates well models using IPD. Use of aggregate data will facilitate broader integration of predictive with decision modeling toward the goals of individualized decision-making.
Collapse
Affiliation(s)
- Iztok Hozo
- Department of Mathematics, Indiana University Northwest, Gary, Indiana, USA
| | - Gordon Guyatt
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Benjamin Djulbegovic
- Department of Medicine, Division of Medical Hematology and Oncology, Medical University of South Carolina, Charleston, South Carolina, USA
| |
Collapse
|
18
|
Michelson AP, Oh I, Gupta A, Puri V, Kreisel D, Gelman AE, Nava R, Witt CA, Byers DE, Halverson L, Vazquez-Guillamet R, Payne PRO, Hachem RR. Developing machine learning models to predict primary graft dysfunction after lung transplantation. Am J Transplant 2024; 24:458-467. [PMID: 37468109 DOI: 10.1016/j.ajt.2023.07.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 06/21/2023] [Accepted: 07/04/2023] [Indexed: 07/21/2023]
Abstract
Primary graft dysfunction (PGD) is the leading cause of morbidity and mortality in the first 30 days after lung transplantation. Risk factors for the development of PGD include donor and recipient characteristics, but how multiple variables interact to impact the development of PGD and how clinicians should consider these in making decisions about donor acceptance remain unclear. This was a single-center retrospective cohort study to develop and evaluate machine learning pipelines to predict the development of PGD grade 3 within the first 72 hours of transplantation using donor and recipient variables that are known at the time of donor offer acceptance. Among 576 bilateral lung recipients, 173 (30%) developed PGD grade 3. The cohort underwent a 75% to 25% train-test split, and lasso regression was used to identify 11 variables for model development. A K-nearest neighbor's model showing the best calibration and performance with relatively small confidence intervals was selected as the final predictive model with an area under the receiver operating characteristics curve of 0.65. Machine learning models can predict the risk for development of PGD grade 3 based on data available at the time of donor offer acceptance. This may improve donor-recipient matching and donor utilization in the future.
Collapse
Affiliation(s)
- Andrew P Michelson
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Washington University School of Medicine, Saint Louis, Missouri, USA; Institute for Informatics, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Inez Oh
- Institute for Informatics, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Aditi Gupta
- Institute for Informatics, Washington University School of Medicine, Saint Louis, Missouri, USA; Division of Biostatistics, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Varun Puri
- Division of Cardiothoracic Surgery, Department of Surgery, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Daniel Kreisel
- Division of Cardiothoracic Surgery, Department of Surgery, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Andrew E Gelman
- Division of Cardiothoracic Surgery, Department of Surgery, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Ruben Nava
- Division of Cardiothoracic Surgery, Department of Surgery, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Chad A Witt
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Derek E Byers
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Laura Halverson
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Rodrigo Vazquez-Guillamet
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Philip R O Payne
- Institute for Informatics, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Ramsey R Hachem
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Washington University School of Medicine, Saint Louis, Missouri, USA.
| |
Collapse
|
19
|
Rahman MK, Williams RB, Ajulo S, Levent G, Loneragan GH, Awosile B. Predictive Modeling of Phenotypic Antimicrobial Susceptibility of Selected Beta-Lactam Antimicrobials from Beta-Lactamase Resistance Genes. Antibiotics (Basel) 2024; 13:224. [PMID: 38534659 DOI: 10.3390/antibiotics13030224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 02/23/2024] [Accepted: 02/26/2024] [Indexed: 03/28/2024] Open
Abstract
The outcome of bacterial infection management relies on prompt diagnosis and effective treatment, but conventional antimicrobial susceptibility testing can be slow and labor-intensive. Therefore, this study aims to predict phenotypic antimicrobial susceptibility of selected beta-lactam antimicrobials in the bacteria of the family Enterobacteriaceae from different beta-lactamase resistance genotypes. Using human datasets extracted from the Antimicrobial Testing Leadership and Surveillance (ATLAS) program conducted by Pfizer and retail meat datasets from the National Antimicrobial Resistance Monitoring System for Enteric Bacteria (NARMS), we used a robust or weighted least square multivariable linear regression modeling framework to explore the relationship between antimicrobial susceptibility data of beta-lactam antimicrobials and different types of beta-lactamase resistance genes. In humans, in the presence of the blaCTX-M-1, blaCTX-M-2, blaCTX-M-8/25, and blaCTX-M-9 groups, MICs of cephalosporins significantly increased by values between 0.34-3.07 μg/mL, however, the MICs of carbapenem significantly decreased by values between 0.81-0.87 μg/mL. In the presence of carbapenemase genes (blaKPC, blaNDM, blaIMP, and blaVIM), the MICs of cephalosporin antimicrobials significantly increased by values between 1.06-5.77 μg/mL, while the MICs of carbapenem antimicrobials significantly increased by values between 5.39-67.38 μg/mL. In retail meat, MIC of ceftriaxone increased significantly in the presence of blaCMY-2, blaCTX-M-1, blaCTX-M-55, blaCTX-M-65, and blaSHV-2 by 55.16 μg/mL, 222.70 μg/mL, 250.81 μg/mL, 204.89 μg/mL, and 31.51 μg/mL respectively. MIC of cefoxitin increased significantly in the presence of blaCTX-M-65 and blaTEM-1 by 1.57 μg/mL and 1.04 μg/mL respectively. In the presence of blaCMY-2, MIC of cefoxitin increased by an average of 8.66 μg/mL over 17 years. Compared to E. coli isolates, MIC of cefoxitin in Salmonella enterica isolates decreased significantly by 0.67 μg/mL. On the other hand, MIC of ceftiofur increased in the presence of blaCTX-M-1, blaCTX-M-65, blaSHV-2, and blaTEM-1 by 8.82 μg/mL, 9.11 μg/mL, 8.18 μg/mL, and 1.04 μg/mL respectively. In the presence of blaCMY-2, MIC of ceftiofur increased by an average of 10.20 μg/mL over 14 years. The ability to predict antimicrobial susceptibility of beta-lactam antimicrobials directly from beta-lactamase resistance genes may help reduce the reliance on routine phenotypic testing with higher turnaround times in diagnostic, therapeutic, and surveillance of antimicrobial-resistant bacteria of the family Enterobacteriaceae.
Collapse
Affiliation(s)
- Md Kaisar Rahman
- School of Veterinary Medicine, Texas Tech University, Amarillo, TX 79106, USA
| | - Ryan B Williams
- School of Veterinary Medicine, Texas Tech University, Amarillo, TX 79106, USA
| | - Samuel Ajulo
- School of Veterinary Medicine, Texas Tech University, Amarillo, TX 79106, USA
| | - Gizem Levent
- School of Veterinary Medicine, Texas Tech University, Amarillo, TX 79106, USA
| | - Guy H Loneragan
- School of Veterinary Medicine, Texas Tech University, Amarillo, TX 79106, USA
| | - Babafela Awosile
- School of Veterinary Medicine, Texas Tech University, Amarillo, TX 79106, USA
| |
Collapse
|
20
|
May SB, Giordano TP, Gottlieb A. Generalizable pipeline for constructing HIV risk prediction models across electronic health record systems. J Am Med Inform Assoc 2024; 31:666-673. [PMID: 37990631 PMCID: PMC10873846 DOI: 10.1093/jamia/ocad217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 09/25/2023] [Accepted: 10/31/2023] [Indexed: 11/23/2023] Open
Abstract
OBJECTIVE The HIV epidemic remains a significant public health issue in the United States. HIV risk prediction models could be beneficial for reducing HIV transmission by helping clinicians identify patients at high risk for infection and refer them for testing. This would facilitate initiation on treatment for those unaware of their status and pre-exposure prophylaxis for those uninfected but at high risk. Existing HIV risk prediction algorithms rely on manual construction of features and are limited in their application across diverse electronic health record systems. Furthermore, the accuracy of these models in predicting HIV in females has thus far been limited. MATERIALS AND METHODS We devised a pipeline for automatic construction of prediction models based on automatic feature engineering to predict HIV risk and tested our pipeline on a local electronic health records system and a national claims data. We also compared the performance of general models to female-specific models. RESULTS Our models obtain similarly good performance on both health record datasets despite difference in represented populations and data availability (AUC = 0.87). Furthermore, our general models obtain good performance on females but are also improved by constructing female-specific models (AUC between 0.81 and 0.86 across datasets). DISCUSSION AND CONCLUSIONS We demonstrated that flexible construction of prediction models performs well on HIV risk prediction across diverse health records systems and perform as well in predicting HIV risk in females, making deployment of such models into existing health care systems tangible.
Collapse
Affiliation(s)
- Sarah B May
- Center for Precision Health, McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77030, United States
- Dan L Duncan Institute for Clinical and Translational Research, Baylor College of Medicine, Houston, TX 77030, United States
| | - Thomas P Giordano
- Section of Infectious Diseases, Department of Medicine, Baylor College of Medicine, Houston, TX 77030, United States
- Center for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey VA Medical Center, Houston, TX 77021, United States
| | - Assaf Gottlieb
- Center for Precision Health, McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77030, United States
| |
Collapse
|
21
|
Walsh C, Stallard-Olivera E, Fierer N. Nine (not so simple) steps: a practical guide to using machine learning in microbial ecology. mBio 2024; 15:e0205023. [PMID: 38126787 PMCID: PMC10865974 DOI: 10.1128/mbio.02050-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023] Open
Abstract
Due to the complex nature of microbiome data, the field of microbial ecology has many current and potential uses for machine learning (ML) modeling. With the increased use of predictive ML models across many disciplines, including microbial ecology, there is extensive published information on the specific ML algorithms available and how those algorithms have been applied. Thus, our goal is not to summarize the breadth of ML models available or compare their performances. Rather, our goal is to provide more concrete and actionable information to guide microbial ecologists in how to select, run, and interpret ML algorithms to predict the taxa or genes associated with particular sample categories or environmental gradients of interest. Such microbial data often have unique characteristics that require careful consideration of how to apply ML models and how to interpret the associated results. This review is intended for practicing microbial ecologists who may be unfamiliar with some of the intricacies of ML models. We provide examples and discuss common opportunities and pitfalls specific to applying ML models to the types of data sets most frequently collected by microbial ecologists.
Collapse
Affiliation(s)
- Corinne Walsh
- Cooperative Institute of Research in Environmental Sciences, CU Boulder, Boulder, Colorado, USA
- Ecology and Evolutionary Biology Department, CU Boulder, Boulder, Colorado, USA
| | - Elías Stallard-Olivera
- Cooperative Institute of Research in Environmental Sciences, CU Boulder, Boulder, Colorado, USA
- Ecology and Evolutionary Biology Department, CU Boulder, Boulder, Colorado, USA
| | - Noah Fierer
- Cooperative Institute of Research in Environmental Sciences, CU Boulder, Boulder, Colorado, USA
- Ecology and Evolutionary Biology Department, CU Boulder, Boulder, Colorado, USA
| |
Collapse
|
22
|
Wang Y, Fan Y, Zhupanska OI. Challenges and Future Recommendations for Lightning Strike Damage Assessments of Composites: Laboratory Testing and Predictive Modeling. Materials (Basel) 2024; 17:744. [PMID: 38591613 PMCID: PMC10856118 DOI: 10.3390/ma17030744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 01/29/2024] [Accepted: 01/31/2024] [Indexed: 04/10/2024]
Abstract
Lightning strike events pose significant challenges to the structural integrity and performance of composite materials, particularly in aerospace, wind turbine blade, and infrastructure applications. Through a meticulous examination of the state-of-the-art methodologies of laboratory testing and damage predictive modeling, this review elucidates the role of simulated lightning strike tests in providing inputs required for damage modeling and experimental data for model validations. In addition, this review provides a holistic understanding of what is there, what are current issues, and what is still missing in both lightning strike testing and modeling to enable a robust and high-fidelity predictive capability, and challenges and future recommendations are also presented. The insights gleaned from this review are poised to catalyze advancements in the safety, reliability, and durability of composite materials under lightning strike conditions, as well as to facilitate the development of innovative lightning damage mitigation strategies.
Collapse
Affiliation(s)
- Yeqing Wang
- Department of Mechanical and Aerospace Engineering, Syracuse University, Syracuse, NY 13244, USA
| | - Yin Fan
- School of Aeronautics and Astronautics, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Olesya I. Zhupanska
- Department of Aerospace and Mechanical Engineering, University of Arizona, Tucson, AZ 85721, USA;
| |
Collapse
|
23
|
Sanjel S, Colee J, Barocco RL, Dufault NS, Tillman BL, Punja ZK, Seepaul R, Small IM. Environmental Factors Influencing Stem Rot Development in Peanut: Predictors and Action Thresholds for Disease Management. Phytopathology 2024; 114:393-404. [PMID: 37581435 DOI: 10.1094/phyto-05-23-0164-r] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/16/2023]
Abstract
Peanuts grown in tropical, subtropical, and temperate regions are susceptible to stem rot, which is a soilborne disease caused by Athelia rolfsii. Due to the lack of reliable environmental-based scheduling recommendations, stem rot control relies heavily on fungicides that are applied at predetermined intervals. We conducted inoculated field experiments for six site-years in North Florida to examine the relationship between germination of A. rolfsii sclerotia: the inoculum, stem rot symptom development in the peanut crop, and environmental factors such as soil temperature (ST), soil moisture, relative humidity (RH), precipitation, evapotranspiration, and solar radiation. Window-pane analysis with hourly and daily environmental data for 5- to 28-day periods before each disease assessment were evaluated to select model predictors using correlation analysis, regularized regression, and exhaustive feature selection. Our results indicated that within-canopy ST (at 0.05 m belowground) and RH (at 0.15 m aboveground) were the most important environmental variables that influenced the progress of mycelial activity in susceptible peanut crops. Decision tree analysis resulted in an easy-to-interpret one-variable model (adjusted R2 = 0.51, Akaike information criterion [AIC] = 324, root average square error [RASE] = 14.21) or two-variable model (adjusted R2 = 0.61, AIC = 306, RASE = 10.95) that provided an action threshold for various disease scenarios based on number of hours of canopy RH above 90% and ST between 25 and 35°C in a 14-day window. Coupling an existing preseason risk index for stem rot, such as Peanut Rx, with the environmentally based predictors identified in this study would be a logical next step to optimize stem rot management. [Formula: see text] Copyright © 2024 The Author(s). This is an open access article distributed under the CC BY 4.0 International license.
Collapse
Affiliation(s)
- Santosh Sanjel
- North Florida Research and Education Center, University of Florida, Quincy, FL, U.S.A
- Plant Pathology Department, University of Florida, Gainesville, FL, U.S.A
| | - James Colee
- IFAS Statistical Consulting Unit, University of Florida, Gainesville, FL, U.S.A
| | - Rebecca L Barocco
- North Florida Research and Education Center, University of Florida, Quincy, FL, U.S.A
- Plant Pathology Department, University of Florida, Gainesville, FL, U.S.A
| | - Nicholas S Dufault
- Plant Pathology Department, University of Florida, Gainesville, FL, U.S.A
| | - Barry L Tillman
- North Florida Research and Education Center, University of Florida, Marianna, FL, U.S.A
- Agronomy Department, University of Florida, Gainesville, FL, U.S.A
| | - Zamir K Punja
- Department of Biological Sciences, Simon Fraser University, Burnaby, BC, Canada
| | - Ramdeo Seepaul
- North Florida Research and Education Center, University of Florida, Quincy, FL, U.S.A
- Agronomy Department, University of Florida, Gainesville, FL, U.S.A
| | - Ian M Small
- North Florida Research and Education Center, University of Florida, Quincy, FL, U.S.A
- Plant Pathology Department, University of Florida, Gainesville, FL, U.S.A
| |
Collapse
|
24
|
Xia L, Hantrakun V, Teparrukkul P, Wongsuvan G, Kaewarpai T, Dulsuk A, Day NPJ, Lemaitre RN, Chantratita N, Limmathurotsakul D, Shojaie A, Gharib SA, West TE. Plasma Metabolomics Reveals Distinct Biological and Diagnostic Signatures for Melioidosis. Am J Respir Crit Care Med 2024; 209:288-298. [PMID: 37812796 PMCID: PMC10840774 DOI: 10.1164/rccm.202207-1349oc] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 10/09/2023] [Indexed: 10/11/2023] Open
Abstract
Rationale: The global burden of sepsis is greatest in low-resource settings. Melioidosis, infection with the gram-negative bacterium Burkholderia pseudomallei, is a frequent cause of fatal sepsis in endemic tropical regions such as Southeast Asia. Objectives: To investigate whether plasma metabolomics would identify biological pathways specific to melioidosis and yield clinically meaningful biomarkers. Methods: Using a comprehensive approach, differential enrichment of plasma metabolites and pathways was systematically evaluated in individuals selected from a prospective cohort of patients hospitalized in rural Thailand with infection. Statistical and bioinformatics methods were used to distinguish metabolomic features and processes specific to patients with melioidosis and between fatal and nonfatal cases. Measurements and Main Results: Metabolomic profiling and pathway enrichment analysis of plasma samples from patients with melioidosis (n = 175) and nonmelioidosis infections (n = 75) revealed a distinct immuno-metabolic state among patients with melioidosis, as suggested by excessive tryptophan catabolism in the kynurenine pathway and significantly increased levels of sphingomyelins and ceramide species. We derived a 12-metabolite classifier to distinguish melioidosis from other infections, yielding an area under the receiver operating characteristic curve of 0.87 in a second validation set of patients. Melioidosis nonsurvivors (n = 94) had a significantly disturbed metabolome compared with survivors (n = 81), with increased leucine, isoleucine, and valine metabolism, and elevated circulating free fatty acids and acylcarnitines. A limited eight-metabolite panel showed promise as an early prognosticator of mortality in melioidosis. Conclusions: Melioidosis induces a distinct metabolomic state that can be examined to distinguish underlying pathophysiological mechanisms associated with death. A 12-metabolite signature accurately differentiates melioidosis from other infections and may have diagnostic applications.
Collapse
Affiliation(s)
- Lu Xia
- Department of Biostatistics
| | | | - Prapit Teparrukkul
- Department of Internal Medicine, Sunpasitthiprasong Hospital, Ubon Ratchathani, Thailand; and
| | | | | | - Adul Dulsuk
- Department of Microbiology and Immunology, and
| | - Nicholas P. J. Day
- Mahidol Oxford Tropical Medicine Research Unit
- Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | | | - Narisara Chantratita
- Mahidol Oxford Tropical Medicine Research Unit
- Department of Microbiology and Immunology, and
| | - Direk Limmathurotsakul
- Mahidol Oxford Tropical Medicine Research Unit
- Department of Tropical Hygiene, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
| | | | - Sina A. Gharib
- Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, and
| | - T. Eoin West
- Division of Pulmonary, Critical Care, and Sleep Medicine, Department of Medicine, and
- Department of Global Health, University of Washington, Seattle, Washington
| |
Collapse
|
25
|
Shi X, Yang L, Bai W, Jing L, Qin L. Evaluating early lymphocyte-to-monocyte ratio as a predictive biomarker for delirium in older adult patients with sepsis: insights from a retrospective cohort analysis. Front Med (Lausanne) 2024; 11:1342568. [PMID: 38357643 PMCID: PMC10864594 DOI: 10.3389/fmed.2024.1342568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 01/15/2024] [Indexed: 02/16/2024] Open
Abstract
Background This study aims to explore the value of the Lymphocyte-to-Monocyte Ratio (LMR) in predicting delirium among older adult patients with sepsis. Methods Retrospective data were obtained from the MIMIC-IV database in accordance with the STROBE guidelines. Patients aged 65 and above, meeting the Sepsis 3.0 criteria, were selected for this study. Delirium was assessed using the Confusion Assessment Method for the ICU (CAM-ICU). Demographic information, comorbid conditions, severity of illness scores, vital sign measurements, and laboratory test results were meticulously extracted. The prognostic utility of the Lymphocyte-to-Monocyte Ratio (LMR) in predicting delirium was assessed through logistic regression models, which were carefully adjusted for potential confounding factors. Results In the studied cohort of 32,971 sepsis patients, 2,327 were identified as meeting the inclusion criteria. The incidence of delirium within this subgroup was observed to be 55%. A univariate analysis revealed a statistically significant inverse correlation between the Lymphocyte-to-Monocyte Ratio (LMR) and the risk of delirium (p < 0.001). Subsequent multivariate analysis, which accounted for comorbidities and illness severity scores, substantiated the role of LMR as a significant predictive marker. An optimized model, achieving the lowest Akaike Information Criterion (AIC), incorporated 17 variables and continued to demonstrate LMR as a significant prognostic factor (p < 0.01). Analysis of the Receiver Operating Characteristic (ROC) curve indicated a significant enhancement in the Area Under the Curve (AUC) upon the inclusion of LMR (p = 0.035). Conclusion The Lymphocyte-to-Monocyte Ratio (LMR) serves as a significant, independent prognostic indicator for the occurrence of delirium in older adult patients with sepsis. Integrating LMR into existing predictive models markedly improves the identification of patients at elevated risk, thereby informing and potentially guiding early intervention strategies.
Collapse
Affiliation(s)
| | | | | | | | - Lijie Qin
- Department of Emergency, Henan Provincial People’s Hospital, Zhengzhou University People’s Hospital, Zhengzhou, China
| |
Collapse
|
26
|
Galadima H, Anson-Dwamena R, Johnson A, Bello G, Adunlin G, Blando J. Machine Learning as a Tool for Early Detection: A Focus on Late-Stage Colorectal Cancer across Socioeconomic Spectrums. Cancers (Basel) 2024; 16:540. [PMID: 38339293 PMCID: PMC10854986 DOI: 10.3390/cancers16030540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 01/19/2024] [Accepted: 01/23/2024] [Indexed: 02/12/2024] Open
Abstract
PURPOSE To assess the efficacy of various machine learning (ML) algorithms in predicting late-stage colorectal cancer (CRC) diagnoses against the backdrop of socio-economic and regional healthcare disparities. METHODS An innovative theoretical framework was developed to integrate individual- and census tract-level social determinants of health (SDOH) with sociodemographic factors. A comparative analysis of the ML models was conducted using key performance metrics such as AUC-ROC to evaluate their predictive accuracy. Spatio-temporal analysis was used to identify disparities in late-stage CRC diagnosis probabilities. RESULTS Gradient boosting emerged as the superior model, with the top predictors for late-stage CRC diagnosis being anatomic site, year of diagnosis, age, proximity to superfund sites, and primary payer. Spatio-temporal clusters highlighted geographic areas with a statistically significant high probability of late-stage diagnoses, emphasizing the need for targeted healthcare interventions. CONCLUSIONS This research underlines the potential of ML in enhancing the prognostic predictions in oncology, particularly in CRC. The gradient boosting model, with its robust performance, holds promise for deployment in healthcare systems to aid early detection and formulate localized cancer prevention strategies. The study's methodology demonstrates a significant step toward utilizing AI in public health to mitigate disparities and improve cancer care outcomes.
Collapse
Affiliation(s)
- Hadiza Galadima
- School of Community and Environmental Health, Old Dominion University, Norfolk, VA 23529, USA; (R.A.-D.); (A.J.); (J.B.)
| | - Rexford Anson-Dwamena
- School of Community and Environmental Health, Old Dominion University, Norfolk, VA 23529, USA; (R.A.-D.); (A.J.); (J.B.)
| | - Ashley Johnson
- School of Community and Environmental Health, Old Dominion University, Norfolk, VA 23529, USA; (R.A.-D.); (A.J.); (J.B.)
| | - Ghalib Bello
- Department of Environmental Medicine & Public Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA;
| | - Georges Adunlin
- Department of Pharmaceutical, Social and Administrative Sciences, Samford University, Birmingham, AL 35229, USA;
| | - James Blando
- School of Community and Environmental Health, Old Dominion University, Norfolk, VA 23529, USA; (R.A.-D.); (A.J.); (J.B.)
| |
Collapse
|
27
|
Patel M, Liu XC, Yang K, Tassone C, Escott B, Thometz J. 3D Back Contour Metrics in Predicting Idiopathic Scoliosis Progression: Retrospective Cohort Analysis, Case Series Report and Proof of Concept. Children (Basel) 2024; 11:159. [PMID: 38397270 PMCID: PMC10886742 DOI: 10.3390/children11020159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 01/14/2024] [Accepted: 01/24/2024] [Indexed: 02/25/2024]
Abstract
Adolescent Idiopathic Scoliosis is a 3D spinal deformity commonly characterized by serial radiographs. Patients with AIS may have increased average radiation exposure compared to unaffected patients and thus may be implicated with a modest increase in cancer risk. To minimize lifetime radiation exposure, alternative imaging modalities such as surface topography are being explored. Surface topography (ST) uses a camera to map anatomic landmarks of the spine and contours of the back to create software-generated spine models. ST has previously shown good correlation to radiographic measures. In this study, we sought to use ST in the creation of a risk stratification model. A total of 38 patients met the inclusion criteria for curve progression prediction. Scoliotic curves were classified as progressing, stabilized, or improving, and a predictive model was created using the proportional odds logistic modeling. The results showed that surface topography was able to moderately appraise scoliosis curvatures when compared to radiographs. The predictive model, using demographic and surface topography measurements, was able to account for 86.9% of the variability in the future Cobb angle. Additionally, attempts at classification of curve progression, stabilization, or improvement were accurately predicted 27/38 times, 71%. These results provide a basis for the creation of a clinical tool in the tracking and prediction of scoliosis progression in order to reduce the number of X-rays required.
Collapse
Affiliation(s)
- Milan Patel
- Department of Orthopedic Surgery, Children’s Wisconsin, Medical College of Wisconsin, Greenfield, WI 53227, USA
| | - Xue-Cheng Liu
- Department of Orthopedic Surgery, Children’s Wisconsin, Medical College of Wisconsin, Greenfield, WI 53227, USA
| | - Kai Yang
- Division of Biostatistics, Institute for Health and Equity, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Channing Tassone
- Department of Orthopedic Surgery, Children’s Wisconsin, Medical College of Wisconsin, Greenfield, WI 53227, USA
| | - Benjamin Escott
- Department of Orthopedic Surgery, Children’s Wisconsin, Medical College of Wisconsin, Greenfield, WI 53227, USA
| | - John Thometz
- Department of Orthopedic Surgery, Children’s Wisconsin, Medical College of Wisconsin, Greenfield, WI 53227, USA
| |
Collapse
|
28
|
Kelbauskas L, Legutki JB, Woodbury NW. Highly heterogenous humoral immune response in Lyme disease patients revealed by broad machine learning-assisted antibody binding profiling with random peptide arrays. Front Immunol 2024; 15:1335446. [PMID: 38318184 PMCID: PMC10838964 DOI: 10.3389/fimmu.2024.1335446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Accepted: 01/03/2024] [Indexed: 02/07/2024] Open
Abstract
Introduction Lyme disease (LD), a rapidly growing public health problem in the US, represents a formidable challenge due to the lack of detailed understanding about how the human immune system responds to its pathogen, the Borrelia burgdorferi bacterium. Despite significant advances in gaining deeper insight into mechanisms the pathogen uses to evade immune response, substantial gaps remain. As a result, molecular tools for the disease diagnosis are lacking with the currently available tests showing poor performance. High interpersonal variability in immune response combined with the ability of the pathogen to use a number of immune evasive tactics have been implicated as underlying factors for the limited test performance. Methods This study was designed to perform a broad profiling of the entire repertoire of circulating antibodies in human sera at the single-individual level using planar arrays of short linear peptides with random sequences. The peptides sample sparsely, but uniformly the entire combinatorial sequence space of the same length peptides for profiling the humoral immune response to a B.burg. infection and compare them with other diseases with etiology similar to LD and healthy controls. Results The study revealed substantial variability in antibody binding profiles between individual LD patients even to the same antigen (VlsE protein) and strong similarity between individuals diagnosed with Lyme disease and healthy controls from the areas endemic to LD suggesting a high prevalence of seropositivity in endemic healthy control. Discussion This work demonstrates the utility of the approach as a valuable analytical tool for agnostic profiling of humoral immune response to a pathogen.
Collapse
Affiliation(s)
- L Kelbauskas
- Biodesign Institute, Arizona State University, Tempe, AZ, United States
- Biomorph Technologies, Chandler, AZ, United States
| | - J B Legutki
- Biodesign Institute, Arizona State University, Tempe, AZ, United States
- Biomorph Technologies, Chandler, AZ, United States
| | - N W Woodbury
- Biodesign Institute, Arizona State University, Tempe, AZ, United States
| |
Collapse
|
29
|
Golob JL, Oskotsky TT, Tang AS, Roldan A, Chung V, Ha CWY, Wong RJ, Flynn KJ, Parraga-Leo A, Wibrand C, Minot SS, Oskotsky B, Andreoletti G, Kosti I, Bletz J, Nelson A, Gao J, Wei Z, Chen G, Tang ZZ, Novielli P, Romano D, Pantaleo E, Amoroso N, Monaco A, Vacca M, De Angelis M, Bellotti R, Tangaro S, Kuntzleman A, Bigcraft I, Techtmann S, Bae D, Kim E, Jeon J, Joe S, Theis KR, Ng S, Lee YS, Diaz-Gimeno P, Bennett PR, MacIntyre DA, Stolovitzky G, Lynch SV, Albrecht J, Gomez-Lopez N, Romero R, Stevenson DK, Aghaeepour N, Tarca AL, Costello JC, Sirota M. Microbiome preterm birth DREAM challenge: Crowdsourcing machine learning approaches to advance preterm birth research. Cell Rep Med 2024; 5:101350. [PMID: 38134931 PMCID: PMC10829755 DOI: 10.1016/j.xcrm.2023.101350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 09/15/2023] [Accepted: 12/01/2023] [Indexed: 12/24/2023]
Abstract
Every year, 11% of infants are born preterm with significant health consequences, with the vaginal microbiome a risk factor for preterm birth. We crowdsource models to predict (1) preterm birth (PTB; <37 weeks) or (2) early preterm birth (ePTB; <32 weeks) from 9 vaginal microbiome studies representing 3,578 samples from 1,268 pregnant individuals, aggregated from public raw data via phylogenetic harmonization. The predictive models are validated on two independent unpublished datasets representing 331 samples from 148 pregnant individuals. The top-performing models (among 148 and 121 submissions from 318 teams) achieve area under the receiver operator characteristic (AUROC) curve scores of 0.69 and 0.87 predicting PTB and ePTB, respectively. Alpha diversity, VALENCIA community state types, and composition are important features in the top-performing models, most of which are tree-based methods. This work is a model for translation of microbiome data into clinically relevant predictive models and to better understand preterm birth.
Collapse
Affiliation(s)
- Jonathan L Golob
- Division of Infectious Disease, Department of Internal Medicine, University of Michigan, Ann Arbor, MI, USA; March of Dimes Prematurity Research Center at the University of California San Francisco, San Francisco, CA, USA.
| | - Tomiko T Oskotsky
- March of Dimes Prematurity Research Center at the University of California San Francisco, San Francisco, CA, USA; Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA; Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA.
| | - Alice S Tang
- March of Dimes Prematurity Research Center at the University of California San Francisco, San Francisco, CA, USA; Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA; Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA
| | - Alennie Roldan
- March of Dimes Prematurity Research Center at the University of California San Francisco, San Francisco, CA, USA; Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA; Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA
| | | | - Connie W Y Ha
- Benioff Center for Microbiome Medicine, Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Ronald J Wong
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, USA; March of Dimes Prematurity Research Center at Stanford University, Stanford, CA, USA
| | | | - Antonio Parraga-Leo
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA; Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA; Department of Pediatrics, Obstetrics and Gynaecology, Universidad de Valencia, Valencia, Spain; IVIRMA Global Research Alliance, IVI Foundation, Instituto de Investigación Sanitaria La Fe (IIS La Fe), Valencia, Spain
| | - Camilla Wibrand
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA; Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA
| | - Samuel S Minot
- Data Core, Shared Resources, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Boris Oskotsky
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA
| | - Gaia Andreoletti
- March of Dimes Prematurity Research Center at the University of California San Francisco, San Francisco, CA, USA; Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA; Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA
| | - Idit Kosti
- March of Dimes Prematurity Research Center at the University of California San Francisco, San Francisco, CA, USA; Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA; Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA
| | | | | | - Jifan Gao
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| | - Zhoujingpeng Wei
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| | - Guanhua Chen
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| | - Zheng-Zheng Tang
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| | - Pierfrancesco Novielli
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy; Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
| | - Donato Romano
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy; Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
| | - Ester Pantaleo
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy; Dipartimento Interateneo di Fisica "M, Merlin", Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Nicola Amoroso
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy; Dipartimento di Farmacia - Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Alfonso Monaco
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy; Dipartimento Interateneo di Fisica "M, Merlin", Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Mirco Vacca
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Maria De Angelis
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Roberto Bellotti
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy; Dipartimento Interateneo di Fisica "M, Merlin", Università degli Studi di Bari Aldo Moro, Bari, Italy
| | - Sabina Tangaro
- Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy; Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy
| | - Abigail Kuntzleman
- Department of Biological Sciences, Michigan Technological University, Houghton, MI, USA
| | - Isaac Bigcraft
- Department of Biological Sciences, Michigan Technological University, Houghton, MI, USA
| | - Stephen Techtmann
- Department of Biological Sciences, Michigan Technological University, Houghton, MI, USA
| | - Daehun Bae
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, Republic of Korea
| | - Eunyoung Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, Republic of Korea
| | - Jongbum Jeon
- Korea Bioinformation Center (KOBIC), Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon, Republic of Korea
| | - Soobok Joe
- Korea Bioinformation Center (KOBIC), Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon, Republic of Korea
| | - Kevin R Theis
- Department of Biochemistry, Microbiology and Immunology, Wayne State University, Detroit, MI, USA
| | - Sherrianne Ng
- Imperial College Parturition Research Group, Division of the Institute of Reproductive and Developmental Biology, Imperial College London, London, UK; March of Dimes Prematurity Research Centre at Imperial College London, London, UK
| | - Yun S Lee
- Imperial College Parturition Research Group, Division of the Institute of Reproductive and Developmental Biology, Imperial College London, London, UK; March of Dimes Prematurity Research Centre at Imperial College London, London, UK
| | - Patricia Diaz-Gimeno
- IVIRMA Global Research Alliance, IVI Foundation, Instituto de Investigación Sanitaria La Fe (IIS La Fe), Valencia, Spain
| | - Phillip R Bennett
- Imperial College Parturition Research Group, Division of the Institute of Reproductive and Developmental Biology, Imperial College London, London, UK; March of Dimes Prematurity Research Centre at Imperial College London, London, UK
| | - David A MacIntyre
- Imperial College Parturition Research Group, Division of the Institute of Reproductive and Developmental Biology, Imperial College London, London, UK; March of Dimes Prematurity Research Centre at Imperial College London, London, UK
| | - Gustavo Stolovitzky
- Center for Computational Biology and Bioinformatics, Columbia University, New York, NY, USA; Thomas J. Watson Research Center, IBM, Yorktown Heights, NY, USA; Sema4, Stamford, CT, USA
| | - Susan V Lynch
- Benioff Center for Microbiome Medicine, Department of Medicine, University of California, San Francisco, San Francisco, CA, USA; Division of Gastroenterology, Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | | | - Nardhy Gomez-Lopez
- Department of Biochemistry, Microbiology and Immunology, Wayne State University, Detroit, MI, USA; Perinatology Research Branch, Division of Obstetrics and Maternal-Fetal Medicine, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, US Department of Health and Human Services, Detroit, MI, USA
| | - Roberto Romero
- Perinatology Research Branch, Division of Obstetrics and Maternal-Fetal Medicine, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, US Department of Health and Human Services, Detroit, MI, USA; Department of Obstetrics and Gynecology, University of Michigan, Ann Arbor, MI, USA; Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, MI, USA; Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, USA; Detroit Medical Center, Detroit, MI, USA; Department of Obstetrics and Gynecology, Florida International University, Miami, FL, USA
| | - David K Stevenson
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, USA; Center for Academic Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Nima Aghaeepour
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, USA; Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA; Department of Biomedical Data Sciences, Stanford University School of Medicine, Stanford, CA, USA
| | - Adi L Tarca
- Perinatology Research Branch, Division of Obstetrics and Maternal-Fetal Medicine, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, US Department of Health and Human Services, Detroit, MI, USA; Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, USA; Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, MI, USA; Department of Computer Science, Wayne State University College of Engineering, Detroit, MI, USA
| | - James C Costello
- Department of Pharmacology, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Marina Sirota
- March of Dimes Prematurity Research Center at the University of California San Francisco, San Francisco, CA, USA; Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA; Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
30
|
Yang Y, Madanian S, Parry D. Enhancing Health Equity by Predicting Missed Appointments in Health Care: Machine Learning Study. JMIR Med Inform 2024; 12:e48273. [PMID: 38214974 PMCID: PMC10818230 DOI: 10.2196/48273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 11/07/2023] [Accepted: 12/04/2023] [Indexed: 01/13/2024] Open
Abstract
BACKGROUND The phenomenon of patients missing booked appointments without canceling them-known as Did Not Show (DNS), Did Not Attend (DNA), or Failed To Attend (FTA)-has a detrimental effect on patients' health and results in massive health care resource wastage. OBJECTIVE Our objective was to develop machine learning (ML) models and evaluate their performance in predicting the likelihood of DNS for hospital outpatient appointments at the MidCentral District Health Board (MDHB) in New Zealand. METHODS We sourced 5 years of MDHB outpatient records (a total of 1,080,566 outpatient visits) to build the ML prediction models. We developed 3 ML models using logistic regression, random forest, and Extreme Gradient Boosting (XGBoost). Subsequently, 10-fold cross-validation and hyperparameter tuning were deployed to minimize model bias and boost the algorithms' prediction strength. All models were evaluated against accuracy, sensitivity, specificity, and area under the receiver operating characteristic (AUROC) curve metrics. RESULTS Based on 5 years of MDHB data, the best prediction classifier was XGBoost, with an area under the curve (AUC) of 0.92, sensitivity of 0.83, and specificity of 0.85. The patients' DNS history, age, ethnicity, and appointment lead time significantly contributed to DNS prediction. An ML system trained on a large data set can produce useful levels of DNS prediction. CONCLUSIONS This research is one of the very first published studies that use ML technologies to assist with DNS management in New Zealand. It is a proof of concept and could be used to benchmark DNS predictions for the MDHB and other district health boards. We encourage conducting additional qualitative research to investigate the root cause of DNS issues and potential solutions. Addressing DNS using better strategies potentially can result in better utilization of health care resources and improve health equity.
Collapse
Affiliation(s)
- Yi Yang
- Auckland University of Technology, Auckland, New Zealand
| | | | | |
Collapse
|
31
|
Li Y, Zhou W, Wang H, Yang J, Li X. The risk factors and predictive modeling of mortality in patients with mental disorders combined with severe pneumonia. Front Psychiatry 2024; 14:1300740. [PMID: 38274425 PMCID: PMC10808291 DOI: 10.3389/fpsyt.2023.1300740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 12/18/2023] [Indexed: 01/27/2024] Open
Abstract
Background We explored clinical characteristics and risk factors for mortality in patients with mental disorders combined with severe pneumonia and developed predictive models. Methods We retrospectively analyzed the data of 161 patients with mental disorders combined with severe pneumonia in the intensive care unit (ICU) of a psychiatric hospital from May 2020 to February 2023, and divided them into two groups according to whether they died or not, and analyzed their basic characteristics, laboratory results and treatments, etc. We analyzed the risk factors of patients' deaths using logistics regression, established a prediction model, and drew a dynamic nomogram based on the results of the regression analysis. Based on the results of regression analysis, a prediction model was established and a dynamic nomogram was drawn. Results The non-survivor group and the survivor group of patients with mental disorders combined with severe pneumonia were statistically different in terms of age, type of primary mental illness, whether or not they were intubated, whether or not they had been bedridden for a long period in the past, and the Montreal Cognitive Assessment (MoCA) scale, procalcitonin (PCT), albumin (ALB), hemoglobin (Hb), etc. Logistics regression analysis revealed the following: MoCA scale (OR = 0.932, 95% CI:0.872-0.997), age (OR = 1.077, 95%CI:1.029-1.128), PCT (OR = 1.078, 95% CI:10.006-10.155), ALB (OR = 0.971, 95%CI:0.893-1.056), Hb (OR = 0.971, 95% CI: 0.942-0.986) were statistically significant. The ROC curve showed that the model predicted patient death with an area under the curve (AUC) of 0.827 with a sensitivity of 73.4% and a specificity of 80.4%. Conclusion Low MoCA score, age, PCT, and low Hb are independent risk factors for death in patients with mental disorders with severe pneumonia, and the prediction model constructed using these factors showed good predictive efficacy.
Collapse
Affiliation(s)
- Yaolin Li
- Department of Respiratory and Critical Care Medicine, The Third People's Hospital of Chengdu, Affiliated Hospital of Southwest Jiaotong University, Chengdu, China
| | - Weiguo Zhou
- Department of Critical Care Medicine, Chengdu Fourth People's Hospital, Chengdu, China
| | - Huiqin Wang
- The Affiliated Women's and Children's Hospital, School of Medicine, UESTC, Chengdu, China
| | - Jing Yang
- Department of Critical Care Medicine, Chengdu Fourth People's Hospital, Chengdu, China
| | - Xiayahu Li
- Department of Critical Care Medicine, Chengdu Second's People Hospital, Chengdu, China
| |
Collapse
|
32
|
He J, Liang G, Yu H, Lin C, Shen W. Evaluating the predictive significance of systemic immune-inflammatory index and tumor markers in lung cancer patients with bone metastases. Front Oncol 2024; 13:1338809. [PMID: 38264753 PMCID: PMC10805270 DOI: 10.3389/fonc.2023.1338809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 12/13/2023] [Indexed: 01/25/2024] Open
Abstract
Objective This study aims to develop a predictive model for identifying lung cancer patients at elevated risk for bone metastases, utilizing the Unified Immunoinflammatory Index and various tumor markers. This model is expected to facilitate timely and effective therapeutic interventions, especially in the context of the growing significance of immunotherapy for lung cancer treatment. Methods A retrospective analysis was conducted on 324 lung cancer patients treated between January 2019 and January 2021. After meeting the inclusion criteria, 241 patients were selected, with 56 exhibiting bone metastases. The cohort was divided into a training group (169 patients) and a validation group (72 patients) at a 7:3 ratio. Lasso regression was employed to identify critical variables, followed by logistic regression to construct a Nomogram model for predicting bone metastases. The model's validity was ascertained through internal and external evaluations using the Concordance Index (C-index) and Receiver Operating Characteristic (ROC) curve. Results The study identified several factors influencing bone metastasis in lung cancer, such as the Systemic Immune-Inflammatory Index (SII), Carcinoembryonic Antigen (CEA), Neuron Specific Enolase (NSE), Cyfra21-1, and Neutrophil-to-Lymphocyte Ratio (NLR). These factors were incorporated into the Nomogram model, demonstrating high validation accuracy with C-index scores of 0.936 for internal and 0.924 for external validation. Conclusion The research successfully developed an intuitive and accurate Nomogram prediction model utilizing clinical indicators to predict the risk of bone metastases in lung cancer patients. This tool can be instrumental in aiding clinicians in developing personalized treatment plans, thereby optimizing patient outcomes in lung cancer care.
Collapse
Affiliation(s)
| | | | | | | | - Weiyu Shen
- Department of Thoracic Surgery, Ningbo Medical Center Lihuili Hospital, Ningbo, Zhejiang, China
| |
Collapse
|
33
|
Zhang L, Richter LR, Kim T, Hripcsak G. Evaluating and Improving the Performance and Racial Fairness of Algorithms for GFR Estimation. medRxiv 2024:2024.01.07.24300943. [PMID: 38260285 PMCID: PMC10802656 DOI: 10.1101/2024.01.07.24300943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Data-driven clinical prediction algorithms are used widely by clinicians. Understanding what factors can impact the performance and fairness of data-driven algorithms is an important step towards achieving equitable healthcare. To investigate the impact of modeling choices on the algorithmic performance and fairness, we make use of a case study to build a prediction algorithm for estimating glomerular filtration rate (GFR) based on the patient's electronic health record (EHR). We compare three distinct approaches for estimating GFR: CKD-EPI equations, epidemiological models, and EHR-based models. For epidemiological models and EHR-based models, four machine learning models of varying computational complexity (i.e., linear regression, support vector machine, random forest regression, and neural network) were compared. Performance metrics included root mean squared error (RMSE), median difference, and the proportion of GFR estimates within 30% of the measured GFR value (P30). Differential performance between non-African American and African American group was used to assess algorithmic fairness with respect to race. Our study showed that the variable race had a negligible effect on error, accuracy, and differential performance. Furthermore, including more relevant clinical features (e.g., common comorbidities of chronic kidney disease) and using more complex machine learning models, namely random forest regression, significantly lowered the estimation error of GFR. However, the difference in performance between African American and non-African American patients did not decrease, where the estimation error for African American patients remained consistently higher than non-African American patients, indicating that more objective patient characteristics should be discovered and included to improve algorithm performance.
Collapse
Affiliation(s)
- Linying Zhang
- Department of Biomedical Informatics Columbia University, New York, NY, USA
- Institute for Informatics, Data Science, and Biostatistics Washington University in St. Louis, St. Louis, MO, USA
| | - Lauren R Richter
- Department of Biomedical Informatics Columbia University, New York, NY, USA
| | - Tevin Kim
- Department of Biomedical Informatics Columbia University, New York, NY, USA
| | - George Hripcsak
- Department of Biomedical Informatics Columbia University, New York, NY, USA
| |
Collapse
|
34
|
Bekbolatova M, Mayer J, Ong CW, Toma M. Transformative Potential of AI in Healthcare: Definitions, Applications, and Navigating the Ethical Landscape and Public Perspectives. Healthcare (Basel) 2024; 12:125. [PMID: 38255014 PMCID: PMC10815906 DOI: 10.3390/healthcare12020125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 12/27/2023] [Accepted: 01/02/2024] [Indexed: 01/24/2024] Open
Abstract
Artificial intelligence (AI) has emerged as a crucial tool in healthcare with the primary aim of improving patient outcomes and optimizing healthcare delivery. By harnessing machine learning algorithms, natural language processing, and computer vision, AI enables the analysis of complex medical data. The integration of AI into healthcare systems aims to support clinicians, personalize patient care, and enhance population health, all while addressing the challenges posed by rising costs and limited resources. As a subdivision of computer science, AI focuses on the development of advanced algorithms capable of performing complex tasks that were once reliant on human intelligence. The ultimate goal is to achieve human-level performance with improved efficiency and accuracy in problem-solving and task execution, thereby reducing the need for human intervention. Various industries, including engineering, media/entertainment, finance, and education, have already reaped significant benefits by incorporating AI systems into their operations. Notably, the healthcare sector has witnessed rapid growth in the utilization of AI technology. Nevertheless, there remains untapped potential for AI to truly revolutionize the industry. It is important to note that despite concerns about job displacement, AI in healthcare should not be viewed as a threat to human workers. Instead, AI systems are designed to augment and support healthcare professionals, freeing up their time to focus on more complex and critical tasks. By automating routine and repetitive tasks, AI can alleviate the burden on healthcare professionals, allowing them to dedicate more attention to patient care and meaningful interactions. However, legal and ethical challenges must be addressed when embracing AI technology in medicine, alongside comprehensive public education to ensure widespread acceptance.
Collapse
Affiliation(s)
- Molly Bekbolatova
- Department of Osteopathic Manipulative Medicine, College of Osteopathic Medicine, New York Institute of Technology, Old Westbury, NY 11568, USA; (M.B.); (J.M.)
| | - Jonathan Mayer
- Department of Osteopathic Manipulative Medicine, College of Osteopathic Medicine, New York Institute of Technology, Old Westbury, NY 11568, USA; (M.B.); (J.M.)
| | - Chi Wei Ong
- School of Chemistry, Chemical Engineering, and Biotechnology, Nanyang Technological University, 62 Nanyang Drive, Singapore 637459, Singapore
| | - Milan Toma
- Department of Osteopathic Manipulative Medicine, College of Osteopathic Medicine, New York Institute of Technology, Old Westbury, NY 11568, USA; (M.B.); (J.M.)
| |
Collapse
|
35
|
Li Z, Huang R, Xia M, Patterson TA, Hong H. Fingerprinting Interactions between Proteins and Ligands for Facilitating Machine Learning in Drug Discovery. Biomolecules 2024; 14:72. [PMID: 38254672 PMCID: PMC10813698 DOI: 10.3390/biom14010072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 12/26/2023] [Accepted: 12/28/2023] [Indexed: 01/24/2024] Open
Abstract
Molecular recognition is fundamental in biology, underpinning intricate processes through specific protein-ligand interactions. This understanding is pivotal in drug discovery, yet traditional experimental methods face limitations in exploring the vast chemical space. Computational approaches, notably quantitative structure-activity/property relationship analysis, have gained prominence. Molecular fingerprints encode molecular structures and serve as property profiles, which are essential in drug discovery. While two-dimensional (2D) fingerprints are commonly used, three-dimensional (3D) structural interaction fingerprints offer enhanced structural features specific to target proteins. Machine learning models trained on interaction fingerprints enable precise binding prediction. Recent focus has shifted to structure-based predictive modeling, with machine-learning scoring functions excelling due to feature engineering guided by key interactions. Notably, 3D interaction fingerprints are gaining ground due to their robustness. Various structural interaction fingerprints have been developed and used in drug discovery, each with unique capabilities. This review recapitulates the developed structural interaction fingerprints and provides two case studies to illustrate the power of interaction fingerprint-driven machine learning. The first elucidates structure-activity relationships in β2 adrenoceptor ligands, demonstrating the ability to differentiate agonists and antagonists. The second employs a retrosynthesis-based pre-trained molecular representation to predict protein-ligand dissociation rates, offering insights into binding kinetics. Despite remarkable progress, challenges persist in interpreting complex machine learning models built on 3D fingerprints, emphasizing the need for strategies to make predictions interpretable. Binding site plasticity and induced fit effects pose additional complexities. Interaction fingerprints are promising but require continued research to harness their full potential.
Collapse
Affiliation(s)
- Zoe Li
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR 72079, USA; (Z.L.); (T.A.P.)
| | - Ruili Huang
- National Center for Advancing Translational Sciences, National Institutes of Health, Bethesda, MD 20892, USA; (R.H.); (M.X.)
| | - Menghang Xia
- National Center for Advancing Translational Sciences, National Institutes of Health, Bethesda, MD 20892, USA; (R.H.); (M.X.)
| | - Tucker A. Patterson
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR 72079, USA; (Z.L.); (T.A.P.)
| | - Huixiao Hong
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR 72079, USA; (Z.L.); (T.A.P.)
| |
Collapse
|
36
|
Berman ME, Lowentritt JE. Chronic kidney disease and value-based care: Lessons from innovation, iteration, and ideation in primary care. Hemodial Int 2024; 28:6-16. [PMID: 37936554 DOI: 10.1111/hdi.13126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 10/15/2023] [Accepted: 10/16/2023] [Indexed: 11/09/2023]
Abstract
Value-based primary care has reduced health care costs, improved the quality of rendered care, and enhanced the patient experience. Value-based care emphasizes prevention, outreach, follow-up, patient engagement, and comprehensive, whole-person health. Primary care Accountable Care Organizations have leveraged technology-enabled workflows, practice transformation, and cutting-edge data and analytics to achieve success. These efforts are increasingly aided by predictive modeling used in the context of patient identification and prioritization algorithms. Value-based kidney care programs can glean salient takeaways from successful value-based primary care methods and models. The kidney care community is experiencing unprecedented transformation as novel payer programs and financial models burgeon. The authors contend these efforts can be accelerated by the adoption of techniques honed in value-based primary care. To optimize value-based kidney care, though, nephrology thought leaders must transcend the archetype of value-based primary care. To do so, the nephrology community must: (1) impel behavioral change among fee-for-service adherents; (2) harness emerging policy, guidelines, and quality measures; (3) adopt innovative tools, technologies, and therapies. In aggregating lessons from value-based primary care-and leveraging novel methodologies and approaches-the kidney care community will be better equipped to achieve the quadruple aim for kidney care.
Collapse
|
37
|
Ratnayake I, Pepper S, Anderson A, Alsup A, Mudaranthakam DP. An R Shiny Application (SDOH) for Predictive Modeling Using Regional Social Determinants of Health Survey Responses. Int J Soc Determinants Health Health Serv 2024; 54:21-27. [PMID: 37697462 PMCID: PMC10797831 DOI: 10.1177/27551938231201011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 07/06/2023] [Accepted: 08/01/2023] [Indexed: 09/13/2023]
Abstract
Social determinants of health (SDoH) surveys are data sets that provide useful health-related information about individuals and communities. This study aims to develop a user-friendly web application that allows clinicians to get a predictive insight into the social needs of their patients before their in-patient visits using SDoH survey data to provide an improved and personalized service. The study used a longitudinal survey that consisted of 108,563 patient responses to 12 questions. Questions were designed to have a binary outcome as the response and the patient's most recent responses for each of these questions were modeled independently by incorporating explanatory variables. Multiple classification and regression techniques were used, including logistic regression, Bayesian generalized linear model, extreme gradient boosting, gradient boosting, neural networks, and random forests. Based on the area under the curve values, gradient boosting models provided the highest precision values. Finally, the models were incorporated into an R Shiny application, enabling users to predict and compare the impact of SDoH on patients' lives. The tool is freely hosted online by the University of Kansas Medical Center's Department of Biostatistics and Data Science. The supporting materials for the application are publicly accessible on GitHub.
Collapse
Affiliation(s)
- Isuru Ratnayake
- Department of Biostatistics & Data Science, The University of Kansas Medical Center, Kansas City, KS, USA
| | - Sam Pepper
- Department of Biostatistics & Data Science, The University of Kansas Medical Center, Kansas City, KS, USA
| | - Aliyah Anderson
- Department of Biostatistics & Data Science, The University of Kansas Medical Center, Kansas City, KS, USA
| | - Alexander Alsup
- PULM Pulmonary and Critical Care Medicine, The University of Kansas Medical Center, Kansas City, KS, USA
| | | |
Collapse
|
38
|
Bandoli G, Coles C, Kable J, Jones KL, Wertelecki W, Yevtushok L, Zymak-Zakutnya N, Granovska I, Plotka L, Chambers C. Predicting fetal alcohol spectrum disorders in preschool-aged children from early life factors. Alcohol Clin Exp Res (Hoboken) 2024; 48:122-131. [PMID: 38206285 PMCID: PMC10786333 DOI: 10.1111/acer.15233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 11/01/2023] [Accepted: 11/14/2023] [Indexed: 01/12/2024]
Abstract
BACKGROUND Early life factors, including parental sociodemographic characteristics, pregnancy exposures, and physical and neurodevelopmental features measured in infancy are associated with fetal alcohol spectrum disorders (FASD). The objective of this study was to evaluate the performance of a classifier model for diagnosing FASD in preschool-aged children from pregnancy and infancy-related characteristics. METHODS We analyzed a prospective pregnancy cohort in Western Ukraine enrolled between 2008 and 2014. Maternal and paternal sociodemographic factors, maternal prenatal alcohol use and smoking behaviors, reproductive characteristics, birth outcomes, infant alcohol-related dysmorphic and physical features, and infant neurodevelopmental outcomes were used to predict FASD. Data were split into separate training (80%: n = 245) and test (20%: n = 58; 11 FASD, 47 no FASD) datasets. Training data were balanced using data augmentation through a synthetic minority oversampling technique. Four classifier models (random forest, extreme gradient boosting [XGBoost], logistic regression [full model] and backward stepwise logistic regression) were evaluated for accuracy, sensitivity, and specificity in the hold-out sample. RESULTS Of 306 children evaluated for FASD, 61 had a diagnosis. Random forest models had the highest sensitivity (0.54), with accuracy of 0.86 (95% CI: 0.74, 0.94) in hold-out data. Boosted gradient models performed similarly, however, sensitivity was less than 50%. The full logistic regression model performed poorly (sensitivity = 0.18 and accuracy = 0.65), while stepwise logistic regression performed similarly to the boosted gradient model but with lower specificity. In a hold-out sample, the best performing algorithm correctly classified six of 11 children with FASD, and 44 of 47 children without FASD. CONCLUSIONS As early identification and treatment optimize outcomes of children with FASD, classifier models from early life characteristics show promise in predicting FASD. Models may be improved through the inclusion of physiologic markers of prenatal alcohol exposure and should be tested in different samples.
Collapse
Affiliation(s)
| | | | | | | | - Wladimir Wertelecki
- Department of Pediatrics, University of California San Diego
- OMNI-Net Ukraine Birth Defects Program
| | - Lyubov Yevtushok
- OMNI-Net Ukraine Birth Defects Program
- Rivne Regional Medical Diagnostic Center, Rivne, Ukraine
- Lviv National Medical University, Lviv, Ukraine
| | - Natalya Zymak-Zakutnya
- OMNI-Net Ukraine Birth Defects Program
- Khmelnytsky Perinatal Center, Khmelnytsky, Ukraine
| | - Iryna Granovska
- OMNI-Net Ukraine Birth Defects Program
- Rivne Regional Medical Diagnostic Center, Rivne, Ukraine
| | - Larysa Plotka
- OMNI-Net Ukraine Birth Defects Program
- Rivne Regional Medical Diagnostic Center, Rivne, Ukraine
| | | | | |
Collapse
|
39
|
Ogwel B, Mzazi V, Nyawanda BO, Otieno G, Omore R. Predictive modeling for infectious diarrheal disease in pediatric populations: A systematic review. Learn Health Syst 2024; 8:e10382. [PMID: 38249852 PMCID: PMC10797570 DOI: 10.1002/lrh2.10382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Revised: 07/09/2023] [Accepted: 07/17/2023] [Indexed: 01/23/2024] Open
Abstract
Introduction Diarrhea is still a significant global public health problem. There are currently no systematic evaluation of the modeling areas and approaches to predict diarrheal illness outcomes. This paper reviews existing research efforts in predictive modeling of infectious diarrheal illness in pediatric populations. Methods We conducted a systematic review via a PubMed search for the period 1990-2021. A comprehensive search query was developed through an iterative process and literature on predictive modeling of diarrhea was retrieved. The following filters were applied to the search results: human subjects, English language, and children (birth to 18 years). We carried out a narrative synthesis of the included publications. Results Our literature search returned 2671 articles. After manual evaluation, 38 of these articles were included in this review. The most common research topic among the studies were disease forecasts 14 (36.8%), vaccine-related predictions 9 (23.7%), and disease/pathogen detection 5 (13.2%). Majority of these studies were published between 2011 and 2020, 28 (73.7%). The most common technique used in the modeling was machine learning 12 (31.6%) with various algorithms used for the prediction tasks. With change in the landscape of diarrheal etiology after rotavirus vaccine introduction, many open areas (disease forecasts, disease detection, and strain dynamics) remain for pathogen-specific predictive models among etiological agents that have emerged as important. Additionally, the outcomes of diarrheal illness remain under researched. We also observed lack of consistency in the reporting of results of prediction models despite the available guidelines highlighting the need for common data standards and adherence to guidelines on reporting of predictive models for biomedical research. Conclusions Our review identified knowledge gaps and opportunities in predictive modeling for diarrheal illness, and limitations in existing attempts whilst advancing some precursory thoughts on how to address them, aiming to invigorate future research efforts in this sphere.
Collapse
Affiliation(s)
- Billy Ogwel
- Kenya Medical Research Institute, Center for Global Health Research (KEMRI‐CGHR)KisumuKenya
- Department of Information SystemsUniversity of South AfricaPretoriaSouth Africa
| | - Vincent Mzazi
- Department of Information SystemsUniversity of South AfricaPretoriaSouth Africa
| | - Bryan O. Nyawanda
- Kenya Medical Research Institute, Center for Global Health Research (KEMRI‐CGHR)KisumuKenya
| | - Gabriel Otieno
- Department of ComputingUnited States International UniversityNairobiKenya
| | - Richard Omore
- Kenya Medical Research Institute, Center for Global Health Research (KEMRI‐CGHR)KisumuKenya
| |
Collapse
|
40
|
Daluwatte C, Dvaretskaya M, Ekhtiari S, Hayat P, Montmerle M, Mathur S, Macina D. Development of an algorithm for finding pertussis episodes in a population-based electronic health record database. Hum Vaccin Immunother 2023; 19:2209455. [PMID: 37171155 PMCID: PMC10184588 DOI: 10.1080/21645515.2023.2209455] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023] Open
Abstract
While tetanus-diphtheria-acellular pertussis (Tdap) vaccines for adolescents and adults were licensed in 2005 and immunization strategies proposed, the burden of pertussis in this population remains under-recognized mainly due to atypical disease presentation, undermining efforts to optimize protection through vaccination. We developed a machine learning algorithm to identify undiagnosed/misdiagnosed pertussis episodes in patients diagnosed with acute respiratory disease (ARD) using signs, diseases and symptoms from clinician notes and demographic information within electronic health-care records (Optum Humedica repository [2007-2019]). We used two patient cohorts aged ≥11 years to develop the model: a positive pertussis cohort (4,515 episodes in 4,316 patients) and a negative pertussis (ARD) cohort (4,573,445 episodes and patients), defined using ICD 9/10 codes. To improve contrast between positive pertussis and negative pertussis (ARD) episodes, only episodes with ≥7 symptoms were selected. LightGBM was used as the machine learning model for pertussis episode identification. Model validity was determined using laboratory-confirmed pertussis positive and negative cohorts. Model explainability was obtained using the Shapley additive explanations method. The predictive performance was as follows: area under the precision-recall curve, 0.24 (SD, 7 × 10-3); recall, 0.72 (SD, 4 × 10-3); precision, 0.012 (SD, 1 × 10-3); and specificity, 0.94 (SD, 7 × 10-3). The model applied to laboratory-confirmed positive and negative pertussis episodes had a specificity of 0.846. Predictive probability for pertussis increased with presence of whooping cough, whoop, and post-tussive vomiting in clinician notes, but decreased with gastrointestinal bleeding, sepsis, pulmonary symptoms, and fever. In conclusion, machine learning can help identify pertussis episodes among those diagnosed with ARD.
Collapse
Affiliation(s)
| | | | | | | | | | - Sachin Mathur
- Digital R&D, Sanofi US Services, Inc, Cambridge, MA, USA
| | - Denis Macina
- Global Medical, PPH Franchise, Sanofi, Lyon, France
| |
Collapse
|
41
|
Macina D, Mathur S, Dvaretskaya M, Ekhtiari S, Hayat P, Montmerle M, Daluwatte C. Estimating the pertussis burden in adolescents and adults in the United States between 2007 and 2019. Hum Vaccin Immunother 2023; 19:2208514. [PMID: 37171153 PMCID: PMC10184607 DOI: 10.1080/21645515.2023.2208514] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023] Open
Abstract
We developed a machine learning algorithm to identify undiagnosed pertussis episodes in adolescent and adult patients with reported acute respiratory disease (ARD) using clinician notes in an electronic healthcare record (EHR) database. Here, we utilized the algorithm to better estimate the overall pertussis incidence within the Optum Humedica clinical repository from 1 January 2007 through 31 December 2019. The incidence of diagnosed pertussis episodes was 1-5 per 100,000 annually, consistent with data registered by the US Centers for Disease Control and Prevention (CDC) over the same time period. Among 18,573,496 ARD episodes assessed, 1,053,946 were identified (i.e. algorithm-identified) as likely undiagnosed pertussis episodes. Accounting for these undiagnosed pertussis episodes increased the estimated pertussis incidence by 110-fold on average (34-474 per 100,000 annually). Risk factors for pertussis episodes (diagnosed and algorithm-identified) included asthma (Odds ratio [OR] 2.14; 2.12-2.16), immunodeficiency (OR 1.85; 1.78-1.91), chronic obstructive pulmonary disease (OR 1.63; 1.61-1.65), obesity (OR 1.44; 1.43-1.45), Crohn's disease (OR 1.39; 1.33-1.45), diabetes type 1 (OR 1.21; 1.17-1.24) and type 2 (OR 1.12; 1.1-1.13). Of note, all these risk factors, except Crohn's disease, increased the likelihood of severe pertussis. In conclusion, the incidence of pertussis in the adolescent and adult population in the USA is likely substantial, but considerably under-recognized, highlighting the need for improved clinical awareness of the disease and for improved control strategies in this population. These results will help better inform public health vaccination and booster programs, particularly in those with underlying comorbidities.
Collapse
Affiliation(s)
- Denis Macina
- Global Medical, PPH Franchise, Sanofi, Lyon, France
| | - Sachin Mathur
- Digital R&D, Sanofi US Services, Inc, Cambridge, MA, USA
| | | | | | | | | | | |
Collapse
|
42
|
Ćirković A, Katz T. Exploring the Potential of ChatGPT-4 in Predicting Refractive Surgery Categorizations: Comparative Study. JMIR Form Res 2023; 7:e51798. [PMID: 38153777 PMCID: PMC10784977 DOI: 10.2196/51798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 11/01/2023] [Accepted: 12/04/2023] [Indexed: 12/29/2023] Open
Abstract
BACKGROUND Refractive surgery research aims to optimally precategorize patients by their suitability for various types of surgery. Recent advances have led to the development of artificial intelligence-powered algorithms, including machine learning approaches, to assess risks and enhance workflow. Large language models (LLMs) like ChatGPT-4 (OpenAI LP) have emerged as potential general artificial intelligence tools that can assist across various disciplines, possibly including refractive surgery decision-making. However, their actual capabilities in precategorizing refractive surgery patients based on real-world parameters remain unexplored. OBJECTIVE This exploratory study aimed to validate ChatGPT-4's capabilities in precategorizing refractive surgery patients based on commonly used clinical parameters. The goal was to assess whether ChatGPT-4's performance when categorizing batch inputs is comparable to those made by a refractive surgeon. A simple binary set of categories (patient suitable for laser refractive surgery or not) as well as a more detailed set were compared. METHODS Data from 100 consecutive patients from a refractive clinic were anonymized and analyzed. Parameters included age, sex, manifest refraction, visual acuity, and various corneal measurements and indices from Scheimpflug imaging. This study compared ChatGPT-4's performance with a clinician's categorizations using Cohen κ coefficient, a chi-square test, a confusion matrix, accuracy, precision, recall, F1-score, and receiver operating characteristic area under the curve. RESULTS A statistically significant noncoincidental accordance was found between ChatGPT-4 and the clinician's categorizations with a Cohen κ coefficient of 0.399 for 6 categories (95% CI 0.256-0.537) and 0.610 for binary categorization (95% CI 0.372-0.792). The model showed temporal instability and response variability, however. The chi-square test on 6 categories indicated an association between the 2 raters' distributions (χ²5=94.7, P<.001). Here, the accuracy was 0.68, precision 0.75, recall 0.68, and F1-score 0.70. For 2 categories, the accuracy was 0.88, precision 0.88, recall 0.88, F1-score 0.88, and area under the curve 0.79. CONCLUSIONS This study revealed that ChatGPT-4 exhibits potential as a precategorization tool in refractive surgery, showing promising agreement with clinician categorizations. However, its main limitations include, among others, dependency on solely one human rater, small sample size, the instability and variability of ChatGPT's (OpenAI LP) output between iterations and nontransparency of the underlying models. The results encourage further exploration into the application of LLMs like ChatGPT-4 in health care, particularly in decision-making processes that require understanding vast clinical data. Future research should focus on defining the model's accuracy with prompt and vignette standardization, detecting confounding factors, and comparing to other versions of ChatGPT-4 and other LLMs to pave the way for larger-scale validation and real-world implementation.
Collapse
Affiliation(s)
| | - Toam Katz
- Department of Ophthalmology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| |
Collapse
|
43
|
Ghanem M, Ghaith AK, El-Hajj VG, Bhandarkar A, de Giorgio A, Elmi-Terander A, Bydon M. Limitations in Evaluating Machine Learning Models for Imbalanced Binary Outcome Classification in Spine Surgery: A Systematic Review. Brain Sci 2023; 13:1723. [PMID: 38137171 PMCID: PMC10741524 DOI: 10.3390/brainsci13121723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 12/12/2023] [Accepted: 12/15/2023] [Indexed: 12/24/2023] Open
Abstract
Clinical prediction models for spine surgery applications are on the rise, with an increasing reliance on machine learning (ML) and deep learning (DL). Many of the predicted outcomes are uncommon; therefore, to ensure the models' effectiveness in clinical practice it is crucial to properly evaluate them. This systematic review aims to identify and evaluate current research-based ML and DL models applied for spine surgery, specifically those predicting binary outcomes with a focus on their evaluation metrics. Overall, 60 papers were included, and the findings were reported according to the PRISMA guidelines. A total of 13 papers focused on lengths of stay (LOS), 12 on readmissions, 12 on non-home discharge, 6 on mortality, and 5 on reoperations. The target outcomes exhibited data imbalances ranging from 0.44% to 42.4%. A total of 59 papers reported the model's area under the receiver operating characteristic (AUROC), 28 mentioned accuracies, 33 provided sensitivity, 29 discussed specificity, 28 addressed positive predictive value (PPV), 24 included the negative predictive value (NPV), 25 indicated the Brier score with 10 providing a null model Brier, and 8 detailed the F1 score. Additionally, data visualization varied among the included papers. This review discusses the use of appropriate evaluation schemes in ML and identifies several common errors and potential bias sources in the literature. Embracing these recommendations as the field advances may facilitate the integration of reliable and effective ML models in clinical settings.
Collapse
Affiliation(s)
- Marc Ghanem
- Mayo Clinic Neuro-Informatics Laboratory, Mayo Clinic, Rochester, MN 55902, USA; (M.G.); (A.K.G.); (V.G.E.-H.); (A.B.); (M.B.)
- Department of Neurological Surgery, Mayo Clinic, Rochester, MN 55902, USA
- School of Medicine, Lebanese American University, Byblos 4504, Lebanon
| | - Abdul Karim Ghaith
- Mayo Clinic Neuro-Informatics Laboratory, Mayo Clinic, Rochester, MN 55902, USA; (M.G.); (A.K.G.); (V.G.E.-H.); (A.B.); (M.B.)
- Department of Neurological Surgery, Mayo Clinic, Rochester, MN 55902, USA
| | - Victor Gabriel El-Hajj
- Mayo Clinic Neuro-Informatics Laboratory, Mayo Clinic, Rochester, MN 55902, USA; (M.G.); (A.K.G.); (V.G.E.-H.); (A.B.); (M.B.)
- Department of Neurological Surgery, Mayo Clinic, Rochester, MN 55902, USA
- Department of Clinical Neuroscience, Karolinska Institutet, 17177 Stockholm, Sweden
| | - Archis Bhandarkar
- Mayo Clinic Neuro-Informatics Laboratory, Mayo Clinic, Rochester, MN 55902, USA; (M.G.); (A.K.G.); (V.G.E.-H.); (A.B.); (M.B.)
- Department of Neurological Surgery, Mayo Clinic, Rochester, MN 55902, USA
| | - Andrea de Giorgio
- Artificial Engineering, Via del Rione Sirignano, 80121 Naples, Italy;
| | - Adrian Elmi-Terander
- Department of Clinical Neuroscience, Karolinska Institutet, 17177 Stockholm, Sweden
- Department of Surgical Sciences, Uppsala University, 75236 Uppsala, Sweden
| | - Mohamad Bydon
- Mayo Clinic Neuro-Informatics Laboratory, Mayo Clinic, Rochester, MN 55902, USA; (M.G.); (A.K.G.); (V.G.E.-H.); (A.B.); (M.B.)
- Department of Neurological Surgery, Mayo Clinic, Rochester, MN 55902, USA
| |
Collapse
|
44
|
Villanueva P, Yang J, Radmer L, Liang X, Leung T, Ikuma K, Swanner ED, Howe A, Lee J. One-Week-Ahead Prediction of Cyanobacterial Harmful Algal Blooms in Iowa Lakes. Environ Sci Technol 2023; 57:20636-20646. [PMID: 38011382 DOI: 10.1021/acs.est.3c07764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Cyanobacterial harmful algal blooms (CyanoHABs) pose serious risks to inland water resources. Despite advancements in our understanding of associated environmental factors and modeling efforts, predicting CyanoHABs remains challenging. Leveraging an integrated water quality data collection effort in Iowa lakes, this study aimed to identify factors associated with hazardous microcystin levels and develop one-week-ahead predictive classification models. Using water samples from 38 Iowa lakes collected between 2018 and 2021, feature selection was conducted considering both linear and nonlinear properties. Subsequently, we developed three model types (Neural Network, XGBoost, and Logistic Regression) with different sampling strategies using the nine selected variables (mcyA_M, TKN, % hay/pasture, pH, mcyA_M:16S, % developed, DOC, dewpoint temperature, and ortho-P). Evaluation metrics demonstrated the strong performance of the Neural Network with oversampling (ROC-AUC 0.940, accuracy 0.861, sensitivity 0.857, specificity 0.857, LR+ 5.993, and 1/LR- 5.993), as well as the XGBoost with downsampling (ROC-AUC 0.944, accuracy 0.831, sensitivity 0.928, specificity 0.833, LR+ 5.557, and 1/LR- 11.569). This study exhibited the intricacies of modeling with limited data and class imbalances, underscoring the importance of continuous monitoring and data collection to improve predictive accuracy. Also, the methodologies employed can serve as meaningful references for researchers tackling similar challenges in diverse environments.
Collapse
Affiliation(s)
- Paul Villanueva
- Department of Agricultural and Biosystems Engineering, Iowa State University, Ames, Iowa 50011, United States
| | - Jihoon Yang
- Department of Agricultural and Biosystems Engineering, Iowa State University, Ames, Iowa 50011, United States
| | - Lorien Radmer
- Department of Agricultural and Biosystems Engineering, Iowa State University, Ames, Iowa 50011, United States
| | - Xuewei Liang
- Department of Civil, Construction and Environmental Engineering, Iowa State University, Ames, Iowa 50011, United States
| | - Tania Leung
- Department of Geological and Atmospheric Sciences, Iowa State University, Ames, Iowa 50011, United States
| | - Kaoru Ikuma
- Department of Civil, Construction and Environmental Engineering, Iowa State University, Ames, Iowa 50011, United States
| | - Elizabeth D Swanner
- Department of Geological and Atmospheric Sciences, Iowa State University, Ames, Iowa 50011, United States
| | - Adina Howe
- Department of Agricultural and Biosystems Engineering, Iowa State University, Ames, Iowa 50011, United States
| | - Jaejin Lee
- Department of Agricultural and Biosystems Engineering, Iowa State University, Ames, Iowa 50011, United States
| |
Collapse
|
45
|
Gorham TJ, Tumin D, Groner J, Allen E, Retzke J, Hersey S, Liu SB, Macias C, Alachraf K, Smith AW, Blount T, Wall B, Crickmore K, Wooten WI, Jamison SD, Rust S. Predicting emergency department visits among children with asthma in two academic medical systems. J Asthma 2023; 60:2137-2144. [PMID: 37318283 DOI: 10.1080/02770903.2023.2225603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 06/11/2023] [Indexed: 06/16/2023]
Abstract
Objective: To develop and validate a predictive algorithm that identifies pediatric patients at risk of asthma-related emergencies, and to test whether algorithm performance can be improved in an external site via local retraining.Methods: In a retrospective cohort at the first site, data from 26 008 patients with asthma aged 2-18 years (2012-2017) were used to develop a lasso-regularized logistic regression model predicting emergency department visits for asthma within one year of a primary care encounter, known as the Asthma Emergency Risk (AER) score. Internal validation was conducted on 8634 patient encounters from 2018. External validation of the AER score was conducted using 1313 pediatric patient encounters from a second site during 2018. The AER score components were then reweighted using logistic regression using data from the second site to improve local model performance. Prediction intervals (PI) were constructed via 10 000 bootstrapped samples.Results: At the first site, the AER score had a cross-validated area under the receiver operating characteristic curve (AUROC) of 0.768 (95% PI: 0.745-0.790) during model training and an AUROC of 0.769 in the 2018 internal validation dataset (p = 0.959). When applied without modification to the second site, the AER score had an AUROC of 0.684 (95% PI: 0.624-0.742). After local refitting, the cross-validated AUROC improved to 0.737 (95% PI: 0.676-0.794; p = 0.037 as compared to initial AUROC).Conclusions: The AER score demonstrated strong internal validity, but external validity was dependent on reweighting model components to reflect local data characteristics at the external site.
Collapse
Affiliation(s)
- Tyler J Gorham
- Information Technology Research & Innovation, Nationwide Children's Hospital, Columbus, OH, USA
| | - Dmitry Tumin
- Department of Pediatrics, Brody School of Medicine at East Carolina University, Greenville, NC, USA
| | - Judith Groner
- Division of Primary Care Pediatrics, Nationwide Children's Hospital, Columbus, OH, USA
- Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Elizabeth Allen
- Division of Primary Care Pediatrics, Nationwide Children's Hospital, Columbus, OH, USA
- Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Jessica Retzke
- Division of Primary Care Pediatrics, Nationwide Children's Hospital, Columbus, OH, USA
- Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Stephen Hersey
- Division of Primary Care Pediatrics, Nationwide Children's Hospital, Columbus, OH, USA
- Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Swan Bee Liu
- Information Technology Research & Innovation, Nationwide Children's Hospital, Columbus, OH, USA
| | - Charlie Macias
- Quality Improvement Services, Nationwide Children's Hospital, Columbus, OH, USA
| | - Kamel Alachraf
- Brody School of Medicine at East Carolina University, Greenville, NC, USA
| | - Aimee W Smith
- Department of Psychology, East Carolina University, Greenville, NC, USA
| | | | | | | | - William I Wooten
- Department of Pediatrics, Brody School of Medicine at East Carolina University, Greenville, NC, USA
| | - Shaundreal D Jamison
- Department of Pediatrics, Brody School of Medicine at East Carolina University, Greenville, NC, USA
| | - Steve Rust
- Information Technology Research & Innovation, Nationwide Children's Hospital, Columbus, OH, USA
| |
Collapse
|
46
|
Girwar SAM, Fiocco M, Sutch SP, Numans ME, Bruijnzeels MA. Validating and Improving Adjusted Clinical Group's Future Hospitalization and High-Cost Prediction Models for Dutch Primary Care. Popul Health Manag 2023; 26:430-437. [PMID: 37917048 DOI: 10.1089/pop.2023.0162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2023] Open
Abstract
The rise in health care costs, caused by older and more complex patient populations, requires Population Health Management approaches including risk stratification. With risk stratification, patients are assigned individual risk scores based on medical records. These patient stratifications focus on future high costs and expensive care utilization such as hospitalization, for which different models exist. With this study, the research team validated the accuracy of risk prediction scores for future hospitalization and high health care costs, calculated by the Adjusted Clinical Group (ACG)'s risk stratification models, using Dutch primary health care data registries. In addition, they aimed to adjust the US-based predictive models for Dutch primary care. The statistical validity of the existing models was assessed. In addition, the underlying prediction models were trained on 95,262 patients' data from de Zoetermeer region and externally validated on data of 48,780 patients from Zeist, Nijkerk, and Urk. Information on age, sex, number of general practitioner visits, International Classification of Primary Care coded information on the diagnosis and Anatomical Therapeutic Chemical Classification coded information on the prescribed medications, were incorporated in the model. C-statistics were used to validate the discriminatory ability of the models. Calibrating ability was assessed by visual inspection of calibration plots. Adjustment of the hospitalization model based on Dutch data improved C-statistics from 0.69 to 0.75, whereas adjustment of the high-cost model improved C-statistics from 0.78 to 0.85, indicating good discrimination of the models. The models also showed good calibration. In conclusion, the local adjustments of the ACG prediction models show great potential for use in Dutch primary care.
Collapse
Affiliation(s)
- Shelley-Ann M Girwar
- Department of Public Health and Primary Care, Health Campus The Hague, Leiden University Medical Center, The Hague, The Netherlands
| | - Marta Fiocco
- Mathematical Institute, Leiden University, Leiden, The Netherlands
- Medical Statistics Department of Biomedical Data Science, Leiden University Medical Center, Leiden, The Netherlands
- Trial and Data Center, Princess Maxima Center for Pediatric Oncology, Utrecht, The Netherlands
| | - Stephen P Sutch
- Department of Public Health and Primary Care, Health Campus The Hague, Leiden University Medical Center, The Hague, The Netherlands
- Department of Health Policy and Management, Bloomberg School of Public Health Johns Hopkins University, Baltimore Maryland, USA
| | - Mattijs E Numans
- Department of Public Health and Primary Care, Health Campus The Hague, Leiden University Medical Center, The Hague, The Netherlands
| | - Marc A Bruijnzeels
- Department of Public Health and Primary Care, Health Campus The Hague, Leiden University Medical Center, The Hague, The Netherlands
- Jan van Es Instituut, Ede, The Netherlands
| |
Collapse
|
47
|
Cuevas-Nunez M, Pan A, Sangalli L, Haering HJ, Mitchell JC. Leveraging machine learning to create user-friendly models to mitigate appointment failure at dental school clinics. J Dent Educ 2023; 87:1735-1745. [PMID: 37786254 DOI: 10.1002/jdd.13375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 08/04/2023] [Accepted: 08/26/2023] [Indexed: 10/04/2023]
Abstract
PURPOSE/OBJECTIVES This study had a twofold outcome. The first aim was to develop an efficient, machine learning (ML) model using data from a dental school clinic (DSC) electronic health record (EHR). This model identified patients with a high likelihood of failing an appointment and provided a user-friendly system with a rating score that would alert clinicians and administrators of patients at high risk of no-show appointments. The second aim was to identify key factors with ML modeling that contributed to patient no-show appointments. METHODS Using de-identified data from a DSC EHR, eight ML algorithms were evaluated: simple decision tree, bagging regressor classifier, random forest classifier, gradient boosted regression, AdaBoost regression, XGBoost regression, neural network, and logistic regression classifier. The performance of each model was assessed using a confusion matrix with different threshold level of probability; precision, recall and predicted accuracy on each threshold; receiver-operating characteristic curve (ROC) and area under curve (AUC); as well as F1 score. RESULTS The ML models agreed on the threshold of probability score at 0.20-0.25 with Bagging classifier as the model that performed best with a F1 score of 0.41 and AUC of 0.76. Results showed a strong correlation between appointment failure and appointment confirmation, patient's age, number of visits before the appointment, total number of prior failed appointments, appointment lead time, as well as the patient's total number of medical alerts. CONCLUSIONS Altogether, the implementation of this user-friendly ML model can improve DSC workflow, benefiting dental students learning outcomes and optimizing personalized patient care.
Collapse
Affiliation(s)
- Maria Cuevas-Nunez
- College of Dental Medicine-Illinois, Midwestern University, Downers Grove, Illinois, USA
| | - Allen Pan
- Midwestern University, Downers Grove, Illinois, USA
| | - Linda Sangalli
- College of Dental Medicine-Illinois, Midwestern University, Downers Grove, Illinois, USA
| | - Harold J Haering
- College of Dental Medicine-Illinois, Midwestern University, Downers Grove, Illinois, USA
| | - John C Mitchell
- College of Dental Medicine-Illinois, Midwestern University, Downers Grove, Illinois, USA
- College of Dental Medicine-Arizona, Midwestern University, Glendale, Arizona, USA
| |
Collapse
|
48
|
Bin Sumaida A, Shanbhag NM, Balaraj KS, Puratchipithan R, Hasnain SM, El-Koha O, Hussain A, Binz T, Rajendran VT, Nair RKR, Jaafar NH, Saleh M, Al Qawasmeh K. Understanding the Radiation Dose Variability in Nasopharyngeal Cancer: An Organs-at-Risk Approach. Cureus 2023; 15:e49882. [PMID: 38053989 PMCID: PMC10694485 DOI: 10.7759/cureus.49882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/03/2023] [Indexed: 12/07/2023] Open
Abstract
Objective This study aims to thoroughly assess the radiation dose distribution to critical organs in patients with nasopharyngeal carcinoma, focusing on the correlation between the radiation dosages for the various organs at risk (OARs) in nasopharyngeal cancer patients. Methods We meticulously analysed a dataset comprising 38 nasopharyngeal carcinoma patients, focusing on radiation dosages measured in Gray (Gy) and volumetric data in cubic centimetres (cc) of critical organs, including the lens, brainstem, spinal cord, optic nerve, optic chiasm, and cochlea. A detailed exploratory data analysis approach encompassed univariate, bivariate, and multivariate techniques. Results Our analysis revealed several key findings. The mean and median values across various dose measurements were closely aligned, indicating symmetrical distributions with minimal skewness. The histograms further corroborated this, showing evenly distributed dose values across different anatomical regions. The correlation matrix highlighted varying degrees of interrelationships between the doses, with some showing strong correlations while others exhibited minimal or no correlation. The 3D scatter plot provided a view of the multi-dimensional dose relationships, with a specific focus on the spinal cord, lens, and brainstem doses. The bivariate scatter plots revealed symmetrical distributions between the right and left lens doses and more complex relationships involving the brainstem and spinal cord, illustrating the intricacies of dose distribution in radiation therapy. Conclusion Our findings reveal distinct radiation exposure patterns to OARs of nasopharyngeal carcinoma. This research emphasises the need for tailored radiation therapy planning to achieve optimal clinical outcomes while safeguarding vital organs.
Collapse
Affiliation(s)
| | - Nandan M Shanbhag
- Oncology/Palliative Care, Tawam Hospital, Al Ain, ARE
- Oncology/Radiation Oncology, Tawam Hospital, Al Ain, ARE
- Internal Medicine, United Arab Emirates University, Al Ain, ARE
| | | | | | | | | | | | - Theresa Binz
- Radiotherapy Technology, Tawam Hospital, Al Ain, ARE
| | | | | | - Noor H Jaafar
- Radiotherapy Technology, Tawam Hospital, Al Ain, ARE
| | | | | |
Collapse
|
49
|
Rivera A, Ponce P, Mata O, Molina A, Meier A. Local Weather Station Design and Development for Cost-Effective Environmental Monitoring and Real-Time Data Sharing. Sensors (Basel) 2023; 23:9060. [PMID: 38005448 PMCID: PMC10675263 DOI: 10.3390/s23229060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 10/26/2023] [Accepted: 11/03/2023] [Indexed: 11/26/2023]
Abstract
Current weather monitoring systems often remain out of reach for small-scale users and local communities due to their high costs and complexity. This paper addresses this significant issue by introducing a cost-effective, easy-to-use local weather station. Utilizing low-cost sensors, this weather station is a pivotal tool in making environmental monitoring more accessible and user-friendly, particularly for those with limited resources. It offers efficient in-site measurements of various environmental parameters, such as temperature, relative humidity, atmospheric pressure, carbon dioxide concentration, and particulate matter, including PM 1, PM 2.5, and PM 10. The findings demonstrate the station's capability to monitor these variables remotely and provide forecasts with a high degree of accuracy, displaying an error margin of just 0.67%. Furthermore, the station's use of the Autoregressive Integrated Moving Average (ARIMA) model enables short-term, reliable forecasts crucial for applications in agriculture, transportation, and air quality monitoring. Furthermore, the weather station's open-source nature significantly enhances environmental monitoring accessibility for smaller users and encourages broader public data sharing. With this approach, crucial in addressing climate change challenges, the station empowers communities to make informed decisions based on real-time data. In designing and developing this low-cost, efficient monitoring system, this work provides a valuable blueprint for future advancements in environmental technologies, emphasizing sustainability. The proposed automatic weather station not only offers an economical solution for environmental monitoring but also features a user-friendly interface for seamless data communication between the sensor platform and end users. This system ensures the transmission of data through various web-based platforms, catering to users with diverse technical backgrounds. Furthermore, by leveraging historical data through the ARIMA model, the station enhances its utility in providing short-term forecasts and supporting critical decision-making processes across different sectors.
Collapse
Affiliation(s)
- Antonio Rivera
- Institute of Advanced Materials for Sustainable Manufacturing, Tecnologico de Monterrey, Monterrey 14380, Mexico; (A.R.); (O.M.); (A.M.)
| | - Pedro Ponce
- Institute of Advanced Materials for Sustainable Manufacturing, Tecnologico de Monterrey, Monterrey 14380, Mexico; (A.R.); (O.M.); (A.M.)
| | - Omar Mata
- Institute of Advanced Materials for Sustainable Manufacturing, Tecnologico de Monterrey, Monterrey 14380, Mexico; (A.R.); (O.M.); (A.M.)
| | - Arturo Molina
- Institute of Advanced Materials for Sustainable Manufacturing, Tecnologico de Monterrey, Monterrey 14380, Mexico; (A.R.); (O.M.); (A.M.)
| | - Alan Meier
- Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA;
| |
Collapse
|
50
|
Heckner MK, Cieslik EC, Paas Oliveros LK, Eickhoff SB, Patil KR, Langner R. Predicting executive functioning from brain networks: modality specificity and age effects. Cereb Cortex 2023; 33:10997-11009. [PMID: 37782935 PMCID: PMC10646699 DOI: 10.1093/cercor/bhad338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 08/25/2023] [Accepted: 08/26/2023] [Indexed: 10/04/2023] Open
Abstract
Healthy aging is associated with structural and functional network changes in the brain, which have been linked to deterioration in executive functioning (EF), while their neural implementation at the individual level remains unclear. As the biomarker potential of individual resting-state functional connectivity (RSFC) patterns has been questioned, we investigated to what degree individual EF abilities can be predicted from the gray-matter volume (GMV), regional homogeneity, fractional amplitude of low-frequency fluctuations (fALFF), and RSFC within EF-related, perceptuo-motor, and whole-brain networks in young and old adults. We examined whether the differences in out-of-sample prediction accuracy were modality-specific and depended on age or task-demand levels. Both uni- and multivariate analysis frameworks revealed overall low prediction accuracies and moderate-to-weak brain-behavior associations (R2 < 0.07, r < 0.28), further challenging the idea of finding meaningful markers for individual EF performance with the metrics used. Regional GMV, well linked to overall atrophy, carried the strongest information about individual EF differences in older adults, whereas fALFF, measuring functional variability, did so for younger adults. Our study calls for future research analyzing more global properties of the brain, different task-states and applying adaptive behavioral testing to result in sensitive predictors for young and older adults, respectively.
Collapse
Affiliation(s)
- Marisa K Heckner
- Institute of Neuroscience and Medicine (INM-7: Brain and Behaviour), Research Centre Jülich, 52425 Jülich, Germany
- Institute of Systems Neuroscience, Medical Faculty, Heinrich Heine University Düsseldorf, 40204 Düsseldorf, Germany
| | - Edna C Cieslik
- Institute of Neuroscience and Medicine (INM-7: Brain and Behaviour), Research Centre Jülich, 52425 Jülich, Germany
- Institute of Systems Neuroscience, Medical Faculty, Heinrich Heine University Düsseldorf, 40204 Düsseldorf, Germany
| | - Lya K Paas Oliveros
- Institute of Neuroscience and Medicine (INM-7: Brain and Behaviour), Research Centre Jülich, 52425 Jülich, Germany
- Institute of Systems Neuroscience, Medical Faculty, Heinrich Heine University Düsseldorf, 40204 Düsseldorf, Germany
| | - Simon B Eickhoff
- Institute of Neuroscience and Medicine (INM-7: Brain and Behaviour), Research Centre Jülich, 52425 Jülich, Germany
- Institute of Systems Neuroscience, Medical Faculty, Heinrich Heine University Düsseldorf, 40204 Düsseldorf, Germany
| | - Kaustubh R Patil
- Institute of Neuroscience and Medicine (INM-7: Brain and Behaviour), Research Centre Jülich, 52425 Jülich, Germany
- Institute of Systems Neuroscience, Medical Faculty, Heinrich Heine University Düsseldorf, 40204 Düsseldorf, Germany
| | - Robert Langner
- Institute of Neuroscience and Medicine (INM-7: Brain and Behaviour), Research Centre Jülich, 52425 Jülich, Germany
- Institute of Systems Neuroscience, Medical Faculty, Heinrich Heine University Düsseldorf, 40204 Düsseldorf, Germany
| |
Collapse
|