Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Ngufor C, Van Houten H, Caffo BS, Shah ND, McCoy RG. Mixed effect machine learning: A framework for predicting longitudinal change in hemoglobin A1c. J Biomed Inform 2018;89:56-67. [PMID: 30189255 DOI: 10.1016/j.jbi.2018.09.001] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2018] [Revised: 08/28/2018] [Accepted: 09/02/2018] [Indexed: 11/26/2022]

For:	Ngufor C, Van Houten H, Caffo BS, Shah ND, McCoy RG. Mixed effect machine learning: A framework for predicting longitudinal change in hemoglobin A1c. J Biomed Inform 2018;89:56-67. [PMID: 30189255 DOI: 10.1016/j.jbi.2018.09.001] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2018] [Revised: 08/28/2018] [Accepted: 09/02/2018] [Indexed: 11/26/2022]

Number

Cited by Other Article(s)

Zhang C, Yu X, Zhang B. Assessment of supervised longitudinal learning methods: Insights from predicting low birth weight and very low birth weight using prenatal ultrasound measurements. Comput Biol Med 2024;182:109084. [PMID: 39250874 DOI: 10.1016/j.compbiomed.2024.109084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Revised: 08/17/2024] [Accepted: 08/28/2024] [Indexed: 09/11/2024]

Abstract

BACKGROUND

This study aimed to assess the efficacy of various supervised longitudinal learning approaches, comparing traditional statistical models and machine learning algorithms for prediction with longitudinal data. The primary objectives were to evaluate the predictive performance of different supervised longitudinal learning methods for low birth weight (LBW) and very low birth weight (VLBW) based on prenatal ultrasound measurements. Additionally, the study sought to extract interpretable risk features for disease prediction.

METHODS

The evaluation involved benchmarking the performance of longitudinal models against conventional machine learning methods. Classification accuracy for LBW and VLBW at birth, as well as prediction accuracy for birth weight using prenatal sonographic ultrasound measurements, were assessed.

RESULTS

Among the learning approaches we investigated in this study, the longitudinal machine learning approach, specifically, the mixed effect random forest (MERF), delivered the overall best performance in predicting birthweights and classifying LBW/VLBW disease status.

CONCLUSION

The MERF combined the power of advanced machine learning algorithms to accommodate the inherent within-individual dependence in the observed data, delivering satisfactory performance in predicting the birthweight and classifying LBW/VLBW disease status. The study emphasized the importance of incorporating previous ultrasound measurements and considering correlations between repeated measurements for accurate prediction. The interpretable trees algorithm used for risk feature extraction proved reliable and applicable to other learning algorithms. These findings underscored the potential of longitudinal learning methods in improving birth weight prediction and highlighted the relevance of consistent risk features in line with established literature.

Collapse

Velez-Arce A, Huang K, Li MM, Lin X, Gao W, Fu T, Kellis M, Pentelute BL, Zitnik M. TDC-2: Multimodal Foundation for Therapeutic Science. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.12.598655. [PMID: 38948789 PMCID: PMC11212894 DOI: 10.1101/2024.06.12.598655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]

Abstract

Therapeutics Data Commons (tdcommons.ai) is an open science initiative with unified datasets, AI models, and benchmarks to support research across therapeutic modalities and drug discovery and development stages. The Commons 2.0 (TDC-2) is a comprehensive overhaul of Therapeutic Data Commons to catalyze research in multimodal models for drug discovery by unifying single-cell biology of diseases, biochemistry of molecules, and effects of drugs through multimodal datasets, AI-powered API endpoints, new multimodal tasks and model frameworks, and comprehensive benchmarks. TDC-2 introduces over 1,000 multimodal datasets spanning approximately 85 million cells, pre-calculated embeddings from 5 state-of-the-art single-cell models, and a biomedical knowledge graph. TDC-2 drastically expands the coverage of ML tasks across therapeutic pipelines and 10+ new modalities, spanning but not limited to single-cell gene expression data, clinical trial data, peptide sequence data, peptidomimetics protein-peptide interaction data regarding newly discovered ligands derived from AS-MS spectroscopy, novel 3D structural data for proteins, and cell-type-specific protein-protein interaction networks at single-cell resolution. TDC-2 introduces multimodal data access under an API-first design using the model-view-controller paradigm. TDC-2 introduces 7 novel ML tasks with fine-grained biological contexts: contextualized drug-target identification, single-cell chemical/genetic perturbation response prediction, protein-peptide binding affinity prediction task, and clinical trial outcome prediction task, which introduce antigen-processing-pathway-specific, cell-type-specific, peptide-specific, and patient-specific biological contexts. TDC-2 also releases benchmarks evaluating 15+ state-of-the-art models across 5+ new learning tasks evaluating models on diverse biological contexts and sampling approaches. Among these, TDC-2 provides the first benchmark for context-specific learning. TDC-2, to our knowledge, is also the first to introduce a protein-peptide binding interaction benchmark.

Collapse

Jiang C, Thompson M, Wallace M. Estimating dynamic treatment regimes for ordinal outcomes with household interference: Application in household smoking cessation. Stat Methods Med Res 2024;33:981-995. [PMID: 38623615 PMCID: PMC11334379 DOI: 10.1177/09622802241242313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2024]

Eghbali-Zarch M, Masoud S. Application of machine learning in affordable and accessible insulin management for type 1 and 2 diabetes: A comprehensive review. Artif Intell Med 2024;151:102868. [PMID: 38632030 DOI: 10.1016/j.artmed.2024.102868] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 03/03/2024] [Accepted: 04/03/2024] [Indexed: 04/19/2024]

Naik K, Goyal RK, Foschini L, Chak CW, Thielscher C, Zhu H, Lu J, Lehár J, Pacanoswki MA, Terranova N, Mehta N, Korsbo N, Fakhouri T, Liu Q, Gobburu J. Current Status and Future Directions: The Application of Artificial Intelligence/Machine Learning for Precision Medicine. Clin Pharmacol Ther 2024;115:673-686. [PMID: 38103204 DOI: 10.1002/cpt.3152] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 11/28/2023] [Indexed: 12/18/2023]

Somé NH, Noormohammadpour P, Lange S. The use of machine learning on administrative and survey data to predict suicidal thoughts and behaviors: a systematic review. Front Psychiatry 2024;15:1291362. [PMID: 38501090 PMCID: PMC10944962 DOI: 10.3389/fpsyt.2024.1291362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Accepted: 02/12/2024] [Indexed: 03/20/2024] Open

Abstract

Background

Machine learning is a promising tool in the area of suicide prevention due to its ability to combine the effects of multiple risk factors and complex interactions. The power of machine learning has led to an influx of studies on suicide prediction, as well as a few recent reviews. Our study distinguished between data sources and reported the most important predictors of suicide outcomes identified in the literature.

Objective

Our study aimed to identify studies that applied machine learning techniques to administrative and survey data, summarize performance metrics reported in those studies, and enumerate the important risk factors of suicidal thoughts and behaviors identified.

Methods

A systematic literature search of PubMed, Medline, Embase, PsycINFO, Web of Science, Cumulative Index to Nursing and Allied Health Literature (CINAHL), and Allied and Complementary Medicine Database (AMED) to identify all studies that have used machine learning to predict suicidal thoughts and behaviors using administrative and survey data was performed. The search was conducted for articles published between January 1, 2019 and May 11, 2022. In addition, all articles identified in three recently published systematic reviews (the last of which included studies up until January 1, 2019) were retained if they met our inclusion criteria. The predictive power of machine learning methods in predicting suicidal thoughts and behaviors was explored using box plots to summarize the distribution of the area under the receiver operating characteristic curve (AUC) values by machine learning method and suicide outcome (i.e., suicidal thoughts, suicide attempt, and death by suicide). Mean AUCs with 95% confidence intervals (CIs) were computed for each suicide outcome by study design, data source, total sample size, sample size of cases, and machine learning methods employed. The most important risk factors were listed.

Results

The search strategy identified 2,200 unique records, of which 104 articles met the inclusion criteria. Machine learning algorithms achieved good prediction of suicidal thoughts and behaviors (i.e., an AUC between 0.80 and 0.89); however, their predictive power appears to differ across suicide outcomes. The boosting algorithms achieved good prediction of suicidal thoughts, death by suicide, and all suicide outcomes combined, while neural network algorithms achieved good prediction of suicide attempts. The risk factors for suicidal thoughts and behaviors differed depending on the data source and the population under study.

Conclusion

The predictive utility of machine learning for suicidal thoughts and behaviors largely depends on the approach used. The findings of the current review should prove helpful in preparing future machine learning models using administrative and survey data.

Systematic review registration

https://www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42022333454 identifier CRD42022333454.

Collapse

McCoy RG, Faust L, Heien HC, Patel S, Caffo B, Ngufor C. Longitudinal trajectories of glycemic control among U.S. Adults with newly diagnosed diabetes. Diabetes Res Clin Pract 2023;205:110989. [PMID: 37918637 PMCID: PMC10842883 DOI: 10.1016/j.diabres.2023.110989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 09/27/2023] [Accepted: 10/31/2023] [Indexed: 11/04/2023]

Salditt M, Nestler S. Parametric and nonparametric propensity score estimation in multilevel observational studies. Stat Med 2023;42:4147-4176. [PMID: 37532119 DOI: 10.1002/sim.9852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 05/16/2023] [Accepted: 07/10/2023] [Indexed: 08/04/2023]

Giuffrè M, Shung DL. Harnessing the power of synthetic data in healthcare: innovation, application, and privacy. NPJ Digit Med 2023;6:186. [PMID: 37813960 PMCID: PMC10562365 DOI: 10.1038/s41746-023-00927-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Accepted: 09/14/2023] [Indexed: 10/11/2023] Open

Salditt M, Humberg S, Nestler S. Gradient Tree Boosting for Hierarchical Data. MULTIVARIATE BEHAVIORAL RESEARCH 2023;58:911-937. [PMID: 36602080 DOI: 10.1080/00273171.2022.2146638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]

Mangino AA, Bolin JH, Finch WH. Fixed Effects or Mixed Effects Classifiers? Evidence From Simulated and Archival Data. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 2023;83:710-739. [PMID: 37398843 PMCID: PMC10311958 DOI: 10.1177/00131644221108180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]

Langworthy B, Wu Y, Wang M. An overview of propensity score matching methods for clustered data. Stat Methods Med Res 2023;32:641-655. [PMID: 36426585 PMCID: PMC10119899 DOI: 10.1177/09622802221133556] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Pham K, Ray AW, Fernstrum AJ, Alfahmy A, Ray S, Hijaz AK, Ju M, Sheyn D. Development of a machine learning-based predictive model for prediction of success or failure of medical management for benign prostatic hyperplasia. Neurourol Urodyn 2023;42:707-717. [PMID: 36826466 DOI: 10.1002/nau.25162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 01/24/2023] [Accepted: 02/11/2023] [Indexed: 02/25/2023]

Hu J, Szymczak S. A review on longitudinal data analysis with random forest. Brief Bioinform 2023;24:6991123. [PMID: 36653905 PMCID: PMC10025446 DOI: 10.1093/bib/bbad002] [Citation(s) in RCA: 37] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 12/12/2022] [Accepted: 12/31/2012] [Indexed: 01/20/2023] Open

Mosquera-Lopez C, Ramsey KL, Roquemen-Echeverri V, Jacobs PG. Modeling risk of hypoglycemia during and following physical activity in people with type 1 diabetes using explainable mixed-effects machine learning. Comput Biol Med 2023;155:106670. [PMID: 36803791 DOI: 10.1016/j.compbiomed.2023.106670] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 01/19/2023] [Accepted: 02/10/2023] [Indexed: 02/13/2023]

Abstract

BACKGROUND

Physical activity (PA) can cause increased hypoglycemia (glucose <70 mg/dL) risk in people with type 1 diabetes (T1D). We modeled the probability of hypoglycemia during and up to 24 h following PA and identified key factors associated with hypoglycemia risk.

METHODS

We leveraged a free-living dataset from Tidepool comprised of glucose measurements, insulin doses, and PA data from 50 individuals with T1D (6448 sessions) for training and validating machine learning models. We also used data from the T1Dexi pilot study that contains glucose management and PA data from 20 individuals with T1D (139 session) for assessing the accuracy of the best performing model on an independent test dataset. We used mixed-effects logistic regression (MELR) and mixed-effects random forest (MERF) to model hypoglycemia risk around PA. We identified risk factors associated with hypoglycemia using odds ratio and partial dependence analysis for the MELR and MERF models, respectively. Prediction accuracy was measured using the area under the receiver operating characteristic curve (AUROC).

RESULTS

The analysis identified risk factors significantly associated with hypoglycemia during and following PA in both MELR and MERF models including glucose and body exposure to insulin at the start of PA, low blood glucose index 24 h prior to PA, and PA intensity and timing. Both models showed overall hypoglycemia risk peaking 1 h after PA and again 5-10 h after PA, which is consistent with the hypoglycemia risk pattern observed in the training dataset. Time following PA impacted hypoglycemia risk differently across different PA types. Accuracy of hypoglycemia prediction using the fixed effects of the MERF model was highest when predicting hypoglycemia during the first hour following the start of PA (AUROCVALIDATION = 0.83 and AUROCTESTING = 0.86) and decreased when predicting hypoglycemia in the 24 h after PA (AUROCVALIDATION = 0.66 and AUROCTESTING = 0.68).

CONCLUSION

Hypoglycemia risk after the start of PA can be modeled using mixed-effects machine learning to identify key risk factors that may be used within decision support and insulin delivery systems. We published the population-level MERF model online for others to use.

Collapse

Cakar S, Yavuz FG. Hybrid statistical and machine learning modeling of cognitive neuroscience data. J Appl Stat 2023;51:1076-1097. [PMID: 38628450 PMCID: PMC11018039 DOI: 10.1080/02664763.2023.2176834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 01/31/2023] [Indexed: 02/18/2023]

Synthetic data in health care: A narrative review. PLOS DIGITAL HEALTH 2023;2:e0000082. [PMID: 36812604 PMCID: PMC9931305 DOI: 10.1371/journal.pdig.0000082] [Citation(s) in RCA: 31] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 12/06/2022] [Indexed: 01/09/2023]

Lou YS, Lin CS, Fang WH, Lee CC, Wang CH, Lin C. Development and validation of a dynamic deep learning algorithm using electrocardiogram to predict dyskalaemias in patients with multiple visits. EUROPEAN HEART JOURNAL. DIGITAL HEALTH 2022;4:22-32. [PMID: 36743876 PMCID: PMC9890087 DOI: 10.1093/ehjdh/ztac072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 10/26/2022] [Indexed: 11/23/2022]

Hu L, Ji J, Liu H, Ennis R. A Flexible Approach for Assessing Heterogeneity of Causal Treatment Effects on Patient Survival Using Large Datasets with Clustered Observations. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022;19:14903. [PMID: 36429621 PMCID: PMC9690785 DOI: 10.3390/ijerph192214903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Revised: 11/08/2022] [Accepted: 11/09/2022] [Indexed: 06/16/2023]

Fuh-Ngwa V, Zhou Y, Melton PE, van der Mei I, Charlesworth JC, Lin X, Zarghami A, Broadley SA, Ponsonby AL, Simpson-Yap S, Lechner-Scott J, Taylor BV. Ensemble machine learning identifies genetic loci associated with future worsening of disability in people with multiple sclerosis. Sci Rep 2022;12:19291. [PMID: 36369345 PMCID: PMC9652373 DOI: 10.1038/s41598-022-23685-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 11/03/2022] [Indexed: 11/13/2022] Open

Affiliation(s)

Valery Fuh-Ngwa grid.1009.80000 0004 1936 826XMenzies Institute for Medical Research, University of Tasmania, 17 Liverpool St, Hobart, TAS 7000 Australia
Yuan Zhou grid.1009.80000 0004 1936 826XMenzies Institute for Medical Research, University of Tasmania, 17 Liverpool St, Hobart, TAS 7000 Australia
Phillip E. Melton grid.1009.80000 0004 1936 826XMenzies Institute for Medical Research, University of Tasmania, 17 Liverpool St, Hobart, TAS 7000 Australia
Ingrid van der Mei grid.1009.80000 0004 1936 826XMenzies Institute for Medical Research, University of Tasmania, 17 Liverpool St, Hobart, TAS 7000 Australia
Jac C. Charlesworth grid.1009.80000 0004 1936 826XMenzies Institute for Medical Research, University of Tasmania, 17 Liverpool St, Hobart, TAS 7000 Australia
Xin Lin grid.1009.80000 0004 1936 826XMenzies Institute for Medical Research, University of Tasmania, 17 Liverpool St, Hobart, TAS 7000 Australia
Amin Zarghami grid.1009.80000 0004 1936 826XMenzies Institute for Medical Research, University of Tasmania, 17 Liverpool St, Hobart, TAS 7000 Australia
Simon A. Broadley grid.1022.10000 0004 0437 5432Menzies Health Institute Queensland and School of Medicine, Griffith University Gold Coast, G40 Griffith Health Centre, QLD 4222, Australia
Anne-Louise Ponsonby grid.1058.c0000 0000 9442 535XDeveloping Brain Division, The Florey Institute for Neuroscience and Mental Health, Royal Children’s Hospital, University of Melbourne Murdoch Children’s Research Institute, Parkville, VIC 3052 Australia
Steve Simpson-Yap grid.1008.90000 0001 2179 088XNeuroepidemiology Unit, Melbourne School of Population & Global Health, The University of Melbourne, Melbourne, VIC 3053 Australia
Jeannette Lechner-Scott grid.266842.c0000 0000 8831 109XDepartment of Neurology, Hunter Medical Research Institute, Hunter New England Health, University of Newcastle, Callaghan, NSW 2310 Australia
Bruce V. Taylor grid.1009.80000 0004 1936 826XMenzies Institute for Medical Research, University of Tasmania, 17 Liverpool St, Hobart, TAS 7000 Australia

Collapse

Sokhansanj BA, Rosen GL. Predicting COVID-19 disease severity from SARS-CoV-2 spike protein sequence by mixed effects machine learning. Comput Biol Med 2022;149:105969. [PMID: 36041271 PMCID: PMC9384346 DOI: 10.1016/j.compbiomed.2022.105969] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 07/11/2022] [Accepted: 08/13/2022] [Indexed: 11/17/2022]

Predictions of machine learning with mixed-effects in analyzing longitudinal data under model misspecification. STAT METHOD APPL-GER 2022. [DOI: 10.1007/s10260-022-00658-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]

Shazly SA, Borah BJ, Ngufor CG, Torbenson VE, Theiler RN, Famuyide AO. Impact of labor characteristics on maternal and neonatal outcomes of labor: A machine-learning model. PLoS One 2022;17:e0273178. [PMID: 35994474 PMCID: PMC9394788 DOI: 10.1371/journal.pone.0273178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 08/01/2022] [Indexed: 11/18/2022] Open

Abstract Introduction Since Friedman’s seminal publication on laboring women, numerous publications have sought to define normal labor progress. However, there is paucity of data on contemporary labor cervicometry incorporating both maternal and neonatal outcomes. The objective of this study is to establish intrapartum prediction models of unfavorable labor outcomes using machine-learning algorithms. Materials and methods Consortium on Safe Labor is a large database consisting of pregnancy and labor characteristics from 12 medical centers in the United States. Outcomes, including maternal and neonatal outcomes, were retrospectively collected. We defined primary outcome as the composite of following unfavorable outcomes: cesarean delivery in active labor, postpartum hemorrhage, intra-amniotic infection, shoulder dystocia, neonatal morbidity, and mortality. Clinical and obstetric parameters at admission and during labor progression were used to build machine-learning risk-prediction models based on the gradient boosting algorithm. Results Of 228,438 delivery episodes, 66,586 were eligible for this study. Mean maternal age was 26.95 ± 6.48 years, mean parity was 0.92 ± 1.23, and mean gestational age was 39.35 ± 1.13 weeks. Unfavorable labor outcome was reported in 14,439 (21.68%) deliveries. Starting at a cervical dilation of 4 cm, the area under receiver operating characteristics curve (AUC) of prediction models increased from 0.75 (95% confidence interval, 0.75–0.75) to 0.89 (95% confidence interval, 0.89–0.90) at a dilation of 10 cm. Baseline labor risk score was above 35% in patients with unfavorable outcomes compared to women with favorable outcomes, whose score was below 25%. Conclusion Labor risk score is a machine-learning–based score that provides individualized and dynamic alternatives to conventional labor charts. It predicts composite of adverse birth, maternal, and neonatal outcomes as labor progresses. Therefore, it can be deployed in clinical practice to monitor labor progress in real time and support clinical decisions. Collapse

Bi Q, Kuang Z, Haihong E, Song M, Tan L, Tang X, Liu X. Research on early warning of renal damage in hypertensive patients based on the stacking strategy. BMC Med Inform Decis Mak 2022;22:212. [PMID: 35945608 PMCID: PMC9361646 DOI: 10.1186/s12911-022-01889-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Accepted: 03/31/2022] [Indexed: 11/26/2022] Open

Abstract

Background

Among the problems caused by hypertension, early renal damage is often ignored. It can not be diagnosed until the condition is severe and irreversible damage occurs. So we decided to screen and explore related risk factors for hypertensive patients with early renal damage and establish the early-warning model of renal damage based on the data-mining method to achieve an early diagnosis for hypertensive patients with renal damage.

Methods

With the aid of an electronic information management system for hypertensive out-patients, we collected 513 cases of original, untreated hypertensive patients. We recorded their demographic data, ambulatory blood pressure parameters, blood routine index, and blood biochemical index to establish the clinical database. Then we screen risk factors for early renal damage through feature engineering and use Random Forest, Extra-Trees, and XGBoost to build an early-warning model, respectively. Finally, we build a new model by model fusion based on the Stacking strategy. We use cross-validation to evaluate the stability and reliability of each model to determine the best risk assessment model.

Results

According to the degree of importance, the descending order of features selected by feature engineering is the drop rate of systolic blood pressure at night, the red blood cell distribution width, blood pressure circadian rhythm, the average diastolic blood pressure at daytime, body surface area, smoking, age, and HDL. The average precision of the two-dimensional fusion model with full features based on the Stacking strategy is 0.89685, and selected features are 0.93824, which is greatly improved.

Conclusions

Through feature engineering and risk factor analysis, we select the drop rate of systolic blood pressure at night, the red blood cell distribution width, blood pressure circadian rhythm, and the average diastolic blood pressure at daytime as early-warning factors of early renal damage in patients with hypertension. On this basis, the two-dimensional fusion model based on the Stacking strategy has a better effect than the single model, which can be used for risk assessment of early renal damage in hypertensive patients.

Collapse

Baurley JW, Claus ED, Witkiewitz K, McMahan CS. A Bayesian mixed effects support vector machine for learning and predicting daily substance use disorder patterns. THE AMERICAN JOURNAL OF DRUG AND ALCOHOL ABUSE 2022;48:413-421. [PMID: 35196194 DOI: 10.1080/00952990.2021.2024839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 12/27/2021] [Accepted: 12/29/2021] [Indexed: 06/14/2023]

Marsch LA, Chen CH, Adams SR, Asyyed A, Does MB, Hassanpour S, Hichborn E, Jackson-Morris M, Jacobson NC, Jones HK, Kotz D, Lambert-Harris CA, Li Z, McLeman B, Mishra V, Stanger C, Subramaniam G, Wu W, Campbell CI. The Feasibility and Utility of Harnessing Digital Health to Understand Clinical Trajectories in Medication Treatment for Opioid Use Disorder: D-TECT Study Design and Methodological Considerations. Front Psychiatry 2022;13:871916. [PMID: 35573377 PMCID: PMC9098973 DOI: 10.3389/fpsyt.2022.871916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 03/22/2022] [Indexed: 11/13/2022] Open

Abstract

Introduction

Across the U.S., the prevalence of opioid use disorder (OUD) and the rates of opioid overdoses have risen precipitously in recent years. Several effective medications for OUD (MOUD) exist and have been shown to be life-saving. A large volume of research has identified a confluence of factors that predict attrition and continued substance use during substance use disorder treatment. However, much of this literature has examined a small set of potential moderators or mediators of outcomes in MOUD treatment and may lead to over-simplified accounts of treatment non-adherence. Digital health methodologies offer great promise for capturing intensive, longitudinal ecologically-valid data from individuals in MOUD treatment to extend our understanding of factors that impact treatment engagement and outcomes.

Methods

This paper describes the protocol (including the study design and methodological considerations) from a novel study supported by the National Drug Abuse Treatment Clinical Trials Network at the National Institute on Drug Abuse (NIDA). This study (D-TECT) primarily seeks to evaluate the feasibility of collecting ecological momentary assessment (EMA), smartphone and smartwatch sensor data, and social media data among patients in outpatient MOUD treatment. It secondarily seeks to examine the utility of EMA, digital sensing, and social media data (separately and compared to one another) in predicting MOUD treatment retention, opioid use events, and medication adherence [as captured in electronic health records (EHR) and EMA data]. To our knowledge, this is the first project to include all three sources of digitally derived data (EMA, digital sensing, and social media) in understanding the clinical trajectories of patients in MOUD treatment. These multiple data streams will allow us to understand the relative and combined utility of collecting digital data from these diverse data sources. The inclusion of EHR data allows us to focus on the utility of digital health data in predicting objectively measured clinical outcomes.

Discussion

Results may be useful in elucidating novel relations between digital data sources and OUD treatment outcomes. It may also inform approaches to enhancing outcomes measurement in clinical trials by allowing for the assessment of dynamic interactions between individuals' daily lives and their MOUD treatment response.

Clinical Trial Registration

Identifier: NCT04535583.

Collapse

Affiliation(s)

Lisa A. Marsch Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States
Ching-Hua Chen Center for Computational Health, International Business Machines (IBM) Research, Yorktown Heights, NY, United States
Sara R. Adams Division of Research Kaiser Permanente Northern California, Oakland, CA, United States
Asma Asyyed The Permanente Medical Group, Northern California, Addiction Medicine and Recovery Services, Oakland, CA, United States
Monique B. Does Division of Research Kaiser Permanente Northern California, Oakland, CA, United States
Saeed Hassanpour Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States
Emily Hichborn Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States
Melanie Jackson-Morris Division of Research Kaiser Permanente Northern California, Oakland, CA, United States
Nicholas C. Jacobson Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States
Heather K. Jones Division of Research Kaiser Permanente Northern California, Oakland, CA, United States
David Kotz Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States Department of Computer Science, Dartmouth College, Hanover, NH, United States
Chantal A. Lambert-Harris Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States
Zhiguo Li Center for Computational Health, International Business Machines (IBM) Research, Yorktown Heights, NY, United States
Bethany McLeman Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States
Varun Mishra Khoury College of Computer Sciences, Northeastern University, Boston, MA, United States Department of Health Sciences, Bouvé College of Health Sciences, Northeastern University, Boston, MA, United States
Catherine Stanger Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States
Geetha Subramaniam Center for Clinical Trials Network, National Institute on Drug Abuse, Bethesda, MD, United States
Weiyi Wu Center for Technology and Behavioral Health, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States
Cynthia I. Campbell Division of Research Kaiser Permanente Northern California, Oakland, CA, United States Department of Psychiatry and Behavioral Sciences, University of California, San Francisco, San Francisco, CA, United States

Collapse

Mousavi A, Zare H, Asadian A, Mohammadzadeh M. Factors Affecting the Product Life Cycle of Generic Medicines. IRANIAN JOURNAL OF PHARMACEUTICAL RESEARCH 2022;21:e127039. [PMID: 36060917 PMCID: PMC9420220 DOI: 10.5812/ijpr-127039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 01/24/2022] [Accepted: 02/13/2022] [Indexed: 11/26/2022]

Bayesian Nonlinear Models for Repeated Measurement Data: An Overview, Implementation, and Applications. MATHEMATICS 2022. [DOI: 10.3390/math10060898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]

Cao X, Yang G, Jin X, He L, Li X, Zheng Z, Liu Z, Wu C. A Machine Learning-Based Aging Measure Among Middle-Aged and Older Chinese Adults: The China Health and Retirement Longitudinal Study. Front Med (Lausanne) 2021;8:698851. [PMID: 34926482 PMCID: PMC8671693 DOI: 10.3389/fmed.2021.698851] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 10/28/2021] [Indexed: 11/13/2022] Open

Shishegar R, Cox T, Rolls D, Bourgeat P, Doré V, Lamb F, Robertson J, Laws SM, Porter T, Fripp J, Tosun D, Maruff P, Savage G, Rowe CC, Masters CL, Weiner MW, Villemagne VL, Burnham SC. Using imputation to provide harmonized longitudinal measures of cognition across AIBL and ADNI. Sci Rep 2021;11:23788. [PMID: 34893624 PMCID: PMC8664816 DOI: 10.1038/s41598-021-02827-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 11/12/2021] [Indexed: 12/12/2022] Open

Affiliation(s)

Rosita Shishegar The Australian e-Health Research Centre, CSIRO, Melbourne, Australia. .,School of Psychological Sciences and Turner Institute for Brain and Mental Health, Monash University, Melbourne, Australia.
Timothy Cox The Australian e-Health Research Centre, CSIRO, Melbourne, Australia
David Rolls The Australian e-Health Research Centre, CSIRO, Melbourne, Australia
Pierrick Bourgeat The Australian e-Health Research Centre, CSIRO, Melbourne, Australia
Vincent Doré The Australian e-Health Research Centre, CSIRO, Melbourne, Australia.,Department of Molecular Imaging and Therapy, Austin Health, Heidelberg, VIC, Australia
Fiona Lamb Department of Molecular Imaging and Therapy, Austin Health, Heidelberg, VIC, Australia
Joanne Robertson Florey Institute of Neuroscience and Mental Health, The University of Melbourne, Parkville, VIC, Australia
Simon M Laws Centre for Precision Health, Edith Cowan University, Joondalup, WA, Australia.,Collaborative Genomics and Translation Group, School of Medical and Health Sciences, Edith Cowan University, Joondalup, WA, Australia.,School of Pharmacy and Biomedical Sciences, Faculty of Health Sciences, Curtin Health Innovation Research Institute, Curtin University, Bentley, WA, Australia
Tenielle Porter Centre for Precision Health, Edith Cowan University, Joondalup, WA, Australia.,Collaborative Genomics and Translation Group, School of Medical and Health Sciences, Edith Cowan University, Joondalup, WA, Australia.,School of Pharmacy and Biomedical Sciences, Faculty of Health Sciences, Curtin Health Innovation Research Institute, Curtin University, Bentley, WA, Australia
Jurgen Fripp The Australian e-Health Research Centre, CSIRO, Melbourne, Australia
Duygu Tosun Department of Radiology and Biomedical Imaging, University of California-San Francisco, San Francisco, CA, USA
Paul Maruff Cogstate Ltd., Melbourne, VIC, Australia
Greg Savage Department of Psychology, Macquarie University, Sydney, NSW, Australia
Christopher C Rowe Department of Molecular Imaging and Therapy, Austin Health, Heidelberg, VIC, Australia.,Department of Medicine, The University of Melbourne, Parkville, VIC, 3052, Australia
Colin L Masters Florey Institute of Neuroscience and Mental Health, The University of Melbourne, Parkville, VIC, Australia
Michael W Weiner Department of Radiology and Biomedical Imaging, University of California-San Francisco, San Francisco, CA, USA
Victor L Villemagne Department of Molecular Imaging and Therapy, Austin Health, Heidelberg, VIC, Australia.,Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
Samantha C Burnham The Australian e-Health Research Centre, CSIRO, Melbourne, Australia

Collapse

Mangino AA, Finch WH. Prediction With Mixed Effects Models: A Monte Carlo Simulation Study. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 2021;81:1118-1142. [PMID: 34565818 PMCID: PMC8451021 DOI: 10.1177/0013164421992818] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Birk N, Matsuzaki M, Fung TT, Li Y, Batis C, Stampfer MJ, Deitchler M, Willett WC, Fawzi WW, Bromage S, Kinra S, Bhupathiraju SN, Lake E. Exploration of Machine Learning and Statistical Techniques in Development of a Low-Cost Screening Method Featuring the Global Diet Quality Score for Detecting Prediabetes in Rural India. J Nutr 2021;151:110S-118S. [PMID: 34689190 PMCID: PMC8542097 DOI: 10.1093/jn/nxab281] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 07/26/2021] [Accepted: 08/02/2021] [Indexed: 12/03/2022] Open

Abstract

BACKGROUND

The prevalence of type 2 diabetes has increased substantially in India over the past 3 decades. Undiagnosed diabetes presents a public health challenge, especially in rural areas, where access to laboratory testing for diagnosis may not be readily available.

OBJECTIVES

The present work explores the use of several machine learning and statistical methods in the development of a predictive tool to screen for prediabetes using survey data from an FFQ to compute the Global Diet Quality Score (GDQS).

METHODS

The outcome variable prediabetes status (yes/no) used throughout this study was determined based upon a fasting blood glucose measurement ≥100 mg/dL. The algorithms utilized included the generalized linear model (GLM), random forest, least absolute shrinkage and selection operator (LASSO), elastic net (EN), and generalized linear mixed model (GLMM) with family unit as a (cluster) random (intercept) effect to account for intrafamily correlation. Model performance was assessed on held-out test data, and comparisons made with respect to area under the receiver operating characteristic curve (AUC), sensitivity, and specificity.

RESULTS

The GLMM, GLM, LASSO, and random forest modeling techniques each performed quite well (AUCs >0.70) and included the GDQS food groups and age, among other predictors. The fully adjusted GLMM, which included a random intercept for family unit, achieved slightly superior results (AUC of 0.72) in classifying the prediabetes outcome in these cluster-correlated data.

CONCLUSIONS

The models presented in the current work show promise in identifying individuals at risk of developing diabetes, although further studies are necessary to assess other potentially impactful predictors, as well as the consistency and generalizability of model performance. In addition, future studies to examine the utility of the GDQS in screening for other noncommunicable diseases are recommended.

Collapse

Affiliation(s)

Nick Birk Department of Biostatistics, Harvard TH Chan School of Public Health, Boston, MA, USA Department of Non-Communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, University of London, London, United Kingdom
Mika Matsuzaki Department of Nutrition, Harvard TH Chan School of Public Health, Boston, MA, USA Department of International Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
Teresa T Fung Nutrition Department, Simmons University, Boston, MA, USA
Yanping Li Department of Nutrition, Harvard TH Chan School of Public Health, Boston, MA, USA
Carolina Batis CONACYT—Health and Nutrition Research Center, National Institute of Public Health, Cuernavaca, Mexico
Meir J Stampfer Department of Nutrition, Harvard TH Chan School of Public Health, Boston, MA, USA Department of Epidemiology, Harvard TH Chan School of Public Health, Boston, MA, USA Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
Megan Deitchler Intake—Center for Dietary Assessment, FHI Solutions, Washington, DC, USA
Walter C Willett Department of Nutrition, Harvard TH Chan School of Public Health, Boston, MA, USA Department of Epidemiology, Harvard TH Chan School of Public Health, Boston, MA, USA Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
Wafaie W Fawzi Department of Global Health and Population, Harvard TH Chan School of Public Health, Boston, MA, USA
Sabri Bromage Department of Nutrition, Harvard TH Chan School of Public Health, Boston, MA, USA
Sanjay Kinra Department of Non-Communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, University of London, London, United Kingdom
Shilpa N Bhupathiraju Department of Nutrition, Harvard TH Chan School of Public Health, Boston, MA, USA Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
Erin Lake Department of Biostatistics, Harvard TH Chan School of Public Health, Boston, MA, USA

Collapse

Nguyen P, Ohnmacht AJ, Galhoz A, Büttner M, Theis F, Menden MP. Künstliche Intelligenz und maschinelles Lernen in der Diabetesforschung. DIABETOLOGE 2021. [DOI: 10.1007/s11428-021-00817-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Cai C, Tafti AP, Ngufor C, Zhang P, Xiao P, Dai M, Liu H, Noseworthy P, Chen M, Friedman PA, Cha YM. Using ensemble of ensemble machine learning methods to predict outcomes of cardiac resynchronization. J Cardiovasc Electrophysiol 2021;32:2504-2514. [PMID: 34260141 DOI: 10.1111/jce.15171] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Revised: 05/08/2021] [Accepted: 06/14/2021] [Indexed: 11/29/2022]

Speiser JL. A random forest method with feature selection for developing medical prediction models with clustered and longitudinal data. J Biomed Inform 2021;117:103763. [PMID: 33781921 PMCID: PMC8131242 DOI: 10.1016/j.jbi.2021.103763] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 03/03/2021] [Accepted: 03/23/2021] [Indexed: 12/22/2022]

Abstract

BACKGROUND

Machine learning methodologies are gaining popularity for developing medical prediction models for datasets with a large number of predictors, particularly in the setting of clustered and longitudinal data. Binary Mixed Model (BiMM) forest is a promising machine learning algorithm which may be applied to develop prediction models for clustered and longitudinal binary outcomes. Although machine learning methods for clustered and longitudinal methods such as BiMM forest exist, feature selection has not been analyzed via data simulations. Feature selection improves the practicality and ease of use of prediction models for clinicians by reducing the burden of data collection. Thus, feature selection procedures are not only beneficial, but are often necessary for development of medical prediction models. In this study, we aim to assess feature selection within the BiMM forest setting for modeling clustered and longitudinal binary outcomes.

METHODS

We conducted a simulation study to compare BiMM forest with feature selection (backward elimination or stepwise selection) to standard generalized linear mixed model feature selection methods (shrinkage and backward elimination). We also evaluated feature selection methods to develop models predicting mobility disability in older adults using the Health, Aging and Body Composition Study dataset as an example utilization of the proposed methodology.

RESULTS

BiMM forest with backward elimination generally offered higher computational efficiency, similar or higher predictive performance (accuracy and area under the receiver operating curve), and similar or higher ability to identify correct features compared to linear methods for the different simulated scenarios. For predicting mobility disability in older adults, methods generally performed similarly in terms of accuracy, area under the receiver operating curve, and specificity; however, BiMM forest with backward elimination had the highest sensitivity.

CONCLUSIONS

This study is novel because it is the first investigation of feature selection for developing random forest prediction models for clustered and longitudinal binary outcomes. Results from the simulation study reveal that BiMM forest with backward elimination has the highest accuracy (performance and identification of correct features) and lowest computation time compared to other feature selection methods in some scenarios and similar performance in other scenarios. Many informatics datasets have clustered and longitudinal outcomes and results from this study suggest that BiMM forest with backward elimination may be beneficial for developing medical prediction models.

Collapse

Mofrad SA, Lundervold A, Lundervold AS. A predictive framework based on brain volume trajectories enabling early detection of Alzheimer's disease. Comput Med Imaging Graph 2021;90:101910. [PMID: 33862355 DOI: 10.1016/j.compmedimag.2021.101910] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 02/12/2021] [Accepted: 03/26/2021] [Indexed: 10/21/2022]

Gujral H, Sinha A. Association between exposure to airborne pollutants and COVID-19 in Los Angeles, United States with ensemble-based dynamic emission model. ENVIRONMENTAL RESEARCH 2021;194:110704. [PMID: 33417905 PMCID: PMC7836725 DOI: 10.1016/j.envres.2020.110704] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Revised: 12/13/2020] [Accepted: 12/29/2020] [Indexed: 05/09/2023]

Abstract

This study aims to find the association between short-term exposure to air pollutants, such as particulate matters and ground-level ozone, and SARS-CoV-2 confirmed cases. Generalized linear models (GLM), a typical choice for ecological modeling, have well-established limitations. These limitations include apriori assumptions, inability to handle multicollinearity, and considering differential effects as the fixed effect. We propose an Ensemble-based Dynamic Emission Model (EDEM) to address these limitations. EDEM is developed at the intersection of network science and ensemble learning, i.e., a specialized approach of machine learning. Generalized Additive Model (GAM), i.e., a variant of GLM, and EDEM are tested in Los Angeles and Ventura counties of California, which is one of the biggest SARS-CoV-2 clusters in the US. GAM depicts that a 1 μg/m³, 1 μg/m³, and 1 ppm increase (lag 0-7) in PM 2.5, PM 10, and O3 is associated with 4.51% (CI: 7.01 to -2.00) decrease, 1.62% (CI: 2.23 to -1.022) decrease, and 4.66% (CI: 0.85 to 8.47) increase in daily SARS-CoV-2 cases, respectively. Subsequent increment in lag resulted in the negative association between pollutants and SARS-CoV-2 cases. EDEM results in an R2 score of 90.96% and 79.16% on training and testing datasets, respectively. EDEM confirmed the negative association between particulates and SARS-CoV-2 cases; whereas, the O3 depicts a positive association; however, the positive association observed through GAM is not statistically significant. In addition, the county-level analysis of pollutant concentration interactions suggests that increased emissions from other counties positively affect SARS-CoV-2 cases in adjoining counties as well. The results reiterate the significance of uniformly adhering to air pollution mitigation strategies, especially related to ground-level ozone.

Collapse

Lucas TCD. A translucent box: interpretable machine learning in ecology. ECOL MONOGR 2020. [DOI: 10.1002/ecm.1422] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Functional data analysis and prediction tools for continuous glucose-monitoring studies. J Clin Transl Sci 2020;5:e51. [PMID: 33948272 PMCID: PMC8057494 DOI: 10.1017/cts.2020.545] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open

Silva KD, Lee WK, Forbes A, Demmer RT, Barton C, Enticott J. Use and performance of machine learning models for type 2 diabetes prediction in community settings: A systematic review and meta-analysis. Int J Med Inform 2020;143:104268. [PMID: 32950874 DOI: 10.1016/j.ijmedinf.2020.104268] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Revised: 08/30/2020] [Accepted: 09/02/2020] [Indexed: 12/11/2022]

Ljubic B, Hai AA, Stanojevic M, Diaz W, Polimac D, Pavlovski M, Obradovic Z. Predicting complications of diabetes mellitus using advanced machine learning algorithms. J Am Med Inform Assoc 2020;27:1343-1351. [PMID: 32869093 PMCID: PMC7647294 DOI: 10.1093/jamia/ocaa120] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 05/17/2020] [Accepted: 05/28/2020] [Indexed: 12/14/2022] Open

Abstract

OBJECTIVE

We sought to predict if patients with type 2 diabetes mellitus (DM2) would develop 10 selected complications. Accurate prediction of complications could help with more targeted measures that would prevent or slow down their development.

MATERIALS AND METHODS

Experiments were conducted on the Healthcare Cost and Utilization Project State Inpatient Databases of California for the period of 2003 to 2011. Recurrent neural network (RNN) long short-term memory (LSTM) and RNN gated recurrent unit (GRU) deep learning methods were designed and compared with random forest and multilayer perceptron traditional models. Prediction accuracy of selected complications were compared on 3 settings corresponding to minimum number of hospitalizations between diabetes diagnosis and the diagnosis of complications.

RESULTS

The diagnosis domain was used for experiments. The best results were achieved with RNN GRU model, followed by RNN LSTM model. The prediction accuracy achieved with RNN GRU model was between 73% (myocardial infarction) and 83% (chronic ischemic heart disease), while accuracy of traditional models was between 66% - 76%.

DISCUSSION

The number of hospitalizations was an important factor for the prediction accuracy. Experiments with 4 hospitalizations achieved significantly better accuracy than with 2 hospitalizations. To achieve improved accuracy deep learning models required training on at least 1000 patients and accuracy significantly dropped if training datasets contained 500 patients. The prediction accuracy of complications decreases over time period. Considering individual complications, the best accuracy was achieved on depressive disorder and chronic ischemic heart disease.

CONCLUSIONS

The RNN GRU model was the best choice for electronic medical record type of data, based on the achieved results.

Collapse

Ngufor C, Caraballo PJ, O’Byrne TJ, Chen D, Shah ND, Pruinelli L, Steinbach M, Simon G. Development and Validation of a Risk Stratification Model Using Disease Severity Hierarchy for Mortality or Major Cardiovascular Event. JAMA Netw Open 2020;3:e208270. [PMID: 32678448 PMCID: PMC7368174 DOI: 10.1001/jamanetworkopen.2020.8270] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open

Abstract

IMPORTANCE

Clinical domain knowledge about diseases and their comorbidities, severity, treatment pathways, and outcomes can facilitate diagnosis, enhance preventive strategies, and help create smart evidence-based practice guidelines.

OBJECTIVE

To introduce a new representation of patient data called disease severity hierarchy that leverages domain knowledge in a nested fashion to create subpopulations that share increasing amounts of clinical details suitable for risk prediction.

DESIGN, SETTING, AND PARTICIPANTS

This retrospective cohort study included 51 969 patients aged 45 to 85 years, with 10 674 patients who received primary care at the Mayo Clinic between January 2004 and December 2015 in the training cohort and 41 295 patients who received primary care at Fairview Health Services from January 2010 to December 2017 in the validation cohort. Data were analyzed from May 2018 to December 2019.

MAIN OUTCOMES AND MEASURES

Several binary classification measures, including the area under the receiver operating characteristic curve (AUC), Gini score, sensitivity, and positive predictive value, were used to evaluate models predicting all-cause mortality and major cardiovascular events at ages 60, 65, 75, and 80 years.

RESULTS

The mean (SD) age and proportions of women and white individuals were 59.4 (10.8) years, 6324 (59.3%) and 9804 (91.9%), respectively, in the training cohort and 57.4 (7.9) years, 21 975 (53.1%), and 37 653 (91.2%), respectively, in the validation cohort. During follow-up, 945 patients (8.9%) in the training cohort died, while 787 (7.4%) had major cardiovascular events. Models using the new representation achieved AUCs for predicting death in the training cohort at ages 60, 65, 75, and 80 years of 0.96 (95% CI, 0.94-0.97), 0.96 (95% CI, 0.95-0.98), 0.97 (95% CI, 0.96-0.98), and 0.98 (95% CI, 0.98-0.99), respectively, while standard methods achieved modest AUCs of 0.67 (95% CI, 0.55-0.80), 0.66 (95% CI, 0.56-0.79), 0.64 (95% CI, 0.57-0.71), and 0.63 (95% CI, 0.54-0.70), respectively.

CONCLUSIONS AND RELEVANCE

In this study, the proposed patient data representation accurately predicted the age at which a patient was at risk of dying or developing major cardiovascular events substantially better than standard methods. The representation uses known relationships contained in electronic health records to capture disease severity in a natural and clinically meaningful way. Furthermore, it is expressive and interpretable. This novel patient representation can help to support critical decision-making, develop smart guidelines, and enhance health care and disease management by helping to identify patients with high risk.

Collapse

Gubbi S, Hamet P, Tremblay J, Koch CA, Hannah-Shmouni F. Artificial Intelligence and Machine Learning in Endocrinology and Metabolism: The Dawn of a New Era. Front Endocrinol (Lausanne) 2019;10:185. [PMID: 30984108 PMCID: PMC6448412 DOI: 10.3389/fendo.2019.00185] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/05/2018] [Accepted: 03/06/2019] [Indexed: 12/22/2022] Open