1
|
Huang Y, Xu J, Fan Z, Hu Y, He X, Chen A, Liu Y, Yin R, Guo J, DeKosky ST, Jaffee M, Zhou M, Su C, Wang F, Guo Y, Bian J. Identifying Alzheimer's Disease Progression Subphenotypes via a Graph-based Framework using Electronic Health Records. RESEARCH SQUARE 2025:rs.3.rs-6257332. [PMID: 40297697 PMCID: PMC12036456 DOI: 10.21203/rs.3.rs-6257332/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/30/2025]
Abstract
Purpose Understanding the heterogeneity of neurodegeneration in Alzheimer's disease (AD) development, as well as identifying AD progression pathways, is vital for enhancing diagnosis, treatment, prognosis, and prevention strategies. To identify disease progression subphenotypes in patients with mild cognitive impairment (MCI) and AD using electronic health records (EHRs). Methods We identified patients with mild cognitive impairment (MCI) and AD from the electronic health records from the OneFlorida+ Clinical Research Consortium. We proposed an outcome-oriented graph neural network-based model to identify progression pathways from MCI to AD. Results Of the included 2,525 patients, 61.66% were female, and the mean age was 76. In this cohort, 64.83% were Non-Hispanic White (NHW), 16.48% were Non-Hispanic Black (NHB), and 2.53% were of other races. Additionally, there were 274 Hispanic patients, accounting for 10.85% of the total patient population. The average duration from the first MCI diagnosis to the transition to AD was 891 days. We identified four progression subphenotypes, each with distinct characteristics. The average progression times from MCI to AD varied among these subphenotypes, ranging from 805 to 1,236 days. Conclusion The findings suggest that AD does not follow uniform transitions of disease states but rather exhibits heterogeneous progression pathways. Our proposed framework holds the potential to identify AD progression subphenotypes, providing valuable and explainable insights for the development of the disease.
Collapse
|
2
|
Venkatesh SS, Ganjgahi H, Palmer DS, Coley K, Linchangco GV, Hui Q, Wilson P, Ho YL, Cho K, Arumäe K, Wittemans LBL, Nellåker C, Vainik U, Sun YV, Holmes C, Lindgren CM, Nicholson G. Characterising the genetic architecture of changes in adiposity during adulthood using electronic health records. Nat Commun 2024; 15:5801. [PMID: 38987242 PMCID: PMC11237142 DOI: 10.1038/s41467-024-49998-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 06/25/2024] [Indexed: 07/12/2024] Open
Abstract
Obesity is a heritable disease, characterised by excess adiposity that is measured by body mass index (BMI). While over 1,000 genetic loci are associated with BMI, less is known about the genetic contribution to adiposity trajectories over adulthood. We derive adiposity-change phenotypes from 24.5 million primary-care health records in over 740,000 individuals in the UK Biobank, Million Veteran Program USA, and Estonian Biobank, to discover and validate the genetic architecture of adiposity trajectories. Using multiple BMI measurements over time increases power to identify genetic factors affecting baseline BMI by 14%. In the largest reported genome-wide study of adiposity-change in adulthood, we identify novel associations with BMI-change at six independent loci, including rs429358 (APOE missense variant). The SNP-based heritability of BMI-change (1.98%) is 9-fold lower than that of BMI. The modest genetic correlation between BMI-change and BMI (45.2%) indicates that genetic studies of longitudinal trajectories could uncover novel biology of quantitative traits in adulthood.
Collapse
Affiliation(s)
- Samvida S Venkatesh
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK.
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK.
| | - Habib Ganjgahi
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
- Department of Statistics, University of Oxford, Oxford, UK
| | - Duncan S Palmer
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
- Nuffield Department of Population Health, Medical Sciences Division, University of Oxford, Oxford, UK
| | - Kayesha Coley
- Department of Population Health Sciences, University of Leicester, Leicester, UK
| | - Gregorio V Linchangco
- Department of Epidemiology, Emory University Rollins School of Public Health, Atlanta, GA, USA
- Atlanta VA Health Care System, Decatur, GA, USA
| | - Qin Hui
- Department of Epidemiology, Emory University Rollins School of Public Health, Atlanta, GA, USA
- Atlanta VA Health Care System, Decatur, GA, USA
| | - Peter Wilson
- Atlanta VA Health Care System, Decatur, GA, USA
- Department of Medicine, Emory University School of Medicine, Atlanta, GA, USA
| | - Yuk-Lam Ho
- Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), Veterans Affairs Boston Healthcare System, Boston, MA, USA
| | - Kelly Cho
- Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), Veterans Affairs Boston Healthcare System, Boston, MA, USA
- Division of Aging, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Kadri Arumäe
- Institute of Psychology, Faculty of Social Sciences, University of Tartu, Tartu, Estonia
| | - Laura B L Wittemans
- Novo Nordisk Research Centre Oxford, Oxford, UK
- Nuffield Department of Women's and Reproductive Health, Medical Sciences Division, University of Oxford, Oxford, UK
| | - Christoffer Nellåker
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
- Nuffield Department of Women's and Reproductive Health, Medical Sciences Division, University of Oxford, Oxford, UK
| | - Uku Vainik
- Institute of Psychology, Faculty of Social Sciences, University of Tartu, Tartu, Estonia
- Estonian Genome Centre, Institute of Genomics, Faculty of Science and Technology, University of Tartu, Tartu, Estonia
- Department of Neurology and Neurosurgery, Faculty of Medicine and Health Sciences, University of McGill, Montreal, Canada
| | - Yan V Sun
- Department of Epidemiology, Emory University Rollins School of Public Health, Atlanta, GA, USA
- Atlanta VA Health Care System, Decatur, GA, USA
| | - Chris Holmes
- Department of Statistics, University of Oxford, Oxford, UK
- Nuffield Department of Medicine, Medical Sciences Division, University of Oxford, Oxford, UK
- The Alan Turing Institute, London, UK
| | - Cecilia M Lindgren
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK.
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK.
- Nuffield Department of Women's and Reproductive Health, Medical Sciences Division, University of Oxford, Oxford, UK.
- Broad Institute of Harvard and MIT, Cambridge, MA, USA.
| | | |
Collapse
|
3
|
Wen J, Hou J, Bonzel CL, Zhao Y, Castro VM, Gainer VS, Weisenfeld D, Cai T, Ho YL, Panickan VA, Costa L, Hong C, Gaziano JM, Liao KP, Lu J, Cho K, Cai T. LATTE: Label-efficient incident phenotyping from longitudinal electronic health records. PATTERNS (NEW YORK, N.Y.) 2024; 5:100906. [PMID: 38264714 PMCID: PMC10801250 DOI: 10.1016/j.patter.2023.100906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 09/06/2023] [Accepted: 12/01/2023] [Indexed: 01/25/2024]
Abstract
Electronic health record (EHR) data are increasingly used to support real-world evidence studies but are limited by the lack of precise timings of clinical events. Here, we propose a label-efficient incident phenotyping (LATTE) algorithm to accurately annotate the timing of clinical events from longitudinal EHR data. By leveraging the pre-trained semantic embeddings, LATTE selects predictive features and compresses their information into longitudinal visit embeddings through visit attention learning. LATTE models the sequential dependency between the target event and visit embeddings to derive the timings. To improve label efficiency, LATTE constructs longitudinal silver-standard labels from unlabeled patients to perform semi-supervised training. LATTE is evaluated on the onset of type 2 diabetes, heart failure, and relapses of multiple sclerosis. LATTE consistently achieves substantial improvements over benchmark methods while providing high prediction interpretability. The event timings are shown to help discover risk factors of heart failure among patients with rheumatoid arthritis.
Collapse
Affiliation(s)
- Jun Wen
- Harvard Medical School, Boston, MA, USA
- VA Boston Healthcare System, Boston, MA, USA
| | - Jue Hou
- University of Minnesota, Minneapolis, MN, USA
| | - Clara-Lea Bonzel
- Harvard Medical School, Boston, MA, USA
- VA Boston Healthcare System, Boston, MA, USA
| | | | | | | | | | - Tianrun Cai
- VA Boston Healthcare System, Boston, MA, USA
- Mass General Brigham, Boston, MA, USA
| | - Yuk-Lam Ho
- VA Boston Healthcare System, Boston, MA, USA
| | - Vidul A. Panickan
- Harvard Medical School, Boston, MA, USA
- VA Boston Healthcare System, Boston, MA, USA
| | | | | | - J. Michael Gaziano
- Harvard Medical School, Boston, MA, USA
- VA Boston Healthcare System, Boston, MA, USA
- Brigham and Women’s Hospital, Boston, MA, USA
| | - Katherine P. Liao
- Harvard Medical School, Boston, MA, USA
- VA Boston Healthcare System, Boston, MA, USA
- Brigham and Women’s Hospital, Boston, MA, USA
| | - Junwei Lu
- VA Boston Healthcare System, Boston, MA, USA
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Kelly Cho
- Harvard Medical School, Boston, MA, USA
- VA Boston Healthcare System, Boston, MA, USA
- Brigham and Women’s Hospital, Boston, MA, USA
| | - Tianxi Cai
- Harvard Medical School, Boston, MA, USA
- VA Boston Healthcare System, Boston, MA, USA
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
4
|
Xu J, Yin R, Huang Y, Gao H, Wu Y, Guo J, Smith GE, DeKosky ST, Wang F, Guo Y, Bian J. Identification of Outcome-Oriented Progression Subtypes from Mild Cognitive Impairment to Alzheimer's Disease Using Electronic Health Records. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.07.27.23293270. [PMID: 37577594 PMCID: PMC10418300 DOI: 10.1101/2023.07.27.23293270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Alzheimer's disease (AD) is a complex heterogeneous neurodegenerative disease that requires an in-depth understanding of its progression pathways and contributing factors to develop effective risk stratification and prevention strategies. In this study, we proposed an outcome-oriented model to identify progression pathways from mild cognitive impairment (MCI) to AD using electronic health records (EHRs) from the OneFlorida+ Clinical Research Consortium. To achieve this, we employed the long short-term memory (LSTM) network to extract relevant information from the sequential records of each patient. The hierarchical agglomerative clustering was then applied to the learned representation to group patients based on their progression subtypes. Our approach identified multiple progression pathways, each of which represented distinct patterns of disease progression from MCI to AD. These pathways can serve as a valuable resource for researchers to understand the factors influencing AD progression and to develop personalized interventions to delay or prevent the onset of the disease.
Collapse
|
5
|
Eskofier BM, Klucken J. Predictive Models for Health Deterioration: Understanding Disease Pathways for Personalized Medicine. Annu Rev Biomed Eng 2023; 25:131-156. [PMID: 36854259 DOI: 10.1146/annurev-bioeng-110220-030247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2023]
Abstract
Artificial intelligence (AI) and machine learning (ML) methods are currently widely employed in medicine and healthcare. A PubMed search returns more than 100,000 articles on these topics published between 2018 and 2022 alone. Notwithstanding several recent reviews in various subfields of AI and ML in medicine, we have yet to see a comprehensive review around the methods' use in longitudinal analysis and prediction of an individual patient's health status within a personalized disease pathway. This review seeks to fill that gap. After an overview of the AI and ML methods employed in this field and of specific medical applications of models of this type, the review discusses the strengths and limitations of current studies and looks ahead to future strands of research in this field. We aim to enable interested readers to gain a detailed impression of the research currently available and accordingly plan future work around predictive models for deterioration in health status.
Collapse
Affiliation(s)
- Bjoern M Eskofier
- Machine Learning and Data Analytics Lab, Department of Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany;
| | - Jochen Klucken
- Digital Medicine Group, Luxembourg Centre for Systems Biomedicine, Université du Luxembourg, Belvaux, Luxembourg
- Digital Medicine Group, Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg
- Centre Hospitalier de Luxembourg, Luxembourg City, Luxembourg
| |
Collapse
|
6
|
Venkatesh SS, Ganjgahi H, Palmer DS, Coley K, Wittemans LBL, Nellaker C, Holmes C, Lindgren CM, Nicholson G. The genetic architecture of changes in adiposity during adulthood. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.01.09.23284364. [PMID: 36711652 PMCID: PMC9882550 DOI: 10.1101/2023.01.09.23284364] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Obesity is a heritable disease, characterised by excess adiposity that is measured by body mass index (BMI). While over 1,000 genetic loci are associated with BMI, less is known about the genetic contribution to adiposity trajectories over adulthood. We derive adiposity-change phenotypes from 1.5 million primary-care health records in over 177,000 individuals in UK Biobank to study the genetic architecture of weight-change. Using multiple BMI measurements over time increases power to identify genetic factors affecting baseline BMI. In the largest reported genome-wide study of adiposity-change in adulthood, we identify novel associations with BMI-change at six independent loci, including rs429358 (a missense variant in APOE). The SNP-based heritability of BMI-change (1.98%) is 9-fold lower than that of BMI, and higher in women than in men. The modest genetic correlation between BMI-change and BMI (45.2%) indicates that genetic studies of longitudinal trajectories could uncover novel biology driving quantitative trait values in adulthood.
Collapse
Affiliation(s)
- Samvida S. Venkatesh
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, UK
- Big Data Institute at the Li Ka Shing Centre for Health Information and Discovery, University of Oxford, UK
| | | | - Duncan S. Palmer
- Big Data Institute at the Li Ka Shing Centre for Health Information and Discovery, University of Oxford, UK
- Nuffield Department of Women’s and Reproductive Health, Medical Sciences Division, University of Oxford, UK
| | - Kayesha Coley
- Department of Population Health Sciences, University of Leicester, UK
| | - Laura B. L. Wittemans
- Big Data Institute at the Li Ka Shing Centre for Health Information and Discovery, University of Oxford, UK
- Nuffield Department of Women’s and Reproductive Health, Medical Sciences Division, University of Oxford, UK
| | - Christoffer Nellaker
- Big Data Institute at the Li Ka Shing Centre for Health Information and Discovery, University of Oxford, UK
- Nuffield Department of Women’s and Reproductive Health, Medical Sciences Division, University of Oxford, UK
| | - Chris Holmes
- Department of Statistics, University of Oxford, UK
- Nuffield Department of Medicine, Medical Sciences Division, University of Oxford, UK
- The Alan Turing Institute, London, UK
| | - Cecilia M. Lindgren
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, UK
- Big Data Institute at the Li Ka Shing Centre for Health Information and Discovery, University of Oxford, UK
- Nuffield Department of Women’s and Reproductive Health, Medical Sciences Division, University of Oxford, UK
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| | | |
Collapse
|
7
|
Xu J, Hu Y, Liu H, Mi W, Li G, Guo J, Feng Y. A Novel Multivariable Time Series Prediction Model for Acute Kidney Injury in General Hospitalization. Int J Med Inform 2022; 161:104729. [DOI: 10.1016/j.ijmedinf.2022.104729] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 01/28/2022] [Accepted: 02/21/2022] [Indexed: 10/19/2022]
|
8
|
Putzel P, Do H, Boyd A, Zhong H, Smyth P. Dynamic Survival Analysis for EHR Data with Personalized Parametric Distributions. PROCEEDINGS OF MACHINE LEARNING RESEARCH 2021; 149:648-673. [PMID: 35425906 PMCID: PMC9006243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The widespread availability of high-dimensional electronic healthcare record (EHR) datasets has led to significant interest in using such data to derive clinical insights and make risk predictions. More specifically, techniques from machine learning are being increasingly applied to the problem of dynamic survival analysis, where updated time-to-event risk predictions are learned as a function of the full covariate trajectory from EHR datasets. EHR data presents unique challenges in the context of dynamic survival analysis, involving a variety of decisions about data representation, modeling, interpretability, and clinically meaningful evaluation. In this paper we propose a new approach to dynamic survival analysis which addresses some of these challenges. Our modeling approach is based on learning a global parametric distribution to represent population characteristics and then dynamically locating individuals on the time-axis of this distribution conditioned on their histories. For evaluation we also propose a new version of the dynamic C-Index for clinically meaningful evaluation of dynamic survival models. To validate our approach we conduct dynamic risk prediction on three real-world datasets, involving COVID-19 severe outcomes, cardiovascular disease (CVD) onset, and primary biliary cirrhosis (PBC) time-to-transplant. We find that our proposed modeling approach is competitive with other well-known statistical and machine learning approaches for dynamic risk prediction, while offering potential advantages in terms of interepretability of predictions at the individual level.
Collapse
Affiliation(s)
- Preston Putzel
- Department of Computer Science, University of California, Irvine, CA, USA
| | - Hyungrok Do
- Department of Population Health, NYU Grossman School of Medicine, New York, NY, USA
| | - Alex Boyd
- Department of Statistics, University of California, Irvine, CA, USA
| | - Hua Zhong
- Department of Population Health, NYU Grossman School of Medicine, New York, NY, USA
| | - Padhraic Smyth
- Department of Computer Science, University of California, Irvine, CA, USA
| |
Collapse
|
9
|
Abstract
Our ability to unravel the mysteries of human health and disease have changed dramatically over the past 2 decades. Decoding health and disease has been facilitated by the recent availability of high-throughput genomics and multi-omics analyses and the companion tools of advanced informatics and computational science. Understanding of the human genome and its influence on phenotype continues to advance through genotyping large populations and using “light phenotyping” approaches in combination with smaller subsets of the population being evaluated using “deep phenotyping” approaches. Using our capability to integrate and jointly analyze genomic data with other multi-omic data, the knowledge of genotype-phenotype relationships and associated genetic pathways and functions is being advanced. Understanding genotype-phenotype relationships that discriminate human health from disease is speculated to facilitate predictive, precision health care and change modes of health care delivery. The American Association for Dental Research Fall Focused Symposium assembled experts to discuss how studies of genotype-phenotype relationships are illuminating the pathophysiology of craniofacial diseases and developmental biology. Although the breadth of the topic did not allow all areas of dental, oral, and craniofacial research to be addressed (e.g., cancer), the importance and power of integrating genomic, phenomic, and other -omic data are illustrated using a variety of examples. The 8 Fall Focused talks presented different methodological approaches for ascertaining study populations and evaluating population variance and phenotyping approaches. These advances are reviewed in this summary.
Collapse
Affiliation(s)
- J T Wright
- Adams School of Dentistry, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - M C Herzberg
- Department of Diagnostic and Biological Sciences, School of Dentistry, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|