Pan J, Fan Z, Smith GE, Guo Y, Bian J, Xu J. Federated learning with multi-cohort real-world data for predicting the progression from mild cognitive impairment to Alzheimer's disease.
Alzheimers Dement 2025;
21:e70128. [PMID:
40219846 PMCID:
PMC11992589 DOI:
10.1002/alz.70128]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2025] [Revised: 03/03/2025] [Accepted: 03/03/2025] [Indexed: 04/14/2025]
Abstract
INTRODUCTION
Leveraging routinely collected electronic health records (EHRs) from multiple health-care institutions, this approach aims to assess the feasibility of using federated learning (FL) to predict the progression from mild cognitive impairment (MCI) to Alzheimer's disease (AD).
METHODS
We analyzed EHR data from the OneFlorida+ consortium, simulating six sites, and used a long short-term memory (LSTM) model with a federated averaging (FedAvg) algorithm. A personalized FL approach was used to address between-site heterogeneity. Model performance was assessed using the area under the receiver operating characteristic curve (AUC) and feature importance techniques.
RESULTS
Of 44,899 MCI patients, 6391 progressed to AD. FL models achieved a 6% improvement in AUC compared to local models. Key predictive features included body mass index, vitamin B12, blood pressure, and others.
DISCUSSION
FL showed promise in predicting AD progression by integrating heterogeneous data across multiple institutions while preserving privacy. Despite limitations, it offers potential for future clinical applications.
HIGHLIGHTS
We applied long short-term memory and federated learning (FL) to predict mild cognitive impairment to Alzheimer's disease progression using electronic health record data from multiple institutions. FL improved prediction performance, with a 6% increase in area under the receiver operating characteristic curve compared to local models. We identified key predictive features, such as body mass index, vitamin B12, and blood pressure. FL shows effectiveness in handling data heterogeneity across multiple sites while ensuring data privacy. Personalized and pooled FL models generally performed better than global and local models.
Collapse