1
|
Chen SF, Lee SE, Sadaei HJ, Park JB, Khattab A, Chen JF, Henegar C, Wineinger NE, Muse ED, Torkamani A. Meta-prediction of coronary artery disease risk. Nat Med 2025:10.1038/s41591-025-03648-0. [PMID: 40240837 DOI: 10.1038/s41591-025-03648-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 03/07/2025] [Indexed: 04/18/2025]
Abstract
Coronary artery disease (CAD) is a leading cause of morbidity and mortality worldwide, and accurately predicting individual risk is critical for prevention. Here we aimed to integrate unmodifiable risk factors, such as age and genetics, with modifiable risk factors, such as clinical and biometric measurements, into a meta-prediction framework that produces actionable and personalized risk estimates. In the initial development of the model, ~2,000 predictive features were considered, including demographic data, lifestyle factors, physical measurements, laboratory tests, medication usage, diagnoses and genetics. To power our meta-prediction approach, we stratified the UK Biobank into two primary cohorts: first, a prevalent CAD cohort used to train predictive models for cross-sectional prediction at baseline and prospective estimation of contributing risk factor levels and diagnoses (baseline models) and, second, an incident CAD cohort using, in part, these baseline models as meta-features to train a final CAD incident risk prediction model. The resultant 10-year incident CAD risk model, composed of 15 derived meta-features with multiple embedded polygenic risk scores, achieves an area under the curve of 0.84. In an independent test cohort from the All of Us research program, this model achieved an area under the curve of 0.81 for predicting 10-year incident CAD risk, outperforming standard clinical scores and previously developed integrative models. Moreover, this framework enables the generation of individualized risk reduction profiles by quantifying the potential impact of standard clinical interventions. Notably, genetic risk influences the extent to which these interventions reduce overall CAD risk, allowing for tailored prevention strategies.
Collapse
Affiliation(s)
- Shang-Fu Chen
- Scripps Research Translational Institute, La Jolla, CA, USA
- Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, CA, USA
| | - Sang Eun Lee
- Department of Cardiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Hossein Javedani Sadaei
- Scripps Research Translational Institute, La Jolla, CA, USA
- Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, CA, USA
| | - Jun-Bean Park
- Department of Internal Medicine, Seoul National University Hospital, Seoul, Republic of Korea
- Cardiovascular Center, Seoul National University Hospital, Seoul, Republic of Korea
| | - Ahmed Khattab
- Scripps Research Translational Institute, La Jolla, CA, USA
- Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, CA, USA
| | - Jei-Fu Chen
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Corneliu Henegar
- Scripps Research Translational Institute, La Jolla, CA, USA
- Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, CA, USA
| | - Nathan E Wineinger
- Scripps Research Translational Institute, La Jolla, CA, USA
- Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, CA, USA
| | - Evan D Muse
- Scripps Research Translational Institute, La Jolla, CA, USA
- Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, CA, USA
- Scripps Clinic, La Jolla, CA, USA
| | - Ali Torkamani
- Scripps Research Translational Institute, La Jolla, CA, USA.
- Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, CA, USA.
| |
Collapse
|
2
|
Shi Y, Xiang Y, Ye Y, He T, Sham PC, So HC. A framework for detecting causal effects of risk factors at an individual level based on principles of Mendelian randomisation: applications to modelling individualised effects of lipids on coronary artery disease. EBioMedicine 2025; 113:105616. [PMID: 40020258 PMCID: PMC11919333 DOI: 10.1016/j.ebiom.2025.105616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Revised: 01/30/2025] [Accepted: 02/10/2025] [Indexed: 03/16/2025] Open
Abstract
BACKGROUND Mendelian Randomisation (MR) has been widely used to study the causal effects of risk factors. However, almost all MR studies concentrate on the population's average causal effects. With the advent of precision medicine, the individualised treatment effect (ITE) is often of greater interest. For instance, certain risk factors may pose a higher risk to some individuals than others, and the benefits of treatments may vary across individuals. This study proposes a framework for estimating individualised causal effects in large-scale observational studies where unobserved confounding factors may be present. METHODS We propose a framework (MR-ITE) that expands the scope of MR from estimating average causal effects to individualised causal effects. We present several approaches for estimating ITEs within this MR framework, primarily grounded on the principles of the "R-learner". To evaluate the presence of causal effect heterogeneity, we also proposed two permutation testing methods. We employed polygenic risk score (PRS) as instruments and proposed methods to improve the accuracy of ITE estimates by removal of potentially pleiotropic single nucleotide polymorphisms (SNPs). The validity of our approach was substantiated through comprehensive simulations. The proposed framework also allows the identification of important effect modifiers contributing to individualised differences in treatment effects. We applied our framework to study the individualised causal effects of various lipid traits, including low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), triglycerides (TG), and total cholesterol (TC), on the risk of coronary artery disease (CAD) based on the UK-Biobank (UKBB). We also studied the ITE of C-reactive protein (CRP) and insulin-like growth factor 1 (IGF-1) on CAD as secondary analyses. FINDINGS Simulation studies demonstrated that MR-ITE outperformed traditional causal forest approaches in identifying ITEs when unobserved confounders were present. The integration of the contamination mixture (ConMix) approach to remove invalid pleiotropic SNPs further enhanced MR-ITE's performance. In real-world applications, we identified positive causal associations between CAD and several factors (LDL-C, Total Cholesterol, and IGF-1 levels). Our permutation tests revealed significant heterogeneity in these causal associations across individuals. Using Shapley value analysis, we identified the top effect modifiers contributing to this heterogeneity. INTERPRETATION We introduced a new framework, MR-ITE, capable of inferring individualised causal effects in observational studies based on the MR approach, utilizing PRS as instruments. MR-ITE extends the application of MR from estimating the average treatment effect to individualised treatment effects. Our real-world application of MR-ITE underscores the importance of identifying ITEs in the context of precision medicine. FUNDING This work was supported partially by a National Natural Science Foundation of China grant (NSFC; grant number 81971706), the KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research of Common Diseases, Kunming Institute of Zoology and The Chinese University of Hong Kong, China, and the Lo Kwee Seong Biomedical Research Fund from The Chinese University of Hong Kong.
Collapse
Affiliation(s)
- Yujia Shi
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong, China
| | - Yong Xiang
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong, China
| | - Yuxin Ye
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong, China
| | - Tingwei He
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong, China
| | - Pak-Chung Sham
- Department of Psychiatry, University of Hong Kong, Hong Kong SAR, China
| | - Hon-Cheong So
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong, China; KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research of Common Diseases, Kunming Institute of Zoology and the Chinese University of Hong Kong, China; Department of Psychiatry, The Chinese University of Hong Kong, Hong Kong, China; CUHK Shenzhen Research Institute, Shenzhen, China; Margaret K.L. Cheung Research Centre for Management of Parkinsonism, The Chinese University of Hong Kong, Shatin, Hong Kong, China; Brain and Mind Institute, The Chinese University of Hong Kong, Hong Kong SAR, China; Hong Kong Branch of the Chinese Academy of Sciences Center for Excellence in Animal Evolution and Genetics, The Chinese University of Hong Kong, Hong Kong SAR, China.
| |
Collapse
|
3
|
Manikpurage HD, Ricard J, Houessou U, Bourgault J, Gagnon E, Gobeil É, Girard A, Li Z, Eslami A, Mathieu P, Bossé Y, Arsenault BJ, Thériault S. Association of genetically predicted levels of circulating blood lipids with coronary artery disease incidence. Atherosclerosis 2025; 401:119083. [PMID: 39674127 DOI: 10.1016/j.atherosclerosis.2024.119083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Revised: 11/27/2024] [Accepted: 12/03/2024] [Indexed: 12/16/2024]
Abstract
BACKGROUND AND AIMS Estimating the genetic risk of coronary artery disease (CAD) is now possible by aggregating data from genome-wide association studies (GWAS) into polygenic risk scores (PRS). Combining multiple PRS for specific circulating blood lipids could improve risk prediction. Here, we sought to evaluate the performance of PRS derived from CAD and blood lipids GWAS to predict the incidence of CAD. METHODS This study included individuals aged between 40 and 69 from UK Biobank. We conducted GWAS for blood lipids measured by nuclear magnetic resonance in individuals without lipid-lowering treatments (n = 73,915). Summary statistics were used to derive PRS in the remaining participants (n = 318,051). A PRSCAD was derived using the CARDIoGRAMplusC4D GWAS. Hazard ratios (HR) for CAD (n = 9017 out of 301,576; median follow-up: 12.6 years) were calculated per standard deviation increase in each PRS. Models' discrimination capacity and goodness-of-fit were evaluated. RESULTS Out of 30 PRS, 27 were significantly associated with the incidence of CAD (p < 0.0017). The optimal combination of PRS included PRS for CAD, VLDL-C, total cholesterol and triglycerides. Discriminative capacities were significantly increased in the model including PRSCAD and clinical risk factors (CRF) (C-statistic = 0.778 [0.773-0.782]) compared to the model with CRF only (C-statistic = 0.755 [0.751-0.760], difference = 0.022 [0.020-0.025]). Although the C-statistic remained similar when independent lipids PRS were added to the model with PRSCAD and CRF (C-statistic = 0.778 [0.773-0.783]), the goodness-of-fit was significantly increased (chi-square test statistic = 20.18, p = 1.56e-04). CONCLUSIONS Although independently associated with CAD incidence, blood lipids PRS provide modest improvement in the predictive performance when added to PRSCAD.
Collapse
Affiliation(s)
- Hasanga D Manikpurage
- Centre de Recherche de l'Institut Universitaire de Cardiologie et de Pneumologie de Québec - Université Laval, Québec, (QC), Canada
| | - Jasmin Ricard
- Centre de Recherche de l'Institut Universitaire de Cardiologie et de Pneumologie de Québec - Université Laval, Québec, (QC), Canada
| | - Ursula Houessou
- Centre de Recherche de l'Institut Universitaire de Cardiologie et de Pneumologie de Québec - Université Laval, Québec, (QC), Canada
| | - Jérôme Bourgault
- Centre de Recherche de l'Institut Universitaire de Cardiologie et de Pneumologie de Québec - Université Laval, Québec, (QC), Canada
| | - Eloi Gagnon
- Centre de Recherche de l'Institut Universitaire de Cardiologie et de Pneumologie de Québec - Université Laval, Québec, (QC), Canada
| | - Émilie Gobeil
- Centre de Recherche de l'Institut Universitaire de Cardiologie et de Pneumologie de Québec - Université Laval, Québec, (QC), Canada
| | - Arnaud Girard
- Centre de Recherche de l'Institut Universitaire de Cardiologie et de Pneumologie de Québec - Université Laval, Québec, (QC), Canada
| | - Zhonglin Li
- Centre de Recherche de l'Institut Universitaire de Cardiologie et de Pneumologie de Québec - Université Laval, Québec, (QC), Canada
| | - Aida Eslami
- Centre de Recherche de l'Institut Universitaire de Cardiologie et de Pneumologie de Québec - Université Laval, Québec, (QC), Canada; Department of Social and Preventive Medicine, Faculty of Medicine, Université Laval, Québec, (QC), Canada
| | - Patrick Mathieu
- Centre de Recherche de l'Institut Universitaire de Cardiologie et de Pneumologie de Québec - Université Laval, Québec, (QC), Canada; Department of Surgery, Faculty of Medicine, Université Laval, Québec, (QC), Canada
| | - Yohan Bossé
- Centre de Recherche de l'Institut Universitaire de Cardiologie et de Pneumologie de Québec - Université Laval, Québec, (QC), Canada; Department of Molecular Medicine, Faculty of Medicine, Université Laval, Québec, (QC), Canada
| | - Benoit J Arsenault
- Centre de Recherche de l'Institut Universitaire de Cardiologie et de Pneumologie de Québec - Université Laval, Québec, (QC), Canada; Department of Medicine, Faculty of Medicine, Université Laval, Québec, (QC), Canada
| | - Sébastien Thériault
- Centre de Recherche de l'Institut Universitaire de Cardiologie et de Pneumologie de Québec - Université Laval, Québec, (QC), Canada; Department of Molecular Biology, Medical Biochemistry and Pathology, Faculty of Medicine, Université Laval, Québec, (QC), Canada.
| |
Collapse
|
4
|
Salenius K, Väljä N, Thusberg S, Iris F, Ladd-Acosta C, Roos C, Nykter M, Fasano A, Autio R, Lin J. Exploring autism spectrum disorder and co-occurring trait associations to elucidate multivariate genetic mechanisms and insights. BMC Psychiatry 2024; 24:934. [PMID: 39696186 DOI: 10.1186/s12888-024-06392-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Accepted: 12/08/2024] [Indexed: 12/20/2024] Open
Abstract
BACKGROUND Autism spectrum disorder (ASD) is a partially heritable neurodevelopmental trait, and people with ASD may also have other co-occurring trait such as ADHD, anxiety disorders, depression, mental health issues, learning difficulty, physical health traits and communication challenges. The concomitant development of ASD and other neurological traits is assumed to result from a complex interplay between genetics and the environment. However, only a limited number of studies have performed multivariate genome-wide association studies (GWAS) for ASD. METHODS We conducted to-date the largest multivariate GWAS on ASD and 8 ASD co-occurring traits (ADHD, ADHD childhood, anxiety stress (ASDR), bipolar (BIP), disruptive behaviour (DBD), educational attainment (EA), major depression, and schizophrenia (SCZ)) using summary statistics from leading studies. Multivariate associations and central traits were further identified. Subsequently, colocalization and Mendelian randomization (MR) analysis were performed on the associations identified with the central traits containing ASD. To further validate our findings, pathway and quantified trait loci (QTL) resources as well as independent datasets consisting of 112 (45 probands) whole genome sequence data from the GEMMA project were utilized. RESULTS Multivariate GWAS resulted in 637 significant associations (p < 5e-8), among which 322 are reported for the first time for any trait. 37 SNPs were identified to contain ASD and one or more traits in their central trait set, including variants mapped to known SFARI ASD genes MAPT, CADPS and NEGR1 as well as novel ASD genes KANSL1, NSF and NTM, associated with immune response, synaptic transmission, and neurite growth respectively. Mendelian randomization analyses found that genetic liability for ADHD childhood, ASRD and DBT has causal effects on the risk of ASD while genetic liability for ASD has causal effects on the risk of ADHD, ADHD childhood, BIP, WA, MDD and SCZ. Frequency differences of SNPs found in NTM and CADPS genes, respectively associated with neurite growth and neural/endocrine calcium regulation, were found between GEMMA ASD probands and controls. Pathway, QTL and cell type enrichment implicated microbiome, enteric inflammation, and central nervous system enrichments. CONCLUSIONS Our study, combining multivariate GWAS with systematic decomposition, identified novel genetic associations related to ASD and ASD co-occurring driver traits. Statistical tests were applied to discern evidence for shared and interpretable liability between ASD and co-occurring traits. These findings expand upon the current understanding of the complex genetics regulating ASD and reveal insights of neuronal brain disruptions potentially driving development and manifestation.
Collapse
Affiliation(s)
- Karoliina Salenius
- Faculty of Medicine and Health Technology, Tampere University and Tays Cancer Centre, Tampere, Finland
| | - Niina Väljä
- Faculty of Medicine and Health Technology, Tampere University and Tays Cancer Centre, Tampere, Finland
| | - Sini Thusberg
- Faculty of Medicine and Health Technology, Tampere University and Tays Cancer Centre, Tampere, Finland
| | | | - Christine Ladd-Acosta
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, USA
| | | | - Matti Nykter
- Faculty of Medicine and Health Technology, Tampere University and Tays Cancer Centre, Tampere, Finland
- Foundation for the Finnish Cancer Institute, Helsinki, Finland
| | - Alessio Fasano
- European Biomedical Research Institute of Salerno (EBRIS), Salerno, Italy
- Harvard Medical School, Harvard T.H. Chan School of Public Health, Boston, USA
| | - Reija Autio
- Health Sciences, Faculty of Social Sciences, Tampere University, Tampere, Finland
| | - Jake Lin
- Faculty of Medicine and Health Technology, Tampere University and Tays Cancer Centre, Tampere, Finland.
- Department of Medical Epidemiology and Biostatistics, Karolinska Institute, Stockholm, Sweden.
| |
Collapse
|
5
|
Abramowitz SA, Boulier K, Keat K, Cardone KM, Shivakumar M, DePaolo J, Judy R, Kim D, Rader DJ, Ritchie MD, Voight BF, Pasaniuc B, Levin MG, Damrauer SM. Population Performance and Individual Agreement of Coronary Artery Disease Polygenic Risk Scores. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.07.25.24310931. [PMID: 39108513 PMCID: PMC11302700 DOI: 10.1101/2024.07.25.24310931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 08/12/2024]
Abstract
Importance Polygenic risk scores (PRSs) for coronary artery disease (CAD) are a growing clinical and commercial reality. Whether existing scores provide similar individual-level assessments of disease liability is a critical consideration for clinical implementation that remains uncharacterized. Objective Characterize the reliability of CAD PRSs that perform equivalently at the population level at predicting individual-level risk. Design Cross-sectional Study. Setting All of Us Research Program (AOU), Penn Medicine Biobank (PMBB), and UCLA ATLAS Precision Health Biobank. Participants Volunteers of diverse genetic backgrounds enrolled in AOU, PMBB, and UCLA with available electronic health record and genotyping data. Exposures Polygenic risk for CAD from previously published PRSs and new PRSs developed separately from the testing cohorts. Main Outcomes and Measures Sets of CAD PRSs that perform population prediction equivalently were identified by comparing calibration and discrimination (Brier score and AUROC) of generalized linear models of prevalent CAD using Bayesian analysis of variance. Among equivalently performing scores, individual-level agreement between risk estimates was tested with intraclass correlation (ICC) and Light's Kappa, measures of inter-rater reliability. Results 50 PRSs were calculated for 171,095 AOU participants. When included in a model of prevalent CAD, 48 scores had practically equivalent Brier scores and AUROCs (region of practical equivalence = 0.02). Across these scores, 84% of participants had at least one score in both the top and bottom risk quintile. Continuous agreement of individual risk predictions from the 48 scores was poor, with an ICC of 0.351 (95% CI; 0.349, 0.352). Agreement between two statistically equivalent scores was moderate, with an ICC of 0.649 (95% CI; 0.646, 0.652). Light's Kappa, used to evaluate consistency of assignment to high-risk thresholds, did not exceed 0.56 (interpreted as 'fair') across statistically and practically equivalent scores. Repeating the analysis among 41,193 PMBB and 50,748 UCLA participants yielded different sets of statistically and practically equivalent scores which also lacked strong individual agreement. Conclusions and Relevance Across three diverse biobanks, CAD PRSs that performed equivalently at the population level produced unreliable individual risk estimates. Approaches to clinical implementation of CAD PRSs must consider the potential for discordant individual risk estimates from otherwise indistinguishable scores.
Collapse
|
6
|
Yang X, Sullivan PF, Li B, Fan Z, Ding D, Shu J, Guo Y, Paschou P, Bao J, Shen L, Ritchie MD, Nave G, Platt ML, Li T, Zhu H, Zhao B. Multi-organ imaging-derived polygenic indexes for brain and body health. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.04.18.23288769. [PMID: 38883759 PMCID: PMC11177904 DOI: 10.1101/2023.04.18.23288769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2024]
Abstract
The UK Biobank (UKB) imaging project is a crucial resource for biomedical research, but is limited to 100,000 participants due to cost and accessibility barriers. Here we used genetic data to predict heritable imaging-derived phenotypes (IDPs) for a larger cohort. We developed and evaluated 4,375 IDP genetic scores (IGS) derived from UKB brain and body images. When applied to UKB participants who were not imaged, IGS revealed links to numerous phenotypes and stratified participants at increased risk for both brain and somatic diseases. For example, IGS identified individuals at higher risk for Alzheimer's disease and multiple sclerosis, offering additional insights beyond traditional polygenic risk scores of these diseases. When applied to independent external cohorts, IGS also stratified those at high disease risk in the All of Us Research Program and the Alzheimer's Disease Neuroimaging Initiative study. Our results demonstrate that, while the UKB imaging cohort is largely healthy and may not be the most enriched for disease risk management, it holds immense potential for stratifying the risk of various brain and body diseases in broader external genetic cohorts.
Collapse
Affiliation(s)
- Xiaochen Yang
- Department of Statistics, Purdue University, West Lafayette, IN 47907, USA
| | - Patrick F. Sullivan
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Bingxuan Li
- UCLA Samueli School of Engineering, Los Angeles, CA 90095, USA
| | - Zirui Fan
- Department of Statistics and Data Science, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Dezheng Ding
- Department of Electrical and Systems Engineering, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Juan Shu
- Department of Statistics, Purdue University, West Lafayette, IN 47907, USA
| | - Yuxin Guo
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Peristera Paschou
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Jingxuan Bao
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA
- Graduate Group in Genomics and Computational Biology, University of Pennsylvania, Philadelphia, PA, USA
| | - Li Shen
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Marylyn D. Ritchie
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA
- Institute for Biomedical Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Gideon Nave
- Marketing Department, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Michael L. Platt
- Marketing Department, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Neuroscience, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Psychology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Tengfei Li
- Department of Radiology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Biomedical Research Imaging Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Hongtu Zhu
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Biomedical Research Imaging Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Bingxin Zhao
- Department of Statistics and Data Science, University of Pennsylvania, Philadelphia, PA 19104, USA
- Institute for Biomedical Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
- Applied Mathematics and Computational Science Graduate Group, University of Pennsylvania, Philadelphia, PA 19104, USA
- Center for AI and Data Science for Integrated Diagnostics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Population Aging Research Center, University of Pennsylvania, Philadelphia, PA 19104, USA
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
7
|
Jung H, Jung HU, Baek EJ, Kwon SY, Kang JO, Lim JE, Oh B. Integration of risk factor polygenic risk score with disease polygenic risk score for disease prediction. Commun Biol 2024; 7:180. [PMID: 38351177 PMCID: PMC10864389 DOI: 10.1038/s42003-024-05874-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 01/30/2024] [Indexed: 02/16/2024] Open
Abstract
Polygenic risk score (PRS) is useful for capturing an individual's genetic susceptibility. However, previous studies have not fully exploited the potential of the risk factor PRS (RFPRS) for disease prediction. We explored the potential of integrating disease-related RFPRSs with disease PRS to enhance disease prediction performance. We constructed 112 RFPRSs and analyzed the association of RFPRSs with diseases to identify disease-related RFPRSs in 700 diseases, using the UK Biobank dataset. We uncovered 6157 statistically significant associations between 247 diseases and 109 RFPRSs. We estimated the disease PRSs of 70 diseases that exhibited statistically significant heritability, to generate RFDiseasemetaPRS-a combined PRS integrating RFPRSs and disease PRS-and compare the prediction performance metrics between RFDiseasemetaPRS and disease PRS. RFDiseasemetaPRS showed better performance for Nagelkerke's pseudo-R2, odds ratio (OR) per 1 SD, net reclassification improvement (NRI) values and difference of R2 considered by variance of R2 in 31 out of 70 diseases. Additionally, we assessed risk classification between two models by examining OR between the top 10% and remaining 90% individuals for the 31 diseases; RFDiseasemetaPRS exhibited better R2, NRI and OR than disease PRS. These findings highlight the importance of utilizing RFDiseasemetaPRS, which can provide personalized healthcare and tailored prevention strategies.
Collapse
Affiliation(s)
- Hyein Jung
- Department of Biomedical Science, Graduate School, Kyung Hee University, Seoul, Republic of Korea
| | - Hae-Un Jung
- Department of Biomedical Science, Graduate School, Kyung Hee University, Seoul, Republic of Korea
| | | | - Shin Young Kwon
- Department of Biomedical Science, Graduate School, Kyung Hee University, Seoul, Republic of Korea
| | - Ji-One Kang
- Department of Biochemistry and Molecular Biology, School of Medicine, Kyung Hee University, Seoul, Republic of Korea
| | - Ji Eun Lim
- Department of Biochemistry and Molecular Biology, School of Medicine, Kyung Hee University, Seoul, Republic of Korea.
| | - Bermseok Oh
- Department of Biomedical Science, Graduate School, Kyung Hee University, Seoul, Republic of Korea.
- Mendel Inc, Seoul, Republic of Korea.
- Department of Biochemistry and Molecular Biology, School of Medicine, Kyung Hee University, Seoul, Republic of Korea.
| |
Collapse
|