1
|
van Dijk MT, Talati A, Barrios PG, Crandall AJ, Lugo-Candelas C. Prenatal depression outcomes in the next generation: A critical review of recent DOHaD studies and recommendations for future research. Semin Perinatol 2024; 48:151948. [PMID: 39043475 DOI: 10.1016/j.semperi.2024.151948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 07/25/2024]
Abstract
Prenatal depression, a common pregnancy-related risk with a prevalence of 10-20 %, may affect in utero development and socioemotional and neurodevelopmental outcomes in the next generation. Although there is a growing body of work that suggests prenatal depression has an independent and long-lasting effect on offspring outcomes, important questions remain, and findings often do not converge. The present review examines work carried out in the last decade, with an emphasis on studies focusing on mechanisms and leveraging innovative technologies and study designs to fill in gaps in research. Overall, the past decade of research continues to suggest that prenatal depression increases risk for offspring socioemotional problems and may alter early brain development by affecting maternal-fetal physiology during pregnancy. However, important limitations remain; lack of diversity in study samples, inconsistent consideration of potential confounders (e.g., genetics, postnatal depression, parenting), and restriction of examination to narrow time windows and single exposures. On the other hand, exciting work has begun uncovering potential mechanisms underlying transmission, including alterations in mitochondria functioning, epigenetics, and the prenatal microbiome. We review the evidence to date, identify limitations, and suggest strategies for the next decade of research to detect mechanisms as well as sources of plasticity and resilience to ensure this work translates into meaningful, actionable science that improves the lives of families.
Collapse
Affiliation(s)
- M T van Dijk
- Columbia University Irving Medical Center, United States; New York State Psychiatric Institute, United States
| | - A Talati
- Columbia University Irving Medical Center, United States; New York State Psychiatric Institute, United States
| | | | - A J Crandall
- Columbia University Irving Medical Center, United States; New York State Psychiatric Institute, United States
| | - C Lugo-Candelas
- Columbia University Irving Medical Center, United States; New York State Psychiatric Institute, United States.
| |
Collapse
|
2
|
Capalbo A, de Wert G, Mertes H, Klausner L, Coonen E, Spinella F, Van de Velde H, Viville S, Sermon K, Vermeulen N, Lencz T, Carmi S. Screening embryos for polygenic disease risk: a review of epidemiological, clinical, and ethical considerations. Hum Reprod Update 2024; 30:529-557. [PMID: 38805697 PMCID: PMC11369226 DOI: 10.1093/humupd/dmae012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 03/25/2024] [Indexed: 05/30/2024] Open
Abstract
BACKGROUND The genetic composition of embryos generated by in vitro fertilization (IVF) can be examined with preimplantation genetic testing (PGT). Until recently, PGT was limited to detecting single-gene, high-risk pathogenic variants, large structural variants, and aneuploidy. Recent advances have made genome-wide genotyping of IVF embryos feasible and affordable, raising the possibility of screening embryos for their risk of polygenic diseases such as breast cancer, hypertension, diabetes, or schizophrenia. Despite a heated debate around this new technology, called polygenic embryo screening (PES; also PGT-P), it is already available to IVF patients in some countries. Several articles have studied epidemiological, clinical, and ethical perspectives on PES; however, a comprehensive, principled review of this emerging field is missing. OBJECTIVE AND RATIONALE This review has four main goals. First, given the interdisciplinary nature of PES studies, we aim to provide a self-contained educational background about PES to reproductive specialists interested in the subject. Second, we provide a comprehensive and critical review of arguments for and against the introduction of PES, crystallizing and prioritizing the key issues. We also cover the attitudes of IVF patients, clinicians, and the public towards PES. Third, we distinguish between possible future groups of PES patients, highlighting the benefits and harms pertaining to each group. Finally, our review, which is supported by ESHRE, is intended to aid healthcare professionals and policymakers in decision-making regarding whether to introduce PES in the clinic, and if so, how, and to whom. SEARCH METHODS We searched for PubMed-indexed articles published between 1/1/2003 and 1/3/2024 using the terms 'polygenic embryo screening', 'polygenic preimplantation', and 'PGT-P'. We limited the review to primary research papers in English whose main focus was PES for medical conditions. We also included papers that did not appear in the search but were deemed relevant. OUTCOMES The main theoretical benefit of PES is a reduction in lifetime polygenic disease risk for children born after screening. The magnitude of the risk reduction has been predicted based on statistical modelling, simulations, and sibling pair analyses. Results based on all methods suggest that under the best-case scenario, large relative risk reductions are possible for one or more diseases. However, as these models abstract several practical limitations, the realized benefits may be smaller, particularly due to a limited number of embryos and unclear future accuracy of the risk estimates. PES may negatively impact patients and their future children, as well as society. The main personal harms are an unindicated IVF treatment, a possible reduction in IVF success rates, and patient confusion, incomplete counselling, and choice overload. The main possible societal harms include discarded embryos, an increasing demand for 'designer babies', overemphasis of the genetic determinants of disease, unequal access, and lower utility in people of non-European ancestries. Benefits and harms will vary across the main potential patient groups, comprising patients already requiring IVF, fertile people with a history of a severe polygenic disease, and fertile healthy people. In the United States, the attitudes of IVF patients and the public towards PES seem positive, while healthcare professionals are cautious, sceptical about clinical utility, and concerned about patient counselling. WIDER IMPLICATIONS The theoretical potential of PES to reduce risk across multiple polygenic diseases requires further research into its benefits and harms. Given the large number of practical limitations and possible harms, particularly unnecessary IVF treatments and discarded viable embryos, PES should be offered only within a research context before further clarity is achieved regarding its balance of benefits and harms. The gap in attitudes between healthcare professionals and the public needs to be narrowed by expanding public and patient education and providing resources for informative and unbiased genetic counselling.
Collapse
Affiliation(s)
- Antonio Capalbo
- Juno Genetics, Department of Reproductive Genetics, Rome, Italy
- Center for Advanced Studies and Technology (CAST), Department of Medical Genetics, “G. d’Annunzio” University of Chieti-Pescara, Chieti, Italy
| | - Guido de Wert
- Department of Health, Ethics & Society, CAPHRI-School for Public Health and Primary Care and GROW School for Oncology and Reproduction, Maastricht University, Maastricht, The Netherlands
| | - Heidi Mertes
- Department of Philosophy and Moral Sciences, Ghent University, Ghent, Belgium
- Department of Public Health and Primary Care, Ghent University, Ghent, Belgium
| | - Liraz Klausner
- Braun School of Public Health and Community Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Edith Coonen
- Departments of Clinical Genetics and Reproductive Medicine, Maastricht University Medical Centre, Maastricht, The Netherlands
- School for Oncology and Developmental Biology, GROW, Maastricht University, Maastricht, The Netherlands
| | - Francesca Spinella
- Eurofins GENOMA Group Srl, Molecular Genetics Laboratories, Department of Scientific Communication, Rome, Italy
| | - Hilde Van de Velde
- Research Group Genetics Reproduction and Development (GRAD), Vrije Universiteit Brussel, Brussel, Belgium
- Brussels IVF, UZ Brussel, Brussel, Belgium
| | - Stephane Viville
- Laboratoire de Génétique Médicale LGM, Institut de Génétique Médicale d’Alsace IGMA, INSERM UMR 1112, Université de Strasbourg, France
- Laboratoire de Diagnostic Génétique, Unité de Génétique de l’infertilité (UF3472), Hôpitaux Universitaires de Strasbourg, Strasbourg, France
| | - Karen Sermon
- Research Group Genetics Reproduction and Development (GRAD), Vrije Universiteit Brussel, Brussel, Belgium
| | | | - Todd Lencz
- Institute of Behavioral Science, Feinstein Institutes for Medical Research, Manhasset, NY, USA
- Departments of Psychiatry and Molecular Medicine, Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY 11549, USA
| | - Shai Carmi
- Braun School of Public Health and Community Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
| |
Collapse
|
3
|
Veller C, Przeworski M, Coop G. Causal interpretations of family GWAS in the presence of heterogeneous effects. Proc Natl Acad Sci U S A 2024; 121:e2401379121. [PMID: 39269774 DOI: 10.1073/pnas.2401379121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Accepted: 07/26/2024] [Indexed: 09/15/2024] Open
Abstract
Family-based genome-wide association studies (GWASs) are often claimed to provide an unbiased estimate of the average causal effects (or average treatment effects; ATEs) of alleles, on the basis of an analogy between the random transmission of alleles from parents to children and a randomized controlled trial. We show that this claim does not hold in general. Because Mendelian segregation only randomizes alleles among children of heterozygotes, the effects of alleles in the children of homozygotes are not observable. This feature will matter if an allele has different average effects in the children of homozygotes and heterozygotes, as can arise in the presence of gene-by-environment interactions, gene-by-gene interactions, or differences in linkage disequilibrium patterns. At a single locus, family-based GWAS can be thought of as providing an unbiased estimate of the average effect in the children of heterozygotes (i.e., a local average treatment effect; LATE). This interpretation does not extend to polygenic scores (PGSs), however, because different sets of SNPs are heterozygous in each family. Therefore, other than under specific conditions, the within-family regression slope of a PGS cannot be assumed to provide an unbiased estimate of the LATE for any subset or weighted average of families. In practice, the potential biases of a family-based GWAS are likely smaller than those that can arise from confounding in a standard, population-based GWAS, and so family studies remain important for the dissection of genetic contributions to phenotypic variation. Nonetheless, their causal interpretation is less straightforward than has been widely appreciated.
Collapse
Affiliation(s)
- Carl Veller
- Department of Ecology & Evolution, University of Chicago, Chicago, IL 60637
| | - Molly Przeworski
- Department of Biological Sciences, Columbia University, New York, NY 10027
- Department of Systems Biology, Columbia University, New York, NY 10032
| | - Graham Coop
- Center for Population Biology and Department of Evolution and Ecology, University of California, Davis, CA 95616
| |
Collapse
|
4
|
Moreno-Grau S, Vernekar M, Lopez-Pineda A, Mas-Montserrat D, Barrabés M, Quinto-Cortés CD, Moatamed B, Lee MTM, Yu Z, Numakura K, Matsuda Y, Wall JD, Ioannidis AG, Katsanis N, Takano T, Bustamante CD. Polygenic risk score portability for common diseases across genetically diverse populations. Hum Genomics 2024; 18:93. [PMID: 39218908 PMCID: PMC11367857 DOI: 10.1186/s40246-024-00664-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2024] [Accepted: 08/19/2024] [Indexed: 09/04/2024] Open
Abstract
BACKGROUND Polygenic risk scores (PRS) derived from European individuals have reduced portability across global populations, limiting their clinical implementation at worldwide scale. Here, we investigate the performance of a wide range of PRS models across four ancestry groups (Africans, Europeans, East Asians, and South Asians) for 14 conditions of high-medical interest. METHODS To select the best-performing model per trait, we first compared PRS performances for publicly available scores, and constructed new models using different methods (LDpred2, PRS-CSx and SNPnet). We used 285 K European individuals from the UK Biobank (UKBB) for training and 18 K, including diverse ancestries, for testing. We then evaluated PRS portability for the best models in Europeans and compared their accuracies with respect to the best PRS per ancestry. Finally, we validated the selected PRS models using an independent set of 8,417 individuals from Biobank of the Americas-Genomelink (BbofA-GL); and performed a PRS-Phewas. RESULTS We confirmed a decay in PRS performances relative to Europeans when the evaluation was conducted using the best-PRS model for Europeans (51.3% for South Asians, 46.6% for East Asians and 39.4% for Africans). We observed an improvement in the PRS performances when specifically selecting ancestry specific PRS models (phenotype variance increase: 1.62 for Africans, 1.40 for South Asians and 0.96 for East Asians). Additionally, when we selected the optimal model conditional on ancestry for CAD, HDL-C and LDL-C, hypertension, hypothyroidism and T2D, PRS performance for studied populations was more comparable to what was observed in Europeans. Finally, we were able to independently validate tested models for Europeans, and conducted a PRS-Phewas, identifying cross-trait interplay between cardiometabolic conditions, and between immune-mediated components. CONCLUSION Our work comprehensively evaluated PRS accuracy across a wide range of phenotypes, reducing the uncertainty with respect to which PRS model to choose and in which ancestry group. This evaluation has let us identify specific conditions where implementing risk-prioritization strategies could have practical utility across diverse ancestral groups, contributing to democratizing the implementation of PRS.
Collapse
Affiliation(s)
- Sonia Moreno-Grau
- Galatea Bio, Inc, 14350 Commerce Way, Miami Lakes, FL, 33146, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, 1265 Welch Road, Stanford, CA, 94305, USA
| | - Manvi Vernekar
- Genomelink, Inc, 2150 Shattuck Avenue, Berkeley, CA, 94704, USA
| | - Arturo Lopez-Pineda
- Galatea Bio, Inc, 14350 Commerce Way, Miami Lakes, FL, 33146, USA
- , Amphora Health. Batallon Independencia 80, Morelia, Michoacan, 58260, Mexico
- Escuela Nacional de Estudios Superiores, Unidad Morelia, Universidad Nacional Autonoma de México, Antigua Carretera a Pátzcuaro No. 8701, Col. Ex Hacienda de San José de la Huerta, Morelia, Michoacán, C.P. 58190, Mexico
| | | | - Míriam Barrabés
- Galatea Bio, Inc, 14350 Commerce Way, Miami Lakes, FL, 33146, USA
| | | | - Babak Moatamed
- Galatea Bio, Inc, 14350 Commerce Way, Miami Lakes, FL, 33146, USA
| | | | - Zhenning Yu
- Genomelink, Inc, 2150 Shattuck Avenue, Berkeley, CA, 94704, USA
| | | | - Yuta Matsuda
- Genomelink, Inc, 2150 Shattuck Avenue, Berkeley, CA, 94704, USA
| | - Jeffrey D Wall
- Galatea Bio, Inc, 14350 Commerce Way, Miami Lakes, FL, 33146, USA
| | - Alexander G Ioannidis
- Galatea Bio, Inc, 14350 Commerce Way, Miami Lakes, FL, 33146, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, 1265 Welch Road, Stanford, CA, 94305, USA
- University of California Santa Cruz, 1156 High Street, Santa Cruz, CA, 95064, USA
| | | | - Tomohiro Takano
- Genomelink, Inc, 2150 Shattuck Avenue, Berkeley, CA, 94704, USA.
- Japan: Awakens Japan K.K. (Japanese subsidiary of Genomelink, Inc.), 2-11-3, Meguro, Meguro-ku, 1530063, Tokyo, Japan.
| | - Carlos D Bustamante
- Galatea Bio, Inc, 14350 Commerce Way, Miami Lakes, FL, 33146, USA.
- Department of Biomedical Data Science, Stanford University School of Medicine, 1265 Welch Road, Stanford, CA, 94305, USA.
| |
Collapse
|
5
|
Jurgens SJ, Wang X, Choi SH, Weng LC, Koyama S, Pirruccello JP, Nguyen T, Smadbeck P, Jang D, Chaffin M, Walsh R, Roselli C, Elliott AL, Wijdeveld LFJM, Biddinger KJ, Kany S, Rämö JT, Natarajan P, Aragam KG, Flannick J, Burtt NP, Bezzina CR, Lubitz SA, Lunetta KL, Ellinor PT. Rare coding variant analysis for human diseases across biobanks and ancestries. Nat Genet 2024; 56:1811-1820. [PMID: 39210047 DOI: 10.1038/s41588-024-01894-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Accepted: 08/01/2024] [Indexed: 09/04/2024]
Abstract
Large-scale sequencing has enabled unparalleled opportunities to investigate the role of rare coding variation in human phenotypic variability. Here, we present a pan-ancestry analysis of sequencing data from three large biobanks, including the All of Us research program. Using mixed-effects models, we performed gene-based rare variant testing for 601 diseases across 748,879 individuals, including 155,236 with ancestry dissimilar to European. We identified 363 significant associations, which highlighted core genes for the human disease phenome and identified potential novel associations, including UBR3 for cardiometabolic disease and YLPM1 for psychiatric disease. Pan-ancestry burden testing represented an inclusive and useful approach for discovery in diverse datasets, although we also highlight the importance of ancestry-specific sensitivity analyses in this setting. Finally, we found that effect sizes for rare protein-disrupting variants were concordant between samples similar to European ancestry and other genetic ancestries (βDeming = 0.7-1.0). Our results have implications for multi-ancestry and cross-biobank approaches in sequencing association studies for human disease.
Collapse
Affiliation(s)
- Sean J Jurgens
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Experimental Cardiology, Heart Center, Amsterdam Cardiovascular Sciences, Heart Failure and Arrhythmias, Amsterdam UMC location University of Amsterdam, Amsterdam, The Netherlands
- Cardiovascular Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Xin Wang
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Seung Hoan Choi
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Lu-Chen Weng
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Satoshi Koyama
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - James P Pirruccello
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Cardiology, University of California, San Francisco, CA, USA
| | - Trang Nguyen
- Metabolism Program, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Patrick Smadbeck
- Metabolism Program, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Dongkeun Jang
- Metabolism Program, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Program in Medical and Population Genetics, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Mark Chaffin
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Roddy Walsh
- Department of Experimental Cardiology, Heart Center, Amsterdam Cardiovascular Sciences, Heart Failure and Arrhythmias, Amsterdam UMC location University of Amsterdam, Amsterdam, The Netherlands
| | - Carolina Roselli
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Amanda L Elliott
- Program in Medical and Population Genetics, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Psychiatry and Center for Genomic Medicine, Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital,Harvard Medical School, Boston, MA, USA
- Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Institute for Molecular Medicine Finland (FIMM), Helsinki Institute of Life Science (HiLIFE), University of Helsinki, Helsinki, Finland
| | - Leonoor F J M Wijdeveld
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Physiology, Amsterdam UMC location VU, Amsterdam, The Netherlands
| | - Kiran J Biddinger
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Shinwan Kany
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Cardiology, University Heart and Vascular Center Hamburg-Eppendorf, Hamburg, Germany
- German Center for Cardiovascular Research (DZHK), Partner Site Hamburg/Kiel/Lübeck, Hamburg, Germany
| | - Joel T Rämö
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Institute for Molecular Medicine Finland (FIMM), Helsinki Institute of Life Science (HiLIFE), University of Helsinki, Helsinki, Finland
| | - Pradeep Natarajan
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Krishna G Aragam
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Jason Flannick
- Metabolism Program, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Noël P Burtt
- Metabolism Program, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Program in Medical and Population Genetics, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Connie R Bezzina
- Department of Experimental Cardiology, Heart Center, Amsterdam Cardiovascular Sciences, Heart Failure and Arrhythmias, Amsterdam UMC location University of Amsterdam, Amsterdam, The Netherlands
| | - Steven A Lubitz
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Demoulas Center for Cardiac Arrhythmias, Massachusetts General Hospital, Boston, MA, USA
| | - Kathryn L Lunetta
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- NHLBI and Boston University's Framingham Heart Study, Framingham, MA, USA
| | - Patrick T Ellinor
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Cardiovascular Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
- Demoulas Center for Cardiac Arrhythmias, Massachusetts General Hospital, Boston, MA, USA.
| |
Collapse
|
6
|
Bigdeli TB, Chatzinakos C, Bendl J, Barr PB, Venkatesh S, Gorman BR, Clarence T, Genovese G, Iyegbe CO, Peterson RE, Kolokotronis SO, Burstein D, Meyers JL, Li Y, Rajeevan N, Sayward F, Cheung KH, DeLisi LE, Kosten TR, Zhao H, Achtyes E, Buckley P, Malaspina D, Lehrer D, Rapaport MH, Braff DL, Pato MT, Fanous AH, Pato CN, Huang GD, Muralidhar S, Michael Gaziano J, Pyarajan S, Girdhar K, Lee D, Hoffman GE, Aslan M, Fullard JF, Voloudakis G, Harvey PD, Roussos P. Biological Insights from Schizophrenia-associated Loci in Ancestral Populations. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.08.27.24312631. [PMID: 39252912 PMCID: PMC11383513 DOI: 10.1101/2024.08.27.24312631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Large-scale genome-wide association studies of schizophrenia have uncovered hundreds of associated loci but with extremely limited representation of African diaspora populations. We surveyed electronic health records of 200,000 individuals of African ancestry in the Million Veteran and All of Us Research Programs, and, coupled with genotype-level data from four case-control studies, realized a combined sample size of 13,012 affected and 54,266 unaffected persons. Three genome-wide significant signals - near PLXNA4, PMAIP1, and TRPA1 - are the first to be independently identified in populations of predominantly African ancestry. Joint analyses of African, European, and East Asian ancestries across 86,981 cases and 303,771 controls, yielded 376 distinct autosomal loci, which were refined to 708 putatively causal variants via multi-ancestry fine-mapping. Utilizing single-cell functional genomic data from human brain tissue and two complementary approaches, transcriptome-wide association studies and enhancer-promoter contact mapping, we identified a consensus set of 94 genes across ancestries and pinpointed the specific cell types in which they act. We identified reproducible associations of schizophrenia polygenic risk scores with schizophrenia diagnoses and a range of other mental and physical health problems. Our study addresses a longstanding gap in the generalizability of research findings for schizophrenia across ancestral populations, underlining shared biological underpinnings of schizophrenia across global populations in the presence of broadly divergent risk allele frequencies.
Collapse
Affiliation(s)
- Tim B Bigdeli
- VA New York Harbor Healthcare System, Brooklyn, NY
- Department of Psychiatry and Behavioral Sciences and SUNY Downstate Health Sciences University, Brooklyn, NY
- Institute for Genomics in Health (IGH), SUNY Downstate Health Sciences University, Brooklyn, NY
- Department of Epidemiology and Biostatistics, School of Public Health, SUNY Downstate Health Sciences University, Brooklyn, NY
| | - Chris Chatzinakos
- Department of Psychiatry and Behavioral Sciences and SUNY Downstate Health Sciences University, Brooklyn, NY
- Institute for Genomics in Health (IGH), SUNY Downstate Health Sciences University, Brooklyn, NY
| | - Jaroslav Bendl
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, NY
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, NY
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, NY
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, NY
| | - Peter B Barr
- VA New York Harbor Healthcare System, Brooklyn, NY
- Department of Psychiatry and Behavioral Sciences and SUNY Downstate Health Sciences University, Brooklyn, NY
- Institute for Genomics in Health (IGH), SUNY Downstate Health Sciences University, Brooklyn, NY
- Department of Epidemiology and Biostatistics, School of Public Health, SUNY Downstate Health Sciences University, Brooklyn, NY
| | - Sanan Venkatesh
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, NY
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, NY
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, NY
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, NY
- Center for Precision Medicine and Translational Therapeutics, James J. Peters VA Medical Center, Bronx, NY, USA
| | - Bryan R Gorman
- Massachusetts Area Veterans Epidemiology, Research, and Information Center (MAVERIC), Jamaica Plain, MA
| | - Tereza Clarence
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, NY
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, NY
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, NY
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, NY
| | - Giulio Genovese
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA
- Harvard Medical School, Boston, MA
| | - Conrad O Iyegbe
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, NY
| | - Roseann E Peterson
- VA New York Harbor Healthcare System, Brooklyn, NY
- Department of Psychiatry and Behavioral Sciences and SUNY Downstate Health Sciences University, Brooklyn, NY
- Institute for Genomics in Health (IGH), SUNY Downstate Health Sciences University, Brooklyn, NY
| | - Sergios-Orestis Kolokotronis
- Institute for Genomics in Health (IGH), SUNY Downstate Health Sciences University, Brooklyn, NY
- Department of Epidemiology and Biostatistics, School of Public Health, SUNY Downstate Health Sciences University, Brooklyn, NY
- Division of Infectious Diseases, Department of Medicine, College of Medicine, SUNY Downstate Health Sciences University, Brooklyn, NY
- Department of Cell Biology, College of Medicine, SUNY Downstate Health Sciences University, Brooklyn, NY
| | - David Burstein
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, NY
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, NY
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, NY
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, NY
- Center for Precision Medicine and Translational Therapeutics, James J. Peters VA Medical Center, Bronx, NY, USA
- Mental Illness Research, Education and Clinical Center VISN2, James J. Peters VA Medical Center, Bronx, NY, USA
| | - Jacquelyn L Meyers
- Department of Psychiatry and Behavioral Sciences and SUNY Downstate Health Sciences University, Brooklyn, NY
- Institute for Genomics in Health (IGH), SUNY Downstate Health Sciences University, Brooklyn, NY
- Department of Epidemiology and Biostatistics, School of Public Health, SUNY Downstate Health Sciences University, Brooklyn, NY
| | - Yuli Li
- Clinical Epidemiology Research Center (CERC), VA Connecticut Healthcare System, West Haven, CT
- Yale University School of Medicine, New Haven, CT
| | - Nallakkandi Rajeevan
- Clinical Epidemiology Research Center (CERC), VA Connecticut Healthcare System, West Haven, CT
- Yale University School of Medicine, New Haven, CT
| | - Frederick Sayward
- Clinical Epidemiology Research Center (CERC), VA Connecticut Healthcare System, West Haven, CT
- Yale University School of Medicine, New Haven, CT
| | - Kei-Hoi Cheung
- Clinical Epidemiology Research Center (CERC), VA Connecticut Healthcare System, West Haven, CT
- Yale University School of Medicine, New Haven, CT
| | - Lynn E DeLisi
- Department of Psychiatry, Cambridge Health Alliance, Cambridge, MA
| | - Thomas R Kosten
- Michael E. DeBakey VA Medical Center, Houston, TX
- Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX
| | - Hongyu Zhao
- Clinical Epidemiology Research Center (CERC), VA Connecticut Healthcare System, West Haven, CT
- Yale University School of Medicine, New Haven, CT
| | - Eric Achtyes
- Western Michigan University Homer Stryker M.D. School of Medicine, Kalamazoo, MI
| | - Peter Buckley
- University of Tennessee Health Science Center in Memphis, TN
| | - Dolores Malaspina
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, NY
| | - Douglas Lehrer
- Department of Psychiatry, Wright State University, Dayton, OH
| | - Mark H Rapaport
- Huntsman Mental Health Institute, Department of Psychiatry, University of Utah, Salt Lake City, UT
| | - David L Braff
- Department of Psychiatry, University of California, San Diego, CA
- VA San Diego Healthcare System, San Diego, CA
| | - Michele T Pato
- Department of Psychiatry, Robert Wood Johnson Medical School, New Brunswick, NJ
| | - Ayman H Fanous
- Department of Psychiatry, University of Arizona College of Medicine-Phoenix, Phoenix, AZ
- Department of Psychiatry, VA Phoenix Healthcare System, Phoenix, AZ
| | - Carlos N Pato
- Department of Psychiatry, Robert Wood Johnson Medical School, New Brunswick, NJ
| | - Grant D Huang
- Office of Research and Development, Veterans Health Administration, Washington, DC
| | - Sumitra Muralidhar
- Office of Research and Development, Veterans Health Administration, Washington, DC
| | - J Michael Gaziano
- Massachusetts Area Veterans Epidemiology, Research, and Information Center (MAVERIC), Jamaica Plain, MA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Saiju Pyarajan
- Massachusetts Area Veterans Epidemiology, Research, and Information Center (MAVERIC), Jamaica Plain, MA
| | - Kiran Girdhar
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, NY
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, NY
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, NY
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, NY
| | - Donghoon Lee
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, NY
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, NY
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, NY
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, NY
| | - Gabriel E Hoffman
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, NY
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, NY
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, NY
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, NY
- Center for Precision Medicine and Translational Therapeutics, James J. Peters VA Medical Center, Bronx, NY, USA
- Mental Illness Research, Education and Clinical Center VISN2, James J. Peters VA Medical Center, Bronx, NY, USA
| | - Mihaela Aslan
- Clinical Epidemiology Research Center (CERC), VA Connecticut Healthcare System, West Haven, CT
- Yale University School of Medicine, New Haven, CT
| | - John F Fullard
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, NY
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, NY
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, NY
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, NY
| | - Georgios Voloudakis
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, NY
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, NY
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, NY
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, NY
- Center for Precision Medicine and Translational Therapeutics, James J. Peters VA Medical Center, Bronx, NY, USA
- Mental Illness Research, Education and Clinical Center VISN2, James J. Peters VA Medical Center, Bronx, NY, USA
| | - Philip D Harvey
- Bruce W. Carter Miami Veterans Affairs (VA) Medical Center, Miami, FL
- University of Miami School of Medicine, Miami, FL
| | - Panos Roussos
- Center for Disease Neurogenomics, Icahn School of Medicine at Mount Sinai, NY
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, NY
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, NY
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, NY
- Center for Precision Medicine and Translational Therapeutics, James J. Peters VA Medical Center, Bronx, NY, USA
- Mental Illness Research, Education and Clinical Center VISN2, James J. Peters VA Medical Center, Bronx, NY, USA
| |
Collapse
|
7
|
Ritchie SC, Taylor HJ, Liang Y, Manikpurage HD, Pennells L, Foguet C, Abraham G, Gibson JT, Jiang X, Liu Y, Xu Y, Kim LG, Mahajan A, McCarthy MI, Kaptoge S, Lambert SA, Wood A, Sim X, Collins FS, Denny JC, Danesh J, Butterworth AS, Di Angelantonio E, Inouye M. Integrated clinical risk prediction of type 2 diabetes with a multifactorial polygenic risk score. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.08.22.24312440. [PMID: 39228710 PMCID: PMC11370520 DOI: 10.1101/2024.08.22.24312440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
Combining information from multiple GWASs for a disease and its risk factors has proven a powerful approach for development of polygenic risk scores (PRSs). This may be particularly useful for type 2 diabetes (T2D), a highly polygenic and heterogeneous disease where the additional predictive value of a PRS is unclear. Here, we use a meta-scoring approach to develop a metaPRS for T2D that incorporated genome-wide associations from both European and non-European genetic ancestries and T2D risk factors. We evaluated the performance of this metaPRS and benchmarked it against existing genome-wide PRS in 620,059 participants and 50,572 T2D cases amongst six diverse genetic ancestries from UK Biobank, INTERVAL, the All of Us Research Program, and the Singapore Multi-Ethnic Cohort. We show that our metaPRS was the most powerful PRS for predicting T2D in European population-based cohorts and had comparable performance to the top ancestry-specific PRS, highlighting its transferability. In UK Biobank, we show the metaPRS had stronger predictive power for 10-year risk than all individual risk factors apart from BMI and biomarkers of dysglycemia. The metaPRS modestly improved T2D risk stratification of QDiabetes risk scores for 10-year risk prediction, particularly when prioritising individuals for blood tests of dysglycemia. Overall, we present a highly predictive and transferrable PRS for T2D and demonstrate that the potential for PRS to incrementally improve T2D risk prediction when incorporated into UK guideline-recommended screening and risk prediction with a clinical risk score.
Collapse
Affiliation(s)
- Scott C Ritchie
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
| | - Henry J Taylor
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Yujian Liang
- Saw Swee Hock School of Public Health, National University of Singapore and National University Health System, Singapore, Singapore
| | - Hasanga D Manikpurage
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
| | - Lisa Pennells
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
| | - Carles Foguet
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
| | - Gad Abraham
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Department of Clinical Pathology, University of Melbourne, Parkville, Victoria, Australia
| | - Joel T Gibson
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
| | - Xilin Jiang
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, US
| | - Yang Liu
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
| | - Yu Xu
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
| | - Lois G Kim
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- National Institute for Health and Care Research Blood and Transplant Research Unit in Donor Health and Behaviour, University of Cambridge, Cambridge, UK
| | - Anubha Mahajan
- OMNI Human Genetics, Genentech, Inc., 1 DNA Way, South San Francisco, CA 94080, USA
| | - Mark I McCarthy
- OMNI Human Genetics, Genentech, Inc., 1 DNA Way, South San Francisco, CA 94080, USA
| | - Stephen Kaptoge
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
| | - Samuel A Lambert
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Angela Wood
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- National Institute for Health and Care Research Blood and Transplant Research Unit in Donor Health and Behaviour, University of Cambridge, Cambridge, UK
- Cambridge Centre of Artificial Intelligence in Medicine, University of Cambridge, Cambridge, UK
| | - Xueling Sim
- Saw Swee Hock School of Public Health, National University of Singapore and National University Health System, Singapore, Singapore
| | - Francis S Collins
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Joshua C Denny
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- All of Us Research Program, National Institutes of Health, Bethesda, MD, USA
| | - John Danesh
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- National Institute for Health and Care Research Blood and Transplant Research Unit in Donor Health and Behaviour, University of Cambridge, Cambridge, UK
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton, UK
| | - Adam S Butterworth
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- National Institute for Health and Care Research Blood and Transplant Research Unit in Donor Health and Behaviour, University of Cambridge, Cambridge, UK
| | - Emanuele Di Angelantonio
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- National Institute for Health and Care Research Blood and Transplant Research Unit in Donor Health and Behaviour, University of Cambridge, Cambridge, UK
- Health Data Science Research Centre, Human Technopole, Milan, Italy
| | - Michael Inouye
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
| |
Collapse
|
8
|
Wang JY, Lin N, Zietz M, Mares J, Narasimhan VM, Rathouz PJ, Harpak A. Three Open Questions in Polygenic Score Portability. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.20.608703. [PMID: 39229140 PMCID: PMC11370354 DOI: 10.1101/2024.08.20.608703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
A major obstacle hindering the broad adoption of polygenic scores (PGS) is their lack of "portability" to people that differ-in genetic ancestry or other characteristics-from the GWAS samples in which genetic effects were estimated. Here, we use the UK Biobank to measure the change in PGS prediction accuracy as a continuous function of individuals' genome-wide genetic dissimilarity to the GWAS sample ("genetic distance"). Our results highlight three gaps in our understanding of PGS portability. First, prediction accuracy is extremely noisy at the individual level and not well predicted by genetic distance. In fact, variance in prediction accuracy is explained comparably well by socioeconomic measures. Second, trends of portability vary across traits. For several immunity-related traits, prediction accuracy drops near zero quickly even at intermediate levels of genetic distance. This quick drop may reflect GWAS associations being more ancestry-specific in immunity-related traits than in other traits. Third, we show that even qualitative trends of portability can depend on the measure of prediction accuracy used. For instance, for white blood cell count, a measure of prediction accuracy at the individual level (reduction in mean squared error) increases with genetic distance. Together, our results show that portability cannot be understood through global ancestry groupings alone. There are other, understudied factors influencing portability, such as the specifics of the evolution of the trait and its genetic architecture, social context, and the construction of the polygenic score. Addressing these gaps can aid in the development and application of PGS and inform more equitable genomic research.
Collapse
Affiliation(s)
- Joyce Y Wang
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX
| | - Neeka Lin
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX
| | - Michael Zietz
- Department of Biomedical Informatics, Columbia University, New York, NY
| | - Jason Mares
- Department of Neurology, Columbia University, New York, NY
| | - Vagheesh M Narasimhan
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX
- Department of Statistics and Data Science, The University of Texas at Austin, Austin, TX
| | - Paul J Rathouz
- Department of Statistics and Data Science, The University of Texas at Austin, Austin, TX
- Department of Population Health, The University of Texas at Austin, Austin, TX
| | - Arbel Harpak
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX
- Department of Population Health, The University of Texas at Austin, Austin, TX
| |
Collapse
|
9
|
Wang J, Zhang Z, Lu Z, Mancuso N, Gazal S. Genes with differential expression across ancestries are enriched in ancestry-specific disease effects likely due to gene-by-environment interactions. Am J Hum Genet 2024:S0002-9297(24)00282-9. [PMID: 39191255 DOI: 10.1016/j.ajhg.2024.07.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 07/26/2024] [Accepted: 07/30/2024] [Indexed: 08/29/2024] Open
Abstract
Multi-ancestry genome-wide association studies (GWASs) have highlighted the existence of variants with ancestry-specific effect sizes. Understanding where and why these ancestry-specific effects occur is fundamental to understanding the genetic basis of human diseases and complex traits. Here, we characterized genes differentially expressed across ancestries (ancDE genes) at the cell-type level by leveraging single-cell RNA-sequencing data in peripheral blood mononuclear cells for 21 individuals with East Asian (EAS) ancestry and 23 individuals with European (EUR) ancestry (172,385 cells); then, we tested whether variants surrounding those genes were enriched in disease variants with ancestry-specific effect sizes by leveraging ancestry-matched GWASs of 31 diseases and complex traits (average n ∼ 90,000 and ∼ 267,000 in EAS and EUR, respectively). We observed that ancDE genes tended to be cell-type specific and enriched in genes interacting with the environment and in variants with ancestry-specific disease effect sizes, which suggests cell-type-specific, gene-by-environment interactions shared between regulatory and disease architectures. Finally, we illustrated how different environments might have led to ancestry-specific myeloid cell leukemia 1 (MCL1) expression in B cells and ancestry-specific allele effect sizes in lymphocyte count GWASs for variants surrounding MCL1. Our results imply that large single-cell and GWAS datasets from diverse ancestries are required to improve our understanding of human diseases.
Collapse
Affiliation(s)
- Juehan Wang
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
| | - Zixuan Zhang
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Zeyun Lu
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Nicholas Mancuso
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Steven Gazal
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
10
|
DePaolo J, Zhang DY, Damrauer SM. Leveraging genetic data to improve the care of patients with thoracic aortic dilation. Eur Heart J 2024:ehae499. [PMID: 39150456 DOI: 10.1093/eurheartj/ehae499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 08/17/2024] Open
Affiliation(s)
- John DePaolo
- Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - David Y Zhang
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Scott M Damrauer
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Division of Vascular Surgery, Department of Surgery, Perelman School of Medicine, University of Pennsylvania, 3400 Civic Center Drive, 14th Floor South Perelman Center, Philadelphia, PA 19104, USA
- Cardiovascular Institute, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, PA 19104, USA
- Corporal Michael Crescenz VA Medical Center, Philadelphia, PA 19104, USA
| |
Collapse
|
11
|
Chen T, Zhang H, Mazumder R, Lin X. Fast and scalable ensemble learning method for versatile polygenic risk prediction. Proc Natl Acad Sci U S A 2024; 121:e2403210121. [PMID: 39110727 PMCID: PMC11331062 DOI: 10.1073/pnas.2403210121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 07/11/2024] [Indexed: 08/21/2024] Open
Abstract
Polygenic risk scores (PRS) enhance population risk stratification and advance personalized medicine, but existing methods face several limitations, encompassing issues related to computational burden, predictive accuracy, and adaptability to a wide range of genetic architectures. To address these issues, we propose Aggregated L0Learn using Summary-level data (ALL-Sum), a fast and scalable ensemble learning method for computing PRS using summary statistics from genome-wide association studies (GWAS). ALL-Sum leverages a L0L2 penalized regression and ensemble learning across tuning parameters to flexibly model traits with diverse genetic architectures. In extensive large-scale simulations across a wide range of polygenicity and GWAS sample sizes, ALL-Sum consistently outperformed popular alternative methods in terms of prediction accuracy, runtime, and memory usage by 10%, 20-fold, and threefold, respectively, and demonstrated robustness to diverse genetic architectures. We validated the performance of ALL-Sum in real data analysis of 11 complex traits using GWAS summary statistics from nine data sources, including the Global Lipids Genetics Consortium, Breast Cancer Association Consortium, and FinnGen Biobank, with validation in the UK Biobank. Our results show that on average, ALL-Sum obtained PRS with 25% higher accuracy on average, with 15 times faster computation and half the memory than the current state-of-the-art methods, and had robust performance across a wide range of traits and diseases. Furthermore, our method demonstrates stable prediction when using linkage disequilibrium computed from different data sources. ALL-Sum is available as a user-friendly R software package with publicly available reference data for streamlined analysis.
Collapse
Affiliation(s)
- Tony Chen
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA02215
| | - Haoyu Zhang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD20814
| | - Rahul Mazumder
- Operations Research and Statistics Group, Sloan School of Management, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Xihong Lin
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA02215
- Department of Statistics, Harvard University, Cambridge, MA02138
| |
Collapse
|
12
|
Pankratov V, Mezzavilla M, Aneli S, Kuznetsov IA, Fusco D, Wilson JF, Metspalu M, Provero P, Pagani L, Marnetto D. Ancestral genetic components are consistently associated with the complex trait landscape in European biobanks. Eur J Hum Genet 2024:10.1038/s41431-024-01678-9. [PMID: 39127804 DOI: 10.1038/s41431-024-01678-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 07/23/2024] [Accepted: 07/25/2024] [Indexed: 08/12/2024] Open
Abstract
The genetic structure in Europe was mostly shaped by admixture between the Western Hunter-Gatherers, Early European Farmers and Steppe Bronze Age ancestral components. Such structure is regarded as a confounder in GWAS and follow-up studies, and gold-standard methods exist to correct for it. However, it is still poorly understood to which extent these ancestral components contribute to complex trait variation in present-day Europe. In this work we harness the UK Biobank to address this question. By extensive demographic simulations, exploiting data on siblings and incorporating previous results we obtained from the Estonian Biobank, we carefully evaluate the significance and scope of our findings. Heart rate, platelet count, bone mineral density and many other traits show stratification similar to height and pigmentation traits, likely targets of selection and divergence across ancestral groups. We show that the reported ancestry-trait associations are not driven by environmental confounders by confirming our results when using between-sibling differences in ancestry. The consistency of our results across biobanks further supports this and indicates that these genetic predispositions that derive from post-Neolithic admixture events act as a source of variability and as potential confounders in Europe as a whole.
Collapse
Affiliation(s)
- Vasili Pankratov
- Center for Genomics, Evolution and Medicine, Institute of Genomics, University of Tartu, 51010, Tartu, Estonia.
| | | | - Serena Aneli
- Department of Public Health Sciences and Pediatrics, University of Turin, 10126, Turin, Italy
| | - Ivan A Kuznetsov
- Center for Genomics, Evolution and Medicine, Institute of Genomics, University of Tartu, 51010, Tartu, Estonia
| | - Daniela Fusco
- Department of Neurosciences, University of Turin, 10126, Turin, Italy
| | - James F Wilson
- Centre for Global Health Research, Usher Institute, University of Edinburgh, Teviot Place, Edinburgh, EH8 9AG, Scotland
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh, EH4 2XU, Scotland
- Centre for Genomic and Experimental Medicine, Institute of Genetics and Cancer, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh, EH4 2XU, Scotland
| | - Mait Metspalu
- Institute of Genomics, University of Tartu, 51010, Tartu, Estonia
| | - Paolo Provero
- Department of Neurosciences, University of Turin, 10126, Turin, Italy
- Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, 20132, Milan, Italy
| | - Luca Pagani
- Department of Biology, University of Padua, Padua, Italy
- Institute of Genomics, University of Tartu, 51010, Tartu, Estonia
| | - Davide Marnetto
- Department of Neurosciences, University of Turin, 10126, Turin, Italy.
| |
Collapse
|
13
|
Tsuo K, Shi Z, Ge T, Mandla R, Hou K, Ding Y, Pasaniuc B, Wang Y, Martin AR. All of Us diversity and scale improve polygenic prediction contextually with greatest improvements for under-represented populations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.06.606846. [PMID: 39149254 PMCID: PMC11326295 DOI: 10.1101/2024.08.06.606846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Recent studies have demonstrated that polygenic risk scores (PRS) trained on multi-ancestry data can improve prediction accuracy in groups historically underrepresented in genomic studies, but the availability of linked health and genetic data from large-scale diverse cohorts representative of a wide spectrum of human diversity remains limited. To address this need, the All of Us research program (AoU) generated whole-genome sequences of 245,388 individuals who collectively reflect the diversity of the USA. Leveraging this resource and another widely-used population-scale biobank, the UK Biobank (UKB) with a half million participants, we developed PRS trained on multi-ancestry and multi-biobank data with up to ~750,000 participants for 32 common, complex traits and diseases across a range of genetic architectures. We then compared effects of ancestry, PRS methodology, and genetic architecture on PRS accuracy across a held out subset of ancestrally diverse AoU participants. Due to the more heterogeneous study design of AoU, we found lower heritability on average compared to UKB (0.075 vs 0.165), which limited the maximal achievable PRS accuracy in AoU. Overall, we found that the increased diversity of AoU significantly improved PRS performance in some participants in AoU, especially underrepresented individuals, across multiple phenotypes. Notably, maximizing sample size by combining discovery data across AoU and UKB is not the optimal approach for predicting some phenotypes in African ancestry populations; rather, using data from only AoU for these traits resulted in the greatest accuracy. This was especially true for less polygenic traits with large ancestry-enriched effects, such as neutrophil count (R 2: 0.055 vs. 0.035 using AoU vs. cross-biobank meta-analysis, respectively, because of e.g. DARC). Lastly, we calculated individual-level PRS accuracies rather than grouping by continental ancestry, a critical step towards interpretability in precision medicine. Individualized PRS accuracy decays linearly as a function of ancestry divergence, but the slope was smaller using multi-ancestry GWAS compared to using European GWAS. Our results highlight the potential of biobanks with more balanced representations of human diversity to facilitate more accurate PRS for the individuals least represented in genomic studies.
Collapse
Affiliation(s)
- Kristin Tsuo
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Zhuozheng Shi
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Tian Ge
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Center for Precision Psychiatry, Massachusetts General Hospital, Boston, MA, USA
| | - Ravi Mandla
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Kangcheng Hou
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Yi Ding
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Bogdan Pasaniuc
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Institute for Precision Health, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Ying Wang
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Alicia R Martin
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
14
|
Liu Y, Meng XH, Wu C, Su KJ, Liu A, Tian Q, Zhao LJ, Qiu C, Luo Z, Gonzalez-Ramirez MI, Shen H, Xiao HM, Deng HW. Variability in performance of genetic-enhanced DXA-BMD prediction models across diverse ethnic and geographic populations: A risk prediction study. PLoS Med 2024; 21:e1004451. [PMID: 39213443 PMCID: PMC11404845 DOI: 10.1371/journal.pmed.1004451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 09/16/2024] [Accepted: 07/23/2024] [Indexed: 09/04/2024] Open
Abstract
BACKGROUND Osteoporosis is a major global health issue, weakening bones and increasing fracture risk. Dual-energy X-ray absorptiometry (DXA) is the standard for measuring bone mineral density (BMD) and diagnosing osteoporosis, but its costliness and complexity impede widespread screening adoption. Predictive modeling using genetic and clinical data offers a cost-effective alternative for assessing osteoporosis and fracture risk. This study aims to develop BMD prediction models using data from the UK Biobank (UKBB) and test their performance across different ethnic and geographical populations. METHODS AND FINDINGS We developed BMD prediction models for the femoral neck (FNK) and lumbar spine (SPN) using both genetic variants and clinical factors (such as sex, age, height, and weight), within 17,964 British white individuals from UKBB. Models based on regression with least absolute shrinkage and selection operator (LASSO), selected based on the coefficient of determination (R2) from a model selection subset of 5,973 individuals from British white population. These models were tested on 5 UKBB test sets and 12 independent cohorts of diverse ancestries, totaling over 15,000 individuals. Furthermore, we assessed the correlation of predicted BMDs with fragility fractures risk in 10 years in a case-control set of 287,183 European white participants without DXA-BMDs in the UKBB. With single-nucleotide polymorphism (SNP) inclusion thresholds at 5×10-6 and 5×10-7, the prediction models for FNK-BMD and SPN-BMD achieved the highest R2 of 27.70% with a 95% confidence interval (CI) of [27.56%, 27.84%] and 48.28% (95% CI [48.23%, 48.34%]), respectively. Adding genetic factors improved predictions slightly, explaining an additional 2.3% variation for FNK-BMD and 3% for SPN-BMD over clinical factors alone. Survival analysis revealed that the predicted FNK-BMD and SPN-BMD were significantly associated with fragility fracture risk in the European white population (P < 0.001). The hazard ratios (HRs) of the predicted FNK-BMD and SPN-BMD were 0.83 (95% CI [0.79, 0.88], corresponding to a 1.44% difference in 10-year absolute risk) and 0.72 (95% CI [0.68, 0.76], corresponding to a 1.64% difference in 10-year absolute risk), respectively, indicating that for every increase of one standard deviation in BMD, the fracture risk will decrease by 17% and 28%, respectively. However, the model's performance declined in other ethnic groups and independent cohorts. The limitations of this study include differences in clinical factors distribution and the use of only SNPs as genetic factors. CONCLUSIONS In this study, we observed that combining genetic and clinical factors improves BMD prediction compared to clinical factors alone. Adjusting inclusion thresholds for genetic variants (e.g., 5×10-6 or 5×10-7) rather than solely considering genome-wide association study (GWAS)-significant variants can enhance the model's explanatory power. The study highlights the need for training models on diverse populations to improve predictive performance across various ethnic and geographical groups.
Collapse
Affiliation(s)
- Yong Liu
- Center for System Biology, Data Sciences, and Reproductive Health, School of Basic Medical Science, Central South University, Changsha, Hunan Province, China
| | - Xiang-He Meng
- Hunan Provincial Key Laboratory of Regional Hereditary Birth Defects Prevention and Control, Changsha Hospital for Maternal & Child Health Care Affiliated to Hunan Normal University, Changsha, Hunan Province, China
| | - Chong Wu
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas, United States of America
| | - Kuan-Jui Su
- Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, Louisiana, United States of America
| | - Anqi Liu
- Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, Louisiana, United States of America
| | - Qing Tian
- Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, Louisiana, United States of America
| | - Lan-Juan Zhao
- Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, Louisiana, United States of America
| | - Chuan Qiu
- Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, Louisiana, United States of America
| | - Zhe Luo
- Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, Louisiana, United States of America
| | - Martha I Gonzalez-Ramirez
- Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, Louisiana, United States of America
| | - Hui Shen
- Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, Louisiana, United States of America
| | - Hong-Mei Xiao
- Center for System Biology, Data Sciences, and Reproductive Health, School of Basic Medical Science, Central South University, Changsha, Hunan Province, China
- Key Laboratory of Biological, Nanotechnology of National Health Commission, Xiangya Hospital, Central South University, Changsha, Hunan Province, China
| | - Hong-Wen Deng
- Tulane Center of Biomedical Informatics and Genomics, Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, Louisiana, United States of America
| |
Collapse
|
15
|
Crone B, Boyle AP. Enhancing portability of trans-ancestral polygenic risk scores through tissue-specific functional genomic data integration. PLoS Genet 2024; 20:e1011356. [PMID: 39110742 PMCID: PMC11333000 DOI: 10.1371/journal.pgen.1011356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 08/19/2024] [Accepted: 06/27/2024] [Indexed: 08/21/2024] Open
Abstract
Portability of trans-ancestral polygenic risk scores is often confounded by differences in linkage disequilibrium and genetic architecture between ancestries. Recent literature has shown that prioritizing GWAS SNPs with functional genomic evidence over strong association signals can improve model portability. We leveraged three RegulomeDB-derived functional regulatory annotations-SURF, TURF, and TLand-to construct polygenic risk models across a set of quantitative and binary traits highlighting functional mutations tagged by trait-associated tissue annotations. Tissue-specific prioritization by TURF and TLand provide a significant improvement in model accuracy over standard polygenic risk score (PRS) models across all traits. We developed the Trans-ancestral Iterative Tissue Refinement (TITR) algorithm to construct PRS models that prioritize functional mutations across multiple trait-implicated tissues. TITR-constructed PRS models show increased predictive accuracy over single tissue prioritization. This indicates our TITR approach captures a more comprehensive view of regulatory systems across implicated tissues that contribute to variance in trait expression.
Collapse
Affiliation(s)
- Bradley Crone
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Alan P. Boyle
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, United States of America
| |
Collapse
|
16
|
Cho H, Froelicher D, Dokmai N, Nandi A, Sadhuka S, Hong MM, Berger B. Privacy-Enhancing Technologies in Biomedical Data Science. Annu Rev Biomed Data Sci 2024; 7:317-343. [PMID: 39178425 PMCID: PMC11346580 DOI: 10.1146/annurev-biodatasci-120423-120107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/25/2024]
Abstract
The rapidly growing scale and variety of biomedical data repositories raise important privacy concerns. Conventional frameworks for collecting and sharing human subject data offer limited privacy protection, often necessitating the creation of data silos. Privacy-enhancing technologies (PETs) promise to safeguard these data and broaden their usage by providing means to share and analyze sensitive data while protecting privacy. Here, we review prominent PETs and illustrate their role in advancing biomedicine. We describe key use cases of PETs and their latest technical advances and highlight recent applications of PETs in a range of biomedical domains. We conclude by discussing outstanding challenges and social considerations that need to be addressed to facilitate a broader adoption of PETs in biomedical data science.
Collapse
Affiliation(s)
- Hyunghoon Cho
- Department of Biomedical Informatics and Data Science, Yale School of Medicine, New Haven, Connecticut, USA;
| | - David Froelicher
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA;
| | - Natnatee Dokmai
- Department of Biomedical Informatics and Data Science, Yale School of Medicine, New Haven, Connecticut, USA;
| | - Anupama Nandi
- Department of Biomedical Informatics and Data Science, Yale School of Medicine, New Haven, Connecticut, USA;
| | - Shuvom Sadhuka
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA;
| | - Matthew M Hong
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA;
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA;
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| |
Collapse
|
17
|
Nagpal S, Gibson G. Dual exposure-by-polygenic score interactions highlight disparities across social groups in the proportion needed to benefit. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.07.29.24311065. [PMID: 39132477 PMCID: PMC11312673 DOI: 10.1101/2024.07.29.24311065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
The transferability of polygenic scores across population groups is a major concern with respect to the equitable clinical implementation of genomic medicine. Since genetic associations are identified relative to the population mean, inevitably differences in disease or trait prevalence among social strata influence the relationship between PGS and risk. Here we quantify the magnitude of PGS-by-Exposure (PGSxE) interactions for seven human diseases (coronary artery disease, type 2 diabetes, obesity thresholded to body mass index and to waist-to-hip ratio, inflammatory bowel disease, chronic kidney disease, and asthma) and pairs of 75 exposures in the White-British subset of the UK Biobank study (n=408,801). Across 24,198 PGSxE models, 746 (3.1%) were significant by two criteria, at least three-fold more than expected by chance under each criterion. Predictive accuracy is significantly improved in the high-risk exposures and by including interaction terms with effects as large as those documented for low transferability of PGS across ancestries. The predominant mechanism for PGS×E interactions is shown to be amplification of genetic effects in the presence of adverse exposures such as low polyunsaturated fatty acids, mediators of obesity, and social determinants of ill health. We introduce the notion of the proportion needed to benefit (PNB) which is the cumulative number needed to treat across the range of the PGS and show that typically this is halved in the 70th to 80th percentile. These findings emphasize how individuals experiencing adverse exposures stand to preferentially benefit from interventions that may reduce risk, and highlight the need for more comprehensive sampling across socioeconomic groups in the performance of genome-wide association studies.
Collapse
Affiliation(s)
- Sini Nagpal
- Center for Integrative Genomics and School of Biological Sciences, Georgia Institute of Technology Atlanta, GA 30302
| | - Greg Gibson
- Center for Integrative Genomics and School of Biological Sciences, Georgia Institute of Technology Atlanta, GA 30302
| |
Collapse
|
18
|
Abramowitz SA, Boulier K, Keat K, Cardone KM, Shivakumar M, DePaolo J, Judy R, Kim D, Rader DJ, Ritchie, Voight BF, Pasaniuc B, Levin MG, Damrauer SM. Population Performance and Individual Agreement of Coronary Artery Disease Polygenic Risk Scores. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.07.25.24310931. [PMID: 39108513 PMCID: PMC11302700 DOI: 10.1101/2024.07.25.24310931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 08/12/2024]
Abstract
Importance Polygenic risk scores (PRSs) for coronary artery disease (CAD) are a growing clinical and commercial reality. Whether existing scores provide similar individual-level assessments of disease liability is a critical consideration for clinical implementation that remains uncharacterized. Objective Characterize the reliability of CAD PRSs that perform equivalently at the population level at predicting individual-level risk. Design Cross-sectional Study. Setting All of Us Research Program (AOU), Penn Medicine Biobank (PMBB), and UCLA ATLAS Precision Health Biobank. Participants Volunteers of diverse genetic backgrounds enrolled in AOU, PMBB, and UCLA with available electronic health record and genotyping data. Exposures Polygenic risk for CAD from previously published PRSs and new PRSs developed separately from the testing cohorts. Main Outcomes and Measures Sets of CAD PRSs that perform population prediction equivalently were identified by comparing calibration and discrimination (Brier score and AUROC) of generalized linear models of prevalent CAD using Bayesian analysis of variance. Among equivalently performing scores, individual-level agreement between risk estimates was tested with intraclass correlation (ICC) and Light's Kappa, measures of inter-rater reliability. Results 50 PRSs were calculated for 171,095 AOU participants. When included in a model of prevalent CAD, 48 scores had practically equivalent Brier scores and AUROCs (region of practical equivalence = 0.02). Across these scores, 84% of participants had at least one score in both the top and bottom risk quintile. Continuous agreement of individual risk predictions from the 48 scores was poor, with an ICC of 0.351 (95% CI; 0.349, 0.352). Agreement between two statistically equivalent scores was moderate, with an ICC of 0.649 (95% CI; 0.646, 0.652). Light's Kappa, used to evaluate consistency of assignment to high-risk thresholds, did not exceed 0.56 (interpreted as 'fair') across statistically and practically equivalent scores. Repeating the analysis among 41,193 PMBB and 50,748 UCLA participants yielded different sets of statistically and practically equivalent scores which also lacked strong individual agreement. Conclusions and Relevance Across three diverse biobanks, CAD PRSs that performed equivalently at the population level produced unreliable individual risk estimates. Approaches to clinical implementation of CAD PRSs must consider the potential for discordant individual risk estimates from otherwise indistinguishable scores.
Collapse
Affiliation(s)
- Sarah A. Abramowitz
- Department of Surgery, University of Pennsylvania Perelman School of Medicine
- Donald and Barbara Zucker School of Medicine at Hofstra/Northwell
| | - Kristin Boulier
- Department of Computational Medicine, University of California, Los Angeles
| | - Karl Keat
- Department of Genetics, University of Pennsylvania Perelman School of Medicine
| | - Katie M. Cardone
- Department of Genetics, University of Pennsylvania Perelman School of Medicine
| | - Manu Shivakumar
- Department of Genetics, University of Pennsylvania Perelman School of Medicine
| | - John DePaolo
- Department of Surgery, University of Pennsylvania Perelman School of Medicine
| | - Renae Judy
- Department of Surgery, University of Pennsylvania Perelman School of Medicine
| | - Dokyoon Kim
- Institute of Biomedical Informatics, University of Pennsylvania
| | - Daniel J. Rader
- Department of Genetics, University of Pennsylvania Perelman School of Medicine
| | - Ritchie
- Department of Genetics, University of Pennsylvania Perelman School of Medicine
| | - Benjamin F. Voight
- Department of Genetics, University of Pennsylvania Perelman School of Medicine
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania Perelman School of Medicine
- Institute of Translational Medicine and Therapeutics, University of Pennsylvania Perelman School of Medicine
| | - Bogdan Pasaniuc
- Department of Computational Medicine, University of California, Los Angeles
| | - Michael G. Levin
- Cardiovascular Institute, University of Pennsylvania Perelman School of Medicine
- Corporal Michael J. Crescenz VA Medical Center
- Division of Cardiovascular Medicine, University of Pennsylvania Perelman School of Medicine
| | - Scott M. Damrauer
- Department of Surgery, University of Pennsylvania Perelman School of Medicine
- Department of Genetics, University of Pennsylvania Perelman School of Medicine
- Cardiovascular Institute, University of Pennsylvania Perelman School of Medicine
- Corporal Michael J. Crescenz VA Medical Center
| |
Collapse
|
19
|
Habtewold TD, Wijesiriwardhana P, Biedrzycki RJ, Tekola-Ayele F. Genetic distance and ancestry proportion modify the association between maternal genetic risk score of type 2 diabetes and fetal growth. Hum Genomics 2024; 18:81. [PMID: 39030631 PMCID: PMC11264503 DOI: 10.1186/s40246-024-00645-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Accepted: 06/27/2024] [Indexed: 07/21/2024] Open
Abstract
BACKGROUND Maternal genetic risk of type 2 diabetes (T2D) has been associated with fetal growth, but the influence of genetic ancestry is not yet fully understood. We aimed to investigate the influence of genetic distance (GD) and genetic ancestry proportion (GAP) on the association of maternal genetic risk score of T2D (GRST2D) with fetal weight and birthweight. METHODS Multi-ancestral pregnant women (n = 1,837) from the NICHD Fetal Growth Studies - Singletons cohort were included in the current analyses. Fetal weight (in grams, g) was estimated from ultrasound measurements of fetal biometry, and birthweight (g) was measured at delivery. GRST2D was calculated using T2D-associated variants identified in the latest trans-ancestral genome-wide association study and was categorized into quartiles. GD and GAP were estimated using genotype data of four reference populations. GD was categorized into closest, middle, and farthest tertiles, and GAP was categorized as highest, medium, and lowest. Linear regression analyses were performed to test the association of GRST2D with fetal weight and birthweight, adjusted for covariates, in each GD and GAP category. RESULTS Among women with the closest GD from African and Amerindigenous ancestries, the fourth and third GRST2D quartile was significantly associated with 5.18 to 7.48 g (weeks 17-20) and 6.83 to 25.44 g (weeks 19-27) larger fetal weight compared to the first quartile, respectively. Among women with middle GD from European ancestry, the fourth GRST2D quartile was significantly associated with 5.73 to 21.21 g (weeks 18-26) larger fetal weight. Furthermore, among women with middle GD from European and African ancestries, the fourth and second GRST2D quartiles were significantly associated with 117.04 g (95% CI = 23.88-210.20, p = 0.014) and 95.05 g (95% CI = 4.73-185.36, p = 0.039) larger birthweight compared to the first quartile, respectively. The absence of significant association among women with the closest GD from East Asian ancestry was complemented by a positive significant association among women with the highest East Asian GAP. CONCLUSIONS The association between maternal GRST2D and fetal growth began in early-second trimester and was influenced by GD and GAP. The results suggest the use of genetic GD and GAP could improve the generalizability of GRS.
Collapse
Affiliation(s)
- Tesfa Dejenie Habtewold
- Epidemiology Branch, Division of Population Health Research, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, 6710B Rockledge Drive, Bethesda, MD, 20892-7004, USA
| | - Prabhavi Wijesiriwardhana
- Epidemiology Branch, Division of Population Health Research, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, 6710B Rockledge Drive, Bethesda, MD, 20892-7004, USA
| | - Richard J Biedrzycki
- Glotech, Inc., contractor for Division of Population Health Research, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, 6710B Rockledge Drive, Bethesda, MD, 20892-7004, USA
| | - Fasil Tekola-Ayele
- Epidemiology Branch, Division of Population Health Research, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, 6710B Rockledge Drive, Bethesda, MD, 20892-7004, USA.
| |
Collapse
|
20
|
Kanjira SC, Adams MJ, Jiang Y, Tian C, Lewis CM, Kuchenbaecker K, McIntosh AM. Polygenic prediction of major depressive disorder and related traits in African ancestries UK Biobank participants. Mol Psychiatry 2024:10.1038/s41380-024-02662-x. [PMID: 39014000 DOI: 10.1038/s41380-024-02662-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 06/27/2024] [Accepted: 07/03/2024] [Indexed: 07/18/2024]
Abstract
Genome-Wide Association Studies (GWAS) over-represent European ancestries, neglecting all other ancestry groups and low-income nations. Consequently, polygenic risk scores (PRS) more accurately predict complex traits in Europeans than African Ancestries groups. Very few studies have looked at the transferability of European-derived PRS for behavioural and mental health phenotypes to Africans. We assessed the comparative accuracy of depression PRS trained on European and African Ancestries GWAS studies to predict major depressive disorder (MDD) and related traits in African ancestry participants from the UK Biobank. UK Biobank participants were selected based on Principal component analysis clustering with an African genetic similarity reference population, MDD was assessed with the Composite International Diagnostic Interview (CIDI). PRS were computed using PRSice2 software using either European or African Ancestries GWAS summary statistics. PRS trained on European ancestry samples (246,363 cases) predicted case control status in Africans of the UK Biobank with similar accuracies (R2 = 2%, β = 0.32, empirical p-value = 0.002) to PRS trained on far much smaller samples of African Ancestries participants from 23andMe, Inc. (5045 cases, R² = 1.8%, β = 0.28, empirical p-value = 0.008). This suggests that prediction of MDD status from Africans to Africans had greater efficiency relative to discovery sample size than prediction of MDD from Europeans to Africans. Prediction of MDD status in African UK Biobank participants using GWAS findings of likely causal risk factors from European ancestries was non-significant. GWAS of MDD in European ancestries are inefficient for improving polygenic prediction in African samples; urgent MDD studies in Africa are needed.
Collapse
Affiliation(s)
- S C Kanjira
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, UK
- Malawi Epidemiology and Intervention Research Unit, Lilongwe, Malawi
| | - M J Adams
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, UK
| | - Y Jiang
- 23andMe Inc, Sunnyvale, CA, USA
| | - C Tian
- 23andMe Inc, Sunnyvale, CA, USA
| | - C M Lewis
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | - K Kuchenbaecker
- UCL Genetics Institute, University College London, London, UK
| | - A M McIntosh
- Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, UK.
- Centre for Genomic and Experimental Medicine, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
21
|
Tubbs JD, Chen Y, Duan R, Huang H, Ge T. Real-time dynamic polygenic prediction for streaming data. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.07.12.24310357. [PMID: 39040195 PMCID: PMC11261927 DOI: 10.1101/2024.07.12.24310357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/24/2024]
Abstract
Polygenic risk scores (PRSs) are promising tools for advancing precision medicine. However, existing PRS construction methods rely on static summary statistics derived from genome-wide association studies (GWASs), which are often updated at lengthy intervals. As genetic data and health outcomes are continuously being generated at an ever-increasing pace, the current PRS training and deployment paradigm is suboptimal in maximizing the prediction accuracy of PRSs for incoming patients in healthcare settings. Here, we introduce real-time PRS-CS (rtPRS-CS), which enables online, dynamic refinement and calibration of PRS as each new sample is collected, without the need to perform intermediate GWASs. Through extensive simulation studies, we evaluate the performance of rtPRS-CS across various genetic architectures and training sample sizes. Leveraging quantitative traits from the Mass General Brigham Biobank and UK Biobank, we show that rtPRS-CS can integrate massive streaming data to enhance PRS prediction over time. We further apply rtPRS-CS to 22 schizophrenia cohorts in 7 Asian regions, demonstrating the clinical utility of rtPRS-CS in dynamically predicting and stratifying disease risk across diverse genetic ancestries.
Collapse
Affiliation(s)
- Justin D. Tubbs
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Center for Precision Psychiatry, Department of Psychiatry, Massachusetts General Hospital, Boston, MA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Yu Chen
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA
- Department of Medicine, Massachusetts General Hospital, Boston, MA
| | - Rui Duan
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
| | - Hailiang Huang
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA
- Department of Medicine, Massachusetts General Hospital, Boston, MA
| | - Tian Ge
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Center for Precision Psychiatry, Department of Psychiatry, Massachusetts General Hospital, Boston, MA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA
| |
Collapse
|
22
|
Monti R, Eick L, Hudjashov G, Läll K, Kanoni S, Wolford BN, Wingfield B, Pain O, Wharrie S, Jermy B, McMahon A, Hartonen T, Heyne H, Mars N, Lambert S, Hveem K, Inouye M, van Heel DA, Mägi R, Marttinen P, Ripatti S, Ganna A, Lippert C. Evaluation of polygenic scoring methods in five biobanks shows larger variation between biobanks than methods and finds benefits of ensemble learning. Am J Hum Genet 2024; 111:1431-1447. [PMID: 38908374 PMCID: PMC11267524 DOI: 10.1016/j.ajhg.2024.06.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 05/31/2024] [Accepted: 06/05/2024] [Indexed: 06/24/2024] Open
Abstract
Methods of estimating polygenic scores (PGSs) from genome-wide association studies are increasingly utilized. However, independent method evaluation is lacking, and method comparisons are often limited. Here, we evaluate polygenic scores derived via seven methods in five biobank studies (totaling about 1.2 million participants) across 16 diseases and quantitative traits, building on a reference-standardized framework. We conducted meta-analyses to quantify the effects of method choice, hyperparameter tuning, method ensembling, and the target biobank on PGS performance. We found that no single method consistently outperformed all others. PGS effect sizes were more variable between biobanks than between methods within biobanks when methods were well tuned. Differences between methods were largest for the two investigated autoimmune diseases, seropositive rheumatoid arthritis and type 1 diabetes. For most methods, cross-validation was more reliable for tuning hyperparameters than automatic tuning (without the use of target data). For a given target phenotype, elastic net models combining PGS across methods (ensemble PGS) tuned in the UK Biobank provided consistent, high, and cross-biobank transferable performance, increasing PGS effect sizes (β coefficients) by a median of 5.0% relative to LDpred2 and MegaPRS (the two best-performing single methods when tuned with cross-validation). Our interactively browsable online-results and open-source workflow prspipe provide a rich resource and reference for the analysis of polygenic scoring methods across biobanks.
Collapse
Affiliation(s)
- Remo Monti
- Hasso Plattner Institute, University of Potsdam, Digital Engineering Faculty, Potsdam, Germany; Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association, Berlin Institute for Medical Systems Biology, Berlin, Germany
| | - Lisa Eick
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
| | - Georgi Hudjashov
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Kristi Läll
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Stavroula Kanoni
- William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK
| | - Brooke N Wolford
- K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, Faculty of Medicine and Health, Norwegian University of Science and Technology, Trondheim, Norway
| | - Benjamin Wingfield
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Oliver Pain
- Maurice Wohl Clinical Neuroscience Institute, Department of Basic and Clinical Neuroscience; Institute of Psychiatry, Psychology and Neuroscience; King's College London, London, UK
| | - Sophie Wharrie
- Aalto University, Department of Computer Science, Espoo, Finland
| | - Bradley Jermy
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
| | - Aoife McMahon
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Tuomo Hartonen
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
| | - Henrike Heyne
- Hasso Plattner Institute, University of Potsdam, Digital Engineering Faculty, Potsdam, Germany
| | - Nina Mars
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA; Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Samuel Lambert
- Levanger Hospital, Nord-Trøndelag Hospital Trust, Levanger, Norway; Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia; British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK; British Heart Foundation Cambridge Centre of Research Excellence, School of Clinical Medicine, University of Cambridge, Cambridge, UK
| | - Kristian Hveem
- K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, Faculty of Medicine and Health, Norwegian University of Science and Technology, Trondheim, Norway; Levanger Hospital, Nord-Trøndelag Hospital Trust, Levanger, Norway
| | - Michael Inouye
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK; Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia; British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK; Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK; British Heart Foundation Cambridge Centre of Research Excellence, School of Clinical Medicine, University of Cambridge, Cambridge, UK; Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
| | | | - Reedik Mägi
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Pekka Marttinen
- Aalto University, Department of Computer Science, Espoo, Finland
| | - Samuli Ripatti
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland; Department of Public Health, University of Helsinki, Helsinki, Finland; Department of Public Health, University of Helsinki, Helsinki, Finland
| | - Andrea Ganna
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland; Massachusetts General Hospital and Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Christoph Lippert
- Hasso Plattner Institute, University of Potsdam, Digital Engineering Faculty, Potsdam, Germany; Windreich Department of Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Diagnostic, Molecular, and Interventional Radiology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
23
|
Hou K, Xu Z, Ding Y, Mandla R, Shi Z, Boulier K, Harpak A, Pasaniuc B. Calibrated prediction intervals for polygenic scores across diverse contexts. Nat Genet 2024; 56:1386-1396. [PMID: 38886587 DOI: 10.1038/s41588-024-01792-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 05/08/2024] [Indexed: 06/20/2024]
Abstract
Polygenic scores (PGS) have emerged as the tool of choice for genomic prediction in a wide range of fields. We show that PGS performance varies broadly across contexts and biobanks. Contexts such as age, sex and income can impact PGS accuracy with similar magnitudes as genetic ancestry. Here we introduce an approach (CalPred) that models all contexts jointly to produce prediction intervals that vary across contexts to achieve calibration (include the trait with 90% probability), whereas existing methods are miscalibrated. In analyses of 72 traits across large and diverse biobanks (All of Us and UK Biobank), we find that prediction intervals required adjustment by up to 80% for quantitative traits. For disease traits, PGS-based predictions were miscalibrated across socioeconomic contexts such as annual household income levels, further highlighting the need of accounting for context information in PGS-based prediction across diverse populations.
Collapse
Affiliation(s)
- Kangcheng Hou
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA.
| | - Ziqi Xu
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA
| | - Yi Ding
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA
| | - Ravi Mandla
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA
| | - Zhuozheng Shi
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA
| | - Kristin Boulier
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA
| | - Arbel Harpak
- Department of Population Health, The University of Texas at Austin, Austin, TX, USA
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX, USA
| | - Bogdan Pasaniuc
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA.
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA.
- Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA.
- Institute for Precision Health, University of California Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
24
|
Vabalas A, Hartonen T, Vartiainen P, Jukarainen S, Viippola E, Rodosthenous RS, Liu A, Hägg S, Perola M, Ganna A. Deep learning-based prediction of one-year mortality in Finland is an accurate but unfair aging marker. NATURE AGING 2024; 4:1014-1027. [PMID: 38914859 PMCID: PMC11257968 DOI: 10.1038/s43587-024-00657-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 05/27/2024] [Indexed: 06/26/2024]
Abstract
Short-term mortality risk, which is indicative of individual frailty, serves as a marker for aging. Previous age clocks focused on predicting either chronological age or longer-term mortality. Aging clocks predicting short-term mortality are lacking and their algorithmic fairness remains unexamined. We developed a deep learning model to predict 1-year mortality using nationwide longitudinal data from the Finnish population (FinRegistry; n = 5.4 million), incorporating more than 8,000 features spanning up to 50 years. We achieved an area under the curve (AUC) of 0.944, outperforming a baseline model that included only age and sex (AUC = 0.897). The model generalized well to different causes of death (AUC > 0.800 for 45 of 50 causes), including coronavirus disease 2019, which was absent in the training data. Performance varied among demographics, with young females exhibiting the best and older males the worst results. Extensive prediction fairness analyses highlighted disparities among disadvantaged groups, posing challenges to equitable integration into public health interventions. Our model accurately identified short-term mortality risk, potentially serving as a population-wide aging marker.
Collapse
Affiliation(s)
- Andrius Vabalas
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
| | - Tuomo Hartonen
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
| | - Pekka Vartiainen
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
- Pediatric Research Center, Helsinki University Hospital and University of Helsinki, Helsinki, Finland
| | - Sakari Jukarainen
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
| | - Essi Viippola
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
| | | | - Aoxing Liu
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Sara Hägg
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Markus Perola
- The Finnish Institute for Health and Welfare, Helsinki, Finland
| | - Andrea Ganna
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.
| |
Collapse
|
25
|
Patel RA, Weiß CL, Zhu H, Mostafavi H, Simons YB, Spence JP, Pritchard JK. Conditional frequency spectra as a tool for studying selection on complex traits in biobanks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.15.599126. [PMID: 38948697 PMCID: PMC11212903 DOI: 10.1101/2024.06.15.599126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Natural selection on complex traits is difficult to study in part due to the ascertainment inherent to genome-wide association studies (GWAS). The power to detect a trait-associated variant in GWAS is a function of frequency and effect size - but for traits under selection, the effect size of a variant determines the strength of selection against it, constraining its frequency. To account for GWAS ascertainment, we propose studying the joint distribution of allele frequencies across populations, conditional on the frequencies in the GWAS cohort. Before considering these conditional frequency spectra, we first characterized the impact of selection and non-equilibrium demography on allele frequency dynamics forwards and backwards in time. We then used these results to understand conditional frequency spectra under realistic human demography. Finally, we investigated empirical conditional frequency spectra for GWAS variants associated with 106 complex traits, finding compelling evidence for either stabilizing or purifying selection. Our results provide insight into polygenic score portability and other properties of variants ascertained with GWAS, highlighting the utility of conditional frequency spectra.
Collapse
Affiliation(s)
- Roshni A. Patel
- Department of Genetics, Stanford University School of Medicine, Stanford, CA
| | - Clemens L. Weiß
- Stanford Cancer Institute Core, Stanford University School of Medicine, Stanford, CA
| | - Huisheng Zhu
- Department of Biology, Stanford University, Stanford, CA
| | - Hakhamanesh Mostafavi
- Center for Human Genetics and Genomics, New York University School of Medicine, New York, NY
- Division of Biostatistics, Department of Population Health, New York University School of Medicine, New York, NY
| | | | - Jeffrey P. Spence
- Department of Genetics, Stanford University School of Medicine, Stanford, CA
| | - Jonathan K. Pritchard
- Department of Genetics, Stanford University School of Medicine, Stanford, CA
- Department of Biology, Stanford University, Stanford, CA
| |
Collapse
|
26
|
Lewis ACF, Chisholm RL, Connolly JJ, Esplin ED, Glessner J, Gordon A, Green RC, Hakonarson H, Harr M, Holm IA, Jarvik GP, Karlson E, Kenny EE, Kottyan L, Lennon N, Linder JE, Luo Y, Martin LJ, Perez E, Puckelwartz MJ, Rasmussen-Torvik LJ, Sabatello M, Sharp RR, Smoller JW, Sterling R, Terek S, Wei WQ, Fullerton SM. Managing differential performance of polygenic risk scores across groups: Real-world experience of the eMERGE Network. Am J Hum Genet 2024; 111:999-1005. [PMID: 38688278 PMCID: PMC11179244 DOI: 10.1016/j.ajhg.2024.04.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 04/10/2024] [Accepted: 04/11/2024] [Indexed: 05/02/2024] Open
Abstract
The differential performance of polygenic risk scores (PRSs) by group is one of the major ethical barriers to their clinical use. It is also one of the main practical challenges for any implementation effort. The social repercussions of how people are grouped in PRS research must be considered in communications with research participants, including return of results. Here, we outline the decisions faced and choices made by a large multi-site clinical implementation study returning PRSs to diverse participants in handling this issue of differential performance. Our approach to managing the complexities associated with the differential performance of PRSs serves as a case study that can help future implementers of PRSs to plot an anticipatory course in response to this issue.
Collapse
Affiliation(s)
- Anna C F Lewis
- Edmond and Lily Safra Center for Ethics, Harvard University, Cambridge, MA, USA; Department of Genetics, Brigham and Women's Hospital, Boston, MA, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA; Harvard Medical School, Boston, MA, USA.
| | - Rex L Chisholm
- Center for Genetic Medicine, Northwestern University, Evanston, IL, USA
| | - John J Connolly
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | | | - Joe Glessner
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Adam Gordon
- Center for Genetic Medicine, Northwestern University, Evanston, IL, USA; Department of Pharmacology, Northwestern University, Evanston, IL, USA
| | - Robert C Green
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA; Ariadne Labs, Boston, MA, USA; Harvard Medical School, Boston, MA, USA
| | - Hakon Hakonarson
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, USA; Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Division of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, PA, USA; Division of Pulmonary Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Margaret Harr
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Ingrid A Holm
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA, USA; Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Gail P Jarvik
- Division of Medical Genetics, Department of Medicine and Department of Genome Science, University of Washington Medical Center, Seattle, WA, USA
| | - Elizabeth Karlson
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA; Mass General Brigham Personalized Medicine, Boston, MA, USA
| | - Eimear E Kenny
- Institute for Genomic Health, Icahn School of Medicine, New York City, NY, USA; Center for Clinical Translational Genomics, Icahn School of Medicine, New York City, NY, USA; Division of Genomic Medicine, Department of Medicine, Icahn School of Medicine, New York City, NY, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine, New York City, NY, USA
| | - Leah Kottyan
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Niall Lennon
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jodell E Linder
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Yuan Luo
- Department of Preventive Medicine, Northwestern University, Evanston, IL, USA
| | - Lisa J Martin
- Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA; University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Emma Perez
- Mass General Brigham Personalized Medicine, Boston, MA, USA
| | - Megan J Puckelwartz
- Center for Genetic Medicine, Northwestern University, Evanston, IL, USA; Department of Pharmacology, Northwestern University, Evanston, IL, USA
| | - Laura J Rasmussen-Torvik
- Center for Genetic Medicine, Northwestern University, Evanston, IL, USA; Department of Preventive Medicine, Northwestern University, Evanston, IL, USA
| | - Maya Sabatello
- Center for Precision Medicine and Genomics, Department of Medicine, Columbia University Irving Medical Center, New York City, NY, USA; Division of Ethics, Department of Medical Humanities and Ethics, Columbia University Irving Medical Center, New York City, NY, USA
| | | | - Jordan W Smoller
- Center for Precision Psychiatry, Department of Psychiatry, Massachusetts General Hospital, Boston, MA, USA; Psychiatric & Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA, USA
| | - Rene Sterling
- Division of Genomics and Society, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Shannon Terek
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Stephanie M Fullerton
- Department of Bioethics & Humanities, University of Washington School of Medicine, Seattle, WA, USA
| |
Collapse
|
27
|
Boye C, Nirmalan S, Ranjbaran A, Luca F. Genotype × environment interactions in gene regulation and complex traits. Nat Genet 2024; 56:1057-1068. [PMID: 38858456 DOI: 10.1038/s41588-024-01776-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 04/25/2024] [Indexed: 06/12/2024]
Abstract
Genotype × environment interactions (GxE) have long been recognized as a key mechanism underlying human phenotypic variation. Technological developments over the past 15 years have dramatically expanded our appreciation of the role of GxE in both gene regulation and complex traits. The richness and complexity of these datasets also required parallel efforts to develop robust and sensitive statistical and computational approaches. Although our understanding of the genetic architecture of molecular and complex traits has been maturing, a large proportion of complex trait heritability remains unexplained. Furthermore, there are increasing efforts to characterize the effect of environmental exposure on human health. We therefore review GxE in human gene regulation and complex traits, advocating for a comprehensive approach that jointly considers genetic and environmental factors in human health and disease.
Collapse
Affiliation(s)
- Carly Boye
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, US
| | - Shreya Nirmalan
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, US
| | - Ali Ranjbaran
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, US
| | - Francesca Luca
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, US.
- Department of Obstetrics and Gynecology, Wayne State University, Detroit, MI, US.
- Department of Biology, University of Rome "Tor Vergata", Rome, Italy.
| |
Collapse
|
28
|
Eastwood SV, Hemani G, Watkins SH, Scally A, Davey Smith G, Chaturvedi N. Ancestry, ethnicity, and race: explaining inequalities in cardiometabolic disease. Trends Mol Med 2024; 30:541-551. [PMID: 38677980 DOI: 10.1016/j.molmed.2024.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 03/30/2024] [Accepted: 04/03/2024] [Indexed: 04/29/2024]
Abstract
Population differences in cardiometabolic disease remain unexplained. Misleading assumptions over genetic explanations are partly due to terminology used to distinguish populations, specifically ancestry, race, and ethnicity. These terms differentially implicate environmental and biological causal pathways, which should inform their use. Genetic variation alone accounts for a limited fraction of population differences in cardiometabolic disease. Research effort should focus on societally driven, lifelong environmental determinants of population differences in disease. Rather than pursuing population stratifiers to personalize medicine, we advocate removing socioeconomic barriers to receipt of and adherence to healthcare interventions, which will have markedly greater impact on improving cardiometabolic outcomes. This requires multidisciplinary collaboration and public and policymaker engagement to address inequalities driven by society rather than biology per se.
Collapse
Affiliation(s)
- Sophie V Eastwood
- MRC Unit for Lifelong Health and Ageing at UCL Population Sciences and Experimental Medicine, Institute of Cardiovascular Sciences Faculty of Population Health Sciences, University College London, London, UK
| | - Gibran Hemani
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK; MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Sarah H Watkins
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK; MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Aylwyn Scally
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, UK
| | - George Davey Smith
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK; MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Nishi Chaturvedi
- MRC Unit for Lifelong Health and Ageing at UCL Population Sciences and Experimental Medicine, Institute of Cardiovascular Sciences Faculty of Population Health Sciences, University College London, London, UK.
| |
Collapse
|
29
|
Hong SC, Muyas F, Cortés-Ciriano I, Hormoz S. scAI-SNP: a method for inferring ancestry from single-cell data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.14.594208. [PMID: 38798590 PMCID: PMC11118306 DOI: 10.1101/2024.05.14.594208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Collaborative efforts, such as the Human Cell Atlas, are rapidly accumulating large amounts of single-cell data. To ensure that single-cell atlases are representative of human genetic diversity, we need to determine the ancestry of the donors from whom single-cell data are generated. Self-reporting of race and ethnicity, although important, can be biased and is not always available for the datasets already collected. Here, we introduce scAI-SNP, a tool to infer ancestry directly from single-cell genomics data. To train scAI-SNP, we identified 4.5 million ancestry-informative single-nucleotide polymorphisms (SNPs) in the 1000 Genomes Project dataset across 3201 individuals from 26 population groups. For a query single-cell data set, scAI-SNP uses these ancestry-informative SNPs to compute the contribution of each of the 26 population groups to the ancestry of the donor from whom the cells were obtained. Using diverse single-cell data sets with matched whole-genome sequencing data, we show that scAI-SNP is robust to the sparsity of single-cell data, can accurately and consistently infer ancestry from samples derived from diverse types of tissues and cancer cells, and can be applied to different modalities of single-cell profiling assays, such as single-cell RNA-seq and single-cell ATAC-seq. Finally, we argue that ensuring that single-cell atlases represent diverse ancestry, ideally alongside race and ethnicity, is ultimately important for improved and equitable health outcomes by accounting for human diversity.
Collapse
Affiliation(s)
- Sung Chul Hong
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Francesc Muyas
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK
| | - Isidro Cortés-Ciriano
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK
| | - Sahand Hormoz
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
30
|
Staerk C, Klinkhammer H, Wistuba T, Maj C, Mayr A. Generalizability of polygenic prediction models: how is the R 2 defined on test data? BMC Med Genomics 2024; 17:132. [PMID: 38755654 PMCID: PMC11100126 DOI: 10.1186/s12920-024-01905-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 05/08/2024] [Indexed: 05/18/2024] Open
Abstract
BACKGROUND Polygenic risk scores (PRS) quantify an individual's genetic predisposition for different traits and are expected to play an increasingly important role in personalized medicine. A crucial challenge in clinical practice is the generalizability and transferability of PRS models to populations with different ancestries. When assessing the generalizability of PRS models for continuous traits, the R 2 is a commonly used measure to evaluate prediction accuracy. While the R 2 is a well-defined goodness-of-fit measure for statistical linear models, there exist different definitions for its application on test data, which complicates interpretation and comparison of results. METHODS Based on large-scale genotype data from the UK Biobank, we compare three definitions of the R 2 on test data for evaluating the generalizability of PRS models to different populations. Polygenic models for several phenotypes, including height, BMI and lipoprotein A, are derived based on training data with European ancestry using state-of-the-art regression methods and are evaluated on various test populations with different ancestries. RESULTS Our analysis shows that the choice of the R 2 definition can lead to considerably different results on test data, making the comparison of R 2 values from the literature problematic. While the definition as the squared correlation between predicted and observed phenotypes solely addresses the discriminative performance and always yields values between 0 and 1, definitions of the R 2 based on the mean squared prediction error (MSPE) with reference to intercept-only models assess both discrimination and calibration. These MSPE-based definitions can yield negative values indicating miscalibrated predictions for out-of-target populations. We argue that the choice of the most appropriate definition depends on the aim of PRS analysis - whether it primarily serves for risk stratification or also for individual phenotype prediction. Moreover, both correlation-based and MSPE-based definitions of R 2 can provide valuable complementary information. CONCLUSIONS Awareness of the different definitions of the R 2 on test data is necessary to facilitate the reporting and interpretation of results on PRS generalizability. It is recommended to explicitly state which definition was used when reporting R 2 values on test data. Further research is warranted to develop and evaluate well-calibrated polygenic models for diverse populations.
Collapse
Affiliation(s)
- Christian Staerk
- Department of Medical Biometry, Informatics and Epidemiology, Medical Faculty, University of Bonn, Bonn, Germany.
- Institute of Statistics, RWTH Aachen University, Aachen, Germany.
| | - Hannah Klinkhammer
- Department of Medical Biometry, Informatics and Epidemiology, Medical Faculty, University of Bonn, Bonn, Germany
- Institute for Genomic Statistics and Bioinformatics, Medical Faculty, University of Bonn, Bonn, Germany
| | - Tobias Wistuba
- Department of Medical Biometry, Informatics and Epidemiology, Medical Faculty, University of Bonn, Bonn, Germany
| | - Carlo Maj
- Center for Human Genetics, University of Marburg, Marburg, Germany
| | - Andreas Mayr
- Department of Medical Biometry, Informatics and Epidemiology, Medical Faculty, University of Bonn, Bonn, Germany
| |
Collapse
|
31
|
Zhang S, Shu H, Zhou J, Rubin-Sigler J, Yang X, Liu Y, Cooper-Knock J, Monte E, Zhu C, Tu S, Li H, Tong M, Ecker JR, Ichida JK, Shen Y, Zeng J, Tsao PS, Snyder MP. Deconvolution of polygenic risk score in single cells unravels cellular and molecular heterogeneity of complex human diseases. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.14.594252. [PMID: 38798507 PMCID: PMC11118500 DOI: 10.1101/2024.05.14.594252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Polygenic risk scores (PRSs) are commonly used for predicting an individual's genetic risk of complex diseases. Yet, their implication for disease pathogenesis remains largely limited. Here, we introduce scPRS, a geometric deep learning model that constructs single-cell-resolved PRS leveraging reference single-cell chromatin accessibility profiling data to enhance biological discovery as well as disease prediction. Real-world applications across multiple complex diseases, including type 2 diabetes (T2D), hypertrophic cardiomyopathy (HCM), and Alzheimer's disease (AD), showcase the superior prediction power of scPRS compared to traditional PRS methods. Importantly, scPRS not only predicts disease risk but also uncovers disease-relevant cells, such as hormone-high alpha and beta cells for T2D, cardiomyocytes and pericytes for HCM, and astrocytes, microglia and oligodendrocyte progenitor cells for AD. Facilitated by a layered multi-omic analysis, scPRS further identifies cell-type-specific genetic underpinnings, linking disease-associated genetic variants to gene regulation within corresponding cell types. We substantiate the disease relevance of scPRS-prioritized HCM genes and demonstrate that the suppression of these genes in HCM cardiomyocytes is rescued by Mavacamten treatment. Additionally, we establish a novel microglia-specific regulatory relationship between the AD risk variant rs7922621 and its target genes ANXA11 and TSPAN14. We further illustrate the detrimental effects of suppressing these two genes on microglia phagocytosis. Our work provides a multi-tasking, interpretable framework for precise disease prediction and systematic investigation of the genetic, cellular, and molecular basis of complex diseases, laying the methodological foundation for single-cell genetics.
Collapse
Affiliation(s)
- Sai Zhang
- Department of Epidemiology, University of Florida, Gainesville, FL, USA
- Departments of Biostatistics & Biomedical Engineering, Genetics Institute, McKnight Brain Institute, University of Florida, Gainesville, FL, USA
- Department of Genetics, Center for Genomics and Personalized Medicine, Stanford University School of Medicine, Stanford, CA, USA
- These authors contributed equally: Sai Zhang, Hantao Shu, and Jingtian Zhou
| | - Hantao Shu
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
- These authors contributed equally: Sai Zhang, Hantao Shu, and Jingtian Zhou
| | - Jingtian Zhou
- Arc Institute, Palo Alto, CA, USA
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, USA
- These authors contributed equally: Sai Zhang, Hantao Shu, and Jingtian Zhou
| | - Jasper Rubin-Sigler
- Department of Stem Cell Biology and Regenerative Medicine, Eli and Edythe Broad Center for Regenerative Medicine and Stem Cell Research, University of Southern California, Los Angeles, CA, USA
| | - Xiaoyu Yang
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Yuxi Liu
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Johnathan Cooper-Knock
- Sheffield Institute for Translational Neuroscience, University of Sheffield, Sheffield, UK
| | - Emma Monte
- Department of Genetics, Center for Genomics and Personalized Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Chenchen Zhu
- Department of Genetics, Center for Genomics and Personalized Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Sharon Tu
- Department of Stem Cell Biology and Regenerative Medicine, Eli and Edythe Broad Center for Regenerative Medicine and Stem Cell Research, University of Southern California, Los Angeles, CA, USA
| | - Han Li
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Mingming Tong
- Department of Genetics, Center for Genomics and Personalized Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Joseph R. Ecker
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
- Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Justin K. Ichida
- Department of Stem Cell Biology and Regenerative Medicine, Eli and Edythe Broad Center for Regenerative Medicine and Stem Cell Research, University of Southern California, Los Angeles, CA, USA
| | - Yin Shen
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA
| | - Jianyang Zeng
- School of Engineering, Research Center for Industries of the Future, Westlake University, Hangzhou, Zhejiang, China
| | - Philip S. Tsao
- VA Palo Alto Healthcare System, Palo Alto, CA, USA
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Michael P. Snyder
- Department of Genetics, Center for Genomics and Personalized Medicine, Stanford University School of Medicine, Stanford, CA, USA
| |
Collapse
|
32
|
Saffie Awad P, Makarious MB, Elsayed I, Sanyaolu A, Wild Crea P, Schumacher Schuh AF, Levine KS, Vitale D, Korestky MJ, Kim J, Peixoto Leal T, Perinan MT, Dey S, Noyce AJ, Reyes-Palomares A, Rodriguez-Losada N, Foo JN, Mohamed W, Heilbron K, Norcliffe-Kaufmann L, Rizig M, Okubadejo N, Nalls M, Blauwendraat C, Singleton A, Leonard H, Mata IF, Bandres Ciga S. Insights into Ancestral Diversity in Parkinsons Disease Risk: A Comparative Assessment of Polygenic Risk Scores. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.11.28.23299090. [PMID: 38076954 PMCID: PMC10705647 DOI: 10.1101/2023.11.28.23299090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/21/2023]
Abstract
Objectives To evaluate and compare different polygenic risk score (PRS) models in predicting Parkinsons disease (PD) across diverse ancestries, focusing on identifying the most suitable approach for each population and potentially contributing to equitable advancements in precision medicine. Methods We constructed a total of 105 PRS across individual level data from seven diverse ancestries. First, a cross-ancestry conventional PRS comparison was implemented by utilizing the 90 known European risk loci with weighted effects from four independent summary statistics including European, East Asian, Latino/Admixed American, and African/Admixed. These models were adjusted by sex, age, and principal components (28 PRS) and by sex, age, and percentage of admixture (28 PRS) for comparison. Secondly, a novel and refined multi-ancestry best-fit PRS approach was then applied across the seven ancestries by leveraging multi-ancestry meta-analyzed summary statistics and using a p-value thresholding approach (49 PRS) to enhance prediction applicability in a global setting. Results European-based PRS models predicted disease status across all ancestries to differing degrees of accuracy. Ashkenazi Jewish had the highest Odds Ratio (OR): 1.96 (95% CI: 1.69-2.25, p < 0.0001) with an AUC (Area Under the Curve) of 68%. Conversely, the East Asian population, despite having fewer predictive variants (84 out of 90), had an OR of 1.37 (95% CI: 1.32-1.42) and an AUC of 62%, illustrating the cross-ancestry transferability of this model. Lower OR alongside broader confidence intervals were observed in other populations, including Africans (OR =1.38, 95% CI: 1.12-1.63, p=0.001). Adjustment by percentage of admixture did not outperform principal components. Multi-ancestry best-fit PRS models improved risk prediction in European, Ashkenazi Jewish, and African ancestries, yet didn't surpass conventional PRS in admixed populations such as Latino/American admixed and African admixed populations. Interpretation The present study represents a novel and comprehensive assessment of PRS performance across seven ancestries in PD, highlighting the inadequacy of a 'one size fits all' approach in genetic risk prediction. We demonstrated that European based PD PRS models are partially transferable to other ancestries and could be improved by a novel best-fit multi-ancestry PRS, especially in non-admixed populations.
Collapse
|
33
|
Khani M, Cerquera-Cleves C, Kekenadze M, Crea PAW, Singleton AB, Bandres-Ciga S. Towards a Global View of Parkinson's Disease Genetics. Ann Neurol 2024; 95:831-842. [PMID: 38557965 PMCID: PMC11060911 DOI: 10.1002/ana.26905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 02/22/2024] [Accepted: 02/25/2024] [Indexed: 04/04/2024]
Abstract
Parkinson's disease (PD) is a global health challenge, yet historically studies of PD have taken place predominantly in European populations. Recent genetics research conducted in non-European populations has revealed novel population-specific genetic loci linked to PD risk, highlighting the importance of studying PD globally. These insights have broadened our understanding of PD etiology, which is crucial for developing disease-modifying interventions. This review comprehensively explores the global genetic landscape of PD, emphasizing the scientific rationale for studying underrepresented populations. It underscores challenges, such as genotype-phenotype heterogeneity and inclusion difficulties for non-European participants, emphasizing the ongoing need for diverse and inclusive research in PD. ANN NEUROL 2024;95:831-842.
Collapse
Affiliation(s)
- Marzieh Khani
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Catalina Cerquera-Cleves
- Pontificia Universidad Javeriana, San Ignacio Hospital, Neurology Unit, Bogotá, Colombia
- CHU de Québec Research Center, Axe Neurosciences, Laval University. Quebec City, Canada
| | - Mariam Kekenadze
- Tbilisi State Medical University, Tbilisi, 0141, Georgia
- University College London, Queen Square Institute of Neurology , WC1N 3BG, London, UK
| | - Peter A. Wild Crea
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Andrew B. Singleton
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Sara Bandres-Ciga
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
34
|
Foreman AL, Warth B, Hessel EVS, Price EJ, Schymanski EL, Cantelli G, Parkinson H, Hecht H, Klánová J, Vlaanderen J, Hilscherova K, Vrijheid M, Vineis P, Araujo R, Barouki R, Vermeulen R, Lanone S, Brunak S, Sebert S, Karjalainen T. Adopting Mechanistic Molecular Biology Approaches in Exposome Research for Causal Understanding. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:7256-7269. [PMID: 38641325 PMCID: PMC11064223 DOI: 10.1021/acs.est.3c07961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 03/25/2024] [Accepted: 03/27/2024] [Indexed: 04/21/2024]
Abstract
Through investigating the combined impact of the environmental exposures experienced by an individual throughout their lifetime, exposome research provides opportunities to understand and mitigate negative health outcomes. While current exposome research is driven by epidemiological studies that identify associations between exposures and effects, new frameworks integrating more substantial population-level metadata, including electronic health and administrative records, will shed further light on characterizing environmental exposure risks. Molecular biology offers methods and concepts to study the biological and health impacts of exposomes in experimental and computational systems. Of particular importance is the growing use of omics readouts in epidemiological and clinical studies. This paper calls for the adoption of mechanistic molecular biology approaches in exposome research as an essential step in understanding the genotype and exposure interactions underlying human phenotypes. A series of recommendations are presented to make the necessary and appropriate steps to move from exposure association to causation, with a huge potential to inform precision medicine and population health. This includes establishing hypothesis-driven laboratory testing within the exposome field, supported by appropriate methods to read across from model systems research to human.
Collapse
Affiliation(s)
- Amy L. Foreman
- European
Molecular Biology Laboratory & European Bioinformatics Institute
(EMBL-EBI), Wellcome Trust Genome Campus, Hinxton CB10 1SD, U.K.
| | - Benedikt Warth
- Department
of Food Chemistry and Toxicology, University
of Vienna, 1090 Vienna, Austria
| | - Ellen V. S. Hessel
- National
Institute for Public Health and the Environment (RIVM), Antonie van Leeuwenhoeklaan 9, 3721 MA Bilthoven, The Netherlands
| | - Elliott J. Price
- RECETOX,
Faculty of Science, Masaryk University, Kotlarska 2, Brno 60200, Czech Republic
| | - Emma L. Schymanski
- Luxembourg
Centre for Systems Biomedicine, University
of Luxembourg, 6 avenue
du Swing, L-4367 Belvaux, Luxembourg
| | - Gaia Cantelli
- European
Molecular Biology Laboratory & European Bioinformatics Institute
(EMBL-EBI), Wellcome Trust Genome Campus, Hinxton CB10 1SD, U.K.
| | - Helen Parkinson
- European
Molecular Biology Laboratory & European Bioinformatics Institute
(EMBL-EBI), Wellcome Trust Genome Campus, Hinxton CB10 1SD, U.K.
| | - Helge Hecht
- RECETOX,
Faculty of Science, Masaryk University, Kotlarska 2, Brno 60200, Czech Republic
| | - Jana Klánová
- RECETOX,
Faculty of Science, Masaryk University, Kotlarska 2, Brno 60200, Czech Republic
| | - Jelle Vlaanderen
- Institute
for Risk Assessment Sciences, Division of Environmental Epidemiology, Utrecht University, Heidelberglaan 8 3584 CS Utrecht, The Netherlands
| | - Klara Hilscherova
- RECETOX,
Faculty of Science, Masaryk University, Kotlarska 2, Brno 60200, Czech Republic
| | - Martine Vrijheid
- Institute
for Global Health (ISGlobal), Barcelona
Biomedical Research Park (PRBB), Doctor Aiguader, 88, 08003 Barcelona, Spain
- Universitat
Pompeu Fabra, Carrer
de la Mercè, 12, Ciutat Vella, 08002 Barcelona, Spain
- Centro de Investigación Biomédica en Red
Epidemiología
y Salud Pública (CIBERESP), Av. Monforte de Lemos, 3-5. Pebellón 11, Planta 0, 28029 Madrid, Spain
| | - Paolo Vineis
- Department
of Epidemiology and Biostatistics, School of Public Health, Imperial College, London SW7 2AZ, U.K.
| | - Rita Araujo
- European Commission, DG Research and Innovation, Sq. Frère-Orban 8, 1000 Bruxelles, Belgium
| | | | - Roel Vermeulen
- Institute
for Risk Assessment Sciences, Division of Environmental Epidemiology, Utrecht University, Heidelberglaan 8 3584 CS Utrecht, The Netherlands
| | - Sophie Lanone
- Univ Paris Est Creteil, INSERM, IMRB, F-94010 Creteil, France
| | - Søren Brunak
- Novo
Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Blegdamsvej 3B, 2200 København, Denmark
| | - Sylvain Sebert
- Research
Unit of Population Health, University of
Oulu, P.O. Box 8000, FI-90014 Oulu, Finland
| | - Tuomo Karjalainen
- European Commission, DG Research and Innovation, Sq. Frère-Orban 8, 1000 Bruxelles, Belgium
| |
Collapse
|
35
|
Casanova F, Tian Q, Atkins JL, Wood AR, Williamson D, Qian Y, Zweibaum D, Ding J, Melzer D, Ferrucci L, Pilling LC. Iron and risk of dementia: Mendelian randomisation analysis in UK Biobank. J Med Genet 2024; 61:435-442. [PMID: 38191510 DOI: 10.1136/jmg-2023-109295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 11/28/2023] [Indexed: 01/10/2024]
Abstract
BACKGROUND Brain iron deposition is common in dementia, but whether serum iron is a causal risk factor is unknown. We aimed to determine whether genetic predisposition to higher serum iron status biomarkers increased risk of dementia and atrophy of grey matter. METHODS We analysed UK Biobank participants clustered into European (N=451284), African (N=7477) and South Asian (N=9570) groups by genetic similarity to the 1000 genomes project. Using Mendelian randomisation methods, we estimated the association between genetically predicted serum iron (transferrin saturation [TSAT] and ferritin), grey matter volume and genetic liability to clinically defined dementia (including Alzheimer's disease [AD], non-AD dementia, and vascular dementia) from hospital and primary care records. We also performed time-to-event (competing risks) analysis of the TSAT polygenic score on risk of clinically defined non-AD dementia. RESULTS In Europeans, higher genetically predicted TSAT increased genetic liability to dementia (Odds Ratio [OR]: 1.15, 95% Confidence Intervals [CI] 1.04 to 1.26, p=0.0051), non-AD dementia (OR: 1.27, 95% CI 1.12 to 1.45, p=0.00018) and vascular dementia (OR: 1.37, 95% CI 1.12 to 1.69, p=0.0023), but not AD (OR: 1.00, 95% CI 0.86 to 1.15, p=0.97). Higher TSAT was also associated with increased risk of non-AD dementia in participants of African, but not South Asian groups. In survival analysis using a TSAT polygenic score, the effect was independent of apolipoprotein-E ε4 genotype (with adjustment subdistribution Hazard Ratio: 1.74, 95% CI 1.33 to 2.28, p=0.00006). Genetically predicted TSAT was associated with lower grey matter volume in caudate, putamen and thalamus, and not in other areas of interest. DISCUSSION Genetic evidence supports a causal relationship between higher TSAT and risk of clinically defined non-AD and vascular dementia, in European and African groups. This association appears to be independent of apolipoprotein-E ε4.
Collapse
Affiliation(s)
- Francesco Casanova
- Department of Clinical and Biomedical Sciences, University of Exeter, Exeter, UK
| | - Qu Tian
- Translational Gerontology Branch Longitudinal Studies Section, National Institute on Aging, Bethesda, Maryland, USA
| | - Janice L Atkins
- Department of Clinical and Biomedical Sciences, University of Exeter, Exeter, UK
| | - Andrew R Wood
- Department of Clinical and Biomedical Sciences, University of Exeter, Exeter, UK
| | | | - Yong Qian
- Translational Gerontology Branch Longitudinal Studies Section, National Institute on Aging, Bethesda, Maryland, USA
| | - David Zweibaum
- Translational Gerontology Branch Longitudinal Studies Section, National Institute on Aging, Bethesda, Maryland, USA
| | - Jun Ding
- Translational Gerontology Branch Longitudinal Studies Section, National Institute on Aging, Bethesda, Maryland, USA
| | - David Melzer
- Department of Clinical and Biomedical Sciences, University of Exeter, Exeter, UK
| | - Luigi Ferrucci
- Translational Gerontology Branch Longitudinal Studies Section, National Institute on Aging, Bethesda, Maryland, USA
| | - Luke C Pilling
- Department of Clinical and Biomedical Sciences, University of Exeter, Exeter, UK
| |
Collapse
|
36
|
Troubat L, Fettahoglu D, Henches L, Aschard H, Julienne H. Multi-trait GWAS for diverse ancestries: mapping the knowledge gap. BMC Genomics 2024; 25:375. [PMID: 38627641 PMCID: PMC11022331 DOI: 10.1186/s12864-024-10293-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 04/09/2024] [Indexed: 04/19/2024] Open
Abstract
BACKGROUND Approximately 95% of samples analyzed in univariate genome-wide association studies (GWAS) are of European ancestry. This bias toward European ancestry populations in association screening also exists for other analyses and methods that are often developed and tested on European ancestry only. However, existing data in non-European populations, which are often of modest sample size, could benefit from innovative approaches as recently illustrated in the context of polygenic risk scores. METHODS Here, we extend and assess the potential limitations and gains of our multi-trait GWAS pipeline, JASS (Joint Analysis of Summary Statistics), for the analysis of non-European ancestries. To this end, we conducted the joint GWAS of 19 hematological traits and glycemic traits across five ancestries (European (EUR), admixed American (AMR), African (AFR), East Asian (EAS), and South-East Asian (SAS)). RESULTS We detected 367 new genome-wide significant associations in non-European populations (15 in Admixed American (AMR), 72 in African (AFR) and 280 in East Asian (EAS)). New associations detected represent 5%, 17% and 13% of associations in the AFR, AMR and EAS populations, respectively. Overall, multi-trait testing increases the replication of European associated loci in non-European ancestry by 15%. Pleiotropic effects were highly similar at significant loci across ancestries (e.g. the mean correlation between multi-trait genetic effects of EUR and EAS ancestries was 0.88). For hematological traits, strong discrepancies in multi-trait genetic effects are tied to known evolutionary divergences: the ARKC1 loci, which is adaptive to overcome p.vivax induced malaria. CONCLUSIONS Multi-trait GWAS can be a valuable tool to narrow the genetic knowledge gap between European and non-European populations.
Collapse
Affiliation(s)
- Lucie Troubat
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, F-75015, France
| | - Deniz Fettahoglu
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, F-75015, France
| | - Léo Henches
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, F-75015, France
| | - Hugues Aschard
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, F-75015, France
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Hanna Julienne
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, F-75015, France.
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, Paris, F-75015, France.
| |
Collapse
|
37
|
Lu Z, Wang X, Carr M, Kim A, Gazal S, Mohammadi P, Wu L, Gusev A, Pirruccello J, Kachuri L, Mancuso N. Improved multi-ancestry fine-mapping identifies cis-regulatory variants underlying molecular traits and disease risk. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.15.24305836. [PMID: 38699369 PMCID: PMC11065034 DOI: 10.1101/2024.04.15.24305836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
Multi-ancestry statistical fine-mapping of cis-molecular quantitative trait loci (cis-molQTL) aims to improve the precision of distinguishing causal cis-molQTLs from tagging variants. However, existing approaches fail to reflect shared genetic architectures. To solve this limitation, we present the Sum of Shared Single Effects (SuShiE) model, which leverages LD heterogeneity to improve fine-mapping precision, infer cross-ancestry effect size correlations, and estimate ancestry-specific expression prediction weights. We apply SuShiE to mRNA expression measured in PBMCs (n=956) and LCLs (n=814) together with plasma protein levels (n=854) from individuals of diverse ancestries in the TOPMed MESA and GENOA studies. We find SuShiE fine-maps cis-molQTLs for 16% more genes compared with baselines while prioritizing fewer variants with greater functional enrichment. SuShiE infers highly consistent cis-molQTL architectures across ancestries on average; however, we also find evidence of heterogeneity at genes with predicted loss-of-function intolerance, suggesting that environmental interactions may partially explain differences in cis-molQTL effect sizes across ancestries. Lastly, we leverage estimated cis-molQTL effect-sizes to perform individual-level TWAS and PWAS on six white blood cell-related traits in AOU Biobank individuals (n=86k), and identify 44 more genes compared with baselines, further highlighting its benefits in identifying genes relevant for complex disease risk. Overall, SuShiE provides new insights into the cis-genetic architecture of molecular traits.
Collapse
Affiliation(s)
- Zeyun Lu
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Xinran Wang
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Matthew Carr
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Artem Kim
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Steven Gazal
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA
| | - Pejman Mohammadi
- Center for Immunity and Immunotherapies, Seattle Children’s Research Institute, Seattle, WA, USA
- Department of Pediatrics, University of Washington School of Medicine, Seattle, WA, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Lang Wu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaiʻi Cancer Center, University of Hawaiʻi at Mānoa, Honolulu, HI, USA
| | - Alexander Gusev
- Harvard Medical School and Dana-Farber Cancer Institute, Boston, MA, USA
| | - James Pirruccello
- Division of Cardiology, University of California San Francisco, San Francisco, CA, USA
| | - Linda Kachuri
- Department of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Cancer Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - Nicholas Mancuso
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA
| |
Collapse
|
38
|
Xiang R, Liu Y, Ben-Eghan C, Ritchie S, Lambert SA, Xu Y, Takeuchi F, Inouye M. Genome-wide analyses of variance in blood cell phenotypes provide new insights into complex trait biology and prediction. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.15.24305830. [PMID: 38699308 PMCID: PMC11065006 DOI: 10.1101/2024.04.15.24305830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
Blood cell phenotypes are routinely tested in healthcare to inform clinical decisions. Genetic variants influencing mean blood cell phenotypes have been used to understand disease aetiology and improve prediction; however, additional information may be captured by genetic effects on observed variance. Here, we mapped variance quantitative trait loci (vQTL), i.e. genetic loci associated with trait variance, for 29 blood cell phenotypes from the UK Biobank (N~408,111). We discovered 176 independent blood cell vQTLs, of which 147 were not found by additive QTL mapping. vQTLs displayed on average 1.8-fold stronger negative selection than additive QTL, highlighting that selection acts to reduce extreme blood cell phenotypes. Variance polygenic scores (vPGSs) were constructed to stratify individuals in the INTERVAL cohort (N~40,466), where genetically less variable individuals (low vPGS) had increased conventional PGS accuracy (by ~19%) than genetically more variable individuals. Genetic prediction of blood cell traits improved by ~10% on average combining PGS with vPGS. Using Mendelian randomisation and vPGS association analyses, we found that alcohol consumption significantly increased blood cell trait variances highlighting the utility of blood cell vQTLs and vPGSs to provide novel insight into phenotype aetiology as well as improve prediction.
Collapse
Affiliation(s)
- Ruidong Xiang
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, Victoria 3083, Australia
- Baker Department of Cardiovascular Research, Translation and Implementation, La Trobe University, Melbourne, VIC, 3086, Australia
- Baker Department of Cardiometabolic Health, The University of Melbourne, VIC, 3010, Australia
| | - Yang Liu
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
| | - Chief Ben-Eghan
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
| | - Scott Ritchie
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
| | - Samuel A. Lambert
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
| | - Yu Xu
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
| | - Fumihiko Takeuchi
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Department of Gene Diagnostics and Therapeutics, Research Institute, National Center for Global Health and Medicine, Tokyo, Japan
| | - Michael Inouye
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
| |
Collapse
|
39
|
Zhang J, Zhan J, Jin J, Ma C, Zhao R, O'Connell J, Jiang Y, Koelsch BL, Zhang H, Chatterjee N. An ensemble penalized regression method for multi-ancestry polygenic risk prediction. Nat Commun 2024; 15:3238. [PMID: 38622117 PMCID: PMC11271575 DOI: 10.1038/s41467-024-47357-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 03/28/2024] [Indexed: 04/17/2024] Open
Abstract
Great efforts are being made to develop advanced polygenic risk scores (PRS) to improve the prediction of complex traits and diseases. However, most existing PRS are primarily trained on European ancestry populations, limiting their transferability to non-European populations. In this article, we propose a novel method for generating multi-ancestry Polygenic Risk scOres based on enSemble of PEnalized Regression models (PROSPER). PROSPER integrates genome-wide association studies (GWAS) summary statistics from diverse populations to develop ancestry-specific PRS with improved predictive power for minority populations. The method uses a combination ofL 1 (lasso) andL 2 (ridge) penalty functions, a parsimonious specification of the penalty parameters across populations, and an ensemble step to combine PRS generated across different penalty parameters. We evaluate the performance of PROSPER and other existing methods on large-scale simulated and real datasets, including those from 23andMe Inc., the Global Lipids Genetics Consortium, and All of Us. Results show that PROSPER can substantially improve multi-ancestry polygenic prediction compared to alternative methods across a wide variety of genetic architectures. In real data analyses, for example, PROSPER increased out-of-sample prediction R2 for continuous traits by an average of 70% compared to a state-of-the-art Bayesian method (PRS-CSx) in the African ancestry population. Further, PROSPER is computationally highly scalable for the analysis of large SNP contents and many diverse populations.
Collapse
Affiliation(s)
- Jingning Zhang
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
| | | | - Jin Jin
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Cheng Ma
- Department of Statistics, University of Michigan, Ann Arbor, MI, USA
| | - Ruzhang Zhao
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | | | | | | | - Haoyu Zhang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Nilanjan Chatterjee
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
- Department of Oncology, School of Medicine, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
40
|
Gu Y, Maria-Stauffer E, Bedford SA, Romero-Garcia R, Grove J, Børglum AD, Martin H, Baron-Cohen S, Bethlehem RA, Warrier V. Polygenic scores for autism are associated with neurite density in adults and children from the general population. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.10.24305539. [PMID: 38645251 PMCID: PMC11030520 DOI: 10.1101/2024.04.10.24305539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Genetic variants linked to autism are thought to change cognition and behaviour by altering the structure and function of the brain. Although a substantial body of literature has identified structural brain differences in autism, it is unknown whether autism-associated common genetic variants are linked to changes in cortical macro- and micro-structure. We investigated this using neuroimaging and genetic data from adults (UK Biobank, N = 31,748) and children (ABCD, N = 4,928). Using polygenic scores and genetic correlations we observe a robust negative association between common variants for autism and a magnetic resonance imaging derived phenotype for neurite density (intracellular volume fraction) in the general population. This result is consistent across both children and adults, in both the cortex and in white matter tracts, and confirmed using polygenic scores and genetic correlations. There were no sex differences in this association. Mendelian randomisation analyses provide no evidence for a causal relationship between autism and intracellular volume fraction, although this should be revisited using better powered instruments. Overall, this study provides evidence for shared common variant genetics between autism and cortical neurite density.
Collapse
Affiliation(s)
- Yuanjun Gu
- Department of Psychiatry, University of Cambridge, Cambridge, CB2 8AH
- Autism Research Centre, Department of Psychiatry, University of Cambridge, Cambridge, CB2 8AH, UK
| | | | - Saashi A. Bedford
- Department of Psychiatry, University of Cambridge, Cambridge, CB2 8AH
- Autism Research Centre, Department of Psychiatry, University of Cambridge, Cambridge, CB2 8AH, UK
| | | | | | - Rafael Romero-Garcia
- Department of Psychiatry, University of Cambridge, Cambridge, CB2 8AH
- Department of Medical Physiology and Biophysics, Instituto de Biomedicina de Sevilla (IBiS), HUVR/CSIC/Universidad de Sevilla/CIBERSAM, ISCIII, 41013, Sevilla, Spain, 41013
| | - Jakob Grove
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, 8210, Denmark
- Center for Genomics and Personalized Medicine (CGPM), Aarhus University, Aarhus, 8000, Denmark
- Department of Biomedicine (Human Genetics) and iSEQ Center, Aarhus University, Aarhus, 8000, Denmark
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark, 8000
| | - Anders D. Børglum
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, 8210, Denmark
- Center for Genomics and Personalized Medicine (CGPM), Aarhus University, Aarhus, 8000, Denmark
- Department of Biomedicine (Human Genetics) and iSEQ Center, Aarhus University, Aarhus, 8000, Denmark
| | - Hilary Martin
- Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
| | - Simon Baron-Cohen
- Autism Research Centre, Department of Psychiatry, University of Cambridge, Cambridge, CB2 8AH, UK
- Department of Psychology, University of Cambridge, Cambridge, CB2 3EB, UK
| | | | - Varun Warrier
- Department of Psychiatry, University of Cambridge, Cambridge, CB2 8AH
- Human Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
- Department of Psychology, University of Cambridge, Cambridge, CB2 3EB, UK
| |
Collapse
|
41
|
Hou K, Gogarten S, Kim J, Hua X, Dias JA, Sun Q, Wang Y, Tan T, Atkinson EG, Martin A, Shortt J, Hirbo J, Li Y, Pasaniuc B, Zhang H. Admix-kit: an integrated toolkit and pipeline for genetic analyses of admixed populations. Bioinformatics 2024; 40:btae148. [PMID: 38490256 PMCID: PMC10980565 DOI: 10.1093/bioinformatics/btae148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 02/08/2024] [Accepted: 03/13/2024] [Indexed: 03/17/2024] Open
Abstract
SUMMARY Admixed populations, with their unique and diverse genetic backgrounds, are often underrepresented in genetic studies. This oversight not only limits our understanding but also exacerbates existing health disparities. One major barrier has been the lack of efficient tools tailored for the special challenges of genetic studies of admixed populations. Here, we present admix-kit, an integrated toolkit and pipeline for genetic analyses of admixed populations. Admix-kit implements a suite of methods to facilitate genotype and phenotype simulation, association testing, genetic architecture inference, and polygenic scoring in admixed populations. AVAILABILITY AND IMPLEMENTATION Admix-kit package is open-source and available at https://github.com/KangchengHou/admix-kit. Additionally, users can use the pipeline designed for admixed genotype simulation available at https://github.com/UW-GAC/admix-kit_workflow.
Collapse
Affiliation(s)
- Kangcheng Hou
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, 90095, United States
| | - Stephanie Gogarten
- Department of Biostatistics, University of Washington, Seattle, WA, 98195, United States
| | - Joohyun Kim
- Vanderbilt Genetics Institute and Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, United States
| | - Xing Hua
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, 20892, United States
| | - Julie-Alexia Dias
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02120, United States
| | - Quan Sun
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, United States
| | - Ying Wang
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, United States
| | - Taotao Tan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, United States
| | - Elizabeth G Atkinson
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, United States
| | - Alicia Martin
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, United States
| | - Jonathan Shortt
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, United States
| | - Jibril Hirbo
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, United States
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, United States
| | - Bogdan Pasaniuc
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, 90095, United States
| | - Haoyu Zhang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, 20892, United States
| |
Collapse
|
42
|
Lappalainen T, Li YI, Ramachandran S, Gusev A. Genetic and molecular architecture of complex traits. Cell 2024; 187:1059-1075. [PMID: 38428388 DOI: 10.1016/j.cell.2024.01.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/20/2023] [Accepted: 01/16/2024] [Indexed: 03/03/2024]
Abstract
Human genetics has emerged as one of the most dynamic areas of biology, with a broadening societal impact. In this review, we discuss recent achievements, ongoing efforts, and future challenges in the field. Advances in technology, statistical methods, and the growing scale of research efforts have all provided many insights into the processes that have given rise to the current patterns of genetic variation. Vast maps of genetic associations with human traits and diseases have allowed characterization of their genetic architecture. Finally, studies of molecular and cellular effects of genetic variants have provided insights into biological processes underlying disease. Many outstanding questions remain, but the field is well poised for groundbreaking discoveries as it increases the use of genetic data to understand both the history of our species and its applications to improve human health.
Collapse
Affiliation(s)
- Tuuli Lappalainen
- New York Genome Center, New York, NY, USA; Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden.
| | - Yang I Li
- Section of Genetic Medicine, University of Chicago, Chicago, IL, USA; Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Sohini Ramachandran
- Ecology, Evolution and Organismal Biology, Center for Computational Molecular Biology, and the Data Science Institute, Brown University, Providence, RI 029129, USA
| | - Alexander Gusev
- Harvard Medical School and Dana-Farber Cancer Institute, Boston, MA, USA
| |
Collapse
|
43
|
Xiang R, Kelemen M, Xu Y, Harris LW, Parkinson H, Inouye M, Lambert SA. Recent advances in polygenic scores: translation, equitability, methods and FAIR tools. Genome Med 2024; 16:33. [PMID: 38373998 PMCID: PMC10875792 DOI: 10.1186/s13073-024-01304-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 02/07/2024] [Indexed: 02/21/2024] Open
Abstract
Polygenic scores (PGS) can be used for risk stratification by quantifying individuals' genetic predisposition to disease, and many potentially clinically useful applications have been proposed. Here, we review the latest potential benefits of PGS in the clinic and challenges to implementation. PGS could augment risk stratification through combined use with traditional risk factors (demographics, disease-specific risk factors, family history, etc.), to support diagnostic pathways, to predict groups with therapeutic benefits, and to increase the efficiency of clinical trials. However, there exist challenges to maximizing the clinical utility of PGS, including FAIR (Findable, Accessible, Interoperable, and Reusable) use and standardized sharing of the genomic data needed to develop and recalculate PGS, the equitable performance of PGS across populations and ancestries, the generation of robust and reproducible PGS calculations, and the responsible communication and interpretation of results. We outline how these challenges may be overcome analytically and with more diverse data as well as highlight sustained community efforts to achieve equitable, impactful, and responsible use of PGS in healthcare.
Collapse
Affiliation(s)
- Ruidong Xiang
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
| | - Martin Kelemen
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
| | - Yu Xu
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
| | - Laura W Harris
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Helen Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Michael Inouye
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia.
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK.
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK.
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK.
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK.
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK.
| | - Samuel A Lambert
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
44
|
Yiangou K, Mavaddat N, Dennis J, Zanti M, Wang Q, Bolla MK, Abubakar M, Ahearn TU, Andrulis IL, Anton-Culver H, Antonenkova NN, Arndt V, Aronson KJ, Augustinsson A, Baten A, Behrens S, Bermisheva M, de Gonzalez AB, Białkowska K, Boddicker N, Bodelon C, Bogdanova NV, Bojesen SE, Brantley KD, Brauch H, Brenner H, Camp NJ, Canzian F, Castelao JE, Cessna MH, Chang-Claude J, Chenevix-Trench G, Chung WK, Colonna SV, Couch FJ, Cox A, Cross SS, Czene K, Daly MB, Devilee P, Dörk T, Dunning AM, Eccles DM, Eliassen AH, Engel C, Eriksson M, Evans DG, Fasching PA, Fletcher O, Flyger H, Fritschi L, Gago-Dominguez M, Gentry-Maharaj A, González-Neira A, Guénel P, Hahnen E, Haiman CA, Hamann U, Hartikainen JM, Ho V, Hodge J, Hollestelle A, Honisch E, Hooning MJ, Hoppe R, Hopper JL, Howell S, Howell A, Jakovchevska S, Jakubowska A, Jernström H, Johnson N, Kaaks R, Khusnutdinova EK, Kitahara CM, Koutros S, Kristensen VN, Lacey JV, Lambrechts D, Lejbkowicz F, Lindblom A, Lush M, Mannermaa A, Mavroudis D, Menon U, Murphy RA, Nevanlinna H, Obi N, Offit K, Park-Simon TW, Patel AV, Peng C, Peterlongo P, Pita G, Plaseska-Karanfilska D, Pylkäs K, Radice P, Rashid MU, Rennert G, Roberts E, Rodriguez J, Romero A, Rosenberg EH, Saloustros E, Sandler DP, Sawyer EJ, Schmutzler RK, Scott CG, Shu XO, Southey MC, Stone J, Taylor JA, Teras LR, van de Beek I, Willett W, Winqvist R, Zheng W, Vachon CM, Schmidt MK, Hall P, MacInnis RJ, Milne RL, Pharoah PD, Simard J, Antoniou AC, Easton DF, Michailidou K. Differences in polygenic score distributions in European ancestry populations: implications for breast cancer risk prediction. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.02.12.24302043. [PMID: 38410445 PMCID: PMC10896416 DOI: 10.1101/2024.02.12.24302043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/28/2024]
Abstract
The 313-variant polygenic risk score (PRS313) provides a promising tool for breast cancer risk prediction. However, evaluation of the PRS313 across different European populations which could influence risk estimation has not been performed. Here, we explored the distribution of PRS313 across European populations using genotype data from 94,072 females without breast cancer, of European-ancestry from 21 countries participating in the Breast Cancer Association Consortium (BCAC) and 225,105 female participants from the UK Biobank. The mean PRS313 differed markedly across European countries, being highest in south-eastern Europe and lowest in north-western Europe. Using the overall European PRS313 distribution to categorise individuals leads to overestimation and underestimation of risk in some individuals from south-eastern and north-western countries, respectively. Adjustment for principal components explained most of the observed heterogeneity in mean PRS. Country-specific PRS distributions may be used to calibrate risk categories in individuals from different countries.
Collapse
Affiliation(s)
- Kristia Yiangou
- Biostatistics Unit, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus, 2371
| | - Nasim Mavaddat
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK, CB1 8RN
| | - Joe Dennis
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK, CB1 8RN
| | - Maria Zanti
- Biostatistics Unit, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus, 2371
| | - Qin Wang
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK, CB1 8RN
| | - Manjeet K. Bolla
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK, CB1 8RN
| | - Mustapha Abubakar
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, USA, 20850
| | - Thomas U. Ahearn
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, USA, 20850
| | - Irene L. Andrulis
- Fred A, Litwin Center for Cancer Genetics, Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, Ontario, Canada, M5G 1X5
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada, M5S 1A8
| | - Hoda Anton-Culver
- Department of Medicine, Genetic Epidemiology Research Institute, University of California Irvine, Irvine, CA, USA, 92617
| | - Natalia N. Antonenkova
- NN Alexandrov Research Institute of Oncology and Medical Radiology, Minsk, Belarus, 223040
| | - Volker Arndt
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany, 69120
| | - Kristan J. Aronson
- Department of Public Health Sciences, and Cancer Research Institute, Queen’s University, Kingston, ON, Canada, K7L 3N6
| | | | - Adinda Baten
- Leuven Multidisciplinary Breast Center, Department of Oncology, Leuven Cancer Institute, University Hospitals Leuven, Leuven, Belgium, 3000
| | - Sabine Behrens
- Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany, 69120
| | - Marina Bermisheva
- Institute of Biochemistry and Genetics of the Ufa Federal Research Centre of the Russian Academy of Sciences, Ufa, Russia, 450054
- St Petersburg State University, St, Petersburg, Russia, 199034
| | | | - Katarzyna Białkowska
- Department of Genetics and Pathology, Pomeranian Medical University, Szczecin, Poland, 71-252
| | - Nicholas Boddicker
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA, 55905
| | - Clara Bodelon
- Department of Population Science, American Cancer Society, Atlanta, GA, USA, 30303
| | - Natalia V. Bogdanova
- NN Alexandrov Research Institute of Oncology and Medical Radiology, Minsk, Belarus, 223040
- Department of Radiation Oncology, Hannover Medical School, Hannover, Germany, 30625
- Gynaecology Research Unit, Hannover Medical School, Hannover, Germany, 30625
| | - Stig E. Bojesen
- Copenhagen General Population Study, Herlev and Gentofte Hospital, Copenhagen University Hospital, Herlev, Denmark, 2730
- Department of Clinical Biochemistry, Herlev and Gentofte Hospital, Copenhagen University Hospital, Herlev, Denmark, 2730
- Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark, 2200
| | - Kristen D. Brantley
- Department of Epidemiology, Harvard TH Chan School of Public Health, Boston, MA, USA, 02115
| | - Hiltrud Brauch
- Dr Margarete Fischer-Bosch-Institute of Clinical Pharmacology, Stuttgart, Germany, 70376
- iFIT-Cluster of Excellence, University of Tübingen, Tübingen, Germany, 72074
- German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Partner Site Tübingen, Tübingen, Germany, 72074
| | - Hermann Brenner
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany, 69120
- Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg, Germany, 69120
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany, 69120
| | - Nicola J. Camp
- Department of Internal Medicine and Huntsman Cancer Institute, University of Utah, Salt Lake City, UT, USA, 84112
| | - Federico Canzian
- Genomic Epidemiology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany, 69120
| | - Jose E. Castelao
- Oncology and Genetics Unit, Instituto de Investigación Sanitaria de Santiago de Compostela (IDIS) Foundation, Complejo Hospitalario Universitario de Santiago, SERGAS, Vigo, Spain, 36312
| | - Melissa H. Cessna
- Department of Pathology, Intermountain Healthcare, Salt Lake City, UT, USA, 84143
- Intermountain Biorepository, Intermountain Healthcare, Salt Lake City, UT, USA, 84143
| | - Jenny Chang-Claude
- Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany, 69120
- Cancer Epidemiology Group, University Cancer Center Hamburg (UCCH), University Medical Center Hamburg-Eppendorf, Hamburg, Germany, 20246
| | - Georgia Chenevix-Trench
- Cancer Research Program, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia, 4006
| | - Wendy K. Chung
- Departments of Pediatrics and Medicine, Columbia University, New York, NY, USA, 10032
| | - NBCS Collaborators
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital-Radiumhospitalet, Oslo, Norway, 0379
- Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway, 0450
- Department of Research, Vestre Viken Hospital, Drammen, Norway, 3019
- Section for Breast- and Endocrine Surgery, Department of Cancer, Division of Surgery, Cancer and Transplantation Medicine, Oslo University Hospital-Ullevål, Oslo, Norway, 0450
- Department of Radiology and Nuclear Medicine, Oslo University Hospital, Oslo, Norway, 0379
- Department of Pathology, Akershus University Hospital, Lørenskog, Norway, 1478
- Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway, 0379
- Department of Oncology, Division of Surgery, Cancer and Transplantation Medicine, Oslo University Hospital-Radiumhospitalet, Oslo, Norway, 0379
- National Advisory Unit on Late Effects after Cancer Treatment, Oslo University Hospital, Oslo, Norway, 0379
- Department of Oncology, Akershus University Hospital, Lørenskog, Norway, 1478
- Oslo Breast Cancer Research Consortium, Oslo University Hospital, Oslo, Norway, 0379
- Department of Medical Genetics, Oslo University Hospital and University of Oslo, Oslo, Norway, 0379
| | - Sarah V. Colonna
- Department of Internal Medicine and Huntsman Cancer Institute, University of Utah, Salt Lake City, UT, USA, 84112
| | - Fergus J. Couch
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA, 55905
| | - Angela Cox
- Division of Clinical Medicine, School of Medicine and Population Health, University of Sheffield, Sheffield, UK, S10 2TN
| | - Simon S. Cross
- Division of Neuroscience, School of Medicine and Population Health, University of Sheffield, Sheffield, UK, S10 2TN
| | - Kamila Czene
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden, 171 65
| | - Mary B. Daly
- Department of Clinical Genetics, Fox Chase Cancer Center, Philadelphia, PA, USA, 19111
| | - Peter Devilee
- Department of Pathology, Leiden University Medical Center, Leiden, the Netherlands, 2333 ZA
- Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands, 2333 ZA
| | - Thilo Dörk
- Gynaecology Research Unit, Hannover Medical School, Hannover, Germany, 30625
| | - Alison M. Dunning
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, UK, CB1 8RN
| | - Diana M. Eccles
- Faculty of Medicine, University of Southampton, Southampton, UK, SO17 1BJ
| | - A. Heather Eliassen
- Department of Epidemiology, Harvard TH Chan School of Public Health, Boston, MA, USA, 02115
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA, 02115
- Department of Nutrition, Harvard TH Chan School of Public Health, Boston, MA, USA, 02115
| | - Christoph Engel
- Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Leipzig, Germany, 04107
- LIFE - Leipzig Research Centre for Civilization Diseases, University of Leipzig, Leipzig, Germany, 04103
| | - Mikael Eriksson
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden, 171 65
| | - D. Gareth Evans
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK, M13 9WL
- North West Genomics Laboratory Hub, Manchester Centre for Genomic Medicine, St Mary’s Hospital, Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, UK, M13 9WL
| | - Peter A. Fasching
- Department of Gynecology and Obstetrics, Comprehensive Cancer Center Erlangen-EMN, Friedrich-Alexander University Erlangen-Nuremberg, University Hospital Erlangen, Erlangen, Germany, 91054
| | - Olivia Fletcher
- The Breast Cancer Now Toby Robins Research Centre, The Institute of Cancer Research, London, UK, SW7 3RP
| | - Henrik Flyger
- Department of Breast Surgery, Herlev and Gentofte Hospital, Copenhagen University Hospital, Herlev, Denmark, 2730
| | - Lin Fritschi
- School of Population Health, Curtin University, Perth, Western Australia, Australia, 6102
| | - Manuela Gago-Dominguez
- Cancer Genetics and Epidemiology Group, Genomic Medicine Group, Fundación Instituto de Investigación Sanitaria de Santiago de Compostela (FIDIS), Complejo Hospitalario Universitario de Santiago, SERGAS, Santiago de Compostela, Spain, 15706
| | - Aleksandra Gentry-Maharaj
- MRC Clinical Trials Unit, Institute of Clinical Trials and Methodology, University College London, London, UK, WC1V 6LJ
- Department of Women’s Cancer, Elizabeth Garrett Anderson Institute for Women’s Health, University College London, London, UK
| | - Anna González-Neira
- Human Genotyping Unit-CeGen, Spanish National Cancer Research Centre (CNIO), Madrid, Spain, 28029
- Spanish Network on Rare Diseases (CIBERER)
| | - Pascal Guénel
- Team ‘Exposome and Heredity’, CESP, Gustave Roussy, INSERM, University Paris-Saclay, UVSQ, Villejuif, France, 94805
| | - Eric Hahnen
- Center for Familial Breast and Ovarian Cancer, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany, 50937
- Center for Integrated Oncology (CIO), Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany, 50937
| | - Christopher A. Haiman
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA, 90033
| | - Ute Hamann
- Molecular Genetics of Breast Cancer, German Cancer Research Center (DKFZ), Heidelberg, Germany, 69120
| | - Jaana M. Hartikainen
- Cancer RC, University of Eastern Finland, Kuopio, Finland, 70210
- Institute of Clinical Medicine, Pathology and Forensic Medicine, University of Eastern Finland, Kuopio, Finland, 70210
| | - Vikki Ho
- Health Innovation and Evaluation Hub, Université de Montréal Hospital Research Centre (CRCHUM), Montréal, Québec, Canada
| | - James Hodge
- Department of Population Science, American Cancer Society, Atlanta, GA, USA, 30303
| | - Antoinette Hollestelle
- Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, the Netherlands, 3015 GD
| | - Ellen Honisch
- Department of Gynecology and Obstetrics, University Hospital Düsseldorf, Heinrich-Heine University Düsseldorf, Düsseldorf, Germany, 40225
| | - Maartje J. Hooning
- Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, the Netherlands, 3015 GD
| | - Reiner Hoppe
- Dr Margarete Fischer-Bosch-Institute of Clinical Pharmacology, Stuttgart, Germany, 70376
- University of Tübingen, Tübingen, Germany, 72074
| | - John L. Hopper
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, Victoria, Australia, 3010
| | - Sacha Howell
- Division of Cancer Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
- Nightingale/Prevent Breast Cancer Centre, Wythenshawe Hospital, Manchester University NHS Foundation Trust, Manchester, UK
- Manchester Breast Centre, Manchester Cancer Research Centre, The Christie Hospital, Manchester, UK
| | - Anthony Howell
- Division of Cancer Sciences, University of Manchester, Manchester, UK, M13 9PL
| | - ABCTB Investigators
- Australian Breast Cancer Tissue Bank, Westmead Institute for Medical Research, University of Sydney, Sydney, New South Wales, Australia, 2145
| | - kConFab Investigators
- Research Department, Peter MacCallum Cancer Center, Melbourne, Victoria, Australia, 3000
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Parkville, Victoria, Australia, 3000
| | - Simona Jakovchevska
- Research Centre for Genetic Engineering and Biotechnology ‘Georgi D, Efremov’, MASA, Skopje, Republic of North Macedonia, 1000
| | - Anna Jakubowska
- Department of Genetics and Pathology, Pomeranian Medical University, Szczecin, Poland, 71-252
- Independent Laboratory of Molecular Biology and Genetic Diagnostics, Pomeranian Medical University, Szczecin, Poland, 171-252
| | - Helena Jernström
- Oncology, Clinical Sciences in Lund, Lund University, Lund, Sweden, 221 85
| | - Nichola Johnson
- The Breast Cancer Now Toby Robins Research Centre, The Institute of Cancer Research, London, UK, SW7 3RP
| | - Rudolf Kaaks
- Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany, 69120
| | - Elza K. Khusnutdinova
- Institute of Biochemistry and Genetics of the Ufa Federal Research Centre of the Russian Academy of Sciences, Ufa, Russia, 450054
- Department of Genetics and Fundamental Medicine, Ufa University of Science and Technology, Ufa, Russia, 450076
| | - Cari M. Kitahara
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Bethesda, MD, USA, 20892
| | - Stella Koutros
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, USA, 20850
| | - Vessela N. Kristensen
- Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway, 0450
- Department of Medical Genetics, Oslo University Hospital and University of Oslo, Oslo, Norway, 0379
| | - James V. Lacey
- Department of Computational and Quantitative Medicine, City of Hope, Duarte, CA, USA, 91010
- City of Hope Comprehensive Cancer Center, City of Hope, Duarte, CA, USA, 91010
| | - Diether Lambrechts
- Laboratory for Translational Genetics, Department of Human Genetics, KU Leuven, Leuven, Belgium, 3000
- VIB Center for Cancer Biology, VIB, Leuven, Belgium, 3001
| | | | - Annika Lindblom
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden, 171 76
- Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden, 171 76
| | - Michael Lush
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK, CB1 8RN
| | - Arto Mannermaa
- Institute of Clinical Medicine, Pathology and Forensic Medicine, University of Eastern Finland, Kuopio, Finland, 70210
- Translational Cancer Research Area, University of Eastern Finland, Kuopio, Finland, 70210
- Biobank of Eastern Finland, Kuopio University Hospital, Kuopio, Finland
| | - Dimitrios Mavroudis
- Department of Medical Oncology, University Hospital of Heraklion, Heraklion, Greece, 711 10
| | - Usha Menon
- MRC Clinical Trials Unit, Institute of Clinical Trials and Methodology, University College London, London, UK, WC1V 6LJ
| | - Rachel A. Murphy
- School of Population and Public Health, University of British Columbia, Vancouver, BC, Canada, V6T 1Z4
- Cancer Control Research, BC Cancer Agency, Vancouver, BC, Canada, V5Z 1L3
| | - Heli Nevanlinna
- Department of Obstetrics and Gynecology, Helsinki University Hospital, University of Helsinki, Helsinki, Finland, 00290
| | - Nadia Obi
- Institute for Occupational and Maritime Medicine, University Medical Center Hamburg-Eppendorf, Hamburg, Germany, 20246
- Institute for Medical Biometry and Epidemiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany, 20246
| | - Kenneth Offit
- Clinical Genetics Research Lab, Department of Cancer Biology and Genetics, Memorial Sloan Kettering Cancer Center, New York, NY, USA, 10065
- Clinical Genetics Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA, 10065
| | | | - Alpa V. Patel
- Department of Population Science, American Cancer Society, Atlanta, GA, USA, 30303
| | - Cheng Peng
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA, 02115
| | - Paolo Peterlongo
- Genome Diagnostics Program, IFOM ETS - the AIRC Institute of Molecular Oncology, Milan, Italy, 20139
| | - Guillermo Pita
- Human Genotyping Unit-CeGen, Spanish National Cancer Research Centre (CNIO), Madrid, Spain, 28029
| | - Dijana Plaseska-Karanfilska
- Research Centre for Genetic Engineering and Biotechnology ‘Georgi D, Efremov’, MASA, Skopje, Republic of North Macedonia, 1000
| | - Katri Pylkäs
- Laboratory of Cancer Genetics and Tumor Biology, Translational Medicine Research Unit, Biocenter Oulu, University of Oulu, Oulu, Finland, 90220
- Laboratory of Cancer Genetics and Tumor Biology, Northern Finland Laboratory Centre Oulu, Oulu, Finland, 90220
| | - Paolo Radice
- Unit of Predictice Medicine, Molecular Bases of Genetic Risk, Department of Research, Fondazione IRCCS Istituto Nazionale dei Tumori (INT), Milan, Italy, 20133
| | - Muhammad U. Rashid
- Molecular Genetics of Breast Cancer, German Cancer Research Center (DKFZ), Heidelberg, Germany, 69120
- Department of Basic Sciences, Shaukat Khanum Memorial Cancer Hospital and Research Centre (SKMCH & RC), Lahore, Pakistan, 54000
| | - Gad Rennert
- Technion, Faculty of Medicine and Association for Promotion of Research in Precision Medicine, Haifa, Israel
| | - Eleanor Roberts
- Division of Cancer Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
| | - Juan Rodriguez
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden, 171 65
| | - Atocha Romero
- Medical Oncology Department, Hospital Universitario Puerta de Hierro, Madrid, Spain, 28222
| | - Efraim H. Rosenberg
- Department of Pathology, The Netherlands Cancer Institute - Antoni van Leeuwenhoek hospital, Amsterdam, the Netherlands, 1066 CX
| | | | - Dale P. Sandler
- Epidemiology Branch, National Institute of Environmental Health Sciences, NIH, Research Triangle Park, NC, USA, 27709
| | - Elinor J. Sawyer
- School of Cancer & Pharmaceutical Sciences, Comprehensive Cancer Centre, Guy’s Campus, King’s College London, London, UK
| | - Rita K. Schmutzler
- Center for Familial Breast and Ovarian Cancer, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany, 50937
- Center for Integrated Oncology (CIO), Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany, 50937
- Center for Molecular Medicine Cologne (CMMC), Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany, 50931
| | - Christopher G. Scott
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA, 55905
| | - Xiao-Ou Shu
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, USA, 37232
| | - Melissa C. Southey
- Precision Medicine, School of Clinical Sciences at Monash Health, Monash University, Clayton, Victoria, Australia, 3168
- Department of Clinical Pathology, The University of Melbourne, Melbourne, Victoria, Australia, 3010
- Cancer Epidemiology Division, Cancer Council Victoria, Melbourne, Victoria, Australia, 3004
| | - Jennifer Stone
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, Victoria, Australia, 3010
- Genetic Epidemiology Group, School of Population and Global Health, University of Western Australia, Perth, Western Australia, Australia, 6000
| | - Jack A. Taylor
- Epidemiology Branch, National Institute of Environmental Health Sciences, NIH, Research Triangle Park, NC, USA, 27709
- Epigenetic and Stem Cell Biology Laboratory, National Institute of Environmental Health Sciences, NIH, Research Triangle Park, NC, USA, 27709
| | - Lauren R. Teras
- Department of Population Science, American Cancer Society, Atlanta, GA, USA, 30303
| | - Irma van de Beek
- Department of Clinical Genetics, The Netherlands Cancer Institute - Antoni van Leeuwenhoek hospital, Amsterdam, the Netherlands, 1066 CX
| | - Walter Willett
- Department of Epidemiology, Harvard TH Chan School of Public Health, Boston, MA, USA, 02115
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA, 02115
- Department of Nutrition, Harvard TH Chan School of Public Health, Boston, MA, USA, 02115
| | - Robert Winqvist
- Laboratory of Cancer Genetics and Tumor Biology, Translational Medicine Research Unit, Biocenter Oulu, University of Oulu, Oulu, Finland, 90220
- Laboratory of Cancer Genetics and Tumor Biology, Northern Finland Laboratory Centre Oulu, Oulu, Finland, 90220
| | - Wei Zheng
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, USA, 37232
| | - Celine M. Vachon
- Department of Quantitative Health Sciences, Division of Epidemiology, Mayo Clinic, Rochester, MN, USA, 55905
| | - Marjanka K. Schmidt
- Division of Molecular Pathology, The Netherlands Cancer Institute, Amsterdam, the Netherlands, 1066 CX
- Division of Psychosocial Research and Epidemiology, The Netherlands Cancer Institute - Antoni van Leeuwenhoek hospital, Amsterdam, the Netherlands, 1066 CX
- Department of Clinical Genetics, Leiden University Medical Center, Leiden, the Netherlands, 2333 ZA
| | - Per Hall
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden, 171 65
- Department of Oncology, Södersjukhuset, Stockholm, Sweden, 118 83
| | - Robert J. MacInnis
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, Victoria, Australia, 3010
- Cancer Epidemiology Division, Cancer Council Victoria, Melbourne, Victoria, Australia, 3004
| | - Roger L. Milne
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, Victoria, Australia, 3010
- Precision Medicine, School of Clinical Sciences at Monash Health, Monash University, Clayton, Victoria, Australia, 3168
- Cancer Epidemiology Division, Cancer Council Victoria, Melbourne, Victoria, Australia, 3004
| | - Paul D.P. Pharoah
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, West Hollywood, CA, USA, 90069
| | - Jacques Simard
- Genomics Center, Centre Hospitalier Universitaire de Québec – Université Laval Research Center, Québec City, Québec, Canada, G1V 4G2
| | - Antonis C. Antoniou
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK, CB1 8RN
| | - Douglas F. Easton
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK, CB1 8RN
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, UK, CB1 8RN
| | - Kyriaki Michailidou
- Biostatistics Unit, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus, 2371
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK, CB1 8RN
| |
Collapse
|
45
|
Peyrot WJ, Panagiotaropoulou G, Olde Loohuis LM, Adams MJ, Awasthi S, Ge T, McIntosh AM, Mitchell BL, Mullins N, O'Connell KS, Penninx BWJH, Posthuma D, Ripke S, Ruderfer DM, Uffelmann E, Vilhjalmsson BJ, Zhu Z, Smoller JW, Price AL. Distinguishing different psychiatric disorders using DDx-PRS. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.02.02.24302228. [PMID: 38352307 PMCID: PMC10862992 DOI: 10.1101/2024.02.02.24302228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/24/2024]
Abstract
Despite great progress on methods for case-control polygenic prediction (e.g. schizophrenia vs. control), there remains an unmet need for a method that genetically distinguishes clinically related disorders (e.g. schizophrenia (SCZ) vs. bipolar disorder (BIP) vs. depression (MDD) vs. control); such a method could have important clinical value, especially at disorder onset when differential diagnosis can be challenging. Here, we introduce a method, Differential Diagnosis-Polygenic Risk Score (DDx-PRS), that jointly estimates posterior probabilities of each possible diagnostic category (e.g. SCZ=50%, BIP=25%, MDD=15%, control=10%) by modeling variance/covariance structure across disorders, leveraging case-control polygenic risk scores (PRS) for each disorder (computed using existing methods) and prior clinical probabilities for each diagnostic category. DDx-PRS uses only summary-level training data and does not use tuning data, facilitating implementation in clinical settings. In simulations, DDx-PRS was well-calibrated (whereas a simpler approach that analyzes each disorder marginally was poorly calibrated), and effective in distinguishing each diagnostic category vs. the rest. We then applied DDx-PRS to Psychiatric Genomics Consortium SCZ/BIP/MDD/control data, including summary-level training data from 3 case-control GWAS ( N =41,917-173,140 cases; total N =1,048,683) and held-out test data from different cohorts with equal numbers of each diagnostic category (total N =11,460). DDx-PRS was well-calibrated and well-powered relative to these training sample sizes, attaining AUCs of 0.66 for SCZ vs. rest, 0.64 for BIP vs. rest, 0.59 for MDD vs. rest, and 0.68 for control vs. rest. DDx-PRS produced comparable results to methods that leverage tuning data, confirming that DDx-PRS is an effective method. True diagnosis probabilities in top deciles of predicted diagnosis probabilities were considerably larger than prior baseline probabilities, particularly in projections to larger training sample sizes, implying considerable potential for clinical utility under certain circumstances. In conclusion, DDx-PRS is an effective method for distinguishing clinically related disorders.
Collapse
|
46
|
Lennon NJ, Kottyan LC, Kachulis C, Abul-Husn NS, Arias J, Belbin G, Below JE, Berndt SI, Chung WK, Cimino JJ, Clayton EW, Connolly JJ, Crosslin DR, Dikilitas O, Velez Edwards DR, Feng Q, Fisher M, Freimuth RR, Ge T, Glessner JT, Gordon AS, Patterson C, Hakonarson H, Harden M, Harr M, Hirschhorn JN, Hoggart C, Hsu L, Irvin MR, Jarvik GP, Karlson EW, Khan A, Khera A, Kiryluk K, Kullo I, Larkin K, Limdi N, Linder JE, Loos RJF, Luo Y, Malolepsza E, Manolio TA, Martin LJ, McCarthy L, McNally EM, Meigs JB, Mersha TB, Mosley JD, Musick A, Namjou B, Pai N, Pesce LL, Peters U, Peterson JF, Prows CA, Puckelwartz MJ, Rehm HL, Roden DM, Rosenthal EA, Rowley R, Sawicki KT, Schaid DJ, Smit RAJ, Smith JL, Smoller JW, Thomas M, Tiwari H, Toledo DM, Vaitinadin NS, Veenstra D, Walunas TL, Wang Z, Wei WQ, Weng C, Wiesner GL, Yin X, Kenny EE. Selection, optimization and validation of ten chronic disease polygenic risk scores for clinical implementation in diverse US populations. Nat Med 2024; 30:480-487. [PMID: 38374346 PMCID: PMC10878968 DOI: 10.1038/s41591-024-02796-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Accepted: 01/02/2024] [Indexed: 02/21/2024]
Abstract
Polygenic risk scores (PRSs) have improved in predictive performance, but several challenges remain to be addressed before PRSs can be implemented in the clinic, including reduced predictive performance of PRSs in diverse populations, and the interpretation and communication of genetic results to both providers and patients. To address these challenges, the National Human Genome Research Institute-funded Electronic Medical Records and Genomics (eMERGE) Network has developed a framework and pipeline for return of a PRS-based genome-informed risk assessment to 25,000 diverse adults and children as part of a clinical study. From an initial list of 23 conditions, ten were selected for implementation based on PRS performance, medical actionability and potential clinical utility, including cardiometabolic diseases and cancer. Standardized metrics were considered in the selection process, with additional consideration given to strength of evidence in African and Hispanic populations. We then developed a pipeline for clinical PRS implementation (score transfer to a clinical laboratory, validation and verification of score performance), and used genetic ancestry to calibrate PRS mean and variance, utilizing genetically diverse data from 13,475 participants of the All of Us Research Program cohort to train and test model parameters. Finally, we created a framework for regulatory compliance and developed a PRS clinical report for return to providers and for inclusion in an additional genome-informed risk assessment. The initial experience from eMERGE can inform the approach needed to implement PRS-based testing in diverse clinical settings.
Collapse
Affiliation(s)
| | - Leah C Kottyan
- Cincinnati Children's Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA
| | | | | | - Josh Arias
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Gillian Belbin
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | - Sonja I Berndt
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - James J Cimino
- University of Alabama at Birmingham, Birmingham, AL, USA
| | | | | | - David R Crosslin
- Tulane University, New Orleans, LA, USA
- University of Washington, Seattle, WA, USA
| | | | | | - QiPing Feng
- Vanderbilt University Medical Center, Nashville, TN, USA
| | | | | | - Tian Ge
- Mass General Brigham, Boston, MA, USA
| | | | | | | | | | - Maegan Harden
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Margaret Harr
- Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Joel N Hirschhorn
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Boston Children's Hospital, Boston, MA, USA
| | - Clive Hoggart
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Li Hsu
- Fred Hutchinson Cancer Center, Seattle, WA, USA
| | | | | | | | | | - Amit Khera
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | | - Katie Larkin
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Nita Limdi
- University of Alabama at Birmingham, Birmingham, AL, USA
| | | | - Ruth J F Loos
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Yuan Luo
- Northwestern University, Evanston, IL, USA
| | | | - Teri A Manolio
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Lisa J Martin
- Cincinnati Children's Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA
| | - Li McCarthy
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | | - Tesfaye B Mersha
- Cincinnati Children's Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA
| | | | | | - Bahram Namjou
- Cincinnati Children's Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA
| | - Nihal Pai
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | | | | - Cynthia A Prows
- Cincinnati Children's Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA
| | | | - Heidi L Rehm
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Dan M Roden
- Vanderbilt University Medical Center, Nashville, TN, USA
| | | | - Robb Rowley
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | | | | | | | | | - Hemant Tiwari
- University of Alabama at Birmingham, Birmingham, AL, USA
| | | | | | | | | | - Zhe Wang
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Wei-Qi Wei
- Vanderbilt University Medical Center, Nashville, TN, USA
| | | | | | | | - Eimear E Kenny
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
47
|
Barr PB, Bigdeli TB, Meyers JL, Peterson RE, Sanchez-Roige S, Mallard TT, Dick DM, Harden KP, Wilkinson A, Graham DP, Nielsen DA, Swann AC, Lipsky RK, Kosten TR, Aslan M, Harvey PD, Kimbrel NA, Beckham JC. Correlates of Risk for Disinhibited Behaviors in the Million Veteran Program Cohort. JAMA Psychiatry 2024; 81:188-197. [PMID: 37938835 PMCID: PMC10633411 DOI: 10.1001/jamapsychiatry.2023.4141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 09/01/2023] [Indexed: 11/10/2023]
Abstract
Importance Many psychiatric outcomes share a common etiologic pathway reflecting behavioral disinhibition, generally referred to as externalizing (EXT) disorders. Recent genome-wide association studies (GWASs) have demonstrated the overlap between EXT disorders and important aspects of veterans' health, such as suicide-related behaviors and substance use disorders (SUDs). Objective To explore correlates of risk for EXT disorders within the Veterans Health Administration (VA) Million Veteran Program (MVP). Design, Setting, and Participants A series of phenome-wide association studies (PheWASs) of polygenic risk scores (PGSs) for EXT disorders was conducted using electronic health records. First, ancestry-specific PheWASs of EXT PGSs were conducted in the African, European, and Hispanic or Latin American ancestries. Next, a conditional PheWAS, covarying for PGSs of comorbid psychiatric problems (depression, schizophrenia, and suicide attempt; European ancestries only), was performed. Lastly, to adjust for unmeasured confounders, a within-family analysis of significant associations from the main PheWAS was performed in full siblings (European ancestries only). This study included the electronic health record data from US veterans from VA health care centers enrolled in MVP. Analyses took place from February 2022 to August 2023 covering a period from October 1999 to January 2020. Exposures PGSs for EXT, depression, schizophrenia, and suicide attempt. Main Outcomes and Measures Phecodes for diagnoses derived from the International Statistical Classification of Diseases, Ninth and Tenth Revisions, Clinical Modification, codes from electronic health records. Results Within the MVP (560 824 patients; mean [SD] age, 67.9 [14.3] years; 512 593 male [91.4%]), the EXT PGS was associated with 619 outcomes, of which 188 were independent of risk for comorbid problems or PGSs (from odds ratio [OR], 1.02; 95% CI, 1.01-1.03 for overweight/obesity to OR, 1.44; 95% CI, 1.42-1.47 for viral hepatitis C). Of the significant outcomes, 73 (11.9%) were significant in the African results and 26 (4.5%) were significant in the Hispanic or Latin American results. Within-family analyses uncovered robust associations between EXT PGS and consequences of SUDs, including liver disease, chronic airway obstruction, and viral hepatitis C. Conclusions and Relevance Results of this cohort study suggest a shared polygenic basis of EXT disorders, independent of risk for other psychiatric problems. In addition, this study found associations between EXT PGS and diagnoses related to SUDs and their sequelae. Overall, this study highlighted the potential negative consequences of EXT disorders for health and functioning in the US veteran population.
Collapse
Affiliation(s)
- Peter B. Barr
- VA New York Harbor Healthcare System, Brooklyn
- Department of Psychiatry and Behavioral Sciences, SUNY Downstate Health Sciences University, Brooklyn, New York
- Institute for Genomics in Health, SUNY Downstate Health Sciences University, Brooklyn, New York
- Department of Epidemiology and Biostatistics, School of Public Health, SUNY Downstate Health Sciences University, Brooklyn, New York
| | - Tim B. Bigdeli
- VA New York Harbor Healthcare System, Brooklyn
- Department of Psychiatry and Behavioral Sciences, SUNY Downstate Health Sciences University, Brooklyn, New York
- Institute for Genomics in Health, SUNY Downstate Health Sciences University, Brooklyn, New York
- Department of Epidemiology and Biostatistics, School of Public Health, SUNY Downstate Health Sciences University, Brooklyn, New York
| | - Jacquelyn L. Meyers
- VA New York Harbor Healthcare System, Brooklyn
- Department of Psychiatry and Behavioral Sciences, SUNY Downstate Health Sciences University, Brooklyn, New York
- Institute for Genomics in Health, SUNY Downstate Health Sciences University, Brooklyn, New York
- Department of Epidemiology and Biostatistics, School of Public Health, SUNY Downstate Health Sciences University, Brooklyn, New York
| | - Roseann E. Peterson
- VA New York Harbor Healthcare System, Brooklyn
- Department of Psychiatry and Behavioral Sciences, SUNY Downstate Health Sciences University, Brooklyn, New York
- Institute for Genomics in Health, SUNY Downstate Health Sciences University, Brooklyn, New York
| | - Sandra Sanchez-Roige
- Department of Psychiatry, University of California San Diego, La Jolla
- Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Travis T. Mallard
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston
- Department of Psychiatry, Harvard Medical School, Boston, Massachusetts
| | - Danielle M. Dick
- Department of Psychiatry, Robert Wood Johnson Medical School, Rutgers University, Piscataway, New Jersey
- Rutgers Addiction Research Center, Rutgers University, Piscataway, New Jersey
| | - K. Paige Harden
- Department of Psychology, University of Texas at Austin, Austin
- Population Research Center, University of Texas at Austin, Austin
| | - Anna Wilkinson
- Michael E. DeBakey VA Medical Center, Houston, Texas
- The University of Texas Health Science Center at Houston, UTHealth Houston School of Public Health, Houston
- Michael and Susan Dell Center for Healthy Living, The University of Texas Health Science Center at Houston, Houston
| | - David P. Graham
- Michael E. DeBakey VA Medical Center, Houston, Texas
- Departments of Psychiatry, Neuroscience, Pharmacology, and Immunology and Rheumatology, Baylor College of Medicine, Houston, Texas
| | - David A. Nielsen
- Michael E. DeBakey VA Medical Center, Houston, Texas
- Departments of Psychiatry, Neuroscience, Pharmacology, and Immunology and Rheumatology, Baylor College of Medicine, Houston, Texas
| | - Alan C. Swann
- Michael E. DeBakey VA Medical Center, Houston, Texas
- Departments of Psychiatry, Neuroscience, Pharmacology, and Immunology and Rheumatology, Baylor College of Medicine, Houston, Texas
| | - Rachele K. Lipsky
- Michael E. DeBakey VA Medical Center, Houston, Texas
- Departments of Psychiatry, Neuroscience, Pharmacology, and Immunology and Rheumatology, Baylor College of Medicine, Houston, Texas
| | - Thomas R. Kosten
- Michael E. DeBakey VA Medical Center, Houston, Texas
- Departments of Psychiatry, Neuroscience, Pharmacology, and Immunology and Rheumatology, Baylor College of Medicine, Houston, Texas
| | - Mihaela Aslan
- Clinical Epidemiology Research Center, VA Connecticut Healthcare System, West Haven, Connecticut
- Yale University School of Medicine, New Haven, Connecticut
| | - Philip D. Harvey
- Research Service, Bruce W. Carter Miami Veterans Affairs Medical Center, Miami, Florida
- University of Miami Miller School of Medicine, Miami, Florida
| | - Nathan A. Kimbrel
- Durham VA Health Care System, Durham, North Carolina
- VA Mid-Atlantic Mental Illness Research, Education and Clinical Center, Durham, North Carolina
- Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, North Carolina
| | - Jean C. Beckham
- Durham VA Health Care System, Durham, North Carolina
- VA Mid-Atlantic Mental Illness Research, Education and Clinical Center, Durham, North Carolina
- Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, North Carolina
| |
Collapse
|
48
|
Aw AJ, McRae J, Rahmani E, Song YS. Highly parameterized polygenic scores tend to overfit to population stratification via random effects. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.27.577589. [PMID: 38352303 PMCID: PMC10862757 DOI: 10.1101/2024.01.27.577589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/22/2024]
Abstract
Polygenic scores (PGSs), increasingly used in clinical settings, frequently include many genetic variants, with performance typically peaking at thousands of variants. Such highly parameterized PGSs often include variants that do not pass a genome-wide significance threshold. We propose a mathematical perspective that renders the effects of many of these non-significant variants random rather than causal, with the randomness capturing population structure. We devise methods to assess variant effect randomness and population stratification bias. Applying these methods to 141 traits from the UK Biobank, we find that, for many PGSs, the effects of non-significant variants are considerably random, with the extent of randomness associated with the degree of overfitting to population structure of the discovery cohort. Our findings explain why highly parameterized PGSs simultaneously have superior cohort-specific performance and limited generalizability, suggesting the critical need for variant randomness tests in PGS evaluation. Supporting code and a dashboard are available at https://github.com/songlab-cal/StratPGS.
Collapse
Affiliation(s)
- Alan J. Aw
- Department of Statistics, University of California, Berkeley
- Center for Computational Biology, University of California, Berkeley
- Artificial Intelligence Laboratory, Illumina Inc
| | - Jeremy McRae
- Artificial Intelligence Laboratory, Illumina Inc
| | - Elior Rahmani
- Department of Computational Medicine, University of California, Los Angeles
| | - Yun S. Song
- Department of Statistics, University of California, Berkeley
- Center for Computational Biology, University of California, Berkeley
- Computer Science Division, University of California, Berkeley
| |
Collapse
|
49
|
DePaolo J, Bornstein M, Judy R, Abramowitz S, Verma SS, Levin MG, Arany Z, Damrauer SM. Titin-Truncating variants Predispose to Dilated Cardiomyopathy in Diverse Populations. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.01.17.24301405. [PMID: 38293092 PMCID: PMC10827233 DOI: 10.1101/2024.01.17.24301405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Importance The effect of high percentage spliced in (hiPSI) TTN truncating variants (TTNtvs) on risk of dilated cardiomyopathy (DCM) has historically been studied among population subgroups defined by genetic similarity to European reference populations. This has raised questions about the effect of TTNtvs in diverse populations, especially among individuals genetically similar to African reference populations. Objective To determine the effect of TTNtvs on risk of DCM in diverse population as measured by genetic distance (GD) in principal component (PC) space. Design Cohort study. Setting Penn Medicine Biobank (PMBB) is a large, diverse biobank. Participants Participants were recruited from across the Penn Medicine healthcare system and volunteered to have their electronic health records linked to biospecimen data including DNA which has undergone whole exome sequencing. Main Outcomes and Measures Risk of DCM among individuals carrying a hiPSI TTNtv. Results Carrying a hiPSI TTNtv was associated with DCM among PMBB participants across a range of GD deciles from the 1000G European centroid; the effect estimates ranged from odds ratio (OR) = 3.29 (95% confidence interval [CI] 1.26 to 8.56) to OR = 9.39 (95% CI 3.82 to 23.13). When individuals were assigned to population subgroups based on genetic similarity to the 1000G reference populations, hiPSI TTNtvs conferred significant risk of DCM among those genetically similar to the 1000G European reference population (OR = 7.55, 95% CI 4.99 to 11.42, P<0.001) and individuals genetically similar to the 1000G African reference population (OR 3.50, 95% CI 1.48 to 8.24, P=0.004). Conclusions and Relevance TTNtvs are associated with increased risk of DCM among a diverse cohort. There is no significant difference in effect of TTNtvs on DCM risk across deciles of GD from the 1000G European centroid, suggesting genetic background should not be considered when screening individuals for titin-related DCM.
Collapse
Affiliation(s)
- John DePaolo
- Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Marc Bornstein
- Cardiovascular Institute, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, PA 19104, USA
| | - Renae Judy
- Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Sarah Abramowitz
- Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Shefali S Verma
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, PA 19104, USA
| | - Michael G Levin
- Cardiovascular Institute, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, PA 19104, USA
- Corporal Michael J. Crescenz VA Medical Center, Philadelphia, PA 19104, USA
| | - Zoltan Arany
- Cardiovascular Institute, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, PA 19104, USA
| | - Scott M Damrauer
- Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Cardiovascular Institute, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, PA 19104, USA
- Corporal Michael J. Crescenz VA Medical Center, Philadelphia, PA 19104, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
50
|
Janivara R, Hazra U, Pfennig A, Harlemon M, Kim MS, Eaaswarkhanth M, Chen WC, Ogunbiyi A, Kachambwa P, Petersen LN, Jalloh M, Mensah JE, Adjei AA, Adusei B, Joffe M, Gueye SM, Aisuodionoe-Shadrach OI, Fernandez PW, Rohan TE, Andrews C, Rebbeck TR, Adebiyi AO, Agalliu I, Lachance J. Uncovering the genetic architecture and evolutionary roots of androgenetic alopecia in African men. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.12.575396. [PMID: 38293167 PMCID: PMC10827056 DOI: 10.1101/2024.01.12.575396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Androgenetic alopecia is a highly heritable trait. However, much of our understanding about the genetics of male pattern baldness comes from individuals of European descent. Here, we examined a novel dataset comprising 2,136 men from Ghana, Nigeria, Senegal, and South Africa that were genotyped using a custom array. We first tested how genetic predictions of baldness generalize from Europe to Africa, finding that polygenic scores from European GWAS yielded AUC statistics that ranged from 0.513 to 0.546, indicating that genetic predictions of baldness in African populations performed notably worse than in European populations. Subsequently, we conducted the first African GWAS of androgenetic alopecia, focusing on self-reported baldness patterns at age 45. After correcting for present age, population structure, and study site, we identified 266 moderately significant associations, 51 of which were independent (p-value < 10-5, r2 < 0.2). Most baldness associations were autosomal, and the X chromosomes does not appear to have a large impact on baldness in African men. Finally, we examined the evolutionary causes of continental differences in genetic architecture. Although Neanderthal alleles have previously been associated with skin and hair phenotypes, we did not find evidence that European-ascertained baldness hits were enriched for signatures of ancient introgression. Most loci that are associated with androgenetic alopecia are evolving neutrally. However, multiple baldness-associated SNPs near the EDA2R and AR genes have large allele frequency differences between continents. Collectively, our findings illustrate how evolutionary history contributes to the limited portability of genetic predictions across ancestries.
Collapse
Affiliation(s)
- Rohini Janivara
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA
| | - Ujani Hazra
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA
| | - Aaron Pfennig
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA
| | - Maxine Harlemon
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA
- Department of Biology, Morgan State University, Baltimore, Maryland, USA
| | - Michelle S Kim
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA
- Department of Human Genetics University of Michigan, Ann Arbor, Michigan, USA
| | | | - Wenlong C Chen
- Strengthening Oncology Services Research Unit, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
- Sydney Brenner Institute for Molecular Bioscience, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
- National Cancer Registry, National Institute for Communicable Diseases a Division of the National Health Laboratory Service, Johannesburg, South Africa
| | | | - Paidamoyo Kachambwa
- Centre for Proteomic and Genomic Research, Cape Town, South Africa
- Mediclinic Precise Southern Africa, Cape Town, South Africa
| | - Lindsay N Petersen
- Centre for Proteomic and Genomic Research, Cape Town, South Africa
- Mediclinic Precise Southern Africa, Cape Town, South Africa
| | - Mohamed Jalloh
- Université Cheikh Anta Diop de Dakar, Dakar, Senegal
- Université Iba Der Thiam de Thiès, Thiès, Senegal
| | - James E Mensah
- Korle-Bu Teaching Hospital and University of Ghana Medical School, Accra, Ghana
| | - Andrew A Adjei
- Department of Pathology, University of Ghana Medical School, Accra, Ghana
| | | | - Maureen Joffe
- Strengthening Oncology Services Research Unit, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| | | | - Oseremen I Aisuodionoe-Shadrach
- College of Health Sciences, University of Abuja, University of Abuja Teaching Hospital and Cancer Science Centre, Abuja, Nigeria
| | - Pedro W Fernandez
- Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Thomas E Rohan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, New York, USA
| | | | - Timothy R Rebbeck
- Dana-Farber Cancer Institute, Boston, Massachusetts, USA
- Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
| | | | - Ilir Agalliu
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Joseph Lachance
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA
| |
Collapse
|