1
|
Chen X, Zhu Y, Zhong P, Xie G. Multi-trait GWAS identifies pleiotropic loci associated with colorectal cancer in East Asian populations. Front Genet 2025; 16:1590652. [PMID: 40303978 PMCID: PMC12037559 DOI: 10.3389/fgene.2025.1590652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2025] [Accepted: 03/20/2025] [Indexed: 05/02/2025] Open
Abstract
Introduction While genome-wide association studies (GWAS) have identified numerous susceptibility loci in Colorectal cancer (CRC), most findings are based on European populations. Additionally, CRC shares genetic architecture with other traits, and multi-trait analysis can improve the discovery of pleiotropic loci. Materials and methods We conducted a multi-trait GWAS using the Multi-Trait Analysis of GWAS (MTAG) framework, leveraging large-scale genomic and phenotypic data from BioBank Japan (BBJ). We also examined genetic correlations between CRC and 70 complex traits, followed by local genetic correlation analysis and enrichment of heritability partitioned by chromatin states and tissue types. Results We identified 25 genome-wide significant loci for CRC and colon polyps, including three novel loci in East Asian populations: BET1L (rs12226698, 11p15.5), OAS1 (rs2525858, 12q24.13), and BMP2 (rs4813802, 20p12.3). While BMP2 had been previously reported in European CRC studies, BET1L and OAS1 represent novel associations in East Asians. Colocalization analysis confirmed strong shared association signals between BET1L and OAS1 in CRC and colon polyps, supporting their pleiotropic effects in colorectal neoplasia. BET1L was further identified in the multi-trait analysis of CRC and myocardial infarction. Similarly, OAS1 was significantly associated with CRC and angina pectoris. Functional annotation revealed that these loci serve as expression quantitative trait loci (eQTLs) in colorectal tissues and immune-related pathways. Conclusion Our study identifies novel pleiotropic loci associated with CRC in East Asians, emphasizing the importance of population-specific genetic studies. The findings provide new insights into the genetic architecture of CRC and its shared pathways with other complex diseases.
Collapse
Affiliation(s)
- Xiqi Chen
- Department of Emergency Surgery, Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan, Shandong, China
| | - Yong Zhu
- Department of Emergency Surgery, Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan, Shandong, China
| | - Peng Zhong
- Department of Cardiology, Jining No.1 People’s Hospital, Jining, Shandong, China
| | - Guangdong Xie
- Department of Emergency Surgery, Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan, Shandong, China
| |
Collapse
|
2
|
Verras GI, Hamady ZZ, Collins A, Tapper W. Utility of Polygenic Risk Scores (PRSs) in Predicting Pancreatic Cancer: A Systematic Review and Meta-Analysis of Common-Variant and Mixed Scores with Insights into Rare Variant Analysis. Cancers (Basel) 2025; 17:241. [PMID: 39858023 PMCID: PMC11764467 DOI: 10.3390/cancers17020241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2024] [Revised: 01/03/2025] [Accepted: 01/08/2025] [Indexed: 01/27/2025] Open
Abstract
Pancreatic adenocarcinoma is the most common histological subtype of pancreatic cancer, representing approximately 85% of all cases [...].
Collapse
Affiliation(s)
- Georgios Ioannis Verras
- School of Human Development and Health, Faculty of Medicine, University of Southampton, Southampton SO16 6YD, UK; (Z.Z.H.); (A.C.)
- Department of General Surgery, University Hospital Southampton, Southampton SO16 6YD, UK
| | - Zaed Z. Hamady
- School of Human Development and Health, Faculty of Medicine, University of Southampton, Southampton SO16 6YD, UK; (Z.Z.H.); (A.C.)
- Department of General Surgery, University Hospital Southampton, Southampton SO16 6YD, UK
| | - Andrew Collins
- School of Human Development and Health, Faculty of Medicine, University of Southampton, Southampton SO16 6YD, UK; (Z.Z.H.); (A.C.)
| | - William Tapper
- School of Human Development and Health, Faculty of Medicine, University of Southampton, Southampton SO16 6YD, UK; (Z.Z.H.); (A.C.)
| |
Collapse
|
3
|
Ziyatdinov A, Mbatchou J, Marcketta A, Backman J, Gaynor S, Zou Y, Joseph T, Geraghty B, Herman J, Watanabe K, Ghosh A, Kosmicki J, Locke A, Thornton T, Kang HM, Ferreira M, Baras A, Abecasis G, Marchini J. Joint testing of rare variant burden scores using non-negative least squares. Am J Hum Genet 2024; 111:2139-2149. [PMID: 39366334 PMCID: PMC11480795 DOI: 10.1016/j.ajhg.2024.08.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 08/23/2024] [Accepted: 08/27/2024] [Indexed: 10/06/2024] Open
Abstract
Gene-based burden tests are a popular and powerful approach for analysis of exome-wide association studies. These approaches combine sets of variants within a gene into a single burden score that is then tested for association. Typically, a range of burden scores are calculated and tested across a range of annotation classes and frequency bins. Correlation between these tests can complicate the multiple testing correction and hamper interpretation of the results. We introduce a method called the sparse burden association test (SBAT) that tests the joint set of burden scores under the assumption that causal burden scores act in the same effect direction. The method simultaneously assesses the significance of the model fit and selects the set of burden scores that best explain the association at the same time. Using simulated data, we show that the method is well calibrated and highlight scenarios where the test outperforms existing gene-based tests. We apply the method to 73 quantitative traits from the UK Biobank, showing that SBAT is a valuable additional gene-based test when combined with other existing approaches. This test is implemented in the REGENIE software.
Collapse
Affiliation(s)
| | | | | | | | | | - Yuxin Zou
- Regeneron Genetics Center, Tarrytown, NY, USA
| | | | | | | | | | | | | | - Adam Locke
- Regeneron Genetics Center, Tarrytown, NY, USA
| | | | | | | | - Aris Baras
- Regeneron Genetics Center, Tarrytown, NY, USA
| | | | | |
Collapse
|
4
|
Auvergne A, Traut N, Henches L, Troubat L, Frouin A, Boetto C, Kazem S, Julienne H, Toro R, Aschard H. Multitrait Analysis to Decipher the Intertwined Genetic Architecture of Neuroanatomical Phenotypes and Psychiatric Disorders. BIOLOGICAL PSYCHIATRY. COGNITIVE NEUROSCIENCE AND NEUROIMAGING 2024:S2451-9022(24)00266-0. [PMID: 39260564 DOI: 10.1016/j.bpsc.2024.08.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Revised: 06/28/2024] [Accepted: 08/12/2024] [Indexed: 09/13/2024]
Abstract
BACKGROUND There is increasing evidence of shared genetic factors between psychiatric disorders and brain magnetic resonance imaging (MRI) phenotypes. However, deciphering the joint genetic architecture of these outcomes has proven to be challenging, and new approaches are needed to infer the genetic structures that may underlie those phenotypes. Multivariate analyses are a meaningful approach to reveal links between MRI phenotypes and psychiatric disorders missed by univariate approaches. METHODS First, we conducted univariate and multivariate genome-wide association studies for 9 MRI-derived brain volume phenotypes in 20,000 UK Biobank participants. Next, we performed various complementary enrichment analyses to assess whether and how univariate and multitrait approaches could distinguish disorder-associated and non-disorder-associated variants from 6 psychiatric disorders: bipolar disorder, attention-deficit/hyperactivity disorder, autism, schizophrenia, obsessive-compulsive disorder, and major depressive disorder. Finally, we conducted a clustering analysis of top associated variants based on their MRI multitrait association using an optimized k-medoids approach. RESULTS A univariate MRI genome-wide association study revealed only negligible genetic correlations with psychiatric disorders, while a multitrait genome-wide association study identified multiple new associations and showed significant enrichment for variants related to both attention-deficit/hyperactivity disorder and schizophrenia. Clustering analyses also detected 2 clusters that showed not only enrichment for association with attention-deficit/hyperactivity disorder and schizophrenia but also a consistent direction of effects. Functional annotation analyses of those clusters pointed to multiple potential mechanisms, suggesting in particular a role of neurotrophin pathways in both MRI phenotypes and schizophrenia. CONCLUSIONS Our results show that multitrait association signature can be used to infer genetically driven latent MRI variables associated with psychiatric disorders, thereby opening paths for future biomarker development.
Collapse
Affiliation(s)
- Antoine Auvergne
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, France.
| | - Nicolas Traut
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, France
| | - Léo Henches
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, France
| | - Lucie Troubat
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, France
| | - Arthur Frouin
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, France
| | - Christophe Boetto
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, France
| | - Sayeh Kazem
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, France
| | - Hanna Julienne
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, France
| | - Roberto Toro
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, France
| | - Hugues Aschard
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, France; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts.
| |
Collapse
|
5
|
Kontou PI, Bagos PG. The goldmine of GWAS summary statistics: a systematic review of methods and tools. BioData Min 2024; 17:31. [PMID: 39238044 PMCID: PMC11375927 DOI: 10.1186/s13040-024-00385-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 08/27/2024] [Indexed: 09/07/2024] Open
Abstract
Genome-wide association studies (GWAS) have revolutionized our understanding of the genetic architecture of complex traits and diseases. GWAS summary statistics have become essential tools for various genetic analyses, including meta-analysis, fine-mapping, and risk prediction. However, the increasing number of GWAS summary statistics and the diversity of software tools available for their analysis can make it challenging for researchers to select the most appropriate tools for their specific needs. This systematic review aims to provide a comprehensive overview of the currently available software tools and databases for GWAS summary statistics analysis. We conducted a comprehensive literature search to identify relevant software tools and databases. We categorized the tools and databases by their functionality, including data management, quality control, single-trait analysis, and multiple-trait analysis. We also compared the tools and databases based on their features, limitations, and user-friendliness. Our review identified a total of 305 functioning software tools and databases dedicated to GWAS summary statistics, each with unique strengths and limitations. We provide descriptions of the key features of each tool and database, including their input/output formats, data types, and computational requirements. We also discuss the overall usability and applicability of each tool for different research scenarios. This comprehensive review will serve as a valuable resource for researchers who are interested in using GWAS summary statistics to investigate the genetic basis of complex traits and diseases. By providing a detailed overview of the available tools and databases, we aim to facilitate informed tool selection and maximize the effectiveness of GWAS summary statistics analysis.
Collapse
Affiliation(s)
| | - Pantelis G Bagos
- Department of Computer Science and Biomedical Informatics, University of Thessaly, 35131, Lamia, Greece.
| |
Collapse
|
6
|
Suzuki Y, Ménager H, Brancotte B, Vernet R, Nerin C, Boetto C, Auvergne A, Linhard C, Torchet R, Lechat P, Troubat L, Cho MH, Bouzigon E, Aschard H, Julienne H. Trait selection strategy in multi-trait GWAS: Boosting SNP discoverability. HGG ADVANCES 2024; 5:100319. [PMID: 38872309 PMCID: PMC11260573 DOI: 10.1016/j.xhgg.2024.100319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 06/11/2024] [Accepted: 06/11/2024] [Indexed: 06/15/2024] Open
Abstract
Since the first genome-wide association studies (GWASs), thousands of variant-trait associations have been discovered. However, comprehensively mapping the genetic determinant of complex traits through univariate testing can require prohibitive sample sizes. Multi-trait GWAS can circumvent this issue and improve statistical power by leveraging the joint genetic architecture of human phenotypes. Although many methodological hurdles of multi-trait testing have been solved, the strategy to select traits has been overlooked. In this study, we conducted multi-trait GWAS on approximately 20,000 combinations of 72 traits using an omnibus test as implemented in the Joint Analysis of Summary Statistics. We assessed which genetic features of the sets of traits analyzed were associated with an increased detection of variants compared with univariate screening. Several features of the set of traits, including the heritability, the number of traits, and the genetic correlation, drive the multi-trait test gain. Using these features jointly in predictive models captures a large fraction of the power gain of the multi-trait test (Pearson's r between the observed and predicted gain equals 0.43, p < 1.6 × 10-60). Applying an alternative multi-trait approach (Multi-Trait Analysis of GWAS), we identified similar features of interest, but with an overall 70% lower number of new associations. Finally, selecting sets based on our data-driven models systematically outperformed the common strategy of selecting clinically similar traits. This work provides a unique picture of the determinant of multi-trait GWAS statistical power and outlines practical strategies for multi-trait testing.
Collapse
Affiliation(s)
- Yuka Suzuki
- Institut Pasteur, Université Paris Cité, Department of Computational Biology, 75015 Paris, France.
| | - Hervé Ménager
- Institut Pasteur, Université Paris Cité, Bioinformatics of Biostatistics Hub, 75015 Paris, France
| | - Bryan Brancotte
- Institut Pasteur, Université Paris Cité, Bioinformatics of Biostatistics Hub, 75015 Paris, France
| | - Raphaël Vernet
- Université Paris Cité, Institut National de la Santé et de la Recherche Médicale (INSERM), UMR-1124, Group of Genomic Epidemiology of Multifactorial Diseases, Paris, France
| | - Cyril Nerin
- Institut Pasteur, Université Paris Cité, Department of Computational Biology, 75015 Paris, France
| | - Christophe Boetto
- Institut Pasteur, Université Paris Cité, Department of Computational Biology, 75015 Paris, France
| | - Antoine Auvergne
- Institut Pasteur, Université Paris Cité, Department of Computational Biology, 75015 Paris, France
| | - Christophe Linhard
- Université Paris Cité, Institut National de la Santé et de la Recherche Médicale (INSERM), UMR-1124, Group of Genomic Epidemiology of Multifactorial Diseases, Paris, France
| | - Rachel Torchet
- Institut Pasteur, Université Paris Cité, Bioinformatics of Biostatistics Hub, 75015 Paris, France
| | - Pierre Lechat
- Institut Pasteur, Université Paris Cité, Bioinformatics of Biostatistics Hub, 75015 Paris, France
| | - Lucie Troubat
- Institut Pasteur, Université Paris Cité, Department of Computational Biology, 75015 Paris, France
| | - Michael H Cho
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, 181 Longwood Avenue, Boston, MA 02115, USA; Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Emmanuelle Bouzigon
- Université Paris Cité, Institut National de la Santé et de la Recherche Médicale (INSERM), UMR-1124, Group of Genomic Epidemiology of Multifactorial Diseases, Paris, France
| | - Hugues Aschard
- Institut Pasteur, Université Paris Cité, Department of Computational Biology, 75015 Paris, France.
| | - Hanna Julienne
- Institut Pasteur, Université Paris Cité, Department of Computational Biology, 75015 Paris, France; Institut Pasteur, Université Paris Cité, Bioinformatics of Biostatistics Hub, 75015 Paris, France.
| |
Collapse
|
7
|
Cai Z, Iso-Touru T, Sanchez MP, Kadri N, Bouwman AC, Chitneedi PK, MacLeod IM, Vander Jagt CJ, Chamberlain AJ, Gredler-Grandl B, Spengeler M, Lund MS, Boichard D, Kühn C, Pausch H, Vilkki J, Sahana G. Meta-analysis of six dairy cattle breeds reveals biologically relevant candidate genes for mastitis resistance. Genet Sel Evol 2024; 56:54. [PMID: 39009986 PMCID: PMC11247842 DOI: 10.1186/s12711-024-00920-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Accepted: 06/26/2024] [Indexed: 07/17/2024] Open
Abstract
BACKGROUND Mastitis is a disease that incurs significant costs in the dairy industry. A promising approach to mitigate its negative effects is to genetically improve the resistance of dairy cattle to mastitis. A meta-analysis of genome-wide association studies (GWAS) across multiple breeds for clinical mastitis (CM) and its indicator trait, somatic cell score (SCS), is a powerful method to identify functional genetic variants that impact mastitis resistance. RESULTS We conducted meta-analyses of eight and fourteen GWAS on CM and SCS, respectively, using 30,689 and 119,438 animals from six dairy cattle breeds. Methods for the meta-analyses were selected to properly account for the multi-breed structure of the GWAS data. Our study revealed 58 lead markers that were associated with mastitis incidence, including 16 loci that did not overlap with previously identified quantitative trait loci (QTL), as curated at the Animal QTLdb. Post-GWAS analysis techniques such as gene-based analysis and genomic feature enrichment analysis enabled prioritization of 31 candidate genes and 14 credible candidate causal variants that affect mastitis. CONCLUSIONS Our list of candidate genes can help to elucidate the genetic architecture underlying mastitis resistance and provide better tools for the prevention or treatment of mastitis, ultimately contributing to more sustainable animal production.
Collapse
Affiliation(s)
- Zexi Cai
- Center for Quantitative Genetics and Genomics, Aarhus University, 8000, Aarhus, Denmark.
| | - Terhi Iso-Touru
- Natural Resources Institute Finland (Luke), 31600, Jokioinen, Finland
| | - Marie-Pierre Sanchez
- Université Paris-Saclay, INRAE, AgroParisTech, GABI, 78350, Jouy-en-Josas, France
| | - Naveen Kadri
- Animal Genomics, ETH Zurich, 8092, Zurich, Switzerland
| | - Aniek C Bouwman
- Wageningen University and Research, Animal Breeding and Genomics, P.O. Box 338, 6700, AH, Wageningen, The Netherlands
| | - Praveen Krishna Chitneedi
- Institute of Genome Biology, Research Institute for Farm Animal Biology (FBN), 18196, Dummerstorf, Germany
| | - Iona M MacLeod
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3083, Australia
| | | | - Amanda J Chamberlain
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, VIC, Australia
| | - Birgit Gredler-Grandl
- Wageningen University and Research, Animal Breeding and Genomics, P.O. Box 338, 6700, AH, Wageningen, The Netherlands
| | | | - Mogens Sandø Lund
- Center for Quantitative Genetics and Genomics, Aarhus University, 8000, Aarhus, Denmark
| | - Didier Boichard
- Université Paris-Saclay, INRAE, AgroParisTech, GABI, 78350, Jouy-en-Josas, France
| | - Christa Kühn
- Institute of Genome Biology, Research Institute for Farm Animal Biology (FBN), 18196, Dummerstorf, Germany
- Agricultural and Environmental Faculty, University Rostock, 18059, Rostock, Germany
| | - Hubert Pausch
- Animal Genomics, ETH Zurich, 8092, Zurich, Switzerland
| | - Johanna Vilkki
- Natural Resources Institute Finland (Luke), 31600, Jokioinen, Finland
| | - Goutam Sahana
- Center for Quantitative Genetics and Genomics, Aarhus University, 8000, Aarhus, Denmark
| |
Collapse
|
8
|
Boetto C, Frouin A, Henches L, Auvergne A, Suzuki Y, Patin E, Bredon M, Chiu A, Consortium MI, Sankararaman S, Zaitlen N, Kennedy SP, Quintana-Murci L, Duffy D, Sokol H, Aschard H. MANOCCA: a robust and computationally efficient test of covariance in high-dimension multivariate omics data. Brief Bioinform 2024; 25:bbae272. [PMID: 38856173 PMCID: PMC11163461 DOI: 10.1093/bib/bbae272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 04/16/2024] [Accepted: 05/28/2024] [Indexed: 06/11/2024] Open
Abstract
Multivariate analysis is becoming central in studies investigating high-throughput molecular data, yet, some important features of these data are seldom explored. Here, we present MANOCCA (Multivariate Analysis of Conditional CovAriance), a powerful method to test for the effect of a predictor on the covariance matrix of a multivariate outcome. The proposed test is by construction orthogonal to tests based on the mean and variance and is able to capture effects that are missed by both approaches. We first compare the performances of MANOCCA with existing correlation-based methods and show that MANOCCA is the only test correctly calibrated in simulation mimicking omics data. We then investigate the impact of reducing the dimensionality of the data using principal component analysis when the sample size is smaller than the number of pairwise covariance terms analysed. We show that, in many realistic scenarios, the maximum power can be achieved with a limited number of components. Finally, we apply MANOCCA to 1000 healthy individuals from the Milieu Interieur cohort, to assess the effect of health, lifestyle and genetic factors on the covariance of two sets of phenotypes, blood biomarkers and flow cytometry-based immune phenotypes. Our analyses identify significant associations between multiple factors and the covariance of both omics data.
Collapse
Affiliation(s)
- Christophe Boetto
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, 25-28 rue du Dr Roux, 75015 Paris, France
| | - Arthur Frouin
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, 25-28 rue du Dr Roux, 75015 Paris, France
| | - Léo Henches
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, 25-28 rue du Dr Roux, 75015 Paris, France
| | - Antoine Auvergne
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, 25-28 rue du Dr Roux, 75015 Paris, France
| | - Yuka Suzuki
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, 25-28 rue du Dr Roux, 75015 Paris, France
| | - Etienne Patin
- Human Evolutionary Genetics Unit, Institut Pasteur, Université Paris Cité, CNRS UMR2000, 25-28 rue Dr Roux, 75015 Paris, France
| | - Marius Bredon
- Sorbonne Université, INSERM, Centre de recherche Saint-Antoine, CRSA, Microbiota, Gut and Inflammation Laboratory, Hôpital Saint-Antoine (UMR S938) Sorbonne Université, 27 rue Chaligny, 75012 Paris, France
| | - Alec Chiu
- Department of Human Genetics, University California Los Angeles, 695 Charles E. Young Drive South, Box 708822, Los Angeles, CA 90095-7088, United States
| | | | - Sriram Sankararaman
- Department of Human Genetics, University California Los Angeles, 695 Charles E. Young Drive South, Box 708822, Los Angeles, CA 90095-7088, United States
| | - Noah Zaitlen
- Department of Human Genetics, University California Los Angeles, 695 Charles E. Young Drive South, Box 708822, Los Angeles, CA 90095-7088, United States
| | - Sean P Kennedy
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, 25-28 rue du Dr Roux, 75015 Paris, France
| | - Lluis Quintana-Murci
- Human Evolutionary Genetics Unit, Institut Pasteur, Université Paris Cité, CNRS UMR2000, 25-28 rue Dr Roux, 75015 Paris, France
- Chair of Human Genomics and Evolution, Collège de France, 11 Pl. Marcelin Berthelot, 75005 Paris, France
| | - Darragh Duffy
- Translational Immunology Unit, Institut Pasteur, Université de Paris Cité, 25-28 rue du Dr Roux, 75015 Paris, France
| | - Harry Sokol
- Sorbonne Université, INSERM, Centre de recherche Saint-Antoine, CRSA, Microbiota, Gut and Inflammation Laboratory, Hôpital Saint-Antoine (UMR S938) Sorbonne Université, 27 rue Chaligny, 75012 Paris, France
- Paris Center for Microbiome Medicine, Fédération Hospitalo-Universitaire, 184 rue du Faubourg Saint-Antoine, 75571 PARIS Cedex 12, France
- Gastroenterology Department, AP-HP, Saint Antoine Hospital, 184 rue du faubourg Saint-Antoine, 75012 Paris, France
- INRAE Micalis & AgroParisTech, UMR1319, Micalis & AgroParisTech, 4 avenue Jean Jaurès, 78352 Jouy en Josas, France
| | - Hugues Aschard
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, 25-28 rue du Dr Roux, 75015 Paris, France
- Department of Epidemiology, Harvard TH Chan School of Public Health, 677 Huntington Ave, Boston, MA 02115, United States
| |
Collapse
|
9
|
Troubat L, Fettahoglu D, Henches L, Aschard H, Julienne H. Multi-trait GWAS for diverse ancestries: mapping the knowledge gap. BMC Genomics 2024; 25:375. [PMID: 38627641 PMCID: PMC11022331 DOI: 10.1186/s12864-024-10293-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 04/09/2024] [Indexed: 04/19/2024] Open
Abstract
BACKGROUND Approximately 95% of samples analyzed in univariate genome-wide association studies (GWAS) are of European ancestry. This bias toward European ancestry populations in association screening also exists for other analyses and methods that are often developed and tested on European ancestry only. However, existing data in non-European populations, which are often of modest sample size, could benefit from innovative approaches as recently illustrated in the context of polygenic risk scores. METHODS Here, we extend and assess the potential limitations and gains of our multi-trait GWAS pipeline, JASS (Joint Analysis of Summary Statistics), for the analysis of non-European ancestries. To this end, we conducted the joint GWAS of 19 hematological traits and glycemic traits across five ancestries (European (EUR), admixed American (AMR), African (AFR), East Asian (EAS), and South-East Asian (SAS)). RESULTS We detected 367 new genome-wide significant associations in non-European populations (15 in Admixed American (AMR), 72 in African (AFR) and 280 in East Asian (EAS)). New associations detected represent 5%, 17% and 13% of associations in the AFR, AMR and EAS populations, respectively. Overall, multi-trait testing increases the replication of European associated loci in non-European ancestry by 15%. Pleiotropic effects were highly similar at significant loci across ancestries (e.g. the mean correlation between multi-trait genetic effects of EUR and EAS ancestries was 0.88). For hematological traits, strong discrepancies in multi-trait genetic effects are tied to known evolutionary divergences: the ARKC1 loci, which is adaptive to overcome p.vivax induced malaria. CONCLUSIONS Multi-trait GWAS can be a valuable tool to narrow the genetic knowledge gap between European and non-European populations.
Collapse
Affiliation(s)
- Lucie Troubat
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, F-75015, France
| | - Deniz Fettahoglu
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, F-75015, France
| | - Léo Henches
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, F-75015, France
| | - Hugues Aschard
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, F-75015, France
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Hanna Julienne
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, F-75015, France.
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, Paris, F-75015, France.
| |
Collapse
|
10
|
Buchardt AS, Zhou X, Ekstrøm CT. Joint regression analysis of multiple traits based on genetic relationships. BIOINFORMATICS ADVANCES 2024; 4:vbad192. [PMID: 38264461 PMCID: PMC10805347 DOI: 10.1093/bioadv/vbad192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Revised: 12/11/2023] [Accepted: 12/31/2023] [Indexed: 01/25/2024]
Abstract
Motivation Polygenic scores (PGSs) are widely available and employed in genomic data analyses for predicting and understanding genetic architectures. Existing approaches either require information on SNP level, do not infer clusters of traits sharing genetic characteristic, or do not have any immediate predictive properties. Results Here, we present geneJAM, which is a novel clustering and estimation method using PGSs for inferring a genetic relationship among multiple, simultaneously measured and potentially correlated traits in a multivariate GWAS.Using graphical lasso, we estimate a sparse covariance matrix of the PGSs and obtain clusters of traits sharing genetic characteristics. We use the clusters to specify the structure of the error covariance matrix of a generalized least squares (GLS) model and use the feasible GLS estimator for estimating a linear regression model with a certain unknown degree of correlation between the residuals.The method suits many biology studies well with traits embedded in some genetic functioning groups and facilitates development of the PGS research. We compare the method with fully parametric techniques on simulated data and illustrate the utility of the methods by examining a heterogeneous stock mouse data set from the Wellcome Trust Centre for Human Genetics. We demonstrate that the method successfully identifies clusters of traits and increases precision, power, and computational efficiency. Availability and implementation GeneJAM is implemented in R and available at: https://github.com/abuchardt/geneJAM.
Collapse
Affiliation(s)
- Ann-Sophie Buchardt
- Department of Public Health, University of Copenhagen, 1014 Copenhagen, Denmark
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, United States
| | - Claus Thorn Ekstrøm
- Department of Public Health, University of Copenhagen, 1014 Copenhagen, Denmark
| |
Collapse
|
11
|
Suzuki Y, Ménager H, Brancotte B, Vernet R, Nerin C, Boetto C, Auvergne A, Linhard C, Torchet R, Lechat P, Troubat L, Cho MH, Bouzigon E, Aschard H, Julienne H. Trait selection strategy in multi-trait GWAS: Boosting SNPs discoverability. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.27.564319. [PMID: 37961722 PMCID: PMC10634875 DOI: 10.1101/2023.10.27.564319] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Since the first Genome-Wide Association Studies (GWAS), thousands of variant-trait associations have been discovered. However, the sample size required to detect additional variants using standard univariate association screening is increasingly prohibitive. Multi-trait GWAS offers a relevant alternative: it can improve statistical power and lead to new insights about gene function and the joint genetic architecture of human phenotypes. Although many methodological hurdles of multi-trait testing have been discussed, the strategy to select trait, among overwhelming possibilities, has been overlooked. In this study, we conducted extensive multi-trait tests using JASS (Joint Analysis of Summary Statistics) and assessed which genetic features of the analysed sets were associated with an increased detection of variants as compared to univariate screening. Our analyses identified multiple factors associated with the gain in the association detection in multi-trait tests. Together, these factors of the analysed sets are predictive of the gain of the multi-trait test (Pearson's ρ equal to 0.43 between the observed and predicted gain, P < 1.6 × 10-60). Applying an alternative multi-trait approach (MTAG, multi-trait analysis of GWAS), we found that in most scenarios but particularly those with larger numbers of traits, JASS outperformed MTAG. Finally, we benchmark several strategies to select set of traits including the prevalent strategy of selecting clinically similar traits, which systematically underperformed selecting clinically heterogenous traits or selecting sets that issued from our data-driven models. This work provides a unique picture of the determinant of multi-trait GWAS statistical power and outline practical strategies for multi-trait testing.
Collapse
Affiliation(s)
- Yuka Suzuki
- Institut Pasteur, Université Paris Cité, Department of Computational Biology, Paris, 75015 France
| | - Hervé Ménager
- Institut Pasteur, Université Paris Cité, Bioinformatics of Biostatistics Hub, F-75015 Paris, France
| | - Bryan Brancotte
- Institut Pasteur, Université Paris Cité, Bioinformatics of Biostatistics Hub, F-75015 Paris, France
| | - Raphaël Vernet
- Université Paris Cité, Institut National de la Santé et de la Recherche Médicale (INSERM), UMR-1124, Group of Genomic Epidemiology of Multifactorial Diseases, Paris, France
| | - Cyril Nerin
- Institut Pasteur, Université Paris Cité, Department of Computational Biology, Paris, 75015 France
| | - Christophe Boetto
- Institut Pasteur, Université Paris Cité, Department of Computational Biology, Paris, 75015 France
| | - Antoine Auvergne
- Institut Pasteur, Université Paris Cité, Department of Computational Biology, Paris, 75015 France
| | - Christophe Linhard
- Université Paris Cité, Institut National de la Santé et de la Recherche Médicale (INSERM), UMR-1124, Group of Genomic Epidemiology of Multifactorial Diseases, Paris, France
| | - Rachel Torchet
- Institut Pasteur, Université Paris Cité, Bioinformatics of Biostatistics Hub, F-75015 Paris, France
| | - Pierre Lechat
- Institut Pasteur, Université Paris Cité, Bioinformatics of Biostatistics Hub, F-75015 Paris, France
| | - Lucie Troubat
- Institut Pasteur, Université Paris Cité, Department of Computational Biology, Paris, 75015 France
| | - Michael H. Cho
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, 181 Longwood Ave, Boston, MA, 02115, USA
- Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Emmanuelle Bouzigon
- Université Paris Cité, Institut National de la Santé et de la Recherche Médicale (INSERM), UMR-1124, Group of Genomic Epidemiology of Multifactorial Diseases, Paris, France
| | - Hugues Aschard
- Institut Pasteur, Université Paris Cité, Department of Computational Biology, Paris, 75015 France
| | - Hanna Julienne
- Institut Pasteur, Université Paris Cité, Department of Computational Biology, Paris, 75015 France
- Institut Pasteur, Université Paris Cité, Bioinformatics of Biostatistics Hub, F-75015 Paris, France
| |
Collapse
|
12
|
Gagliano Taliun SA, Dinsmore IR, Mirshahi T, Chang AR, Paterson AD, Barua M. GWAS for the composite traits of hematuria and albuminuria. Sci Rep 2023; 13:18084. [PMID: 37872228 PMCID: PMC10593773 DOI: 10.1038/s41598-023-45102-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 10/16/2023] [Indexed: 10/25/2023] Open
Abstract
Our GWAS of hematuria in the UK Biobank identified 6 loci, some of which overlap with loci for albuminuria suggesting pleiotropy. Since clinical syndromes are often defined by combinations of traits, generating a combined phenotype can improve power to detect loci influencing multiple characteristics. Thus the composite trait of hematuria and albuminuria was chosen to enrich for glomerular pathologies. Cases had both hematuria defined by ICD codes and albuminuria defined as uACR > 3 mg/mmol. Controls had neither an ICD code for hematuria nor an uACR > 3 mg/mmol. 2429 cases and 343,509 controls from the UK Biobank were included. eGFR was lower in cases compared to controls, with the exception of the comparison in females using CKD-EPI after age adjustment. Variants at 4 loci met genome-wide significance with the following nearest genes: COL4A4, TRIM27, ETV1 and CUBN. TRIM27 is part of the extended MHC locus. All loci with the exception of ETV1 were replicated in the Geisinger MyCode cohort. The previous GWAS of hematuria reported COL4A3-COL4A4 variants and HLA-B*0801 within MHC, which is in linkage disequilibrium with the TRIM27 variant (D' = 0.59). TRIM27 is highly expressed in the tubules. Additional loci included a coding sequence variant in CUBN (p.Ala2914Val, MAF = 0.014 (A), p = 3.29E-8, OR = 2.09, 95% CI = 1.61-2.72). Overall, GWAS for the composite trait of hematuria and albuminuria identified 4 loci, 2 of which were not previously identified in a GWAS of hematuria.
Collapse
Affiliation(s)
- Sarah A Gagliano Taliun
- Department of Medicine and Department of Neurosciences, Université de Montréal, Montréal, QC, Canada
- Montréal Heart Institute, Montréal, QC, Canada
| | - Ian R Dinsmore
- Department of Genomic Health, Geisinger, Danville, PA, USA
| | | | - Alexander R Chang
- Department of Population Health Sciences, Center for Kidney Health Research, Geisinger, Danville, PA, USA
- Department of Nephrology, Geisinger, Danville, PA, USA
| | - Andrew D Paterson
- Divisions of Epidemiology and Biostatistics, Dalla Lana School of Public Health, Toronto, ON, Canada.
- Genetics and Genome Biology, Research Institute at the Hospital for Sick Children, Toronto, ON, Canada.
- Institute of Medical Sciences, University of Toronto, Toronto, ON, Canada.
| | - Moumita Barua
- Institute of Medical Sciences, University of Toronto, Toronto, ON, Canada.
- Division of Nephrology, University Health Network, Toronto, ON, Canada.
- Department of Medicine, University of Toronto, Toronto, ON, Canada.
- Toronto General Hospital Research Institute, 8NU-855, 200 Elizabeth Street, Toronto, ON, M5G2C4, Canada.
| |
Collapse
|
13
|
Kim MS, Song M, Kim B, Shim I, Kim DS, Natarajan P, Do R, Won HH. Prioritization of therapeutic targets for dyslipidemia using integrative multi-omics and multi-trait analysis. Cell Rep Med 2023; 4:101112. [PMID: 37582372 PMCID: PMC10518515 DOI: 10.1016/j.xcrm.2023.101112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 12/22/2022] [Accepted: 06/19/2023] [Indexed: 08/17/2023]
Abstract
Drug targets with genetic support are several-fold more likely to succeed in clinical trials. We introduce a genetic-driven approach based on causal inferences that can inform drug target prioritization, repurposing, and adverse effects of using lipid-lowering agents. Given that a multi-trait approach increases the power to detect meaningful variants/genes, we conduct multi-omics and multi-trait analyses, followed by network connectivity investigations, and prioritize 30 potential therapeutic targets for dyslipidemia, including SORT1, PSRC1, CELSR2, PCSK9, HMGCR, APOB, GRN, HFE2, FJX1, C1QTNF1, and SLC5A8. 20% (6/30) of prioritized targets from our hypothesis-free drug target search are either approved or under investigation for dyslipidemia. The prioritized targets are 22-fold higher in likelihood of being approved or under investigation in clinical trials than genome-wide association study (GWAS)-curated targets. Our results demonstrate that the genetic-driven approach used in this study is a promising strategy for prioritizing targets while informing about the potential adverse effects and repurposing opportunities.
Collapse
Affiliation(s)
- Min Seo Kim
- Samsung Advanced Institute for Health Sciences & Technology (SAIHST), Sungkyunkwan University, Samsung Medical Center, Seoul, Republic of Korea
| | - Minku Song
- Samsung Advanced Institute for Health Sciences & Technology (SAIHST), Sungkyunkwan University, Samsung Medical Center, Seoul, Republic of Korea
| | - Beomsu Kim
- Samsung Advanced Institute for Health Sciences & Technology (SAIHST), Sungkyunkwan University, Samsung Medical Center, Seoul, Republic of Korea
| | - Injeong Shim
- Samsung Advanced Institute for Health Sciences & Technology (SAIHST), Sungkyunkwan University, Samsung Medical Center, Seoul, Republic of Korea
| | - Dan Say Kim
- Samsung Advanced Institute for Health Sciences & Technology (SAIHST), Sungkyunkwan University, Samsung Medical Center, Seoul, Republic of Korea
| | - Pradeep Natarajan
- Medical and Population Genetics and Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA; Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Ron Do
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Hong-Hee Won
- Samsung Advanced Institute for Health Sciences & Technology (SAIHST), Sungkyunkwan University, Samsung Medical Center, Seoul, Republic of Korea; Samsung Genome Institute, Samsung Medical Center, Seoul, Republic of Korea.
| |
Collapse
|
14
|
Yang M, Zinkgraf M, Fitzgerald-Cook C, Harrison BR, Putzier A, Promislow DEL, Wang AM. Using Drosophila to identify naturally occurring genetic modifiers of amyloid beta 42- and tau-induced toxicity. G3 (BETHESDA, MD.) 2023; 13:jkad132. [PMID: 37311212 PMCID: PMC10468303 DOI: 10.1093/g3journal/jkad132] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Revised: 04/15/2023] [Accepted: 05/15/2023] [Indexed: 06/15/2023]
Abstract
Alzheimer's disease is characterized by 2 pathological proteins, amyloid beta 42 and tau. The majority of Alzheimer's disease cases in the population are sporadic and late-onset Alzheimer's disease, which exhibits high levels of heritability. While several genetic risk factors for late-onset Alzheimer's disease have been identified and replicated in independent studies, including the ApoE ε4 allele, the great majority of the heritability of late-onset Alzheimer's disease remains unexplained, likely due to the aggregate effects of a very large number of genes with small effect size, as well as to biases in sample collection and statistical approaches. Here, we present an unbiased forward genetic screen in Drosophila looking for naturally occurring modifiers of amyloid beta 42- and tau-induced ommatidial degeneration. Our results identify 14 significant SNPs, which map to 12 potential genes in 8 unique genomic regions. Our hits that are significant after genome-wide correction identify genes involved in neuronal development, signal transduction, and organismal development. Looking more broadly at suggestive hits (P < 10-5), we see significant enrichment in genes associated with neurogenesis, development, and growth as well as significant enrichment in genes whose orthologs have been identified as significantly or suggestively associated with Alzheimer's disease in human GWAS studies. These latter genes include ones whose orthologs are in close proximity to regions in the human genome that are associated with Alzheimer's disease, but where a causal gene has not been identified. Together, our results illustrate the potential for complementary and convergent evidence provided through multitrait GWAS in Drosophila to supplement and inform human studies, helping to identify the remaining heritability and novel modifiers of complex diseases.
Collapse
Affiliation(s)
- Ming Yang
- Department of Laboratory Medicine and Pathology, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Matthew Zinkgraf
- Department of Biology, Western Washington University, Bellingham, WA 98225, USA
| | - Cecilia Fitzgerald-Cook
- Department of Laboratory Medicine and Pathology, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Benjamin R Harrison
- Department of Laboratory Medicine and Pathology, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Alexandra Putzier
- Department of Biology, Western Washington University, Bellingham, WA 98225, USA
| | - Daniel E L Promislow
- Department of Laboratory Medicine and Pathology, University of Washington School of Medicine, Seattle, WA 98195, USA
- Department of Biology, University of Washington, Seattle, WA 98195, USA
| | - Adrienne M Wang
- Department of Biology, Western Washington University, Bellingham, WA 98225, USA
| |
Collapse
|
15
|
Suarez-Pajes E, Tosco-Herrera E, Ramirez-Falcon M, Gonzalez-Barbuzano S, Hernandez-Beeftink T, Guillen-Guio B, Villar J, Flores C. Genetic Determinants of the Acute Respiratory Distress Syndrome. J Clin Med 2023; 12:3713. [PMID: 37297908 PMCID: PMC10253474 DOI: 10.3390/jcm12113713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 05/18/2023] [Accepted: 05/25/2023] [Indexed: 06/12/2023] Open
Abstract
Acute respiratory distress syndrome (ARDS) is a life-threatening lung condition that arises from multiple causes, including sepsis, pneumonia, trauma, and severe coronavirus disease 2019 (COVID-19). Given the heterogeneity of causes and the lack of specific therapeutic options, it is crucial to understand the genetic and molecular mechanisms that underlie this condition. The identification of genetic risks and pharmacogenetic loci, which are involved in determining drug responses, could help enhance early patient diagnosis, assist in risk stratification of patients, and reveal novel targets for pharmacological interventions, including possibilities for drug repositioning. Here, we highlight the basis and importance of the most common genetic approaches to understanding the pathogenesis of ARDS and its critical triggers. We summarize the findings of screening common genetic variation via genome-wide association studies and analyses based on other approaches, such as polygenic risk scores, multi-trait analyses, or Mendelian randomization studies. We also provide an overview of results from rare genetic variation studies using Next-Generation Sequencing techniques and their links with inborn errors of immunity. Lastly, we discuss the genetic overlap between severe COVID-19 and ARDS by other causes.
Collapse
Affiliation(s)
- Eva Suarez-Pajes
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, 38010 Santa Cruz de Tenerife, Spain
| | - Eva Tosco-Herrera
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, 38010 Santa Cruz de Tenerife, Spain
| | - Melody Ramirez-Falcon
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, 38010 Santa Cruz de Tenerife, Spain
| | - Silvia Gonzalez-Barbuzano
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, 38010 Santa Cruz de Tenerife, Spain
| | - Tamara Hernandez-Beeftink
- Department of Population Health Sciences, University of Leicester, Leicester LE1 7RH, UK
- NIHR Leicester Biomedical Research Centre, University of Leicester, Leicester LE1 7RH, UK
| | - Beatriz Guillen-Guio
- Department of Population Health Sciences, University of Leicester, Leicester LE1 7RH, UK
- NIHR Leicester Biomedical Research Centre, University of Leicester, Leicester LE1 7RH, UK
| | - Jesús Villar
- CIBER de Enfermedades Respiratorias (CIBERES), Instituto de Salud Carlos III, 28029 Madrid, Spain
- Research Unit, Hospital Universitario de Gran Canaria Dr. Negrín, 35019 Las Palmas de Gran Canaria, Spain
| | - Carlos Flores
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, 38010 Santa Cruz de Tenerife, Spain
- CIBER de Enfermedades Respiratorias (CIBERES), Instituto de Salud Carlos III, 28029 Madrid, Spain
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
- Faculty of Health Sciences, University of Fernando Pessoa Canarias, 35450 Las Palmas de Gran Canaria, Spain
| |
Collapse
|
16
|
Li A, Liu S, Bakshi A, Jiang L, Chen W, Zheng Z, Sullivan PF, Visscher PM, Wray NR, Yang J, Zeng J. mBAT-combo: A more powerful test to detect gene-trait associations from GWAS data. Am J Hum Genet 2023; 110:30-43. [PMID: 36608683 PMCID: PMC9892780 DOI: 10.1016/j.ajhg.2022.12.006] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 12/08/2022] [Indexed: 01/07/2023] Open
Abstract
Gene-based association tests aggregate multiple SNP-trait associations into sets defined by gene boundaries and are widely used in post-GWAS analysis. A common approach for gene-based tests is to combine SNPs associations by computing the sum of χ2 statistics. However, this strategy ignores the directions of SNP effects, which could result in a loss of power for SNPs with masking effects, e.g., when the product of two SNP effects and the linkage disequilibrium (LD) correlation is negative. Here, we introduce "mBAT-combo," a set-based test that is better powered than other methods to detect multi-SNP associations in the context of masking effects. We validate the method through simulations and applications to real data. We find that of 35 blood and urine biomarker traits in the UK Biobank, 34 traits show evidence for masking effects in a total of 4,273 gene-trait pairs, indicating that masking effects is common in complex traits. We further validate the improved power of our method in height, body mass index, and schizophrenia with different GWAS sample sizes and show that on average 95.7% of the genes detected only by mBAT-combo with smaller sample sizes can be identified by the single-SNP approach with a 1.7-fold increase in sample sizes. Eleven genes significant only in mBAT-combo for schizophrenia are confirmed by functionally informed fine-mapping or Mendelian randomization integrating gene expression data. The framework of mBAT-combo can be applied to any set of SNPs to refine trait-association signals hidden in genomic regions with complex LD structures.
Collapse
Affiliation(s)
- Ang Li
- Institute for Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia
| | - Shouye Liu
- Institute for Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia
| | - Andrew Bakshi
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC, Australia
| | | | - Wenhan Chen
- Epigenetics Research Laboratory, Genomics and Epigenetics Theme, Garvan Institute of Medical Research, Sydney, NSW, Australia
| | - Zhili Zheng
- Institute for Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia
| | - Patrick F Sullivan
- Department of Medical Epidemiology and Biostatistics, Karolinska Institute, Stockholm, Sweden; Departments of Genetics and Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Peter M Visscher
- Institute for Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia
| | - Naomi R Wray
- Institute for Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia; Queensland Brain Institute, University of Queensland, Brisbane, QLD, Australia
| | - Jian Yang
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China; Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
| | - Jian Zeng
- Institute for Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia.
| |
Collapse
|
17
|
mGWAS-Explorer: Linking SNPs, Genes, Metabolites, and Diseases for Functional Insights. Metabolites 2022; 12:metabo12060526. [PMID: 35736459 PMCID: PMC9230867 DOI: 10.3390/metabo12060526] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Revised: 05/24/2022] [Accepted: 05/31/2022] [Indexed: 11/25/2022] Open
Abstract
Tens of thousands of single-nucleotide polymorphisms (SNPs) have been identified to be significantly associated with metabolite abundance in over 65 genome-wide association studies with metabolomics (mGWAS) to date. Obtaining mechanistic or functional insights from these associations for translational applications has become a key research area in the mGWAS community. Here, we introduce mGWAS-Explorer, a user-friendly web-based platform to help connect SNPs, metabolites, genes, and their known disease associations via powerful network visual analytics. The application of the mGWAS-Explorer was demonstrated using a COVID-19 and a type 2 diabetes case studies.
Collapse
|
18
|
Ballard JL, O'Connor LJ. Shared components of heritability across genetically correlated traits. Am J Hum Genet 2022; 109:989-1006. [PMID: 35477001 PMCID: PMC9247834 DOI: 10.1016/j.ajhg.2022.04.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 04/01/2022] [Indexed: 11/01/2022] Open
Abstract
Most disease-associated genetic variants are pleiotropic, affecting multiple genetically correlated traits. Their pleiotropic associations can be mechanistically informative: if many variants have similar patterns of association, they may act via similar pleiotropic mechanisms, forming a shared component of heritability. We developed pleiotropic decomposition regression (PDR) to identify shared components and their underlying genetic variants. We validated PDR on simulated data and identified limitations of existing methods in recovering the true components. We applied PDR to three clusters of five to six traits genetically correlated with coronary artery disease (CAD), asthma, and type II diabetes (T2D), producing biologically interpretable components. For CAD, PDR identified components related to BMI, hypertension, and cholesterol, and it clarified the relationship among these highly correlated risk factors. We assigned variants to components, calculated their posterior-mean effect sizes, and performed out-of-sample validation. Our posterior-mean effect sizes pool statistical power across traits and substantially boost the correlation (r2) between true and estimated effect sizes (compared with the original summary statistics) by 94% and 70% for asthma and T2D out of sample, respectively, and by a predicted 300% for CAD.
Collapse
Affiliation(s)
- Jenna Lee Ballard
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Luke Jen O'Connor
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|