1
|
Mews MA, Naj AC, Griswold AJ, Below JE, Bush WS. Brain and blood transcriptome-wide association studies identify five novel genes associated with Alzheimer's disease. J Alzheimers Dis 2025; 105:228-244. [PMID: 40111921 DOI: 10.1177/13872877251326288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/22/2025]
Abstract
BackgroundGenome-wide association studies (GWAS) have identified numerous genetic variants associated with Alzheimer's disease (AD), but their functional implications remain unclear. Transcriptome-wide association studies (TWAS) offer enhanced statistical power by analyzing genetic associations at the gene level rather than at the variant level, enabling assessment of how genetically-regulated gene expression influences AD risk. However, previous AD-TWAS have been limited by small expression quantitative trait loci (eQTL) reference datasets or reliance on AD-by-proxy phenotypes.ObjectiveTo perform the most powerful AD-TWAS to date using summary statistics from the largest available brain and blood cis-eQTL meta-analyses applied to the largest clinically-adjudicated AD GWAS.MethodsWe implemented the OTTERS TWAS pipeline to predict gene expression using the largest available cis-eQTL data from cortical brain tissue (MetaBrain; N = 2683) and blood (eQTLGen; N = 31,684), and then applied these models to AD-GWAS data (Cases = 21,982; Controls = 44,944).ResultsWe identified and validated five novel gene associations in cortical brain tissue (PRKAG1, C3orf62, LYSMD4, ZNF439, SLC11A2) and six genes proximal to known AD-related GWAS loci (Blood: MYBPC3; Brain: MTCH2, CYB561, MADD, PSMA5, ANXA11). Further, using causal eQTL fine-mapping, we generated sparse models that retained the strength of the AD-TWAS association for MTCH2, MADD, ZNF439, CYB561, and MYBPC3.ConclusionsOur comprehensive AD-TWAS discovered new gene associations and provided insights into the functional relevance of previously associated variants, which enables us to further understand the genetic architecture underlying AD risk.
Collapse
Affiliation(s)
- Makaela A Mews
- System Biology and Bioinformatics, Department of Nutrition, Case Western Reserve University School of Medicine, Cleveland, OH, USA
| | - Adam C Naj
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
- Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Anthony J Griswold
- John P. Hussman Institute for Human Genomics, University of Miami, Miami, FL, USA
- Dr. John T. Macdonald Foundation Department of Human Genetics, University of Miami, Miami, FL, USA
| | - Jennifer E Below
- Vanderbilt Genetics Institute and Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - William S Bush
- Department of Population and Quantitative Health Sciences, Cleveland Institute for Computational Biology, Case Western Reserve University School of Medicine, Cleveland, OH, USA
| |
Collapse
|
2
|
Qi G, Lila E, Ji Z, Shojaie A, Battle A, Sun W. Transcriptome-wide association studies at cell state level using single-cell eQTL data. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2025.03.17.25324128. [PMID: 40166533 PMCID: PMC11957072 DOI: 10.1101/2025.03.17.25324128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
Transcriptome-wide association studies (TWAS) have been widely used to prioritize relevant genes for diseases. Current methods for TWAS test gene-disease associations at bulk tissue or cell-type-specific pseudobulk level, which do not account for the heterogeneity within cell types. We present TWiST, a statistical method for TWAS analysis at cell state resolution using single-cell expression quantitative trait loci (eQTL) data. Our method uses pseudotime to represent cell states and models the effect of gene expression on trait as a continuous pseudotemporal curve. Therefore, it allows flexible hypothesis testing of global, dynamic, and nonlinear effects. Through simulation studies and real data analysis, we demonstrated that TWiST leads to significantly improved power compared to pseudobulk methods that ignores heterogeneity due to cell states. Application to the OneK1K study identified hundreds of genes with dynamic effects on autoimmune diseases along the trajectory of immune cell differentiation. TWiST presents great promise to understand disease genetics using single-cell eQTL studies.
Collapse
|
3
|
He J, Perera D, Wen W, Ping J, Li Q, Lyu L, Chen Z, Shu X, Long J, Cai Q, Shu XO, Yin Z, Zheng W, Long Q, Guo X. Enhancing disease risk gene discovery by integrating transcription factor-linked trans-variants into transcriptome-wide association analyses. Nucleic Acids Res 2025; 53:gkae1035. [PMID: 39535029 PMCID: PMC11724290 DOI: 10.1093/nar/gkae1035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2024] [Revised: 10/14/2024] [Accepted: 10/21/2024] [Indexed: 11/16/2024] Open
Abstract
Transcriptome-wide association studies (TWAS) have been successful in identifying disease susceptibility genes by integrating cis-variants predicted gene expression with genome-wide association studies (GWAS) data. However, trans-variants for predicting gene expression remain largely unexplored. Here, we introduce transTF-TWAS, which incorporates transcription factor (TF)-linked trans-variants to enhance model building for TF downstream target genes. Using data from the Genotype-Tissue Expression project, we predict gene expression and alternative splicing and applied these prediction models to large GWAS datasets for breast, prostate, lung cancers and other diseases. We demonstrate that transTF-TWAS outperforms other existing TWAS approaches in both constructing gene expression prediction models and identifying disease-associated genes, as shown by simulations and real data analysis. Our transTF-TWAS approach significantly contributes to the discovery of disease risk genes. Findings from this study shed new light on several genetically driven key TF regulators and their associated TF-gene regulatory networks underlying disease susceptibility.
Collapse
Affiliation(s)
- Jingni He
- Department of Biochemistry & Molecular Biology, University of Calgary, HMRB 231, 3330 Hospital Drive NW, Calgary, AB T2N 4N1, Canada
- Department of Neuroscience, School of Translational Medicine, Faculty of Medicine, Nursing and Health Sciences, Monash University, The Alfred Centre, Level 6, 99 Commercial Road, Melbourne, VIC 3004, Australia
| | - Deshan Perera
- Department of Biochemistry & Molecular Biology, University of Calgary, HMRB 231, 3330 Hospital Drive NW, Calgary, AB T2N 4N1, Canada
| | - Wanqing Wen
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Ave, Nashville, TN 37203, USA
| | - Jie Ping
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Ave, Nashville, TN 37203, USA
| | - Qing Li
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Ave, Nashville, TN 37203, USA
| | - Linshuoshuo Lyu
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Ave, Nashville, TN 37203, USA
| | - Zhishan Chen
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Ave, Nashville, TN 37203, USA
| | - Xiang Shu
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, 633 3rd Ave, 3rd Floor, New York, NY, 10017, USA
| | - Jirong Long
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Ave, Nashville, TN 37203, USA
| | - Qiuyin Cai
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Ave, Nashville, TN 37203, USA
| | - Xiao-Ou Shu
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Ave, Nashville, TN 37203, USA
| | - Zhijun Yin
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, 2525 West End Ave, Nashville, TN 37203, USA
| | - Wei Zheng
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Ave, Nashville, TN 37203, USA
| | - Quan Long
- Department of Biochemistry & Molecular Biology, University of Calgary, HMRB 231, 3330 Hospital Drive NW, Calgary, AB T2N 4N1, Canada
- Department of Medical Genetics, University of Calgary, 3330 Hospital Drive NW, Calgary, AB T2N 4N2, Canada
- Department of Mathematics & Statistics, University of Calgary, Mathematical Sciences 476, 2500 University Drive NW, Calgary, AB, T2N 1N4, Canada
- Alberta Children's Hospital Research Institute, University of Calgary, Heritage Medical Research Building, 3330 Hospital Dr. NW, Calgary, AB T2N 4N1, Canada
- Hotchkiss Brain Institute, University of Calgary, Health Research Innovation Centre, 3330 Hospital Drive NW, Calgary, Alberta, T2N 4N1, Canada
| | - Xingyi Guo
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, 2525 West End Ave, Nashville, TN 37203, USA
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, 2525 West End Ave, Nashville, TN 37203, USA
| |
Collapse
|
4
|
Zucker R, Kelman G, Linial M. PWAS Hub: exploring gene-based associations of complex diseases with sex dependency. Nucleic Acids Res 2025; 53:D1132-D1143. [PMID: 39565197 PMCID: PMC11701668 DOI: 10.1093/nar/gkae1125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2024] [Revised: 10/15/2024] [Accepted: 11/18/2024] [Indexed: 11/21/2024] Open
Abstract
The Proteome-Wide Association Study (PWAS) is a protein-based genetic association approach designed to complement traditional variant-based methods like GWAS. PWAS operates in two stages: first, machine learning models predict the impact of genetic variants on protein-coding genes, generating effect scores. These scores are then aggregated into a gene-damaging score for each individual. This score is then used in case-control statistical tests to significantly link to specific phenotypes. PWAS Hub (v1.2) is a user-friendly platform that facilitates the exploration of gene-disease associations using clinical and genetic data from the UK Biobank (UKB), encompassing 500k individuals. PWAS Hub reports on 819 diseases and phenotypes determined by PheCode and ICD-10 clinical codes, each with a minimum of 400 affected individuals. PWAS-derived gene associations were reported for 72% of the tested phenotypes. The PWAS Hub also analyzes gene associations separately for males and females, considering sex-specific genetic effects, inheritance patterns (dominant and recessive), and gene pleiotropy. We illustrated the utility of the PWAS Hub for primary (essential) hypertension (I10), type 2 diabetes mellitus (E11), and specified haematuria (R31) that showed sex-dependent genetic signals. The PWAS Hub, available at pwas.huji.ac.il, is a valuable resource for studying genetic contributions to common diseases and sex-specific effects.
Collapse
Affiliation(s)
- Roei Zucker
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem 91904, Israel
| | - Guy Kelman
- The Jerusalem Center for Personalized Computational Medicine, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Michal Linial
- Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem 91904, Israel
| |
Collapse
|
5
|
Shao M, Chen K, Zhang S, Tian M, Shen Y, Cao C, Gu N. Multiome-wide Association Studies: Novel Approaches for Understanding Diseases. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae077. [PMID: 39471467 PMCID: PMC11630051 DOI: 10.1093/gpbjnl/qzae077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 10/06/2024] [Accepted: 10/23/2024] [Indexed: 11/01/2024]
Abstract
The rapid development of multiome (transcriptome, proteome, cistrome, imaging, and regulome)-wide association study methods have opened new avenues for biologists to understand the susceptibility genes underlying complex diseases. Thorough comparisons of these methods are essential for selecting the most appropriate tool for a given research objective. This review provides a detailed categorization and summary of the statistical models, use cases, and advantages of recent multiome-wide association studies. In addition, to illustrate gene-disease association studies based on transcriptome-wide association study (TWAS), we collected 478 disease entries across 22 categories from 235 manually reviewed publications. Our analysis reveals that mental disorders are the most frequently studied diseases by TWAS, indicating its potential to deepen our understanding of the genetic architecture of complex diseases. In summary, this review underscores the importance of multiome-wide association studies in elucidating complex diseases and highlights the significance of selecting the appropriate method for each study.
Collapse
Affiliation(s)
- Mengting Shao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Kaiyang Chen
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Shuting Zhang
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Min Tian
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Yan Shen
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Chen Cao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Ning Gu
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
- Nanjing Key Laboratory for Cardiovascular Information and Health Engineering Medicine, Institute of Clinical Medicine, Nanjing Drum Tower Hospital, Medical School, Nanjing University, Nanjing 210093, China
| |
Collapse
|
6
|
Kim S, Qin Y, Park HJ, Bohn RIC, Yue M, Xu Z, Forno E, Chen W, Celedón JC. MOSES: a methylation-based gene association approach for unveiling environmentally regulated genes linked to a trait or disease. Clin Epigenetics 2024; 16:161. [PMID: 39558360 PMCID: PMC11574994 DOI: 10.1186/s13148-024-01776-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Accepted: 11/06/2024] [Indexed: 11/20/2024] Open
Abstract
BACKGROUND DNA methylation is a critical regulatory mechanism of gene expression, influencing various human diseases and traits. While traditional expression quantitative trait loci (eQTL) studies have helped elucidate the genetic regulation of gene expression, there is a growing need to explore environmental influences on gene expression. Existing methods such as PrediXcan and FUSION focus on genotype-based associations but overlook the impact of environmental factors. To address this gap, we present MOSES (methylation-based gene association), a novel approach that utilizes DNA methylation to identify environmentally regulated genes associated with traits or diseases without relying on measured gene expression. RESULTS MOSES involves training, imputation, and association testing. It employs elastic-net penalized regression models to estimate the influence of CpGs and SNPs (if available) on gene expression. We developed and compared four MOSES versions incorporating different methylation and genetic data: (1) cis-DNA methylation within 1 Mb of promoter regions, (2) both cis-SNPs and cis-CpGs, 3) both cis- and a part of trans- CpGs (±5Mb away) from promoter regions), and 4) long-range DNA methylation (±10 Mb away) from promoter regions. Our analysis using nasal epithelium and white blood cell data from the Epigenetic Variation and Childhood Asthma in Puerto Ricans (EVA-PR) study demonstrated that MOSES, particularly the version incorporating long-range CpGs (MOSES-DNAm 10 M), significantly outperformed existing methods like PrediXcan, MethylXcan, and Biomethyl in predicting gene expression. MOSES-DNAm 10 M identified more differentially expressed genes (DEGs) associated with atopic asthma, particularly those involved in immune pathways, highlighting its superior performance in uncovering environmentally regulated genes. Further application of MOSES to lung tissue data from idiopathic pulmonary fibrosis (IPF) patients confirmed its robustness and versatility across different diseases and tissues. CONCLUSION MOSES represents an innovative advancement in gene association studies, leveraging DNA methylation to capture the influence of environmental factors on gene expression. By incorporating long-range CpGs, MOSES-DNAm 10 M provides superior predictive accuracy and gene association capabilities compared to traditional genotype-based methods. This novel approach offers valuable insights into the complex interplay between genetics and the environment, enhancing our understanding of disease mechanisms and potentially guiding therapeutic strategies. The user-friendly MOSES R package is publicly available to advance studies in various diseases, including immune-related conditions like asthma.
Collapse
Affiliation(s)
- Soyeon Kim
- Division of Pulmonary Medicine, Department of Pediatrics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Yidi Qin
- Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Hyun Jung Park
- Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Rebecca I Caldino Bohn
- Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Molin Yue
- Department of Biostatistics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Zhongli Xu
- School of Medicine, Tsinghua University, Beijing, China
| | - Erick Forno
- Division of Pulmonary Medicine, Department of Pediatrics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Wei Chen
- Division of Pulmonary Medicine, Department of Pediatrics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Juan C Celedón
- Division of Pulmonary Medicine, Department of Pediatrics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
7
|
Chen BD, Lee C, Tapia AL, Reiner AP, Tang H, Kooperberg C, Manson JE, Li Y, Raffield LM. Proteome-wide association study using cis and trans variants and applied to blood cell and lipid-related traits in the Women's Health Initiative study. Genet Epidemiol 2024; 48:310-323. [PMID: 38940271 DOI: 10.1002/gepi.22578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 05/26/2024] [Accepted: 06/13/2024] [Indexed: 06/29/2024]
Abstract
In most Proteome-Wide Association Studies (PWAS), variants near the protein-coding gene (±1 Mb), also known as cis single nucleotide polymorphisms (SNPs), are used to predict protein levels, which are then tested for association with phenotypes. However, proteins can be regulated through variants outside of the cis region. An intermediate GWAS step to identify protein quantitative trait loci (pQTL) allows for the inclusion of trans SNPs outside the cis region in protein-level prediction models. Here, we assess the prediction of 540 proteins in 1002 individuals from the Women's Health Initiative (WHI), split equally into a GWAS set, an elastic net training set, and a testing set. We compared the testing r2 between measured and predicted protein levels using this proposed approach, to the testing r2 using only cis SNPs. The two methods usually resulted in similar testing r2, but some proteins showed a significant increase in testing r2 with our method. For example, for cartilage acidic protein 1, the testing r2 increased from 0.101 to 0.351. We also demonstrate reproducible findings for predicted protein association with lipid and blood cell traits in WHI participants without proteomics data and in UK Biobank utilizing our PWAS weights.
Collapse
Affiliation(s)
- Brian D Chen
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Chanhwa Lee
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Amanda L Tapia
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Alexander P Reiner
- Department of Epidemiology, University of Washington, Seattle, Washington, USA
| | - Hua Tang
- Department of Genetics, Stanford University School of Medicine, Stanford, California, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, Washington, USA
| | - JoAnn E Manson
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
8
|
Stone K, Platig J, Quackenbush J, Fagny M. The Importance of Regulatory Network Structure for Complex Trait Heritability and Evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.27.582063. [PMID: 38464142 PMCID: PMC10925220 DOI: 10.1101/2024.02.27.582063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Complex traits are determined by many loci-mostly regulatory elements-that, through combinatorial interactions, can affect multiple traits. Such high levels of epistasis and pleiotropy have been proposed in the omnigenic model and may explain why such a large part of complex trait heritability is usually missed by genome-wide association studies while raising questions about the possibility for such traits to evolve in response to environmental constraints. To explore the molecular bases of complex traits and understand how they can adapt, we systematically analyzed the distribution of SNP heritability for ten traits across 29 tissue-specific Expression Quantitative Trait Locus (eQTL) networks. We find that heritability is clustered in a small number of tissue-specific, functionally relevant SNP-gene modules and that the greatest heritability occurs in local "hubs" that are both the cornerstone of the network's modules and tissue-specific regulatory elements. The network structure could thus both amplify the genotype-phenotype connection and buffer the deleterious effect of the genetic variations on other traits. We confirm that this structure has allowed complex traits to evolve in response to environmental constraints, with the local "hubs" being the preferential targets of past and ongoing directional selection. Together, these results provide a conceptual framework for understanding complex trait architecture and evolution.
Collapse
Affiliation(s)
- Katherine Stone
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Department of Data Science and Center for Cancer Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
| | - John Platig
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia, USA
- Department of Public Health Sciences, University of Virginia, Charlottesville, Virginia, USA
- Department of Biomedical Engineering, University of Virginia, Charlottesville, Virginia, USA
| | - John Quackenbush
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Department of Data Science and Center for Cancer Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, Massachusetts, United States
| | - Maud Fagny
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Department of Data Science and Center for Cancer Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, Genetique Quantitative et Evolution - Le Moulon, Gif-sur-Yvette 91190 France
| |
Collapse
|
9
|
Kontou PI, Bagos PG. The goldmine of GWAS summary statistics: a systematic review of methods and tools. BioData Min 2024; 17:31. [PMID: 39238044 PMCID: PMC11375927 DOI: 10.1186/s13040-024-00385-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 08/27/2024] [Indexed: 09/07/2024] Open
Abstract
Genome-wide association studies (GWAS) have revolutionized our understanding of the genetic architecture of complex traits and diseases. GWAS summary statistics have become essential tools for various genetic analyses, including meta-analysis, fine-mapping, and risk prediction. However, the increasing number of GWAS summary statistics and the diversity of software tools available for their analysis can make it challenging for researchers to select the most appropriate tools for their specific needs. This systematic review aims to provide a comprehensive overview of the currently available software tools and databases for GWAS summary statistics analysis. We conducted a comprehensive literature search to identify relevant software tools and databases. We categorized the tools and databases by their functionality, including data management, quality control, single-trait analysis, and multiple-trait analysis. We also compared the tools and databases based on their features, limitations, and user-friendliness. Our review identified a total of 305 functioning software tools and databases dedicated to GWAS summary statistics, each with unique strengths and limitations. We provide descriptions of the key features of each tool and database, including their input/output formats, data types, and computational requirements. We also discuss the overall usability and applicability of each tool for different research scenarios. This comprehensive review will serve as a valuable resource for researchers who are interested in using GWAS summary statistics to investigate the genetic basis of complex traits and diseases. By providing a detailed overview of the available tools and databases, we aim to facilitate informed tool selection and maximize the effectiveness of GWAS summary statistics analysis.
Collapse
Affiliation(s)
| | - Pantelis G Bagos
- Department of Computer Science and Biomedical Informatics, University of Thessaly, 35131, Lamia, Greece.
| |
Collapse
|
10
|
Hu T, Parrish RL, Dai Q, Buchman AS, Tasaki S, Bennett DA, Seyfried NT, Epstein MP, Yang J. Omnibus proteome-wide association study identifies 43 risk genes for Alzheimer disease dementia. Am J Hum Genet 2024; 111:1848-1863. [PMID: 39079537 PMCID: PMC11393696 DOI: 10.1016/j.ajhg.2024.07.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 06/28/2024] [Accepted: 07/02/2024] [Indexed: 09/08/2024] Open
Abstract
Transcriptome-wide association study (TWAS) tools have been applied to conduct proteome-wide association studies (PWASs) by integrating proteomics data with genome-wide association study (GWAS) summary data. The genetic effects of PWAS-identified significant genes are potentially mediated through genetically regulated protein abundance, thus informing the underlying disease mechanisms better than GWAS loci. However, existing TWAS/PWAS tools are limited by considering only one statistical model. We propose an omnibus PWAS pipeline to account for multiple statistical models and demonstrate improved performance by simulation and application studies of Alzheimer disease (AD) dementia. We employ the Aggregated Cauchy Association Test to derive omnibus PWAS (PWAS-O) p values from PWAS p values obtained by three existing tools assuming complementary statistical models-TIGAR, PrediXcan, and FUSION. Our simulation studies demonstrated improved power, with well-calibrated type I error, for PWAS-O over all three individual tools. We applied PWAS-O to studying AD dementia with reference proteomic data profiled from dorsolateral prefrontal cortex of postmortem brains from individuals of European ancestry. We identified 43 risk genes, including 5 not identified by previous studies, which are interconnected through a protein-protein interaction network that includes the well-known AD risk genes TOMM40, APOC1, and APOC2. We also validated causal genetic effects mediated through the proteome for 27 (63%) PWAS-O risk genes, providing insights into the underlying biological mechanisms of AD dementia and highlighting promising targets for therapeutic development. PWAS-O can be easily applied to studying other complex diseases.
Collapse
Affiliation(s)
- Tingyang Hu
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA; Division of Biostatistics and Bioinformatics, Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Randy L Parrish
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA; Department of Biostatistics and Bioinformatics, Emory University School of Public Health, Atlanta, GA 30322, USA
| | - Qile Dai
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA; Department of Biostatistics and Bioinformatics, Emory University School of Public Health, Atlanta, GA 30322, USA
| | - Aron S Buchman
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL 60612, USA
| | - Shinya Tasaki
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL 60612, USA
| | - David A Bennett
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL 60612, USA
| | - Nicholas T Seyfried
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Michael P Epstein
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Jingjing Yang
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA.
| |
Collapse
|
11
|
Hu T, Liu Q, Dai Q, Parrish RL, Buchman AS, Tasaki S, Seyfried NT, Wang Y, Bennett DA, De Jager PL, Epstein MP, Yang J. Proteome-wide association studies using summary pQTL data of three tissues identified 30 risk genes of Alzheimer's disease dementia. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.28.24305044. [PMID: 38585769 PMCID: PMC10996749 DOI: 10.1101/2024.03.28.24305044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Background Proteome-wide association study (PWAS) integrating proteomic data with genome-wide association study (GWAS) summary data is a powerful tool for studying Alzheimer's disease (AD) dementia. Existing PWAS analyses of AD often rely on the availability of individual-level proteomic and genetic data of a reference panel. Leveraging summary protein quantitative trait loci (pQTL) reference data of multiple AD-relevant tissues is expected to improve PWAS findings of AD dementia. Methods We conducted PWAS of AD dementia by integrating publicly available summary pQTL data of brain, cerebrospinal fluid (CSF), and plasma tissues, with the latest GWAS summary data of AD dementia. For each target protein per tissue, we employed our recently published OTTERS tool to obtain omnibus PWAS p-value, to test whether the genetically regulated protein abundance in the corresponding tissue is associated with AD dementia. Protein-protein interactions and enriched pathways of identified significant PWAS risk genes were analyzed by STRING. The potential causal effects of these PWAS risk genes were assessed by probabilistic Mendelian randomization analyses. Results We identified 30 unique significant PWAS risk genes for AD dementia, including 11 for brain, 9 for CSF, and 16 for plasma tissues. Four of these were shared by at least two tissues, and gene MAPK3 was found in all three tissues. We found that 11 of these PWAS risk genes were associated with AD or AD pathological hall marks as shown in GWAS Catalog; 18 of these were detected by transcriptome-wide association studies (TWAS); and 25 of these, including 8 out of 9 novel genes, were interconnected within a protein-protein interaction network involving the well-known AD risk gene APOE. Especially, these PWAS risk genes were enriched in immune response, glial cell proliferation, and high-density lipoprotein particle clearance pathways. Mediated causal effects were validated for 13 PWAS risk genes (43.3%). Conclusions Our findings provide novel insights into the genetic mechanisms of AD dementia in brain, CSF, and plasma tissues, and targets for developing therapeutic interventions. We also demonstrated the effectiveness of integrating summary pQTL and GWAS data for mapping risk genes of complex human diseases.
Collapse
Affiliation(s)
- Tingyang Hu
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
- Division of Biostatistics and Bioinformatics, Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Qiang Liu
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
- Department of Biostatistics and Bioinformatics, Emory University School of Public Health, Atlanta, GA, 30322, USA
| | - Qile Dai
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
- Department of Biostatistics and Bioinformatics, Emory University School of Public Health, Atlanta, GA, 30322, USA
| | - Randy L. Parrish
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
- Department of Biostatistics and Bioinformatics, Emory University School of Public Health, Atlanta, GA, 30322, USA
| | - Aron S. Buchman
- Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, IL, 60612, USA
| | - Shinya Tasaki
- Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, IL, 60612, USA
| | - Nicholas T. Seyfried
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Yanling Wang
- Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, IL, 60612, USA
| | - David A. Bennett
- Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, IL, 60612, USA
| | - Philip L. De Jager
- Center for Translational and Computational Neuroimmunology, Department of Neurology and Taub Institute for Research on Alzheimer’s Disease and the Aging Brain, Columbia University Irving Medical Center, New York, NY10032, USA
| | - Michael P. Epstein
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Jingjing Yang
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
| |
Collapse
|
12
|
Simmonds E, Leonenko G, Yaman U, Bellou E, Myers A, Morgan K, Brookes K, Hardy J, Salih D, Escott-Price V. Chromosome X-wide association study in case control studies of pathologically confirmed Alzheimer's disease in a European population. Transl Psychiatry 2024; 14:358. [PMID: 39231932 PMCID: PMC11375158 DOI: 10.1038/s41398-024-03058-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 08/20/2024] [Accepted: 08/22/2024] [Indexed: 09/06/2024] Open
Abstract
Although there are several genome-wide association studies available which highlight genetic variants associated with Alzheimer's disease (AD), often the X chromosome is excluded from the analysis. We conducted an X-chromosome-wide association study (XWAS) in three independent studies with a pathologically confirmed phenotype (total 1970 cases and 1113 controls). The XWAS was performed in males and females separately, and these results were then meta-analysed. Four suggestively associated genes were identified which may be of potential interest for further study in AD, these are DDX53 (rs12006935, OR = 0.52, p = 6.9e-05), IL1RAPL1 (rs6628450, OR = 0.36, p = 4.2e-05; rs137983810, OR = 0.52, p = 0.0003), TBX22 (rs5913102, OR = 0.74, p = 0.0003) and SH3BGRL (rs186553004, OR = 0.35, p = 0.0005; rs113157993, OR = 0.52, p = 0.0003), which replicate across at least two studies. The SNP rs5913102 in TBX22 achieves chromosome-wide significance in meta-analysed data. DDX53 shows highest expression in astrocytes, IL1RAPL1 is most highly expressed in oligodendrocytes and neurons and SH3BGRL is most highly expressed in microglia. We have also identified SNPs in the NXF5 gene at chromosome-wide significance in females (rs5944989, OR = 0.62, p = 1.1e-05) but not in males (p = 0.83). The discovery of relevant AD associated genes on the X chromosome may identify AD risk differences and similarities based on sex and lead to the development of sex-stratified therapeutics.
Collapse
Affiliation(s)
- Emily Simmonds
- Dementia Research Institute, Cardiff University, Cardiff, UK
| | - Ganna Leonenko
- Dementia Research Institute, Cardiff University, Cardiff, UK
| | - Umran Yaman
- Institute of Neurology, University College London, London, UK
| | - Eftychia Bellou
- Dementia Research Institute, Cardiff University, Cardiff, UK
| | - Amanda Myers
- Department of Cell Biology, University of Miami, Miller School of Medicine, Miami, FL, USA
| | | | - Keeley Brookes
- Biosciences, School of Science and Technology, Nottingham Trent University, Nottingham, UK
| | - John Hardy
- Institute of Neurology, University College London, London, UK
| | - Dervis Salih
- Institute of Neurology, University College London, London, UK
| | - Valentina Escott-Price
- Dementia Research Institute, Cardiff University, Cardiff, UK.
- Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff, UK.
| |
Collapse
|
13
|
Parrish RL, Buchman AS, Tasaki S, Wang Y, Avey D, Xu J, De Jager PL, Bennett DA, Epstein MP, Yang J. SR-TWAS: leveraging multiple reference panels to improve transcriptome-wide association study power by ensemble machine learning. Nat Commun 2024; 15:6646. [PMID: 39103319 DOI: 10.1038/s41467-024-50983-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 07/26/2024] [Indexed: 08/07/2024] Open
Abstract
Multiple reference panels of a given tissue or multiple tissues often exist, and multiple regression methods could be used for training gene expression imputation models for transcriptome-wide association studies (TWAS). To leverage expression imputation models (i.e., base models) trained with multiple reference panels, regression methods, and tissues, we develop a Stacked Regression based TWAS (SR-TWAS) tool which can obtain optimal linear combinations of base models for a given validation transcriptomic dataset. Both simulation and real studies show that SR-TWAS improves power, due to increased training sample sizes and borrowed strength across multiple regression methods and tissues. Leveraging base models across multiple reference panels, tissues, and regression methods, our real studies identify 6 independent significant risk genes for Alzheimer's disease (AD) dementia for supplementary motor area tissue and 9 independent significant risk genes for Parkinson's disease (PD) for substantia nigra tissue. Relevant biological interpretations are found for these significant risk genes.
Collapse
Affiliation(s)
- Randy L Parrish
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
- Department of Biostatistics, Emory University School of Public Health, Atlanta, GA, 30322, USA
| | - Aron S Buchman
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, 60612, USA
| | - Shinya Tasaki
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, 60612, USA
| | - Yanling Wang
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, 60612, USA
| | - Denis Avey
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, 60612, USA
| | - Jishu Xu
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, 60612, USA
| | - Philip L De Jager
- Center for Translational and Computational Neuroimmunology, Department of Neurology and Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - David A Bennett
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, 60612, USA
| | - Michael P Epstein
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Jingjing Yang
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA.
| |
Collapse
|
14
|
Ren S, Sun C, Zhai W, Wei W, Liu J. Gaining new insights into the etiology of ulcerative colitis through a cross-tissue transcriptome-wide association study. Front Genet 2024; 15:1425370. [PMID: 39092429 PMCID: PMC11291327 DOI: 10.3389/fgene.2024.1425370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 06/25/2024] [Indexed: 08/04/2024] Open
Abstract
Background Genome-wide association studies (GWASs) have identified 38 loci associated with ulcerative colitis (UC) susceptibility, but the risk genes and their biological mechanisms remained to be comprehensively elucidated. Methods Multi-marker analysis of genomic annotation (MAGMA) software was used to annotate genes on GWAS summary statistics of UC from FinnGen database. Genetic analysis was performed to identify risk genes. Cross-tissue transcriptome-wide association study (TWAS) using the unified test for molecular signatures (UTMOST) was performed to compare GWAS summary statistics with gene expression matrix (from Genotype-Tissue Expression Project) for data integration. Subsequently, we used FUSION software to select key genes from the individual tissues. Additionally, conditional and joint analysis was conducted to improve our understanding on UC. Fine-mapping of causal gene sets (FOCUS) software was employed to accurately locate risk genes. The results of the four genetic analyses (MAGMA, UTMOST, FUSION and FOCUS) were combined to obtain a set of UC risk genes. Finally, Mendelian randomization (MR) analysis and Bayesian colocalization analysis were conducted to determine the causal relationship between the risk genes and UC. To test the robustness of our findings, the same approaches were taken to verify the GWAS data of UC on IEU. Results Multiple correction tests screened PIM3 as a risk gene for UC. The results of Bayesian colocalization analysis showed that the posterior probability of hypothesis 4 was 0.997 and 0.954 in the validation dataset. MR was conducted using the inverse variance weighting method and two single nucleotide polymorphisms (SNPs, rs28645887 and rs62231924) were included in the analysis (p < 0.001, 95%CI: 1.45-1.89). In the validation dataset, MR result was p < 0.001, 95%CI: 1.19-1.72, indicating a clear causal relationship between PIM3 and UC. Conclusion Our study validated PIM3 as a key risk gene for UC and its expression level may be related to the risk of UC, providing a novel reference for further improving the current understanding on the genetic structure of UC.
Collapse
Affiliation(s)
- Shijie Ren
- Graduate School, Hebei University of Chinese Medicine, Shijiazhuang, Hebei, China
| | - Chaodi Sun
- Graduate School, Hebei University of Chinese Medicine, Shijiazhuang, Hebei, China
| | - Wenjing Zhai
- Graduate School, Hebei University of Chinese Medicine, Shijiazhuang, Hebei, China
| | - Wenli Wei
- Graduate School, Hebei University of Chinese Medicine, Shijiazhuang, Hebei, China
| | - Jianping Liu
- Graduate School, Hebei University of Chinese Medicine, Shijiazhuang, Hebei, China
- Department of Gastroenterology, The First Affiliated Hospital of Hebei University of Chinese Medicine, Shijiazhuang, Hebei, China
| |
Collapse
|
15
|
Chen DM, Dong R, Kachuri L, Hoffmann TJ, Jiang Y, Berndt SI, Shelley JP, Schaffer KR, Machiela MJ, Freedman ND, Huang WY, Li SA, Lilja H, Justice AC, Madduri RK, Rodriguez AA, Van Den Eeden SK, Chanock SJ, Haiman CA, Conti DV, Klein RJ, Mosley JD, Witte JS, Graff RE. Transcriptome-wide association analysis identifies candidate susceptibility genes for prostate-specific antigen levels in men without prostate cancer. HGG ADVANCES 2024; 5:100315. [PMID: 38845201 PMCID: PMC11262184 DOI: 10.1016/j.xhgg.2024.100315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Revised: 05/31/2024] [Accepted: 06/03/2024] [Indexed: 06/18/2024] Open
Abstract
Deciphering the genetic basis of prostate-specific antigen (PSA) levels may improve their utility for prostate cancer (PCa) screening. Using genome-wide association study (GWAS) summary statistics from 95,768 PCa-free men, we conducted a transcriptome-wide association study (TWAS) to examine impacts of genetically predicted gene expression on PSA. Analyses identified 41 statistically significant (p < 0.05/12,192 = 4.10 × 10-6) associations in whole blood and 39 statistically significant (p < 0.05/13,844 = 3.61 × 10-6) associations in prostate tissue, with 18 genes associated in both tissues. Cross-tissue analyses identified 155 statistically significantly (p < 0.05/22,249 = 2.25 × 10-6) genes. Out of 173 unique PSA-associated genes across analyses, we replicated 151 (87.3%) in a TWAS of 209,318 PCa-free individuals from the Million Veteran Program. Based on conditional analyses, we found 20 genes (11 single tissue, nine cross-tissue) that were associated with PSA levels in the discovery TWAS that were not attributable to a lead variant from a GWAS. Ten of these 20 genes replicated, and two of the replicated genes had colocalization probability of >0.5: CCNA2 and HIST1H2BN. Six of the 20 identified genes are not known to impact PCa risk. Fine-mapping based on whole blood and prostate tissue revealed five protein-coding genes with evidence of causal relationships with PSA levels. Of these five genes, four exhibited evidence of colocalization and one was conditionally independent of previous GWAS findings. These results yield hypotheses that should be further explored to improve understanding of genetic factors underlying PSA levels.
Collapse
Affiliation(s)
- Dorothy M Chen
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Ruocheng Dong
- Department of Epidemiology and Population Health, Stanford University, Stanford, CA 94305, USA
| | - Linda Kachuri
- Department of Epidemiology and Population Health, Stanford University, Stanford, CA 94305, USA; Stanford Cancer Institute, Stanford University, Stanford, CA 94305, USA
| | - Thomas J Hoffmann
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA 94158, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Yu Jiang
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Sonja I Berndt
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20814, USA
| | - John P Shelley
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Kerry R Schaffer
- Department of Internal Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Mitchell J Machiela
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20814, USA
| | - Neal D Freedman
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20814, USA
| | - Wen-Yi Huang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20814, USA
| | - Shengchao A Li
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20814, USA
| | - Hans Lilja
- Departments of Pathology and Laboratory Medicine, Surgery, Medicine, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Department of Translational Medicine, Lund University, 21428 Malmö, Sweden
| | | | | | | | | | - Stephen J Chanock
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20814, USA
| | - Christopher A Haiman
- Center for Genetic Epidemiology, Department of Population and Preventive Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA 90032, USA; Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | - David V Conti
- Center for Genetic Epidemiology, Department of Population and Preventive Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA 90032, USA; Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | - Robert J Klein
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Jonathan D Mosley
- Departments of Internal Medicine and Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - John S Witte
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA 94158, USA; Department of Epidemiology and Population Health, Stanford University, Stanford, CA 94305, USA; Departments of Biomedical Data Science and Genetics (by courtesy), Stanford University, Stanford, CA 94305, USA.
| | - Rebecca E Graff
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA 94158, USA.
| |
Collapse
|
16
|
He J, Perera D, Wen W, Ping J, Li Q, Lyu L, Chen Z, Shu X, Long J, Cai Q, Shu XO, Zheng W, Long Q, Guo X. Enhancing Disease Risk Gene Discovery by Integrating Transcription Factor-Linked Trans-located Variants into Transcriptome-Wide Association Analyses. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.10.10.23295443. [PMID: 37873299 PMCID: PMC10593059 DOI: 10.1101/2023.10.10.23295443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Transcriptome-wide association studies (TWAS) have been successful in identifying disease susceptibility genes by integrating cis-variants predicted gene expression with genome-wide association studies (GWAS) data. However, trans-located variants for predicting gene expression remain largely unexplored. Here, we introduce transTF-TWAS, which incorporates transcription factor (TF)-linked trans-located variants to enhance model building. Using data from the Genotype-Tissue Expression project, we predict gene expression and alternative splicing and applied these models to large GWAS datasets for breast, prostate, and lung cancers. We demonstrate that transTF-TWAS outperforms other existing TWAS approaches in both constructing gene prediction models and identifying disease-associated genes, as evidenced by simulations and real data analysis. Our transTF-TWAS approach significantly contributes to the discovery of disease risk genes. Findings from this study have shed new light on several genetically driven key regulators and their associated regulatory networks underlying disease susceptibility.
Collapse
|
17
|
Zucker R, Kovalerchik M, Stern A, Kaufman H, Linial M. Revealing the genetic complexity of hypothyroidism: integrating complementary association methods. Front Genet 2024; 15:1409226. [PMID: 38919955 PMCID: PMC11196612 DOI: 10.3389/fgene.2024.1409226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Accepted: 05/16/2024] [Indexed: 06/27/2024] Open
Abstract
Hypothyroidism is a common endocrine disorder whose prevalence increases with age. The disease manifests itself when the thyroid gland fails to produce sufficient thyroid hormones. The disorder includes cases of congenital hypothyroidism (CH), but most cases exhibit hormonal feedback dysregulation and destruction of the thyroid gland by autoantibodies. In this study, we sought to identify causal genes for hypothyroidism in large populations. The study used the UK-Biobank (UKB) database, reporting on 13,687 cases of European ancestry. We used GWAS compilation from Open Targets (OT) and tuned protocols focusing on genes and coding regions, along with complementary association methods of PWAS (proteome-based) and TWAS (transcriptome-based). Comparing summary statistics from numerous GWAS revealed a limited number of variants associated with thyroid development. The proteome-wide association study method identified 77 statistically significant genes, half of which are located within the Chr6-MHC locus and are enriched with autoimmunity-related genes. While coding GWAS and PWAS highlighted the centrality of immune-related genes, OT and transcriptome-wide association study mostly identified genes involved in thyroid developmental programs. We used independent populations from Finland (FinnGen) and the Taiwan cohort to validate the PWAS results. The higher prevalence in females relative to males is substantiated as the polygenic risk score prediction of hypothyroidism relied mostly from the female group genetics. Comparing results from OT, TWAS, and PWAS revealed the complementary facets of hypothyroidism's etiology. This study underscores the significance of synthesizing gene-phenotype association methods for this common, intricate disease. We propose that the integration of established association methods enhances interpretability and clinical utility.
Collapse
Affiliation(s)
- Roei Zucker
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Michael Kovalerchik
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Amos Stern
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Hadasa Kaufman
- Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Michal Linial
- Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| |
Collapse
|
18
|
Head ST, Dezem F, Todor A, Yang J, Plummer J, Gayther S, Kar S, Schildkraut J, Epstein MP. Cis- and trans-eQTL TWASs of breast and ovarian cancer identify more than 100 susceptibility genes in the BCAC and OCAC consortia. Am J Hum Genet 2024; 111:1084-1099. [PMID: 38723630 PMCID: PMC11179407 DOI: 10.1016/j.ajhg.2024.04.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 04/11/2024] [Accepted: 04/16/2024] [Indexed: 05/21/2024] Open
Abstract
Transcriptome-wide association studies (TWASs) have investigated the role of genetically regulated transcriptional activity in the etiologies of breast and ovarian cancer. However, methods performed to date have focused on the regulatory effects of risk-associated SNPs thought to act in cis on a nearby target gene. With growing evidence for distal (trans) regulatory effects of variants on gene expression, we performed TWASs of breast and ovarian cancer using a Bayesian genome-wide TWAS method (BGW-TWAS) that considers effects of both cis- and trans-expression quantitative trait loci (eQTLs). We applied BGW-TWAS to whole-genome and RNA sequencing data in breast and ovarian tissues from the Genotype-Tissue Expression project to train expression imputation models. We applied these models to large-scale GWAS summary statistic data from the Breast Cancer and Ovarian Cancer Association Consortia to identify genes associated with risk of overall breast cancer, non-mucinous epithelial ovarian cancer, and 10 cancer subtypes. We identified 101 genes significantly associated with risk with breast cancer phenotypes and 8 with ovarian phenotypes. These loci include established risk genes and several novel candidate risk loci, such as ACAP3, whose associations are predominantly driven by trans-eQTLs. We replicated several associations using summary statistics from an independent GWAS of these cancer phenotypes. We further used genotype and expression data in normal and tumor breast tissue from the Cancer Genome Atlas to examine the performance of our trained expression imputation models. This work represents an in-depth look into the role of trans eQTLs in the complex molecular mechanisms underlying these diseases.
Collapse
Affiliation(s)
- S Taylor Head
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA
| | - Felipe Dezem
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Andrei Todor
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, GA 30322, USA
| | - Jingjing Yang
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, GA 30322, USA
| | - Jasmine Plummer
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Simon Gayther
- Department of Biomedical Sciences, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Siddhartha Kar
- Early Cancer Institute, Department of Oncology, University of Cambridge, Cambridge CB2 0XZ, UK
| | - Joellen Schildkraut
- Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA
| | - Michael P Epstein
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, GA 30322, USA.
| |
Collapse
|
19
|
Guo S, Yang J. Bayesian genome-wide TWAS with reference transcriptomic data of brain and blood tissues identified 141 risk genes for Alzheimer's disease dementia. Alzheimers Res Ther 2024; 16:120. [PMID: 38824563 PMCID: PMC11144322 DOI: 10.1186/s13195-024-01488-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 05/27/2024] [Indexed: 06/03/2024]
Abstract
BACKGROUND Transcriptome-wide association study (TWAS) is an influential tool for identifying genes associated with complex diseases whose genetic effects are likely mediated through transcriptome. TWAS utilizes reference genetic and transcriptomic data to estimate effect sizes of genetic variants on gene expression (i.e., effect sizes of a broad sense of expression quantitative trait loci, eQTL). These estimated effect sizes are employed as variant weights in gene-based association tests, facilitating the mapping of risk genes with genome-wide association study (GWAS) data. However, most existing TWAS of Alzheimer's disease (AD) dementia are limited to studying only cis-eQTL proximal to the test gene. To overcome this limitation, we applied the Bayesian Genome-wide TWAS (BGW-TWAS) method to leveraging both cis- and trans- eQTL of brain and blood tissues, in order to enhance mapping risk genes for AD dementia. METHODS We first applied BGW-TWAS to the Genotype-Tissue Expression (GTEx) V8 dataset to estimate cis- and trans- eQTL effect sizes of the prefrontal cortex, cortex, and whole blood tissues. Estimated eQTL effect sizes were integrated with the summary data of the most recent GWAS of AD dementia to obtain BGW-TWAS (i.e., gene-based association test) p-values of AD dementia per gene per tissue type. Then we used the aggregated Cauchy association test to combine TWAS p-values across three tissues to obtain omnibus TWAS p-values per gene. RESULTS We identified 85 significant genes in prefrontal cortex, 82 in cortex, and 76 in whole blood that were significantly associated with AD dementia. By combining BGW-TWAS p-values across these three tissues, we obtained 141 significant risk genes including 34 genes primarily due to trans-eQTL and 35 mapped risk genes in GWAS Catalog. With these 141 significant risk genes, we detected functional clusters comprised of both known mapped GWAS risk genes of AD in GWAS Catalog and our identified TWAS risk genes by protein-protein interaction network analysis, as well as several enriched phenotypes related to AD. CONCLUSION We applied BGW-TWAS and aggregated Cauchy test methods to integrate both cis- and trans- eQTL data of brain and blood tissues with GWAS summary data, identifying 141 TWAS risk genes of AD dementia. These identified risk genes provide novel insights into the underlying biological mechanisms of AD dementia and potential gene targets for therapeutics development.
Collapse
Affiliation(s)
- Shuyi Guo
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
- Department of Biostatistics and Data Science, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Jingjing Yang
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA.
| |
Collapse
|
20
|
Zhang Y, Wang M, Li Z, Yang X, Li K, Xie A, Dong F, Wang S, Yan J, Liu J. An overview of detecting gene-trait associations by integrating GWAS summary statistics and eQTLs. SCIENCE CHINA. LIFE SCIENCES 2024; 67:1133-1154. [PMID: 38568343 DOI: 10.1007/s11427-023-2522-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 01/29/2024] [Indexed: 06/07/2024]
Abstract
Detecting genes that affect specific traits (such as human diseases and crop yields) is important for treating complex diseases and improving crop quality. A genome-wide association study (GWAS) provides new insights and directions for understanding complex traits by identifying important single nucleotide polymorphisms. Many GWAS summary statistics data related to various complex traits have been gathered recently. Studies have shown that GWAS risk loci and expression quantitative trait loci (eQTLs) often have a lot of overlaps, which makes gene expression gradually become an important intermediary to reveal the regulatory role of GWAS. In this review, we review three types of gene-trait association detection methods of integrating GWAS summary statistics and eQTLs data, namely colocalization methods, transcriptome-wide association study-oriented approaches, and Mendelian randomization-related methods. At the theoretical level, we discussed the differences, relationships, advantages, and disadvantages of various algorithms in the three kinds of gene-trait association detection methods. To further discuss the performance of various methods, we summarize the significant gene sets that influence high-density lipoprotein, low-density lipoprotein, total cholesterol, and triglyceride reported in 16 studies. We discuss the performance of various algorithms using the datasets of the four lipid traits. The advantages and limitations of various algorithms are analyzed based on experimental results, and we suggest directions for follow-up studies on detecting gene-trait associations.
Collapse
Affiliation(s)
- Yang Zhang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, 430070, China
- Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, 430070, China
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Mengyao Wang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, 430070, China
- Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, 430070, China
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Zhenguo Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, 430070, China
- Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, 430070, China
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Xuan Yang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, 430070, China
- Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, 430070, China
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Keqin Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, 430070, China
- Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, 430070, China
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Ao Xie
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, 430070, China
- Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, 430070, China
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Fang Dong
- College of Life Sciences, Nankai University, Tianjin, 300071, China
| | - Shihan Wang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
| | - Jianbing Yan
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Jianxiao Liu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China.
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, 430070, China.
- Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, 430070, China.
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China.
| |
Collapse
|
21
|
Parrish RL, Buchman AS, Tasaki S, Wang Y, Avey D, Xu J, De Jager PL, Bennett DA, Epstein MP, Yang J. SR-TWAS: Leveraging Multiple Reference Panels to Improve TWAS Power by Ensemble Machine Learning. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.06.20.23291605. [PMID: 37425698 PMCID: PMC10327185 DOI: 10.1101/2023.06.20.23291605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Multiple reference panels of a given tissue or multiple tissues often exist, and multiple regression methods could be used for training gene expression imputation models for TWAS. To leverage expression imputation models (i.e., base models) trained with multiple reference panels, regression methods, and tissues, we develop a Stacked Regression based TWAS (SR-TWAS) tool which can obtain optimal linear combinations of base models for a given validation transcriptomic dataset. Both simulation and real studies showed that SR-TWAS improved power, due to increased effective training sample sizes and borrowed strength across multiple regression methods and tissues. Leveraging base models across multiple reference panels, tissues, and regression methods, our real application studies identified 6 independent significant risk genes for Alzheimer's disease (AD) dementia for supplementary motor area tissue and 9 independent significant risk genes for Parkinson's disease (PD) for substantia nigra tissue. Relevant biological interpretations were found for these significant risk genes.
Collapse
|
22
|
Mews MA, Naj AC, Griswold AJ, Below JE, Bush WS. Brain and Blood Transcriptome-Wide Association Studies Identify Five Novel Genes Associated with Alzheimer's Disease. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.17.24305737. [PMID: 38699333 PMCID: PMC11065015 DOI: 10.1101/2024.04.17.24305737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
INTRODUCTION Transcriptome-wide Association Studies (TWAS) extend genome-wide association studies (GWAS) by integrating genetically-regulated gene expression models. We performed the most powerful AD-TWAS to date, using summary statistics from cis -eQTL meta-analyses and the largest clinically-adjudicated Alzheimer's Disease (AD) GWAS. METHODS We implemented the OTTERS TWAS pipeline, leveraging cis -eQTL data from cortical brain tissue (MetaBrain; N=2,683) and blood (eQTLGen; N=31,684) to predict gene expression, then applied these models to AD-GWAS data (Cases=21,982; Controls=44,944). RESULTS We identified and validated five novel gene associations in cortical brain tissue ( PRKAG1 , C3orf62 , LYSMD4 , ZNF439 , SLC11A2 ) and six genes proximal to known AD-related GWAS loci (Blood: MYBPC3 ; Brain: MTCH2 , CYB561 , MADD , PSMA5 , ANXA11 ). Further, using causal eQTL fine-mapping, we generated sparse models that retained the strength of the AD-TWAS association for MTCH2 , MADD , ZNF439 , CYB561 , and MYBPC3 . DISCUSSION Our comprehensive AD-TWAS discovered new gene associations and provided insights into the functional relevance of previously associated variants.
Collapse
|
23
|
Wu YS, Zheng WH, Liu TH, Sun Y, Xu YT, Shao LZ, Cai QY, Tang YQ. Joint-tissue integrative analysis identifies high-risk genes for Parkinson's disease. Front Neurosci 2024; 18:1309684. [PMID: 38576865 PMCID: PMC10991821 DOI: 10.3389/fnins.2024.1309684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Accepted: 02/22/2024] [Indexed: 04/06/2024] Open
Abstract
The loss of dopaminergic neurons in the substantia nigra and the abnormal accumulation of synuclein proteins and neurotransmitters in Lewy bodies constitute the primary symptoms of Parkinson's disease (PD). Besides environmental factors, scholars are in the early stages of comprehending the genetic factors involved in the pathogenic mechanism of PD. Although genome-wide association studies (GWAS) have unveiled numerous genetic variants associated with PD, precisely pinpointing the causal variants remains challenging due to strong linkage disequilibrium (LD) among them. Addressing this issue, expression quantitative trait locus (eQTL) cohorts were employed in a transcriptome-wide association study (TWAS) to infer the genetic correlation between gene expression and a particular trait. Utilizing the TWAS theory alongside the enhanced Joint-Tissue Imputation (JTI) technique and Mendelian Randomization (MR) framework (MR-JTI), we identified a total of 159 PD-associated genes by amalgamating LD score, GTEx eQTL data, and GWAS summary statistic data from a substantial cohort. Subsequently, Fisher's exact test was conducted on these PD-associated genes using 5,152 differentially expressed genes sourced from 12 PD-related datasets. Ultimately, 29 highly credible PD-associated genes, including CTX1B, SCNA, and ARSA, were uncovered. Furthermore, GO and KEGG enrichment analyses indicated that these genes primarily function in tissue synthesis, regulation of neuron projection development, vesicle organization and transportation, and lysosomal impact. The potential PD-associated genes identified in this study not only offer fresh insights into the disease's pathophysiology but also suggest potential biomarkers for early disease detection.
Collapse
Affiliation(s)
- Ya-Shi Wu
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
- Department of Cell Biology and Medical Genetics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| | - Wen-Han Zheng
- Department of Cell Biology and Medical Genetics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| | - Tai-Hang Liu
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| | - Yan Sun
- Department of Cell Biology and Medical Genetics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| | - Yu-Ting Xu
- Department of Cell Biology and Medical Genetics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| | - Li-Zhen Shao
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| | - Qin-Yu Cai
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| | - Ya Qin Tang
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| |
Collapse
|
24
|
Xu X, Khunsriraksakul C, Eales JM, Rubin S, Scannali D, Saluja S, Talavera D, Markus H, Wang L, Drzal M, Maan A, Lay AC, Prestes PR, Regan J, Diwadkar AR, Denniff M, Rempega G, Ryszawy J, Król R, Dormer JP, Szulinska M, Walczak M, Antczak A, Matías-García PR, Waldenberger M, Woolf AS, Keavney B, Zukowska-Szczechowska E, Wystrychowski W, Zywiec J, Bogdanski P, Danser AHJ, Samani NJ, Guzik TJ, Morris AP, Liu DJ, Charchar FJ, Tomaszewski M. Genetic imputation of kidney transcriptome, proteome and multi-omics illuminates new blood pressure and hypertension targets. Nat Commun 2024; 15:2359. [PMID: 38504097 PMCID: PMC10950894 DOI: 10.1038/s41467-024-46132-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 02/14/2024] [Indexed: 03/21/2024] Open
Abstract
Genetic mechanisms of blood pressure (BP) regulation remain poorly defined. Using kidney-specific epigenomic annotations and 3D genome information we generated and validated gene expression prediction models for the purpose of transcriptome-wide association studies in 700 human kidneys. We identified 889 kidney genes associated with BP of which 399 were prioritised as contributors to BP regulation. Imputation of kidney proteome and microRNAome uncovered 97 renal proteins and 11 miRNAs associated with BP. Integration with plasma proteomics and metabolomics illuminated circulating levels of myo-inositol, 4-guanidinobutanoate and angiotensinogen as downstream effectors of several kidney BP genes (SLC5A11, AGMAT, AGT, respectively). We showed that genetically determined reduction in renal expression may mimic the effects of rare loss-of-function variants on kidney mRNA/protein and lead to an increase in BP (e.g., ENPEP). We demonstrated a strong correlation (r = 0.81) in expression of protein-coding genes between cells harvested from urine and the kidney highlighting a diagnostic potential of urinary cell transcriptomics. We uncovered adenylyl cyclase activators as a repurposing opportunity for hypertension and illustrated examples of BP-elevating effects of anticancer drugs (e.g. tubulin polymerisation inhibitors). Collectively, our studies provide new biological insights into genetic regulation of BP with potential to drive clinical translation in hypertension.
Collapse
Affiliation(s)
- Xiaoguang Xu
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | | | - James M Eales
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - Sebastien Rubin
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - David Scannali
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - Sushant Saluja
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - David Talavera
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - Havell Markus
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA
| | - Lida Wang
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA
| | - Maciej Drzal
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - Akhlaq Maan
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - Abigail C Lay
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - Priscilla R Prestes
- Health Innovation and Transformation Centre, Federation University Australia, Ballarat, Australia
| | - Jeniece Regan
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA
| | - Avantika R Diwadkar
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA
| | - Matthew Denniff
- Department of Cardiovascular Sciences, University of Leicester, Leicester, UK
| | - Grzegorz Rempega
- Department of Urology, Medical University of Silesia, Katowice, Poland
| | - Jakub Ryszawy
- Department of Urology, Medical University of Silesia, Katowice, Poland
| | - Robert Król
- Department of General, Vascular and Transplant Surgery, Faculty of Medical Sciences in Katowice, Medical University of Silesia, Katowice, Poland
| | - John P Dormer
- Department of Cellular Pathology, University Hospitals of Leicester, Leicester, UK
| | - Monika Szulinska
- Department of Obesity, Metabolic Disorders Treatment and Clinical Dietetics, Karol Marcinkowski University of Medical Sciences, Poznan, Poland
| | - Marta Walczak
- Department of Internal Diseases, Metabolic Disorders and Arterial Hypertension, Poznan University of Medical Sciences, Poznan, Poland
| | - Andrzej Antczak
- Department of Urology and Uro-oncology, Karol Marcinkowski University of Medical Sciences, Poznan, Poland
| | - Pamela R Matías-García
- Institute of Epidemiology, Helmholtz Center Munich, Neuherberg, Germany
- Research Unit Molecular Epidemiology, Helmholtz Center Munich, Neuherberg, Germany
- German Research Center for Cardiovascular Disease (DZHK), partner site Munich Heart Alliance, Munich, Germany
| | - Melanie Waldenberger
- Institute of Epidemiology, Helmholtz Center Munich, Neuherberg, Germany
- Research Unit Molecular Epidemiology, Helmholtz Center Munich, Neuherberg, Germany
- German Research Center for Cardiovascular Disease (DZHK), partner site Munich Heart Alliance, Munich, Germany
| | - Adrian S Woolf
- Division of Cell Matrix Biology and Regenerative Medicine, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
- Royal Manchester Children's Hospital and Manchester Academic Health Science Centre, Manchester University NHS Foundation Trust, Manchester, UK
| | - Bernard Keavney
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
- Manchester Academic Health Science Centre, Manchester University NHS Foundation Trust Manchester, Manchester Royal Infirmary, Manchester, UK
| | | | - Wojciech Wystrychowski
- Department of General, Vascular and Transplant Surgery, Faculty of Medical Sciences in Katowice, Medical University of Silesia, Katowice, Poland
| | - Joanna Zywiec
- Department of Internal Medicine, Diabetology and Nephrology, Zabrze, Medical University of Silesia, Katowice, Poland
| | - Pawel Bogdanski
- Department of Obesity, Metabolic Disorders Treatment and Clinical Dietetics, Karol Marcinkowski University of Medical Sciences, Poznan, Poland
| | - A H Jan Danser
- Department of Internal Medicine, Division of Pharmacology and Vascular Medicine, Erasmus Medical Centre, Rotterdam, The Netherlands
| | - Nilesh J Samani
- Department of Cardiovascular Sciences, University of Leicester, Leicester, UK
- NIHR Leicester Biomedical Research Centre, Glenfield Hospital, Leicester, UK
| | - Tomasz J Guzik
- Department of Internal Medicine, Jagiellonian University Medical College, Kraków, Poland
- Centre for Cardiovascular Sciences, Queen's Medical Research Institute, University of Edinburgh, Edinburgh, UK
- Center for Medical Genomics OMICRON, Jagiellonian University Medical College, Kraków, Poland
| | - Andrew P Morris
- Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, Division of Musculoskeletal & Dermatological Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - Dajiang J Liu
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA
| | - Fadi J Charchar
- Health Innovation and Transformation Centre, Federation University Australia, Ballarat, Australia
- Department of Cardiovascular Sciences, University of Leicester, Leicester, UK
- Department of Physiology, University of Melbourne, Melbourne, Australia
| | - Maciej Tomaszewski
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK.
- Manchester Academic Health Science Centre, Manchester University NHS Foundation Trust Manchester, Manchester Royal Infirmary, Manchester, UK.
| |
Collapse
|
25
|
Jia K, Shen J. Transcriptome-wide association studies associated with Crohn's disease: challenges and perspectives. Cell Biosci 2024; 14:29. [PMID: 38403629 PMCID: PMC10895848 DOI: 10.1186/s13578-024-01204-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Accepted: 02/04/2024] [Indexed: 02/27/2024] Open
Abstract
Crohn's disease (CD) is regarded as a lifelong progressive disease affecting all segments of the intestinal tract and multiple organs. Based on genome-wide association studies (GWAS) and gene expression data, transcriptome-wide association studies (TWAS) can help identify susceptibility genes associated with pathogenesis and disease behavior. In this review, we overview seven reported TWASs of CD, summarize their study designs, and discuss the key methods and steps used in TWAS, which affect the prioritization of susceptibility genes. This article summarized the screening of tissue-specific susceptibility genes for CD, and discussed the reported potential pathological mechanisms of overlapping susceptibility genes related to CD in a certain tissue type. We observed that ileal lipid-related metabolism and colonic extracellular vesicles may be involved in the pathogenesis of CD by performing GO pathway enrichment analysis for susceptibility genes. We further pointed the low reproducibility of TWAS associated with CD and discussed the reasons for these issues, strategies for solving them. In the future, more TWAS are needed to be designed into large-scale, unified cohorts, unified analysis pipelines, and fully classified databases of expression trait loci.
Collapse
Affiliation(s)
- Keyu Jia
- Laboratory of Medicine, Baoshan Branch, Ren Ji Hospital, School of Medicine, Nephrology department, Shanghai Jiao Tong University, 1058 Huanzhen Northroad, Shanghai, 200444, China
| | - Jun Shen
- Laboratory of Medicine, Baoshan Branch, Ren Ji Hospital, School of Medicine, Nephrology department, Shanghai Jiao Tong University, 1058 Huanzhen Northroad, Shanghai, 200444, China.
- Division of Gastroenterology and Hepatology, Key Laboratory of Gastroenterology and Hepatology, Ministry of Health, Inflammatory Bowel Research Center, Ren Ji Hospital, School of Medicine, Shanghai Institute of Digestive Disease, Shanghai Jiao Tong University, Shanghai, China.
- NHC Key Laboratory of Digestive Diseases, Renji Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China.
- Division of Gastroenterology and Hepatology, Baoshan Branch, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China.
| |
Collapse
|
26
|
He J, Li Q, Zhang Q. rvTWAS: identifying gene-trait association using sequences by utilizing transcriptome-directed feature selection. Genetics 2024; 226:iyad204. [PMID: 38001381 DOI: 10.1093/genetics/iyad204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 11/14/2023] [Accepted: 11/16/2023] [Indexed: 11/26/2023] Open
Abstract
Toward the identification of genetic basis of complex traits, transcriptome-wide association study (TWAS) is successful in integrating transcriptome data. However, TWAS is only applicable for common variants, excluding rare variants in exome or whole-genome sequences. This is partly because of the inherent limitation of TWAS protocols that rely on predicting gene expressions. Our previous research has revealed the insight into TWAS: the 2 steps in TWAS, building and applying the expression prediction models, are essentially genetic feature selection and aggregations that do not have to involve predictions. Based on this insight disentangling TWAS, rare variants' inability of predicting expression traits is no longer an obstacle. Herein, we developed "rare variant TWAS," or rvTWAS, that first uses a Bayesian model to conduct expression-directed feature selection and then uses a kernel machine to carry out feature aggregation, forming a model leveraging expressions for association mapping including rare variants. We demonstrated the performance of rvTWAS by thorough simulations and real data analysis in 3 psychiatric disorders, namely schizophrenia, bipolar disorder, and autism spectrum disorder. We confirmed that rvTWAS outperforms existing TWAS protocols and revealed additional genes underlying psychiatric disorders. Particularly, we formed a hypothetical mechanism in which zinc finger genes impact all 3 disorders through transcriptional regulations. rvTWAS will open a door for sequence-based association mappings integrating gene expressions.
Collapse
Affiliation(s)
- Jingni He
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Qing Li
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Qingrun Zhang
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary T2N 1N4, Canada
- Arnie Charbonneau Cancer Institute, University of Calgary, Calgary T2N 1N4, Canada
| |
Collapse
|
27
|
Liu L, Yan R, Guo P, Ji J, Gong W, Xue F, Yuan Z, Zhou X. Conditional transcriptome-wide association study for fine-mapping candidate causal genes. Nat Genet 2024; 56:348-356. [PMID: 38279040 DOI: 10.1038/s41588-023-01645-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Accepted: 12/08/2023] [Indexed: 01/28/2024]
Abstract
Transcriptome-wide association studies (TWASs) aim to integrate genome-wide association studies with expression-mapping studies to identify genes with genetically predicted expression (GReX) associated with a complex trait. In the present report, we develop a method, GIFT (gene-based integrative fine-mapping through conditional TWAS), that performs conditional TWAS analysis by explicitly controlling for GReX of all other genes residing in a local region to fine-map putatively causal genes. GIFT is frequentist in nature, explicitly models both expression correlation and cis-single nucleotide polymorphism linkage disequilibrium across multiple genes and uses a likelihood framework to account for expression prediction uncertainty. As a result, GIFT produces calibrated P values and is effective for fine-mapping. We apply GIFT to analyze six traits in the UK Biobank, where GIFT narrows down the set size of putatively causal genes by 32.16-91.32% compared with existing TWAS fine-mapping approaches. The genes identified by GIFT highlight the importance of vessel regulation in determining blood pressures and lipid metabolism for regulating lipid levels.
Collapse
Affiliation(s)
- Lu Liu
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Ran Yan
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Ping Guo
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Jiadong Ji
- Institute for Financial Studies, Shandong University, Jinan, China
| | - Weiming Gong
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Fuzhong Xue
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Zhongshang Yuan
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, China.
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, China.
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA.
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
28
|
Goddard TR, Brookes KJ, Sharma R, Moemeni A, Rajkumar AP. Dementia with Lewy Bodies: Genomics, Transcriptomics, and Its Future with Data Science. Cells 2024; 13:223. [PMID: 38334615 PMCID: PMC10854541 DOI: 10.3390/cells13030223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 01/17/2024] [Accepted: 01/23/2024] [Indexed: 02/10/2024] Open
Abstract
Dementia with Lewy bodies (DLB) is a significant public health issue. It is the second most common neurodegenerative dementia and presents with severe neuropsychiatric symptoms. Genomic and transcriptomic analyses have provided some insight into disease pathology. Variants within SNCA, GBA, APOE, SNCB, and MAPT have been shown to be associated with DLB in repeated genomic studies. Transcriptomic analysis, conducted predominantly on candidate genes, has identified signatures of synuclein aggregation, protein degradation, amyloid deposition, neuroinflammation, mitochondrial dysfunction, and the upregulation of heat-shock proteins in DLB. Yet, the understanding of DLB molecular pathology is incomplete. This precipitates the current clinical position whereby there are no available disease-modifying treatments or blood-based diagnostic biomarkers. Data science methods have the potential to improve disease understanding, optimising therapeutic intervention and drug development, to reduce disease burden. Genomic prediction will facilitate the early identification of cases and the timely application of future disease-modifying treatments. Transcript-level analyses across the entire transcriptome and machine learning analysis of multi-omic data will uncover novel signatures that may provide clues to DLB pathology and improve drug development. This review will discuss the current genomic and transcriptomic understanding of DLB, highlight gaps in the literature, and describe data science methods that may advance the field.
Collapse
Affiliation(s)
- Thomas R. Goddard
- Mental Health and Clinical Neurosciences Academic Unit, Institute of Mental Health, School of Medicine, University of Nottingham, Nottingham NG7 2TU, UK
| | - Keeley J. Brookes
- Department of Biosciences, School of Science & Technology, Nottingham Trent University, Nottingham NG11 8NS, UK
| | - Riddhi Sharma
- Biodiscovery Institute, School of Medicine, University of Nottingham, Nottingham NG7 2RD, UK
- UK Health Security Agency, Radiation Effects Department, Radiation Protection Science Division, Harwell Science Campus, Didcot, Oxfordshire OX11 0RQ, UK
| | - Armaghan Moemeni
- School of Computer Science, University of Nottingham, Nottingham NG8 1BB, UK
| | - Anto P. Rajkumar
- Mental Health and Clinical Neurosciences Academic Unit, Institute of Mental Health, School of Medicine, University of Nottingham, Nottingham NG7 2TU, UK
| |
Collapse
|
29
|
Yang J, Liu X, Oveisgharan S, Zammit AR, Nag S, Bennett DA, Buchman AS. Inferring Alzheimer's disease pathologic traits from clinical measures in living adults. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.05.08.23289668. [PMID: 37214885 PMCID: PMC10197717 DOI: 10.1101/2023.05.08.23289668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Background Alzheimer's disease neuropathologic changes (AD-NC) are important for identify people with high risk for AD dementia (ADD) and subtyping ADD. Objective Develop imputation models based on clinical measures to infer AD-NC. Methods We used penalized generalized linear regression to train imputation models for four AD-NC traits (amyloid-β, tangles, global AD pathology, and pathologic AD) in Rush Memory and Aging Project decedents, using clinical measures at the last visit prior to death as predictors. We validated these models by inferring AD-NC traits with clinical measures at the last visit prior to death for independent Religious Orders Study (ROS) decedents. We inferred baseline AD-NC traits for all ROS participants at study entry, and then tested if inferred AD-NC traits at study entry predicted incident ADD and postmortem pathologic AD. Results Inferred AD-NC traits at the last visit prior to death were related to postmortem measures with R2=(0.188,0.316,0.262) respectively for amyloid-β, tangles, and global AD pathology, and prediction Area Under the receiver operating characteristic Curve (AUC) 0.765 for pathologic AD. Inferred baseline levels of all four AD-NC traits predicted ADD. The strongest prediction was obtained by the inferred baseline probabilities of pathologic AD with AUC=(0.919,0.896) for predicting the development of ADD in 3 and 5 years from baseline. The inferred baseline levels of all four AD-NC traits significantly discriminated pathologic AD profiled eight years later with p-values<1.4 × 10-10. Conclusion Inferred AD-NC traits based on clinical measures may provide effective AD biomarkers that can estimate the burden of AD-NC traits in aging adults.
Collapse
Affiliation(s)
- Jingjing Yang
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, 615 Michael St, Atlanta, GA, 30322, USA
| | - Xizhu Liu
- Department of Biostatistics, Yale University School of Public Health, 60 College St, New Haven, CT, 06510, USA
| | - Shahram Oveisgharan
- Rush Alzheimer’s Disease Center, Rush University Medicine Center, 1620 W Harrison St, Chicago, IL, 60612, USA
| | - Andrea R. Zammit
- Rush Alzheimer’s Disease Center, Rush University Medicine Center, 1620 W Harrison St, Chicago, IL, 60612, USA
| | - Sukriti Nag
- Rush Alzheimer’s Disease Center, Rush University Medicine Center, 1620 W Harrison St, Chicago, IL, 60612, USA
| | - David A Bennett
- Rush Alzheimer’s Disease Center, Rush University Medicine Center, 1620 W Harrison St, Chicago, IL, 60612, USA
| | - Aron S Buchman
- Rush Alzheimer’s Disease Center, Rush University Medicine Center, 1620 W Harrison St, Chicago, IL, 60612, USA
| |
Collapse
|
30
|
Shi JJ, Mao CY, Guo YZ, Fan Y, Hao XY, Li SJ, Tian J, Hu ZW, Li MJ, Li JD, Ma DR, Guo MN, Zuo CY, Liang YY, Xu YM, Yang J, Shi CH. Joint analysis of proteome, transcriptome, and multi-trait analysis to identify novel Parkinson's disease risk genes. Aging (Albany NY) 2024; 16:1555-1580. [PMID: 38240717 PMCID: PMC10866412 DOI: 10.18632/aging.205444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 12/04/2023] [Indexed: 02/06/2024]
Abstract
Genome-wide association studies (GWAS) have identified multiple risk variants for Parkinson's disease (PD). Nevertheless, how the risk variants confer the risk of PD remains largely unknown. We conducted a proteome-wide association study (PWAS) and summary-data-based mendelian randomization (SMR) analysis by integrating PD GWAS with proteome and protein quantitative trait loci (pQTL) data from human brain, plasma and CSF. We also performed a large transcriptome-wide association study (TWAS) and Fine-mapping of causal gene sets (FOCUS), leveraging joint-tissue imputation (JTI) prediction models of 22 tissues to identify and prioritize putatively causal genes. We further conducted PWAS, SMR, TWAS, and FOCUS using a multi-trait analysis of GWAS (MTAG) to identify additional PD risk genes to boost statistical power. In this large-scale study, we identified 16 genes whose genetically regulated protein abundance levels were associated with Parkinson's disease risk. We undertook a large-scale analysis of PD and correlated traits, through TWAS and FOCUS studies, and discovered 26 casual genes related to PD that had not been reported in previous TWAS. 5 genes (CD38, GPNMB, RAB29, TMEM175, TTC19) showed significant associations with PD at both the proteome-wide and transcriptome-wide levels. Our study provides new insights into the etiology and underlying genetic architecture of PD.
Collapse
Affiliation(s)
- Jing-Jing Shi
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Cheng-Yuan Mao
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Ya-Zhou Guo
- School of Life Sciences, Westlake University, Hangzhou 310024, Zhejiang, China
| | - Yu Fan
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Xiao-Yan Hao
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Shuang-Jie Li
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Jie Tian
- Zhengzhou Railway Vocational and Technical College, Zhengzhou 450000, Henan, China
| | - Zheng-Wei Hu
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Meng-Jie Li
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Jia-Di Li
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Dong-Rui Ma
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Meng-Nan Guo
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Chun-Yan Zuo
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Yuan-Yuan Liang
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Yu-Ming Xu
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
- NHC Key Laboratory of Prevention and Treatment of Cerebrovascular Diseases, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
- Henan Key Laboratory of Cerebrovascular Diseases, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
- Institute of Neuroscience, Zhengzhou University, Zhengzhou 450000, Henan, China
| | - Jian Yang
- School of Life Sciences, Westlake University, Hangzhou 310024, Zhejiang, China
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, Zhejiang, China
| | - Chang-He Shi
- Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
- NHC Key Laboratory of Prevention and Treatment of Cerebrovascular Diseases, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
- Henan Key Laboratory of Cerebrovascular Diseases, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou 450000, Henan, China
- Institute of Neuroscience, Zhengzhou University, Zhengzhou 450000, Henan, China
| |
Collapse
|
31
|
Yang J, Liu X, Oveisgharan S, Zammit AR, Nag S, Bennett DA, Buchman AS. Inferring Alzheimer's Disease Pathologic Traits from Clinical Measures in Living Adults. J Alzheimers Dis 2024; 98:95-107. [PMID: 38427476 PMCID: PMC11034758 DOI: 10.3233/jad-230639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2024]
Abstract
Background Alzheimer's disease neuropathologic changes (AD-NC) are important to identify people with high risk for AD dementia (ADD) and subtyping ADD. Objective Develop imputation models based on clinical measures to infer AD-NC. Methods We used penalized generalized linear regression to train imputation models for four AD-NC traits (amyloid-β, tangles, global AD pathology, and pathologic AD) in Rush Memory and Aging Project decedents, using clinical measures at the last visit prior to death as predictors. We validated these models by inferring AD-NC traits with clinical measures at the last visit prior to death for independent Religious Orders Study (ROS) decedents. We inferred baseline AD-NC traits for all ROS participants at study entry, and then tested if inferred AD-NC traits at study entry predicted incident ADD and postmortem pathologic AD. Results Inferred AD-NC traits at the last visit prior to death were related to postmortem measures with R2 = (0.188,0.316,0.262) respectively for amyloid-β, tangles, and global AD pathology, and prediction Area Under the receiver operating characteristic Curve (AUC) 0.765 for pathologic AD. Inferred baseline levels of all four AD-NC traits predicted ADD. The strongest prediction was obtained by the inferred baseline probabilities of pathologic AD with AUC = (0.919,0.896) for predicting the development of ADD in 3 and 5 years from baseline. The inferred baseline levels of all four AD-NC traits significantly discriminated pathologic AD profiled eight years later with p-values < 1.4×10-10. Conclusions Inferred AD-NC traits based on clinical measures may provide effective AD biomarkers that can estimate the burden of AD-NC traits in aging adults.
Collapse
Affiliation(s)
- Jingjing Yang
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, 615 Michael St, Atlanta, GA, 30322, USA
| | - Xizhu Liu
- Department of Biostatistics, Yale University School of Public Health, 60 College St, New Haven, CT, 06510, USA
| | - Shahram Oveisgharan
- Rush Alzheimer’s Disease Center, Rush University Medicine Center, 1620 W Harrison St, Chicago, IL, 60612, USA
| | - Andrea R. Zammit
- Rush Alzheimer’s Disease Center, Rush University Medicine Center, 1620 W Harrison St, Chicago, IL, 60612, USA
| | - Sukriti Nag
- Rush Alzheimer’s Disease Center, Rush University Medicine Center, 1620 W Harrison St, Chicago, IL, 60612, USA
| | - David A Bennett
- Rush Alzheimer’s Disease Center, Rush University Medicine Center, 1620 W Harrison St, Chicago, IL, 60612, USA
| | - Aron S Buchman
- Rush Alzheimer’s Disease Center, Rush University Medicine Center, 1620 W Harrison St, Chicago, IL, 60612, USA
| |
Collapse
|
32
|
Zhang X, Gomez L, Below JE, Naj AC, Martin ER, Kunkle BW, Bush WS. An X Chromosome Transcriptome Wide Association Study Implicates ARMCX6 in Alzheimer's Disease. J Alzheimers Dis 2024; 98:1053-1067. [PMID: 38489177 DOI: 10.3233/jad-231075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2024]
Abstract
Background The X chromosome is often omitted in disease association studies despite containing thousands of genes that may provide insight into well-known sex differences in the risk of Alzheimer's disease (AD). Objective To model the expression of X chromosome genes and evaluate their impact on AD risk in a sex-stratified manner. Methods Using elastic net, we evaluated multiple modeling strategies in a set of 175 whole blood samples and 126 brain cortex samples, with whole genome sequencing and RNA-seq data. SNPs (MAF > 0.05) within the cis-regulatory window were used to train tissue-specific models of each gene. We apply the best models in both tissues to sex-stratified summary statistics from a meta-analysis of Alzheimer's Disease Genetics Consortium (ADGC) studies to identify AD-related genes on the X chromosome. Results Across different model parameters, sample sex, and tissue types, we modeled the expression of 217 genes (95 genes in blood and 135 genes in brain cortex). The average model R2 was 0.12 (range from 0.03 to 0.34). We also compared sex-stratified and sex-combined models on the X chromosome. We further investigated genes that escaped X chromosome inactivation (XCI) to determine if their genetic regulation patterns were distinct. We found ten genes associated with AD at p < 0.05, with only ARMCX6 in female brain cortex (p = 0.008) nearing the significance threshold after adjusting for multiple testing (α = 0.002). Conclusions We optimized the expression prediction of X chromosome genes, applied these models to sex-stratified AD GWAS summary statistics, and identified one putative AD risk gene, ARMCX6.
Collapse
Affiliation(s)
- Xueyi Zhang
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA
| | - Lissette Gomez
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL, USA
| | - Jennifer E Below
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Adam C Naj
- Department of Biostatistics, Epidemiology, and Informatics, Penn Neurodegeneration Genomics Center, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
- Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Eden R Martin
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL, USA
| | - Brian W Kunkle
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL, USA
| | - William S Bush
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA
| |
Collapse
|
33
|
Kim S, Qin Y, Park HJ, Yue M, Xu Z, Forno E, Chen W, Celedón JC. Methyl-TWAS: A powerful method for in silico transcriptome-wide association studies (TWAS) using long-range DNA methylation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.10.566586. [PMID: 38014125 PMCID: PMC10680683 DOI: 10.1101/2023.11.10.566586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
In silico transcriptome-wide association studies (TWAS) are commonly used to test whether expression of specific genes is linked to a complex trait. However, genotype-based in silico TWAS such as PrediXcan, exhibit low prediction accuracy for a majority of genes because genotypic data lack tissue- and disease-specificity and are not affected by the environment. Because methylation is tissue-specific and, like gene expression, can be modified by environment or disease status, methylation should predict gene expression with more accuracy than SNPs. Therefore, we propose Methyl-TWAS, the first approach that utilizes long-range methylation markers to impute gene expression for in silico TWAS through penalized regression. Methyl-TWAS 1) predicts epigenetically regulated/associated expression (eGReX), which incorporates tissue-specific expression and both genetically- (GReX) and environmentally-regulated expression to identify differentially expressed genes (DEGs) that could not be identified by genotype-based methods; and 2) incorporates both cis- and trans- CpGs, including various regulatory regions to identify DEGs that would be missed using cis- methylation only. Methyl-TWAS outperforms PrediXcan and two other methods in imputing gene expression in the nasal epithelium, particularly for immunity-related genes and DEGs in atopic asthma. Methyl-TWAS identified 3,681 (85.2%) of the 4,316 DEGs identified in a previous TWAS of atopic asthma using measured expression, while PrediXcan could not identify any gene. Methyl-TWAS also outperforms PrediXcan for expression imputation as well as in silico TWAS in white blood cells. Methyl-TWAS is a valuable tool for in silico TWAS, leveraging a growing body of publicly available genome-wide DNA methylation data for a variety of human tissues.
Collapse
Affiliation(s)
- Soyeon Kim
- Division of Pulmonary Medicine, Department of Pediatrics, UPMC Children’s Hospital of Pittsburgh, University of Pittsburgh, Pittsburgh, PA, USA
| | - Yidi Qin
- Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Hyun Jung Park
- Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Molin Yue
- Department of Biostatistics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Zhongli Xu
- Division of Pulmonary Medicine, Department of Pediatrics, UPMC Children’s Hospital of Pittsburgh, University of Pittsburgh, Pittsburgh, PA, USA
- School of Medicine, Tsinghua University, Beijing, China
| | - Erick Forno
- Division of Pulmonary Medicine, Department of Pediatrics, UPMC Children’s Hospital of Pittsburgh, University of Pittsburgh, Pittsburgh, PA, USA
| | - Wei Chen
- Division of Pulmonary Medicine, Department of Pediatrics, UPMC Children’s Hospital of Pittsburgh, University of Pittsburgh, Pittsburgh, PA, USA
| | - Juan C. Celedón
- Division of Pulmonary Medicine, Department of Pediatrics, UPMC Children’s Hospital of Pittsburgh, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
34
|
Head ST, Dezem F, Todor A, Yang J, Plummer J, Gayther S, Kar S, Schildkraut J, Epstein MP. Cis- and trans-eQTL TWAS of breast and ovarian cancer identify more than 100 risk associated genes in the BCAC and OCAC consortia. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.09.566218. [PMID: 38014246 PMCID: PMC10680675 DOI: 10.1101/2023.11.09.566218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Transcriptome-wide association studies (TWAS) have investigated the role of genetically regulated transcriptional activity in the etiologies of breast and ovarian cancer. However, methods performed to date have only considered regulatory effects of risk associated SNPs thought to act in cis on a nearby target gene. With growing evidence for distal (trans) regulatory effects of variants on gene expression, we performed TWAS of breast and ovarian cancer using a Bayesian genome-wide TWAS method (BGW-TWAS) that considers effects of both cis- and trans-expression quantitative trait loci (eQTLs). We applied BGW-TWAS to whole genome and RNA sequencing data in breast and ovarian tissues from the Genotype-Tissue Expression project to train expression imputation models. We applied these models to large-scale GWAS summary statistic data from the Breast Cancer and Ovarian Cancer Association Consortia to identify genes associated with risk of overall breast cancer, non-mucinous epithelial ovarian cancer, and 10 cancer subtypes. We identified 101 genes significantly associated with risk with breast cancer phenotypes and 8 with ovarian phenotypes. These loci include established risk genes and several novel candidate risk loci, such as ACAP3, whose associations are predominantly driven by trans-eQTLs. We replicated several associations using summary statistics from an independent GWAS of these cancer phenotypes. We further used genotype and expression data in normal and tumor breast tissue from the Cancer Genome Atlas to examine the performance of our trained expression imputation models. This work represents a first look into the role of trans-eQTLs in the complex molecular mechanisms underlying these diseases.
Collapse
Affiliation(s)
- S. Taylor Head
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA
| | - Felipe Dezem
- Department of Developmental Neurobiology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Andrei Todor
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, GA 30322, USA
| | - Jingjing Yang
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, GA 30322, USA
| | - Jasmine Plummer
- Department of Developmental Neurobiology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Simon Gayther
- Department of Biomedical Sciences, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Siddhartha Kar
- Early Cancer Institute, Department of Oncology, University of Cambridge, Cambridge CB2 0XZ, UK
| | - Joellen Schildkraut
- Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA
| | - Michael P. Epstein
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, GA 30322, USA
| |
Collapse
|
35
|
Jung S, Lee CH, Sul JH, Han B. Building an optimal predictive model for imputing tissue-specific gene expression by combining genotype and whole-blood transcriptome data. HGG ADVANCES 2023; 4:100223. [PMID: 37576186 PMCID: PMC10413136 DOI: 10.1016/j.xhgg.2023.100223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 05/04/2023] [Indexed: 08/15/2023] Open
Abstract
Accurate imputation of tissue-specific gene expression can be a powerful tool for understanding the biological mechanisms underlying human complex traits. Existing imputation methods can be grouped into two categories according to the types of predictors used. The first category uses genotype data, while the second category uses whole-blood expression data. Both data types can be easily collected from blood, avoiding invasive tissue biopsies. In this study, we attempted to build an optimal predictive model for imputing tissue-specific gene expression by combining the genotype and whole-blood expression data. We first evaluated the imputation performance of each standalone model (using genotype data [GEN model] and using whole-blood expression data [WBE model]) using their respective data types across 47 human tissues. The WBE model outperformed the GEN model in most tissues by a large gain. Then, we developed several combined models that leverage both types of predictors to further improve imputation performance. We tried various strategies, including utilizing a merged dataset of the two data types (MERGED models) and integrating the imputation outcomes of the two standalone models (inverse variance-weighted [IVW] models). We found that one of the MERGED models noticeably outperformed the standalone models. This model involved a fixed ratio between the two regularization penalty factors for the two predictor types so that the contribution of the whole-blood transcriptome is upweighted compared with the genotype. Our study suggests that one can improve the imputation of tissue-specific gene expression by combining the genotype and whole-blood expression, but the improvement can be largely dependent on the combination strategy chosen.
Collapse
Affiliation(s)
- Sunwoo Jung
- Interdisciplinary Program in Bioengineering, Seoul National University, Seoul, Republic of Korea
| | - Cue Hyunkyu Lee
- Department of Biostatistics, Columbia University, New York, NY, USA
| | - Jae Hoon Sul
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Buhm Han
- Interdisciplinary Program in Bioengineering, Seoul National University, Seoul, Republic of Korea
- Department of Biomedical Sciences, BK21 Plus Biomedical Science Project, Seoul National University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
36
|
Stolzenburg LR, Esmaeeli S, Kulkarni AS, Murphy E, Kwon T, Preiss C, Bahnassawy L, Stender JD, Manos JD, Reinhardt P, Rahimov F, Waring JF, Ramathal CY. Functional characterization of a single nucleotide polymorphism associated with Alzheimer's disease in a hiPSC-based neuron model. PLoS One 2023; 18:e0291029. [PMID: 37751459 PMCID: PMC10521995 DOI: 10.1371/journal.pone.0291029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Accepted: 08/20/2023] [Indexed: 09/28/2023] Open
Abstract
Neurodegenerative diseases encompass a group of debilitating conditions resulting from progressive nerve cell death. Of these, Alzheimer's disease (AD) occurs most frequently, but is currently incurable and has limited treatment success. Late onset AD, the most common form, is highly heritable but is caused by a combination of non-genetic risk factors and many low-effect genetic variants whose disease-causing mechanisms remain unclear. By mining the FinnGen study database of phenome-wide association studies, we identified a rare variant, rs148726219, enriched in the Finnish population that is associated with AD risk and dementia, and appears to have arisen on a common haplotype with older AD-associated variants such as rs429358. The rs148726219 variant lies in an overlapping intron of the FosB proto-oncogene (FOSB) and ERCC excision repair 1 (ERCC1) genes. To understand the impact of this SNP on disease phenotypes, we performed CRISPR/Cas9 editing in a human induced pluripotent stem cell (hiPSC) line to generate isogenic clones harboring heterozygous and homozygous alleles of rs148726219. hiPSC clones differentiated into induced excitatory neurons (iNs) did not exhibit detectable molecular or morphological variation in differentiation potential compared to isogenic controls. However, global transcriptome analysis showed differential regulation of nearby genes and upregulation of several biological pathways related to neuronal function, particularly synaptogenesis and calcium signaling, specifically in mature iNs harboring rs148726219 homozygous and heterozygous alleles. Functional differences in iN circuit maturation as measured by calcium imaging were observed across genotypes. Edited mature iNs also displayed downregulation of unfolded protein response and cell death pathways. This study implicates a phenotypic impact of rs148726219 in the context of mature neurons, consistent with its identification in late onset AD, and underscores a hiPSC-based experimental model to functionalize GWAS-identified variants.
Collapse
Affiliation(s)
| | - Sahar Esmaeeli
- AbbVie Inc., North Chicago, Illinois, United States of America
| | | | - Erin Murphy
- AbbVie Inc., North Chicago, Illinois, United States of America
| | - Taekyung Kwon
- AbbVie, Cambridge Research Center, Cambridge, Massachusetts, United States of America
| | - Christina Preiss
- AbbVie, Cambridge Research Center, Cambridge, Massachusetts, United States of America
| | - Lamiaa Bahnassawy
- AbbVie Deutschland GmbH & Co. KG, Neuroscience Discovery, Knollstrasse, Ludwigshafen, Germany
| | | | - Justine D. Manos
- AbbVie, Cambridge Research Center, Cambridge, Massachusetts, United States of America
| | - Peter Reinhardt
- AbbVie Deutschland GmbH & Co. KG, Neuroscience Discovery, Knollstrasse, Ludwigshafen, Germany
| | - Fedik Rahimov
- AbbVie Inc., North Chicago, Illinois, United States of America
| | | | | |
Collapse
|
37
|
de Leeuw C, Werme J, Savage JE, Peyrot WJ, Posthuma D. On the interpretation of transcriptome-wide association studies. PLoS Genet 2023; 19:e1010921. [PMID: 37676898 PMCID: PMC10508613 DOI: 10.1371/journal.pgen.1010921] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 09/19/2023] [Accepted: 08/15/2023] [Indexed: 09/09/2023] Open
Abstract
Transcriptome-wide association studies (TWAS) aim to detect relationships between gene expression and a phenotype, and are commonly used for secondary analysis of genome-wide association study (GWAS) results. Results from TWAS analyses are often interpreted as indicating a genetic relationship between gene expression and a phenotype, but this interpretation is not consistent with the null hypothesis that is evaluated in the traditional TWAS framework. In this study we provide a mathematical outline of this TWAS framework, and elucidate what interpretations are warranted given the null hypothesis it actually tests. We then use both simulations and real data analysis to assess the implications of misinterpreting TWAS results as indicative of a genetic relationship between gene expression and the phenotype. Our simulation results show considerably inflated type 1 error rates for TWAS when interpreted this way, with 41% of significant TWAS associations detected in the real data analysis found to have insufficient statistical evidence to infer such a relationship. This demonstrates that in current implementations, TWAS cannot reliably be used to investigate genetic relationships between gene expression and a phenotype, but that local genetic correlation analysis can serve as a potential alternative.
Collapse
Affiliation(s)
- Christiaan de Leeuw
- Department of Complex Trait Genetics, Centre for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands
| | - Josefin Werme
- Department of Complex Trait Genetics, Centre for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands
| | - Jeanne E. Savage
- Department of Complex Trait Genetics, Centre for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands
| | - Wouter J. Peyrot
- Department of Complex Trait Genetics, Centre for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands
- Department of Psychiatry, Amsterdam UMC, location VUmc, Amsterdam, the Netherlands
| | - Danielle Posthuma
- Department of Complex Trait Genetics, Centre for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands
- Department of Child and Adolescent Psychology and Psychiatry, section Complex Trait Genetics, Amsterdam Neuroscience, VU University Medical Centre, Amsterdam, The Netherlands
| |
Collapse
|
38
|
Guo S, Yang J. Bayesian genome-wide TWAS with reference transcriptomic data of brain and blood tissues identified 93 risk genes for Alzheimer's disease dementia. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.07.06.23292336. [PMID: 37503151 PMCID: PMC10370241 DOI: 10.1101/2023.07.06.23292336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Background Transcriptome-wide association study (TWAS) is an influential tool for identifying novel genes associated with complex diseases, where their genetic effects may be mediated through transcriptome. TWAS utilizes reference genetic and transcriptomic data to estimate genetic effect sizes on expression quantitative traits of target genes (i.e., effect sizes of a broad sense of expression quantitative trait loci, eQTL). These estimated effect sizes are then employed as variant weights in burden gene-based association test statistics, facilitating the mapping of risk genes for complex diseases with genome-wide association study (GWAS) data. However, most existing TWAS of Alzheimer's disease (AD) dementia have primarily focused on cis -eQTL, disregarding potential trans -eQTL. To overcome this limitation, we applied the Bayesian Genome-wide TWAS (BGW-TWAS) method which incorporated both cis - and trans -eQTL of brain and blood tissues to enhance mapping risk genes for AD dementia. Methods We first applied BGW-TWAS to the Genotype-Tissue Expression (GTEx) V8 dataset to estimate cis - and trans -eQTL effect sizes of the prefrontal cortex, cortex, and whole blood tissues. Subsequently, estimated eQTL effect sizes were integrated with the summary data of the most recent GWAS of AD dementia to obtain BGW-TWAS (i.e., gene-based association test) p-values of AD dementia per tissue type. Finally, we used the aggregated Cauchy association test to combine TWAS p-values across three tissues to obtain omnibus TWAS p-values per gene. Results We identified 37 genes in prefrontal cortex, 55 in cortex, and 51 in whole blood that were significantly associated with AD dementia. By combining BGW-TWAS p-values across these three tissues, we obtained 93 significant risk genes including 29 genes primarily due to trans -eQTL and 50 novel genes. Utilizing protein-protein interaction network and phenotype enrichment analyses with these 93 significant risk genes, we detected 5 functional clusters comprised of both known and novel AD risk genes and 7 enriched phenotypes. Conclusion We applied BGW-TWAS and aggregated Cauchy test methods to integrate both cis - and trans -eQTL data of brain and blood tissues with GWAS summary data to identify risk genes of AD dementia. The risk genes we identified provide novel insights into the underlying biological pathways implicated in AD dementia.
Collapse
|
39
|
Wang YH, Luo PP, Geng AY, Li X, Liu TH, He YJ, Huang L, Tang YQ. Identification of highly reliable risk genes for Alzheimer's disease through joint-tissue integrative analysis. Front Aging Neurosci 2023; 15:1183119. [PMID: 37416324 PMCID: PMC10320295 DOI: 10.3389/fnagi.2023.1183119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Accepted: 05/30/2023] [Indexed: 07/08/2023] Open
Abstract
Numerous genetic variants associated with Alzheimer's disease (AD) have been identified through genome-wide association studies (GWAS), but their interpretation is hindered by the strong linkage disequilibrium (LD) among the variants, making it difficult to identify the causal variants directly. To address this issue, the transcriptome-wide association study (TWAS) was employed to infer the association between gene expression and a trait at the genetic level using expression quantitative trait locus (eQTL) cohorts. In this study, we applied the TWAS theory and utilized the improved Joint-Tissue Imputation (JTI) approach and Mendelian Randomization (MR) framework (MR-JTI) to identify potential AD-associated genes. By integrating LD score, GTEx eQTL data, and GWAS summary statistic data from a large cohort using MR-JTI, a total of 415 AD-associated genes were identified. Then, 2873 differentially expressed genes from 11 AD-related datasets were used for the Fisher test of these AD-associated genes. We finally obtained 36 highly reliable AD-associated genes, including APOC1, CR1, ERBB2, and RIN3. Moreover, the GO and KEGG enrichment analysis revealed that these genes are primarily involved in antigen processing and presentation, amyloid-beta formation, tau protein binding, and response to oxidative stress. The identification of these potential AD-associated genes not only provides insights into the pathogenesis of AD but also offers biomarkers for early diagnosis of the disease.
Collapse
Affiliation(s)
- Yong Heng Wang
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
- Joint International Research Laboratory of Reproduction and Development, Chongqing Medical University, Chongqing, China
| | - Pan Pan Luo
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| | - Ao Yi Geng
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| | - Xinwei Li
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, China
| | - Tai-Hang Liu
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
- Joint International Research Laboratory of Reproduction and Development, Chongqing Medical University, Chongqing, China
| | - Yi Jie He
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| | - Lin Huang
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| | - Ya Qin Tang
- Department of Bioinformatics, School of Basic Medical Sciences, Chongqing Medical University, Chongqing, China
| |
Collapse
|
40
|
Dai Q, Zhou G, Zhao H, Võsa U, Franke L, Battle A, Teumer A, Lehtimäki T, Raitakari OT, Esko T, Epstein MP, Yang J. OTTERS: a powerful TWAS framework leveraging summary-level reference data. Nat Commun 2023; 14:1271. [PMID: 36882394 PMCID: PMC9992663 DOI: 10.1038/s41467-023-36862-w] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 02/20/2023] [Indexed: 03/09/2023] Open
Abstract
Most existing TWAS tools require individual-level eQTL reference data and thus are not applicable to summary-level reference eQTL datasets. The development of TWAS methods that can harness summary-level reference data is valuable to enable TWAS in broader settings and enhance power due to increased reference sample size. Thus, we develop a TWAS framework called OTTERS (Omnibus Transcriptome Test using Expression Reference Summary data) that adapts multiple polygenic risk score (PRS) methods to estimate eQTL weights from summary-level eQTL reference data and conducts an omnibus TWAS. We show that OTTERS is a practical and powerful TWAS tool by both simulations and application studies.
Collapse
Affiliation(s)
- Qile Dai
- Department of Biostatistics and Bioinformatics, Emory University School of Public Health, Atlanta, GA, 30322, USA
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Geyu Zhou
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06511, USA
| | - Hongyu Zhao
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06511, USA
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, 06520, USA
| | - Urmo Võsa
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 50090, Tartu, Estonia
| | - Lude Franke
- Department of Genetics, University of Groningen, University Medical Center Groningen, 9700 RB, Groningen, The Netherlands
- Oncode Institute, 3521 AL, Utrecht, The Netherlands
| | - Alexis Battle
- Department of Computer Science, and Departments of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Alexander Teumer
- Institute for Community Medicine, University Medicine Greifswald, 17489, Greifswald, Germany
| | - Terho Lehtimäki
- Department of Clinical Chemistry, Fimlab Laboratories and Finnish Centre for Cardiovascular Disease Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere, 33520, Finland
| | - Olli T Raitakari
- Centre for Population Health Research, University of Turku and Turku University Hospital, 20520, Turku, Finland
- Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku, 20520, Turku, Finland
- Department of Clinical Physiology and Nuclear Medicine, Turku University Hospital, 20521, Turku, Finland
| | - Tõnu Esko
- Estonian Genome Centre, Institute of Genomics, University of Tartu, 50090, Tartu, Estonia
| | - Michael P Epstein
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA.
| | - Jingjing Yang
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA.
| |
Collapse
|
41
|
Liang M, An B, Deng T, Du L, Li K, Cao S, Du Y, Xu L, Zhang L, Gao X, Cao Y, Zhao Y, Li J, Gao H. Incorporating genome-wide and transcriptome-wide association studies to identify genetic elements of longissimus dorsi muscle in Huaxi cattle. Front Genet 2023; 13:982433. [PMID: 36685878 PMCID: PMC9852892 DOI: 10.3389/fgene.2022.982433] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 12/07/2022] [Indexed: 01/07/2023] Open
Abstract
Locating the genetic variation of important livestock and poultry economic traits is essential for genetic improvement in breeding programs. Identifying the candidate genes for the productive ability of Huaxi cattle was one crucial element for practical breeding. Based on the genotype and phenotype data of 1,478 individuals and the RNA-seq data of 120 individuals contained in 1,478 individuals, we implemented genome-wide association studies (GWAS), transcriptome-wide association studies (TWAS), and Fisher's combined test (FCT) to identify the candidate genes for the carcass trait, the weight of longissimus dorsi muscle (LDM). The results indicated that GWAS, TWAS, and FCT identified seven candidate genes for LDM altogether: PENK was located by GWAS and FCT, PPAT was located by TWAS and FCT, and XKR4, MTMR3, FGFRL1, DHRS4, and LAP3 were only located by one of the methods. After functional analysis of these candidate genes and referring to the reported studies, we found that they were mainly functional in the progress of the development of the body and the growth of muscle cells. Combining advanced breeding techniques such as gene editing with our study will significantly accelerate the genetic improvement for the future breeding of Huaxi cattle.
Collapse
Affiliation(s)
- Mang Liang
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Bingxing An
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Tianyu Deng
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Lili Du
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Keanning Li
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Sheng Cao
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Yueying Du
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Lingyang Xu
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Lupei Zhang
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Xue Gao
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Yang Cao
- Jilin Academy of Agricultural Sciences, Changchun, China
| | - Yuming Zhao
- Jilin Academy of Agricultural Sciences, Changchun, China
| | - Junya Li
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Huijiang Gao
- Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China,*Correspondence: Huijiang Gao,
| |
Collapse
|
42
|
Gedik H, Peterson RE, Riley BP, Vladimirov VI, Bacanu SA. Integrative Post-Genome-Wide Association Study Analyses Relevant to Psychiatric Disorders: Imputing Transcriptome and Proteome Signals. Complex Psychiatry 2023; 9:130-144. [PMID: 37588130 PMCID: PMC10425719 DOI: 10.1159/000530223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 03/09/2023] [Indexed: 08/18/2023] Open
Abstract
Background The genome-wide association study (GWAS) is a common tool to identify genetic variants associated with complex traits, including psychiatric disorders (PDs). However, post-GWAS analyses are needed to extend the statistical inference to biologically relevant entities, e.g., genes, proteins, and pathways. To achieve this goal, researchers developed methods that incorporate biologically relevant intermediate molecular phenotypes, such as gene expression and protein abundance, which are posited to mediate the variant-trait association. Transcriptome-wide association study (TWAS) and proteome-wide association study (PWAS) are commonly used methods to test the association between these molecular mediators and the trait. Summary In this review, we discuss the most recent developments in TWAS and PWAS. These methods integrate existing "omic" information with the GWAS summary statistics for trait(s) of interest. Specifically, they impute transcript/protein data and test the association between imputed gene expression/protein level with phenotype of interest by using (i) GWAS summary statistics and (ii) reference transcriptomic/proteomic/genomic datasets. TWAS and PWAS are suitable as analysis tools for (i) primary association scan and (ii) fine-mapping to identify potentially causal genes for PDs. Key Messages As post-GWAS analyses, TWAS and PWAS have the potential to highlight causal genes for PDs. These prioritized genes could indicate targets for the development of novel drug therapies. For researchers attempting such analyses, we recommend Mendelian randomization tools that use GWAS statistics for both trait and reference datasets, e.g., summary Mendelian randomization (SMR). We base our recommendation on (i) being able to use the same tool for both TWAS and PWAS, (ii) not requiring the pre-computed weights (and thus easier to update for larger reference datasets), and (iii) most larger transcriptome reference datasets are publicly available and easy to transform into a compatible format for SMR analysis.
Collapse
Affiliation(s)
- Huseyin Gedik
- Integrative Life Sciences, Virginia Institute of Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA, USA
| | - Roseann E. Peterson
- Institute for Genomics in Health, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| | - Brien P. Riley
- Institute for Genomics in Health, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| | - Vladimir I. Vladimirov
- Department of Psychiatry, College of Medicine-Phoenix, University of Arizona, Phoenix, AZ, USA
| | - Silviu-Alin Bacanu
- Institute for Genomics in Health, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| |
Collapse
|
43
|
Chen J, Wang L, De Jager PL, Bennett DA, Buchman AS, Yang J. A scalable Bayesian functional GWAS method accounting for multivariate quantitative functional annotations with applications for studying Alzheimer disease. HGG ADVANCES 2022; 3:100143. [PMID: 36204489 PMCID: PMC9530673 DOI: 10.1016/j.xhgg.2022.100143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 09/14/2022] [Indexed: 11/30/2022] Open
Abstract
Existing methods for integrating functional annotations in genome-wide association studies (GWASs) to fine-map and prioritize potential causal variants are limited to using non-overlapped categorical annotations or limited by the computation burden of modeling genome-wide variants. To overcome these limitations, we propose a scalable Bayesian functional GWAS method to account for multivariate quantitative functional annotations (BFGWAS_QUANT), accompanied by a scalable computation algorithm enabling joint modeling of genome-wide variants. Simulation studies validated the performance of BFGWAS_QUANT for accurately quantifying annotation enrichment and improving GWAS power. Applying BFGWAS_QUANT to study five Alzheimer disease (AD)-related phenotypes using individual-level GWAS data (n = ∼1,000), we found that histone modification annotations have higher enrichment than expression quantitative trait locus (eQTL) annotations for all considered phenotypes, with the highest enrichment in H3K27me3 (polycomb regression). We also found that cis-eQTLs in microglia had higher enrichment than eQTLs of bulk brain frontal cortex tissue for all considered phenotypes. A similar enrichment pattern was also identified using the International Genomics of Alzheimer's Project (IGAP) summary-level GWAS data of AD (n = ∼54,000). The strongest known APOE E4 risk allele was identified for all five phenotypes, and the APOE locus was validated using the IGAP data. BFGWAS_QUANT fine-mapped 32 significant variants from 1,073 genome-wide significant variants in the IGAP data. We also demonstrated that the polygenic risk scores (PRSs) using effect size estimates by BFGWAS_QUANT had a similar prediction accuracy as other methods assuming a sparse causal model. Overall, BFGWAS_QUANT is a useful GWAS tool for quantifying annotation enrichment and prioritizing potential causal variants.
Collapse
Affiliation(s)
- Junyu Chen
- Department of Epidemiology, Emory University School of Public Health, Atlanta, GA 30322, USA
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Lei Wang
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Philip L. De Jager
- Center for Translational and Computational Neuroimmunology, Department of Neurology and Taub Institute for Research on Alzheimer’s Disease and the Aging Brain, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - David A. Bennett
- Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, IL 60612, USA
| | - Aron S. Buchman
- Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, IL 60612, USA
| | - Jingjing Yang
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA
| |
Collapse
|
44
|
Bhattacharya A, Hirbo JB, Zhou D, Zhou W, Zheng J, Kanai M, Pasaniuc B, Gamazon ER, Cox NJ. Best practices for multi-ancestry, meta-analytic transcriptome-wide association studies: Lessons from the Global Biobank Meta-analysis Initiative. CELL GENOMICS 2022; 2:100180. [PMID: 36341024 PMCID: PMC9631681 DOI: 10.1016/j.xgen.2022.100180] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Revised: 08/09/2022] [Accepted: 09/01/2022] [Indexed: 12/13/2022]
Abstract
The Global Biobank Meta-analysis Initiative (GBMI), through its diversity, provides a valuable opportunity to study population-wide and ancestry-specific genetic associations. However, with multiple ascertainment strategies and multi-ancestry study populations across biobanks, GBMI presents unique challenges in implementing statistical genetics methods. Transcriptome-wide association studies (TWASs) boost detection power for and provide biological context to genetic associations by integrating genetic variant-to-trait associations from genome-wide association studies (GWASs) with predictive models of gene expression. TWASs present unique challenges beyond GWASs, especially in a multi-biobank, meta-analytic setting. Here, we present the GBMI TWAS pipeline, outlining practical considerations for ancestry and tissue specificity, meta-analytic strategies, and open challenges at every step of the framework. We advise conducting ancestry-stratified TWASs using ancestry-specific expression models and meta-analyzing results using inverse-variance weighting, showing the least test statistic inflation. Our work provides a foundation for adding transcriptomic context to biobank-linked GWASs, allowing for ancestry-aware discovery to accelerate genomic medicine.
Collapse
Affiliation(s)
- Arjun Bhattacharya
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Institute of Quantitative and Computational Biosciences, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Jibril B. Hirbo
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Dan Zhou
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Wei Zhou
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Jie Zheng
- MRC Integrative Epidemiology Unit (IEU), Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol BS8 2BN, UK
| | - Masahiro Kanai
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita 565-0871, Japan
| | - the Global Biobank Meta-analysis Initiative
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Institute of Quantitative and Computational Biosciences, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- MRC Integrative Epidemiology Unit (IEU), Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol BS8 2BN, UK
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita 565-0871, Japan
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- MRC Epidemiology Unit, University of Cambridge, Cambridge, UK
| | - Bogdan Pasaniuc
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Eric R. Gamazon
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
- MRC Epidemiology Unit, University of Cambridge, Cambridge, UK
| | - Nancy J. Cox
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
45
|
Shao Z, Wang T, Qiao J, Zhang Y, Huang S, Zeng P. A comprehensive comparison of multilocus association methods with summary statistics in genome-wide association studies. BMC Bioinformatics 2022; 23:359. [PMID: 36042399 PMCID: PMC9429742 DOI: 10.1186/s12859-022-04897-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 08/22/2022] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Multilocus analysis on a set of single nucleotide polymorphisms (SNPs) pre-assigned within a gene constitutes a valuable complement to single-marker analysis by aggregating data on complex traits in a biologically meaningful way. However, despite the existence of a wide variety of SNP-set methods, few comprehensive comparison studies have been previously performed to evaluate the effectiveness of these methods. RESULTS We herein sought to fill this knowledge gap by conducting a comprehensive empirical comparison for 22 commonly-used summary-statistics based SNP-set methods. We showed that only seven methods could effectively control the type I error, and that these well-calibrated approaches had varying power performance under the simulation scenarios. Overall, we confirmed that the burden test was generally underpowered and score-based variance component tests (e.g., sequence kernel association test) were much powerful under the polygenic genetic architecture in both common and rare variant association analyses. We further revealed that two linkage-disequilibrium-free P value combination methods (e.g., harmonic mean P value method and aggregated Cauchy association test) behaved very well under the sparse genetic architecture in simulations and real-data applications to common and rare variant association analyses as well as in expression quantitative trait loci weighted integrative analysis. We also assessed the scalability of these approaches by recording computational time and found that all these methods can be scalable to biobank-scale data although some might be relatively slow. CONCLUSION In conclusion, we hope that our findings can offer an important guidance on how to choose appropriate multilocus association analysis methods in post-GWAS era. All the SNP-set methods are implemented in the R package called MCA, which is freely available at https://github.com/biostatpzeng/ .
Collapse
Affiliation(s)
- Zhonghe Shao
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ting Wang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Jiahao Qiao
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Yuchen Zhang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Shuiping Huang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Key Laboratory of Environment and Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
- Engineering Research Innovation Center of Biological Data Mining and Healthcare Transformation, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China
| | - Ping Zeng
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Key Laboratory of Human Genetics and Environmental Medicine, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Key Laboratory of Environment and Health, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
- Engineering Research Innovation Center of Biological Data Mining and Healthcare Transformation, Xuzhou Medical University, Xuzhou, 221004, Jiangsu, China.
| |
Collapse
|
46
|
Jin X, Zhang L, Ji J, Ju T, Zhao J, Yuan Z. Network regression analysis in transcriptome-wide association studies. BMC Genomics 2022; 23:562. [PMID: 35933330 PMCID: PMC9356418 DOI: 10.1186/s12864-022-08809-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Accepted: 08/02/2022] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Transcriptome-wide association studies (TWASs) have shown great promise in interpreting the findings from genome-wide association studies (GWASs) and exploring the disease mechanisms, by integrating GWAS and eQTL mapping studies. Almost all TWAS methods only focus on one gene at a time, with exception of only two published multiple-gene methods nevertheless failing to account for the inter-dependence as well as the network structure among multiple genes, which may lead to power loss in TWAS analysis as complex disease often owe to multiple genes that interact with each other as a biological network. We therefore developed a Network Regression method in a two-stage TWAS framework (NeRiT) to detect whether a given network is associated with the traits of interest. NeRiT adopts the flexible Bayesian Dirichlet process regression to obtain the gene expression prediction weights in the first stage, uses pointwise mutual information to represent the general between-node correlation in the second stage and can effectively take the network structure among different gene nodes into account. RESULTS Comprehensive and realistic simulations indicated NeRiT had calibrated type I error control for testing both the node effect and edge effect, and yields higher power than the existed methods, especially in testing the edge effect. The results were consistent regardless of the GWAS sample size, the gene expression prediction model in the first step of TWAS, the network structure as well as the correlation pattern among different gene nodes. Real data applications through analyzing systolic blood pressure and diastolic blood pressure from UK Biobank showed that NeRiT can simultaneously identify the trait-related nodes as well as the trait-related edges. CONCLUSIONS NeRiT is a powerful and efficient network regression method in TWAS.
Collapse
Affiliation(s)
- Xiuyuan Jin
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China.,Institute for Medical Dataology, Shandong University, Jinan, 250003, Shandong, China
| | - Liye Zhang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China.,Institute for Medical Dataology, Shandong University, Jinan, 250003, Shandong, China
| | - Jiadong Ji
- Institute for Financial Studies, Shandong University, Jinan, 250100, Shandong, China
| | - Tao Ju
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China.,Institute for Medical Dataology, Shandong University, Jinan, 250003, Shandong, China
| | - Jinghua Zhao
- Department of Public Health and Primary Care, Cardiovascular Epidemiology Unit, University of Cambridge, Cambridge, UK.
| | - Zhongshang Yuan
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China. .,Institute for Medical Dataology, Shandong University, Jinan, 250003, Shandong, China.
| |
Collapse
|
47
|
Rowland B, Venkatesh S, Tardaguila M, Wen J, Rosen JD, Tapia AL, Sun Q, Graff M, Vuckovic D, Lettre G, Sankaran VG, Voloudakis G, Roussos P, Huffman JE, Reiner AP, Soranzo N, Raffield LM, Li Y. Transcriptome-wide association study in UK Biobank Europeans identifies associations with blood cell traits. Hum Mol Genet 2022; 31:2333-2347. [PMID: 35138379 PMCID: PMC9307312 DOI: 10.1093/hmg/ddac011] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 12/15/2021] [Accepted: 01/04/2022] [Indexed: 11/13/2022] Open
Abstract
Previous genome-wide association studies (GWAS) of hematological traits have identified over 10 000 distinct trait-specific risk loci. However, at these loci, the underlying causal mechanisms remain incompletely characterized. To elucidate novel biology and better understand causal mechanisms at known loci, we performed a transcriptome-wide association study (TWAS) of 29 hematological traits in 399 835 UK Biobank (UKB) participants of European ancestry using gene expression prediction models trained from whole blood RNA-seq data in 922 individuals. We discovered 557 gene-trait associations for hematological traits distinct from previously reported GWAS variants in European populations. Among the 557 associations, 301 were available for replication in a cohort of 141 286 participants of European ancestry from the Million Veteran Program. Of these 301 associations, 108 replicated at a strict Bonferroni adjusted threshold ($\alpha$= 0.05/301). Using our TWAS results, we systematically assigned 4261 out of 16 900 previously identified hematological trait GWAS variants to putative target genes. Compared to coloc, our TWAS results show reduced specificity and increased sensitivity in external datasets to assign variants to target genes.
Collapse
Affiliation(s)
- Bryce Rowland
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Sanan Venkatesh
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA
- Mental Illness Research, Education, and Clinical Center (VISN 2 South), James J. Peters VA Medical Center, Bronx, NY 10468, USA
| | - Manuel Tardaguila
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton CB10 1SA, UK
| | - Jia Wen
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jonathan D Rosen
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Amanda L Tapia
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Quan Sun
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Mariaelisa Graff
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Dragana Vuckovic
- Department of Epidemiology and Biostatistics, School of Public Health, Faculty of Medicine, Imperial College London, London SW7 2AZ, UK
| | - Guillaume Lettre
- Montreal Heart Institute, Université de Montréal, Montreal, Quebec, Canada
| | - Vijay G Sankaran
- Division of Hematology/Oncology, Boston Children's Hospital, Boston, MA 02115, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Georgios Voloudakis
- Mental Illness Research, Education, and Clinical Center (VISN 2 South), James J. Peters VA Medical Center, Bronx, NY 10468, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA
| | - Panos Roussos
- Mental Illness Research, Education, and Clinical Center (VISN 2 South), James J. Peters VA Medical Center, Bronx, NY 10468, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA
| | - Jennifer E Huffman
- Center for Population Genomics, Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, MA 02130, USA
| | - Alexander P Reiner
- Department of Epidemiology, University of Washington, Seattle, WA 98195, USA
| | - Nicole Soranzo
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton CB10 1SA, UK
| | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
48
|
Transcriptomic Insight into Viviparous Growth in Water Lily. BIOMED RESEARCH INTERNATIONAL 2022; 2022:8445484. [PMID: 35845943 PMCID: PMC9283058 DOI: 10.1155/2022/8445484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 04/30/2022] [Accepted: 05/18/2022] [Indexed: 11/17/2022]
Abstract
Water lily is an important ornamental flower plant which is capable of viviparous plantlet development. But no study has been reported on the molecular basis of viviparity in water lily. Hence, we performed a comparative transcriptome study between viviparous water lily Nymphaea micrantha and a nonviviparous species Nymphaea colorata at four developmental stages. The higher expression of highly conserved AUX/IAA, ARF, GH3, and SAUR gene families in N. micrantha compared to N. colorata is predicted to have a major impact on the development and evolution of viviparity in water lily. Likewise, differential regulation of hormone signaling, brassinosteroid, photosynthesis, and energy-related pathways in the two species provide clues of their involvement in viviparity phenomenon. This study revealed the complex mechanism of viviparity trait in water lily. The transcriptomic signatures identified are important basis for future breeding and research of viviparity in water lily and other plant species.
Collapse
|
49
|
Ji Y, Wei Q, Chen R, Wang Q, Tao R, Li B. Integration of multidimensional splicing data and GWAS summary statistics for risk gene discovery. PLoS Genet 2022; 18:e1009814. [PMID: 35771864 PMCID: PMC9278751 DOI: 10.1371/journal.pgen.1009814] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 07/13/2022] [Accepted: 05/26/2022] [Indexed: 12/30/2022] Open
Abstract
A common strategy for the functional interpretation of genome-wide association study (GWAS) findings has been the integrative analysis of GWAS and expression data. Using this strategy, many association methods (e.g., PrediXcan and FUSION) have been successful in identifying trait-associated genes via mediating effects on RNA expression. However, these approaches often ignore the effects of splicing, which can carry as much disease risk as expression. Compared to expression data, one challenge to detect associations using splicing data is the large multiple testing burden due to multidimensional splicing events within genes. Here, we introduce a multidimensional splicing gene (MSG) approach, which consists of two stages: 1) we use sparse canonical correlation analysis (sCCA) to construct latent canonical vectors (CVs) by identifying sparse linear combinations of genetic variants and splicing events that are maximally correlated with each other; and 2) we test for the association between the genetically regulated splicing CVs and the trait of interest using GWAS summary statistics. Simulations show that MSG has proper type I error control and substantial power gains over existing multidimensional expression analysis methods (i.e., S-MultiXcan, UTMOST, and sCCA+ACAT) under diverse scenarios. When applied to the Genotype-Tissue Expression Project data and GWAS summary statistics of 14 complex human traits, MSG identified on average 83%, 115%, and 223% more significant genes than sCCA+ACAT, S-MultiXcan, and UTMOST, respectively. We highlight MSG's applications to Alzheimer's disease, low-density lipoprotein cholesterol, and schizophrenia, and found that the majority of MSG-identified genes would have been missed from expression-based analyses. Our results demonstrate that aggregating splicing data through MSG can improve power in identifying gene-trait associations and help better understand the genetic risk of complex traits.
Collapse
Affiliation(s)
- Ying Ji
- Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, Tennessee, United States of America
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Qiang Wei
- Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, Tennessee, United States of America
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Rui Chen
- Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, Tennessee, United States of America
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Quan Wang
- Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, Tennessee, United States of America
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Ran Tao
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- * E-mail: (RT); (BL)
| | - Bingshan Li
- Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, Tennessee, United States of America
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- * E-mail: (RT); (BL)
| |
Collapse
|
50
|
Wolc A, Dekkers JCM. Application of Bayesian genomic prediction methods to genome-wide association analyses. Genet Sel Evol 2022; 54:31. [PMID: 35562659 PMCID: PMC9103490 DOI: 10.1186/s12711-022-00724-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 04/27/2022] [Indexed: 11/19/2022] Open
Abstract
Background Bayesian genomic prediction methods were developed to simultaneously fit all genotyped markers to a set of available phenotypes for prediction of breeding values for quantitative traits, allowing for differences in the genetic architecture (distribution of marker effects) of traits. These methods also provide a flexible and reliable framework for genome-wide association (GWA) studies. The objective here was to review developments in Bayesian hierarchical and variable selection models for GWA analyses. Results By fitting all genotyped markers simultaneously, Bayesian GWA methods implicitly account for population structure and the multiple-testing problem of classical single-marker GWA. Implemented using Markov chain Monte Carlo methods, Bayesian GWA methods allow for control of error rates using probabilities obtained from posterior distributions. Power of GWA studies using Bayesian methods can be enhanced by using informative priors based on previous association studies, gene expression analyses, or functional annotation information. Applied to multiple traits, Bayesian GWA analyses can give insight into pleiotropic effects by multi-trait, structural equation, or graphical models. Bayesian methods can also be used to combine genomic, transcriptomic, proteomic, and other -omics data to infer causal genotype to phenotype relationships and to suggest external interventions that can improve performance. Conclusions Bayesian hierarchical and variable selection methods provide a unified and powerful framework for genomic prediction, GWA, integration of prior information, and integration of information from other -omics platforms to identify causal mutations for complex quantitative traits.
Collapse
Affiliation(s)
- Anna Wolc
- Department of Animal Science, Iowa State University, 806 Stange Road, 239 Kildee Hall, Ames, IA, 50010, USA.,Hy-Line International, 2583 240th Street, Dallas Center, IA, 50063, USA
| | - Jack C M Dekkers
- Department of Animal Science, Iowa State University, 806 Stange Road, 239 Kildee Hall, Ames, IA, 50010, USA.
| |
Collapse
|