1
|
Mews MA, Naj AC, Griswold AJ, Below JE, Bush WS. Brain and blood transcriptome-wide association studies identify five novel genes associated with Alzheimer's disease. J Alzheimers Dis 2025; 105:228-244. [PMID: 40111921 DOI: 10.1177/13872877251326288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/22/2025]
Abstract
BackgroundGenome-wide association studies (GWAS) have identified numerous genetic variants associated with Alzheimer's disease (AD), but their functional implications remain unclear. Transcriptome-wide association studies (TWAS) offer enhanced statistical power by analyzing genetic associations at the gene level rather than at the variant level, enabling assessment of how genetically-regulated gene expression influences AD risk. However, previous AD-TWAS have been limited by small expression quantitative trait loci (eQTL) reference datasets or reliance on AD-by-proxy phenotypes.ObjectiveTo perform the most powerful AD-TWAS to date using summary statistics from the largest available brain and blood cis-eQTL meta-analyses applied to the largest clinically-adjudicated AD GWAS.MethodsWe implemented the OTTERS TWAS pipeline to predict gene expression using the largest available cis-eQTL data from cortical brain tissue (MetaBrain; N = 2683) and blood (eQTLGen; N = 31,684), and then applied these models to AD-GWAS data (Cases = 21,982; Controls = 44,944).ResultsWe identified and validated five novel gene associations in cortical brain tissue (PRKAG1, C3orf62, LYSMD4, ZNF439, SLC11A2) and six genes proximal to known AD-related GWAS loci (Blood: MYBPC3; Brain: MTCH2, CYB561, MADD, PSMA5, ANXA11). Further, using causal eQTL fine-mapping, we generated sparse models that retained the strength of the AD-TWAS association for MTCH2, MADD, ZNF439, CYB561, and MYBPC3.ConclusionsOur comprehensive AD-TWAS discovered new gene associations and provided insights into the functional relevance of previously associated variants, which enables us to further understand the genetic architecture underlying AD risk.
Collapse
Affiliation(s)
- Makaela A Mews
- System Biology and Bioinformatics, Department of Nutrition, Case Western Reserve University School of Medicine, Cleveland, OH, USA
| | - Adam C Naj
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
- Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Anthony J Griswold
- John P. Hussman Institute for Human Genomics, University of Miami, Miami, FL, USA
- Dr. John T. Macdonald Foundation Department of Human Genetics, University of Miami, Miami, FL, USA
| | - Jennifer E Below
- Vanderbilt Genetics Institute and Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - William S Bush
- Department of Population and Quantitative Health Sciences, Cleveland Institute for Computational Biology, Case Western Reserve University School of Medicine, Cleveland, OH, USA
| |
Collapse
|
2
|
Wang L, Markus H, Chen D, Chen S, Zhang F, Gao S, Khunsriraksakul C, Chen F, Olsen N, Foulke G, Jiang B, Carrel L, Liu DJ. An atlas of single-cell eQTLs dissects autoimmune disease genes and identifies novel drug classes for treatment. CELL GENOMICS 2025; 5:100820. [PMID: 40154479 PMCID: PMC12008810 DOI: 10.1016/j.xgen.2025.100820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 11/05/2024] [Accepted: 03/04/2025] [Indexed: 04/01/2025]
Abstract
Most variants identified from genome-wide association studies (GWASs) are non-coding and regulate gene expression. However, many risk loci fail to colocalize with expression quantitative trait loci (eQTLs), potentially due to limited GWAS and eQTL analysis power or cellular heterogeneity. Population-scale single-cell RNA-sequencing (scRNA-seq) datasets are emerging, enabling mapping of eQTLs in different cell types (sc-eQTLs). Compared to eQTL data from bulk tissues (bk-eQTLs), sc-eQTL datasets are smaller. We propose a joint model of bk-eQTLs as a weighted sum of sc-eQTLs (JOBS) from constituent cell types to improve power. Applying JOBS to One1K1K and eQTLGen data, we identify 586% more eQTLs, matching the power of 4× the sample sizes of OneK1K. Integrating sc-eQTLs with GWAS data creates an atlas for 14 immune-mediated disorders, colocalizing 29.9% or 32.2% more loci than using sc-eQTL or bk-eQTL alone. Extending JOBS, we develop a drug-repurposing pipeline and identify novel drugs validated by real-world data.
Collapse
Affiliation(s)
- Lida Wang
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Havell Markus
- Bioinformatics and Genomics PhD Program, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA; Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Dieyi Chen
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Siyuan Chen
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Fan Zhang
- Bioinformatics and Genomics PhD Program, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA; Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Shuang Gao
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Chachrit Khunsriraksakul
- Bioinformatics and Genomics PhD Program, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA; Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Fang Chen
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Nancy Olsen
- Department of Medicine, Penn State University, College of Medicine, Hershey, PA 17033, USA
| | - Galen Foulke
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA; Department of Dermatology, Penn State University College of Medicine, Hershey, PA 17033, USA
| | - Bibo Jiang
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA.
| | - Laura Carrel
- Bioinformatics and Genomics PhD Program, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA; Department of Biochemistry and Molecular Biology, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA.
| | - Dajiang J Liu
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA; Bioinformatics and Genomics PhD Program, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA; Department of Biochemistry and Molecular Biology, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA.
| |
Collapse
|
3
|
Qi G, Lila E, Ji Z, Shojaie A, Battle A, Sun W. Transcriptome-wide association studies at cell state level using single-cell eQTL data. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2025.03.17.25324128. [PMID: 40166533 PMCID: PMC11957072 DOI: 10.1101/2025.03.17.25324128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
Transcriptome-wide association studies (TWAS) have been widely used to prioritize relevant genes for diseases. Current methods for TWAS test gene-disease associations at bulk tissue or cell-type-specific pseudobulk level, which do not account for the heterogeneity within cell types. We present TWiST, a statistical method for TWAS analysis at cell state resolution using single-cell expression quantitative trait loci (eQTL) data. Our method uses pseudotime to represent cell states and models the effect of gene expression on trait as a continuous pseudotemporal curve. Therefore, it allows flexible hypothesis testing of global, dynamic, and nonlinear effects. Through simulation studies and real data analysis, we demonstrated that TWiST leads to significantly improved power compared to pseudobulk methods that ignores heterogeneity due to cell states. Application to the OneK1K study identified hundreds of genes with dynamic effects on autoimmune diseases along the trajectory of immune cell differentiation. TWiST presents great promise to understand disease genetics using single-cell eQTL studies.
Collapse
|
4
|
Cao C, Shao M, Wang J, Li Z, Chen H, You T, Li MJ, Ding Y, Zou Q. webTWAS 2.0: update platform for identifying complex disease susceptibility genes through transcriptome-wide association study. Nucleic Acids Res 2025; 53:D1261-D1269. [PMID: 39526380 PMCID: PMC11701649 DOI: 10.1093/nar/gkae1022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2024] [Revised: 10/14/2024] [Accepted: 10/17/2024] [Indexed: 11/16/2024] Open
Abstract
Transcriptome-wide association study (TWAS) has successfully identified numerous complex disease susceptibility genes in the post-genome-wide association study (GWAS) era. Over the past 3 years, the focus of TWAS algorithms has shifted from merely identifying associations to understanding how single nucleotide polymorphisms (SNPs) regulate gene expression, with a growing emphasis on incorporating fine-mapping techniques. Additionally, the rapid increase in GWAS summary statistics, driven largely by the UK Biobank and other consortia, has made it essential to update our webTWAS resource. To address these challenges and meet the growing needs of researchers, we developed webTWAS 2.0, an updated platform for identifying susceptibility genes for human complex diseases using TWAS. Additionally, webTWAS 2.0 provides an online TWAS analysis tool that simplifies conducting TWAS analyses. The updated resource includes 7247 GWAS summary statistics covering 1588 complex human diseases from 192 publications. It also incorporates multiple TWAS methods, such as sTF-TWAS, 3'aTWAS and GIFT, along with an updated interactive visualization tool that allows users to easily explore significant associations across different methods. Other upgrades include a personalized online analysis tool for user-submitted GWAS data and a refined search function that makes it easier to identify relevant associations and meet diverse user needs more efficiently. webTWAS 2.0 is freely accessible at http://www.webtwas.net.
Collapse
Affiliation(s)
- Chen Cao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University,101 Longmian Ave, Nanjing, Jiangsu 211166, China
| | - Mengting Shao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University,101 Longmian Ave, Nanjing, Jiangsu 211166, China
| | - Jianhua Wang
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania—Perelman School of Medicine, 421 Curie Blvd, Philadelphia, PA 19104, USA
| | - Zhenghui Li
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University,101 Longmian Ave, Nanjing, Jiangsu 211166, China
| | - Haoran Chen
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University,101 Longmian Ave, Nanjing, Jiangsu 211166, China
| | - Tianyi You
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, 22 Qixiangtai Road, Tianjin 300203, China
| | - Mulin Jun Li
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, 22 Qixiangtai Road, Tianjin 300203, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, 1 Chengdian Road, Quzhou, Zhejiang 324003, China
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, 1 Chengdian Road, Quzhou, Zhejiang 324003, China
| |
Collapse
|
5
|
King A, Wu C. Integrative Multi-Omics Approach for Improving Causal Gene Identification. Genet Epidemiol 2025; 49:e22601. [PMID: 39444114 DOI: 10.1002/gepi.22601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2024] [Revised: 10/01/2024] [Accepted: 10/04/2024] [Indexed: 10/25/2024]
Abstract
Transcriptome-wide association studies (TWAS) have been widely used to identify thousands of likely causal genes for diseases and complex traits using predicted expression models. However, most existing TWAS methods rely on gene expression alone and overlook other regulatory mechanisms of gene expression, including DNA methylation and splicing, that contribute to the genetic basis of these complex traits and diseases. Here we introduce a multi-omics method that integrates gene expression, DNA methylation, and splicing data to improve the identification of associated genes with our traits of interest. Through simulations and by analyzing genome-wide association study (GWAS) summary statistics for 24 complex traits, we show that our integrated method, which leverages these complementary omics biomarkers, achieves higher statistical power, and improves the accuracy of likely causal gene identification in blood tissues over individual omics methods. Finally, we apply our integrated model to a lung cancer GWAS data set, demonstrating the integrated models improved identification of prioritized genes for lung cancer risk.
Collapse
Affiliation(s)
- Austin King
- Department of Statistics, Florida State University, Tallahassee, Florida, USA
| | - Chong Wu
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| |
Collapse
|
6
|
Shao M, Chen K, Zhang S, Tian M, Shen Y, Cao C, Gu N. Multiome-wide Association Studies: Novel Approaches for Understanding Diseases. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae077. [PMID: 39471467 PMCID: PMC11630051 DOI: 10.1093/gpbjnl/qzae077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 10/06/2024] [Accepted: 10/23/2024] [Indexed: 11/01/2024]
Abstract
The rapid development of multiome (transcriptome, proteome, cistrome, imaging, and regulome)-wide association study methods have opened new avenues for biologists to understand the susceptibility genes underlying complex diseases. Thorough comparisons of these methods are essential for selecting the most appropriate tool for a given research objective. This review provides a detailed categorization and summary of the statistical models, use cases, and advantages of recent multiome-wide association studies. In addition, to illustrate gene-disease association studies based on transcriptome-wide association study (TWAS), we collected 478 disease entries across 22 categories from 235 manually reviewed publications. Our analysis reveals that mental disorders are the most frequently studied diseases by TWAS, indicating its potential to deepen our understanding of the genetic architecture of complex diseases. In summary, this review underscores the importance of multiome-wide association studies in elucidating complex diseases and highlights the significance of selecting the appropriate method for each study.
Collapse
Affiliation(s)
- Mengting Shao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Kaiyang Chen
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Shuting Zhang
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Min Tian
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Yan Shen
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Chen Cao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Ning Gu
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
- Nanjing Key Laboratory for Cardiovascular Information and Health Engineering Medicine, Institute of Clinical Medicine, Nanjing Drum Tower Hospital, Medical School, Nanjing University, Nanjing 210093, China
| |
Collapse
|
7
|
Melton HJ, Zhang Z, Deng HW, Wu L, Wu C. MIMOSA: a resource consisting of improved methylome prediction models increases power to identify DNA methylation-phenotype associations. Epigenetics 2024; 19:2370542. [PMID: 38963888 PMCID: PMC11225927 DOI: 10.1080/15592294.2024.2370542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 04/29/2024] [Accepted: 06/12/2024] [Indexed: 07/06/2024] Open
Abstract
Although DNA methylation (DNAm) has been implicated in the pathogenesis of numerous complex diseases, from cancer to cardiovascular disease to autoimmune disease, the exact methylation sites that play key roles in these processes remain elusive. One strategy to identify putative causal CpG sites and enhance disease etiology understanding is to conduct methylome-wide association studies (MWASs), in which predicted DNA methylation that is associated with complex diseases can be identified. However, current MWAS models are primarily trained using the data from single studies, thereby limiting the methylation prediction accuracy and the power of subsequent association studies. Here, we introduce a new resource, MWAS Imputing Methylome Obliging Summary-level mQTLs and Associated LD matrices (MIMOSA), a set of models that substantially improve the prediction accuracy of DNA methylation and subsequent MWAS power through the use of a large summary-level mQTL dataset provided by the Genetics of DNA Methylation Consortium (GoDMC). Through the analyses of GWAS (genome-wide association study) summary statistics for 28 complex traits and diseases, we demonstrate that MIMOSA considerably increases the accuracy of DNA methylation prediction in whole blood, crafts fruitful prediction models for low heritability CpG sites, and determines markedly more CpG site-phenotype associations than preceding methods. Finally, we use MIMOSA to conduct a case study on high cholesterol, pinpointing 146 putatively causal CpG sites.
Collapse
Affiliation(s)
- Hunter J. Melton
- Department of Statistics, Florida State University, Tallahassee, FL, USA
| | - Zichen Zhang
- Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Hong-Wen Deng
- Center of Bioinformatics and Genomics, Tulane University, New Orleans, LA, USA
| | - Lang Wu
- Center of Bioinformatics and Genomics, Tulane University, New Orleans, LA, USA
| | - Chong Wu
- Cancer Epidemiology Division, University of Hawaii Cancer Center, Honolulu, HI, USA
- Institute for Data Science in Oncology, The UT MD Anderson Cancer Center
| |
Collapse
|
8
|
Kontou PI, Bagos PG. The goldmine of GWAS summary statistics: a systematic review of methods and tools. BioData Min 2024; 17:31. [PMID: 39238044 PMCID: PMC11375927 DOI: 10.1186/s13040-024-00385-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 08/27/2024] [Indexed: 09/07/2024] Open
Abstract
Genome-wide association studies (GWAS) have revolutionized our understanding of the genetic architecture of complex traits and diseases. GWAS summary statistics have become essential tools for various genetic analyses, including meta-analysis, fine-mapping, and risk prediction. However, the increasing number of GWAS summary statistics and the diversity of software tools available for their analysis can make it challenging for researchers to select the most appropriate tools for their specific needs. This systematic review aims to provide a comprehensive overview of the currently available software tools and databases for GWAS summary statistics analysis. We conducted a comprehensive literature search to identify relevant software tools and databases. We categorized the tools and databases by their functionality, including data management, quality control, single-trait analysis, and multiple-trait analysis. We also compared the tools and databases based on their features, limitations, and user-friendliness. Our review identified a total of 305 functioning software tools and databases dedicated to GWAS summary statistics, each with unique strengths and limitations. We provide descriptions of the key features of each tool and database, including their input/output formats, data types, and computational requirements. We also discuss the overall usability and applicability of each tool for different research scenarios. This comprehensive review will serve as a valuable resource for researchers who are interested in using GWAS summary statistics to investigate the genetic basis of complex traits and diseases. By providing a detailed overview of the available tools and databases, we aim to facilitate informed tool selection and maximize the effectiveness of GWAS summary statistics analysis.
Collapse
Affiliation(s)
| | - Pantelis G Bagos
- Department of Computer Science and Biomedical Informatics, University of Thessaly, 35131, Lamia, Greece.
| |
Collapse
|
9
|
Hu T, Parrish RL, Dai Q, Buchman AS, Tasaki S, Bennett DA, Seyfried NT, Epstein MP, Yang J. Omnibus proteome-wide association study identifies 43 risk genes for Alzheimer disease dementia. Am J Hum Genet 2024; 111:1848-1863. [PMID: 39079537 PMCID: PMC11393696 DOI: 10.1016/j.ajhg.2024.07.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 06/28/2024] [Accepted: 07/02/2024] [Indexed: 09/08/2024] Open
Abstract
Transcriptome-wide association study (TWAS) tools have been applied to conduct proteome-wide association studies (PWASs) by integrating proteomics data with genome-wide association study (GWAS) summary data. The genetic effects of PWAS-identified significant genes are potentially mediated through genetically regulated protein abundance, thus informing the underlying disease mechanisms better than GWAS loci. However, existing TWAS/PWAS tools are limited by considering only one statistical model. We propose an omnibus PWAS pipeline to account for multiple statistical models and demonstrate improved performance by simulation and application studies of Alzheimer disease (AD) dementia. We employ the Aggregated Cauchy Association Test to derive omnibus PWAS (PWAS-O) p values from PWAS p values obtained by three existing tools assuming complementary statistical models-TIGAR, PrediXcan, and FUSION. Our simulation studies demonstrated improved power, with well-calibrated type I error, for PWAS-O over all three individual tools. We applied PWAS-O to studying AD dementia with reference proteomic data profiled from dorsolateral prefrontal cortex of postmortem brains from individuals of European ancestry. We identified 43 risk genes, including 5 not identified by previous studies, which are interconnected through a protein-protein interaction network that includes the well-known AD risk genes TOMM40, APOC1, and APOC2. We also validated causal genetic effects mediated through the proteome for 27 (63%) PWAS-O risk genes, providing insights into the underlying biological mechanisms of AD dementia and highlighting promising targets for therapeutic development. PWAS-O can be easily applied to studying other complex diseases.
Collapse
Affiliation(s)
- Tingyang Hu
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA; Division of Biostatistics and Bioinformatics, Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Randy L Parrish
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA; Department of Biostatistics and Bioinformatics, Emory University School of Public Health, Atlanta, GA 30322, USA
| | - Qile Dai
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA; Department of Biostatistics and Bioinformatics, Emory University School of Public Health, Atlanta, GA 30322, USA
| | - Aron S Buchman
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL 60612, USA
| | - Shinya Tasaki
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL 60612, USA
| | - David A Bennett
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL 60612, USA
| | - Nicholas T Seyfried
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Michael P Epstein
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Jingjing Yang
- Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA.
| |
Collapse
|
10
|
Wang L, Khunsriraksakul C, Markus H, Chen D, Zhang F, Chen F, Zhan X, Carrel L, Liu DJ, Jiang B. Integrating single cell expression quantitative trait loci summary statistics to understand complex trait risk genes. Nat Commun 2024; 15:4260. [PMID: 38769300 PMCID: PMC11519974 DOI: 10.1038/s41467-024-48143-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 04/22/2024] [Indexed: 05/22/2024] Open
Abstract
Transcriptome-wide association study (TWAS) is a popular approach to dissect the functional consequence of disease associated non-coding variants. Most existing TWAS use bulk tissues and may not have the resolution to reveal cell-type specific target genes. Single-cell expression quantitative trait loci (sc-eQTL) datasets are emerging. The largest bulk- and sc-eQTL datasets are most conveniently available as summary statistics, but have not been broadly utilized in TWAS. Here, we present a new method EXPRESSO (EXpression PREdiction with Summary Statistics Only), to analyze sc-eQTL summary statistics, which also integrates 3D genomic data and epigenomic annotation to prioritize causal variants. EXPRESSO substantially improves existing methods. We apply EXPRESSO to analyze multi-ancestry GWAS datasets for 14 autoimmune diseases. EXPRESSO uniquely identifies 958 novel gene x trait associations, which is 26% more than the second-best method. Among them, 492 are unique to cell type level analysis and missed by TWAS using whole blood. We also develop a cell type aware drug repurposing pipeline, which leverages EXPRESSO results to identify drug compounds that can reverse disease gene expressions in relevant cell types. Our results point to multiple drugs with therapeutic potentials, including metformin for type 1 diabetes, and vitamin K for ulcerative colitis.
Collapse
Affiliation(s)
- Lida Wang
- Department of Public Health Sciences; Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
| | - Chachrit Khunsriraksakul
- Bioinformatics and Genomics PhD Program; Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
- Institute for Personalized Medicine; Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
| | - Havell Markus
- Bioinformatics and Genomics PhD Program; Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
- Institute for Personalized Medicine; Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
| | - Dieyi Chen
- Department of Public Health Sciences; Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
| | - Fan Zhang
- Bioinformatics and Genomics PhD Program; Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
| | - Fang Chen
- Department of Public Health Sciences; Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA
| | - Xiaowei Zhan
- Department of Statistical Science, Southern Methodist University, Dallas, TX, US
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, US
- Center for Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, US
| | - Laura Carrel
- Department of Biochemistry and Molecular Biology; Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA.
| | - Dajiang J Liu
- Department of Public Health Sciences; Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA.
- Bioinformatics and Genomics PhD Program; Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA.
- Department of Statistical Science, Southern Methodist University, Dallas, TX, US.
| | - Bibo Jiang
- Department of Public Health Sciences; Pennsylvania State University College of Medicine, Hershey, Pennsylvania, USA.
| |
Collapse
|
11
|
Mews MA, Naj AC, Griswold AJ, Below JE, Bush WS. Brain and Blood Transcriptome-Wide Association Studies Identify Five Novel Genes Associated with Alzheimer's Disease. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.17.24305737. [PMID: 38699333 PMCID: PMC11065015 DOI: 10.1101/2024.04.17.24305737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
INTRODUCTION Transcriptome-wide Association Studies (TWAS) extend genome-wide association studies (GWAS) by integrating genetically-regulated gene expression models. We performed the most powerful AD-TWAS to date, using summary statistics from cis -eQTL meta-analyses and the largest clinically-adjudicated Alzheimer's Disease (AD) GWAS. METHODS We implemented the OTTERS TWAS pipeline, leveraging cis -eQTL data from cortical brain tissue (MetaBrain; N=2,683) and blood (eQTLGen; N=31,684) to predict gene expression, then applied these models to AD-GWAS data (Cases=21,982; Controls=44,944). RESULTS We identified and validated five novel gene associations in cortical brain tissue ( PRKAG1 , C3orf62 , LYSMD4 , ZNF439 , SLC11A2 ) and six genes proximal to known AD-related GWAS loci (Blood: MYBPC3 ; Brain: MTCH2 , CYB561 , MADD , PSMA5 , ANXA11 ). Further, using causal eQTL fine-mapping, we generated sparse models that retained the strength of the AD-TWAS association for MTCH2 , MADD , ZNF439 , CYB561 , and MYBPC3 . DISCUSSION Our comprehensive AD-TWAS discovered new gene associations and provided insights into the functional relevance of previously associated variants.
Collapse
|
12
|
He J, Antonyan L, Zhu H, Ardila K, Li Q, Enoma D, Zhang W, Liu A, Chekouo T, Cao B, MacDonald ME, Arnold PD, Long Q. A statistical method for image-mediated association studies discovers genes and pathways associated with four brain disorders. Am J Hum Genet 2024; 111:48-69. [PMID: 38118447 PMCID: PMC10806749 DOI: 10.1016/j.ajhg.2023.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 11/04/2023] [Accepted: 11/16/2023] [Indexed: 12/22/2023] Open
Abstract
Brain imaging and genomics are critical tools enabling characterization of the genetic basis of brain disorders. However, imaging large cohorts is expensive and may be unavailable for legacy datasets used for genome-wide association studies (GWASs). Using an integrated feature selection/aggregation model, we developed an image-mediated association study (IMAS), which utilizes borrowed imaging/genomics data to conduct association mapping in legacy GWAS cohorts. By leveraging the UK Biobank image-derived phenotypes (IDPs), the IMAS discovered genetic bases underlying four neuropsychiatric disorders and verified them by analyzing annotations, pathways, and expression quantitative trait loci (eQTLs). A cerebellar-mediated mechanism was identified to be common to the four disorders. Simulations show that, if the goal is identifying genetic risk, our IMAS is more powerful than a hypothetical protocol in which the imaging results were available in the GWAS dataset. This implies the feasibility of reanalyzing legacy GWAS datasets without conducting additional imaging, yielding cost savings for integrated analysis of genetics and imaging.
Collapse
Affiliation(s)
- Jingni He
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Lilit Antonyan
- Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Harold Zhu
- Department of Biological Sciences, Faculty of Science, University of Calgary, Calgary, AB, Canada
| | - Karen Ardila
- Department of Biomedical Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada
| | - Qing Li
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - David Enoma
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | | | - Andy Liu
- Sir Winston Churchill High School, Calgary, AB, Canada; College of Letters and Science, University of California, Los Angeles, Los Angeles, CA, USA
| | - Thierry Chekouo
- Department of Mathematics and Statistics, Faculty of Science, University of Calgary, Calgary, AB, Canada; Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Bo Cao
- Department of Psychiatry, Faculty of Medicine & Dentistry, University of Alberta, Edmonton, AB, Canada
| | - M Ethan MacDonald
- The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Biomedical Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada; Department of Electrical and Software Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada; Department of Radiology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Paul D Arnold
- Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Psychiatry, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
| | - Quan Long
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Mathematics and Statistics, Faculty of Science, University of Calgary, Calgary, AB, Canada.
| |
Collapse
|
13
|
Zhu Z, Chen X, Zhang S, Yu R, Qi C, Cheng L, Zhang X. Leveraging molecular quantitative trait loci to comprehend complex diseases/traits from the omics perspective. Hum Genet 2023; 142:1543-1560. [PMID: 37755483 DOI: 10.1007/s00439-023-02602-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 09/14/2023] [Indexed: 09/28/2023]
Abstract
Comprehending the molecular basis of quantitative genetic variation is a principal goal for complex diseases or traits. Molecular quantitative trait loci (molQTLs) have made it possible to investigate the effects of genetic variants hiding behind large-scale omics data. A deeper understanding of molQTL is urgently required in light of the multi-dimensionalization of omics data to more fully elucidate the pertinent biological mechanisms. Herein, we reviewed molQTLs with the corresponding resource from the omics perspective and further discussed the integrative strategy of GWAS-molQTL to infer their causal effects. Subsequently, we described the opportunities and challenges encountered by molQTL. The case studies showed that molQTL is essential for complex diseases and traits, whether single- or multi-omics QTLs. Overall, we highlighted the functional significance of genetic variants to employ the discovery of molQTL in complex diseases and traits.
Collapse
Affiliation(s)
- Zijun Zhu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Xinyu Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Sainan Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Rui Yu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Changlu Qi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China.
- NHC Key Laboratory of Molecular Probe and Targeted Diagnosis and Therapy, Harbin Medical University, Harbin, 150028, Heilongjiang, China.
| | - Xue Zhang
- NHC Key Laboratory of Molecular Probe and Targeted Diagnosis and Therapy, Harbin Medical University, Harbin, 150028, Heilongjiang, China
- McKusick-Zhang Center for Genetic Medicine, State Key Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, 100005, China
| |
Collapse
|
14
|
Mai J, Lu M, Gao Q, Zeng J, Xiao J. Transcriptome-wide association studies: recent advances in methods, applications and available databases. Commun Biol 2023; 6:899. [PMID: 37658226 PMCID: PMC10474133 DOI: 10.1038/s42003-023-05279-y] [Citation(s) in RCA: 53] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 08/24/2023] [Indexed: 09/03/2023] Open
Abstract
Genome-wide association study has identified fruitful variants impacting heritable traits. Nevertheless, identifying critical genes underlying those significant variants has been a great task. Transcriptome-wide association study (TWAS) is an instrumental post-analysis to detect significant gene-trait associations focusing on modeling transcription-level regulations, which has made numerous progresses in recent years. Leveraging from expression quantitative loci (eQTL) regulation information, TWAS has advantages in detecting functioning genes regulated by disease-associated variants, thus providing insight into mechanisms of diseases and other phenotypes. Considering its vast potential, this review article comprehensively summarizes TWAS, including the methodology, applications and available resources.
Collapse
Affiliation(s)
- Jialin Mai
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing, 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Mingming Lu
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing, 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Qianwen Gao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing, 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Jingyao Zeng
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing, 100101, China.
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing, 100101, China.
| | - Jingfa Xiao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing, 100101, China.
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing, 100101, China.
- University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|