51
|
Khunsriraksakul C, Li Q, Markus H, Patrick MT, Sauteraud R, McGuire D, Wang X, Wang C, Wang L, Chen S, Shenoy G, Li B, Zhong X, Olsen NJ, Carrel L, Tsoi LC, Jiang B, Liu DJ. Multi-ancestry and multi-trait genome-wide association meta-analyses inform clinical risk prediction for systemic lupus erythematosus. Nat Commun 2023; 14:668. [PMID: 36750564 PMCID: PMC9905560 DOI: 10.1038/s41467-023-36306-5] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 01/25/2023] [Indexed: 02/09/2023] Open
Abstract
Systemic lupus erythematosus is a heritable autoimmune disease that predominantly affects young women. To improve our understanding of genetic etiology, we conduct multi-ancestry and multi-trait meta-analysis of genome-wide association studies, encompassing 12 systemic lupus erythematosus cohorts from 3 different ancestries and 10 genetically correlated autoimmune diseases, and identify 16 novel loci. We also perform transcriptome-wide association studies, computational drug repurposing analysis, and cell type enrichment analysis. We discover putative drug classes, including a histone deacetylase inhibitor that could be repurposed to treat lupus. We also identify multiple cell types enriched with putative target genes, such as non-classical monocytes and B cells, which may be targeted for future therapeutics. Using this newly assembled result, we further construct polygenic risk score models and demonstrate that integrating polygenic risk score with clinical lab biomarkers improves the diagnostic accuracy of systemic lupus erythematosus using the Vanderbilt BioVU and Michigan Genomics Initiative biobanks.
Collapse
Affiliation(s)
- Chachrit Khunsriraksakul
- Program in Bioinformatics and Genomics, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
- Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Qinmengge Li
- Department of Dermatology, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Havell Markus
- Program in Bioinformatics and Genomics, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
- Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Matthew T Patrick
- Department of Dermatology, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Renan Sauteraud
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Daniel McGuire
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Xingyan Wang
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Chen Wang
- Program in Bioinformatics and Genomics, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Lida Wang
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Siyuan Chen
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Ganesh Shenoy
- Department of Neurosurgery, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Bingshan Li
- Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, TN, 37235, USA
| | - Xue Zhong
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Nancy J Olsen
- Department of Medicine, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Laura Carrel
- Department of Biochemistry and Molecular Biology, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Lam C Tsoi
- Department of Dermatology, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Bibo Jiang
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA
| | - Dajiang J Liu
- Program in Bioinformatics and Genomics, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA.
- Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA.
- Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA, 17033, USA.
| |
Collapse
|
52
|
Lafferty MJ, Aygün N, Patel NK, Krupa O, Liang D, Wolter JM, Geschwind DH, de la Torre-Ubieta L, Stein JL. MicroRNA-eQTLs in the developing human neocortex link miR-4707-3p expression to brain size. eLife 2023; 12:e79488. [PMID: 36629315 PMCID: PMC9859047 DOI: 10.7554/elife.79488] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 01/10/2023] [Indexed: 01/12/2023] Open
Abstract
Expression quantitative trait loci (eQTL) data have proven important for linking non-coding loci to protein-coding genes. But eQTL studies rarely measure microRNAs (miRNAs), small non-coding RNAs known to play a role in human brain development and neurogenesis. Here, we performed small-RNA sequencing across 212 mid-gestation human neocortical tissue samples, measured 907 expressed miRNAs, discovering 111 of which were novel, and identified 85 local-miRNA-eQTLs. Colocalization of miRNA-eQTLs with GWAS summary statistics yielded one robust colocalization of miR-4707-3p expression with educational attainment and brain size phenotypes, where the miRNA expression increasing allele was associated with decreased brain size. Exogenous expression of miR-4707-3p in primary human neural progenitor cells decreased expression of predicted targets and increased cell proliferation, indicating miR-4707-3p modulates progenitor gene regulation and cell fate decisions. Integrating miRNA-eQTLs with existing GWAS yielded evidence of a miRNA that may influence human brain size and function via modulation of neocortical brain development.
Collapse
Affiliation(s)
- Michael J Lafferty
- Department of Genetics, University of North Carolina at Chapel HillChapel HillUnited States
- UNC Neuroscience Center, University of North Carolina at Chapel HillChapel HillUnited States
| | - Nil Aygün
- Department of Genetics, University of North Carolina at Chapel HillChapel HillUnited States
- UNC Neuroscience Center, University of North Carolina at Chapel HillChapel HillUnited States
| | - Niyanta K Patel
- Department of Genetics, University of North Carolina at Chapel HillChapel HillUnited States
- UNC Neuroscience Center, University of North Carolina at Chapel HillChapel HillUnited States
| | - Oleh Krupa
- Department of Genetics, University of North Carolina at Chapel HillChapel HillUnited States
- UNC Neuroscience Center, University of North Carolina at Chapel HillChapel HillUnited States
| | - Dan Liang
- Department of Genetics, University of North Carolina at Chapel HillChapel HillUnited States
- UNC Neuroscience Center, University of North Carolina at Chapel HillChapel HillUnited States
| | - Justin M Wolter
- Department of Genetics, University of North Carolina at Chapel HillChapel HillUnited States
- UNC Neuroscience Center, University of North Carolina at Chapel HillChapel HillUnited States
- Department of Cell Biology and Physiology, The University of North Carolina at Chapel HillChapel HillUnited States
- Carolina Institute for Developmental Disabilities, The University of North Carolina at Chapel HillChapel HillUnited States
| | - Daniel H Geschwind
- Neurogenetics Program, Department of Neurology, David Geffen School of Medicine, University of California, Los AngelesLos AngelesUnited States
- Center for Autism Research and Treatment, Semel Institute, David Geffen School of Medicine, University of California, Los AngelesLos AngelesUnited States
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los AngelesLos AngelesUnited States
- Department of Psychiatry and Biobehavioral Sciences, Semel Institute, David Geffen School of Medicine, University of California, Los AngelesLos AngelesUnited States
| | - Luis de la Torre-Ubieta
- Department of Psychiatry and Biobehavioral Sciences, Semel Institute, David Geffen School of Medicine, University of California, Los AngelesLos AngelesUnited States
| | - Jason L Stein
- Department of Genetics, University of North Carolina at Chapel HillChapel HillUnited States
- UNC Neuroscience Center, University of North Carolina at Chapel HillChapel HillUnited States
- Carolina Institute for Developmental Disabilities, The University of North Carolina at Chapel HillChapel HillUnited States
| |
Collapse
|
53
|
Gedik H, Peterson RE, Riley BP, Vladimirov VI, Bacanu SA. Integrative Post-Genome-Wide Association Study Analyses Relevant to Psychiatric Disorders: Imputing Transcriptome and Proteome Signals. Complex Psychiatry 2023; 9:130-144. [PMID: 37588130 PMCID: PMC10425719 DOI: 10.1159/000530223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 03/09/2023] [Indexed: 08/18/2023] Open
Abstract
Background The genome-wide association study (GWAS) is a common tool to identify genetic variants associated with complex traits, including psychiatric disorders (PDs). However, post-GWAS analyses are needed to extend the statistical inference to biologically relevant entities, e.g., genes, proteins, and pathways. To achieve this goal, researchers developed methods that incorporate biologically relevant intermediate molecular phenotypes, such as gene expression and protein abundance, which are posited to mediate the variant-trait association. Transcriptome-wide association study (TWAS) and proteome-wide association study (PWAS) are commonly used methods to test the association between these molecular mediators and the trait. Summary In this review, we discuss the most recent developments in TWAS and PWAS. These methods integrate existing "omic" information with the GWAS summary statistics for trait(s) of interest. Specifically, they impute transcript/protein data and test the association between imputed gene expression/protein level with phenotype of interest by using (i) GWAS summary statistics and (ii) reference transcriptomic/proteomic/genomic datasets. TWAS and PWAS are suitable as analysis tools for (i) primary association scan and (ii) fine-mapping to identify potentially causal genes for PDs. Key Messages As post-GWAS analyses, TWAS and PWAS have the potential to highlight causal genes for PDs. These prioritized genes could indicate targets for the development of novel drug therapies. For researchers attempting such analyses, we recommend Mendelian randomization tools that use GWAS statistics for both trait and reference datasets, e.g., summary Mendelian randomization (SMR). We base our recommendation on (i) being able to use the same tool for both TWAS and PWAS, (ii) not requiring the pre-computed weights (and thus easier to update for larger reference datasets), and (iii) most larger transcriptome reference datasets are publicly available and easy to transform into a compatible format for SMR analysis.
Collapse
Affiliation(s)
- Huseyin Gedik
- Integrative Life Sciences, Virginia Institute of Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA, USA
| | - Roseann E. Peterson
- Institute for Genomics in Health, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| | - Brien P. Riley
- Institute for Genomics in Health, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| | - Vladimir I. Vladimirov
- Department of Psychiatry, College of Medicine-Phoenix, University of Arizona, Phoenix, AZ, USA
| | - Silviu-Alin Bacanu
- Institute for Genomics in Health, SUNY Downstate Health Sciences University, Brooklyn, NY, USA
| |
Collapse
|
54
|
Long-read transcriptome sequencing reveals allele-specific variants at high resolution. Trends Genet 2023; 39:31-33. [PMID: 36207147 DOI: 10.1016/j.tig.2022.09.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 09/11/2022] [Accepted: 09/21/2022] [Indexed: 11/05/2022]
Abstract
Disturbance in the regulation of transcript structure plays a crucial role in human disease. In a recent study, Glinos et al. characterized allele-specific transcript alterations in long-read RNA sequencing (RNA-seq) data derived from multiple human tissues and provide a high-resolution view of how disease-associated genetic variants affect transcript structure.
Collapse
|
55
|
Fernandez-Rozadilla C, Timofeeva M, Chen Z, Law P, Thomas M, Schmit S, Díez-Obrero V, Hsu L, Fernandez-Tajes J, Palles C, Sherwood K, Briggs S, Svinti V, Donnelly K, Farrington S, Blackmur J, Vaughan-Shaw P, Shu XO, Long J, Cai Q, Guo X, Lu Y, Broderick P, Studd J, Huyghe J, Harrison T, Conti D, Dampier C, Devall M, Schumacher F, Melas M, Rennert G, Obón-Santacana M, Martín-Sánchez V, Moratalla-Navarro F, Oh JH, Kim J, Jee SH, Jung KJ, Kweon SS, Shin MH, Shin A, Ahn YO, Kim DH, Oze I, Wen W, Matsuo K, Matsuda K, Tanikawa C, Ren Z, Gao YT, Jia WH, Hopper J, Jenkins M, Win AK, Pai R, Figueiredo J, Haile R, Gallinger S, Woods M, Newcomb P, Duggan D, Cheadle J, Kaplan R, Maughan T, Kerr R, Kerr D, Kirac I, Böhm J, Mecklin LP, Jousilahti P, Knekt P, Aaltonen L, Rissanen H, Pukkala E, Eriksson J, Cajuso T, Hänninen U, Kondelin J, Palin K, Tanskanen T, Renkonen-Sinisalo L, Zanke B, Männistö S, Albanes D, Weinstein S, Ruiz-Narvaez E, Palmer J, Buchanan D, Platz E, Visvanathan K, Ulrich C, Siegel E, Brezina S, Gsur A, Campbell P, Chang-Claude J, Hoffmeister M, Brenner H, Slattery M, et alFernandez-Rozadilla C, Timofeeva M, Chen Z, Law P, Thomas M, Schmit S, Díez-Obrero V, Hsu L, Fernandez-Tajes J, Palles C, Sherwood K, Briggs S, Svinti V, Donnelly K, Farrington S, Blackmur J, Vaughan-Shaw P, Shu XO, Long J, Cai Q, Guo X, Lu Y, Broderick P, Studd J, Huyghe J, Harrison T, Conti D, Dampier C, Devall M, Schumacher F, Melas M, Rennert G, Obón-Santacana M, Martín-Sánchez V, Moratalla-Navarro F, Oh JH, Kim J, Jee SH, Jung KJ, Kweon SS, Shin MH, Shin A, Ahn YO, Kim DH, Oze I, Wen W, Matsuo K, Matsuda K, Tanikawa C, Ren Z, Gao YT, Jia WH, Hopper J, Jenkins M, Win AK, Pai R, Figueiredo J, Haile R, Gallinger S, Woods M, Newcomb P, Duggan D, Cheadle J, Kaplan R, Maughan T, Kerr R, Kerr D, Kirac I, Böhm J, Mecklin LP, Jousilahti P, Knekt P, Aaltonen L, Rissanen H, Pukkala E, Eriksson J, Cajuso T, Hänninen U, Kondelin J, Palin K, Tanskanen T, Renkonen-Sinisalo L, Zanke B, Männistö S, Albanes D, Weinstein S, Ruiz-Narvaez E, Palmer J, Buchanan D, Platz E, Visvanathan K, Ulrich C, Siegel E, Brezina S, Gsur A, Campbell P, Chang-Claude J, Hoffmeister M, Brenner H, Slattery M, Potter J, Tsilidis K, Schulze M, Gunter M, Murphy N, Castells A, Castellví-Bel S, Moreira L, Arndt V, Shcherbina A, Stern M, Pardamean B, Bishop T, Giles G, Southey M, Idos G, McDonnell K, Abu-Ful Z, Greenson J, Shulman K, Lejbkowicz F, Offit K, Su YR, Steinfelder R, Keku T, van Guelpen B, Hudson T, Hampel H, Pearlman R, Berndt S, Hayes R, Martinez ME, Thomas S, Corley D, Pharoah P, Larsson S, Yen Y, Lenz HJ, White E, Li L, Doheny K, Pugh E, Shelford T, Chan A, Cruz-Correa M, Lindblom A, Hunter D, Joshi A, Schafmayer C, Scacheri P, Kundaje A, Nickerson D, Schoen R, Hampe J, Stadler Z, Vodicka P, Vodickova L, Vymetalkova V, Papadopoulos N, Edlund C, Gauderman W, Thomas D, Shibata D, Toland A, Markowitz S, Kim A, Chanock S, van Duijnhoven F, Feskens E, Sakoda L, Gago-Dominguez M, Wolk A, Naccarati A, Pardini B, FitzGerald L, Lee SC, Ogino S, Bien S, Kooperberg C, Li C, Lin Y, Prentice R, Qu C, Bézieau S, Tangen C, Mardis E, Yamaji T, Sawada N, Iwasaki M, Haiman C, Le Marchand L, Wu A, Qu C, McNeil C, Coetzee G, Hayward C, Deary I, Harris S, Theodoratou E, Reid S, Walker M, Ooi LY, Moreno V, Casey G, Gruber S, Tomlinson I, Zheng W, Dunlop M, Houlston R, Peters U. Deciphering colorectal cancer genetics through multi-omic analysis of 100,204 cases and 154,587 controls of European and east Asian ancestries. Nat Genet 2023; 55:89-99. [PMID: 36539618 PMCID: PMC10094749 DOI: 10.1038/s41588-022-01222-9] [Show More Authors] [Citation(s) in RCA: 92] [Impact Index Per Article: 46.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 10/09/2022] [Indexed: 12/24/2022]
Abstract
Colorectal cancer (CRC) is a leading cause of mortality worldwide. We conducted a genome-wide association study meta-analysis of 100,204 CRC cases and 154,587 controls of European and east Asian ancestry, identifying 205 independent risk associations, of which 50 were unreported. We performed integrative genomic, transcriptomic and methylomic analyses across large bowel mucosa and other tissues. Transcriptome- and methylome-wide association studies revealed an additional 53 risk associations. We identified 155 high-confidence effector genes functionally linked to CRC risk, many of which had no previously established role in CRC. These have multiple different functions and specifically indicate that variation in normal colorectal homeostasis, proliferation, cell adhesion, migration, immunity and microbial interactions determines CRC risk. Crosstissue analyses indicated that over a third of effector genes most probably act outside the colonic mucosa. Our findings provide insights into colorectal oncogenesis and highlight potential targets across tissues for new CRC treatment and chemoprevention strategies.
Collapse
Affiliation(s)
- Ceres Fernandez-Rozadilla
- Edinburgh Cancer Research Centre, Institute of Genomics and Cancer, University of Edinburgh, Edinburgh, UK
- Genomic Medicine Group, Instituto de Investigacion Sanitaria de Santiago, Santiago de Compostela, Spain
| | - Maria Timofeeva
- Colon Cancer Genetics Group, Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
- Danish Institute for Advanced Study, Department of Public Health, University of Southern Denmark, Odense, Denmark
| | - Zhishan Chen
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Philip Law
- Division of Genetics and Epidemiology, Institute of Cancer Research, London, UK
| | - Minta Thomas
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Stephanie Schmit
- Genomic Medicine Institute, Cleveland Clinic, Cleveland, OH, USA
- Population and Cancer Prevention Program, Case Comprehensive Cancer Center, Cleveland, OH, USA
| | - Virginia Díez-Obrero
- Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute, Barcelona, Spain
- Oncology Data Analytics Program, Catalan Institute of Oncology, Barcelona, Spain
- Consortium for Biomedical Research in Epidemiology and Public Health, Madrid, Madrid, Spain
- Department of Clinical Sciences, Faculty of Medicine, University of Barcelona, Barcelona, Spain
| | - Li Hsu
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Department of Biostatistics, School of Public Health, University of Washington, Seattle, WA, USA
| | - Juan Fernandez-Tajes
- Edinburgh Cancer Research Centre, Institute of Genomics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Claire Palles
- Institute of Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
| | - Kitty Sherwood
- Edinburgh Cancer Research Centre, Institute of Genomics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Sarah Briggs
- Department of Public Health, Richard Doll Building, University of Oxford, Oxford, UK
| | - Victoria Svinti
- Colon Cancer Genetics Group, Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Kevin Donnelly
- Colon Cancer Genetics Group, Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Susan Farrington
- Colon Cancer Genetics Group, Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - James Blackmur
- Colon Cancer Genetics Group, Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Peter Vaughan-Shaw
- Colon Cancer Genetics Group, Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Xiao-Ou Shu
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jirong Long
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Qiuyin Cai
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Xingyi Guo
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Yingchang Lu
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Peter Broderick
- Division of Genetics and Epidemiology, Institute of Cancer Research, London, UK
| | - James Studd
- Division of Genetics and Epidemiology, Institute of Cancer Research, London, UK
| | - Jeroen Huyghe
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Tabitha Harrison
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - David Conti
- Department of Preventive Medicine, USC Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Christopher Dampier
- Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA
| | - Mathew Devall
- Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA
| | - Fredrick Schumacher
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA
- Case Comprehensive Cancer Center, Case Western Reserve University, Cleveland, OH, USA
| | - Marilena Melas
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Gad Rennert
- Department of Community Medicine and Epidemiology, Lady Davis Carmel Medical Center, Haifa, Israel
- Ruth and Bruce Rappaport Faculty of Medicine, Technion-Israel Institute of Technology, Haifa, Israel
- Clalit National Cancer Control Center, Haifa, Israel
| | - Mireia Obón-Santacana
- Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute, Barcelona, Spain
- Oncology Data Analytics Program, Catalan Institute of Oncology, Barcelona, Spain
- Consortium for Biomedical Research in Epidemiology and Public Health, Madrid, Spain
| | - Vicente Martín-Sánchez
- Consortium for Biomedical Research in Epidemiology and Public Health, Madrid, Madrid, Spain
- Biomedicine Institute, University of León, León, Spain
| | - Ferran Moratalla-Navarro
- Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute, Barcelona, Spain
- Oncology Data Analytics Program, Catalan Institute of Oncology, Barcelona, Spain
- Consortium for Biomedical Research in Epidemiology and Public Health, Madrid, Madrid, Spain
- Department of Clinical Sciences, Faculty of Medicine, University of Barcelona, Barcelona, Spain
| | - Jae Hwan Oh
- Center for Colorectal Cancer, National Cancer Center Hospital, National Cancer Center, Gyeonggi-do, South Korea
| | - Jeongseon Kim
- Department of Cancer Biomedical Science, Graduate School of Cancer Science and Policy, National Cancer Center, Gyeonggi-do, South Korea
| | - Sun Ha Jee
- Department of Epidemiology and Health Promotion, Graduate School of Public Health, Yonsei University, Seoul, South Korea
| | - Keum Ji Jung
- Department of Epidemiology and Health Promotion, Graduate School of Public Health, Yonsei University, Seoul, South Korea
| | - Sun-Seog Kweon
- Department of Preventive Medicine, Chonnam National University Medical School, Gwangju, South Korea
| | - Min-Ho Shin
- Department of Preventive Medicine, Chonnam National University Medical School, Gwangju, South Korea
| | - Aesun Shin
- Department of Preventive Medicine, Seoul National University College of Medicine, Seoul, South Korea
- Cancer Research Institute, Seoul National University, Seoul, South Korea
| | - Yoon-Ok Ahn
- Department of Preventive Medicine, Seoul National University College of Medicine, Seoul, South Korea
| | - Dong-Hyun Kim
- Department of Social and Preventive Medicine, Hallym University College of Medicine, Okcheon-dong, South Korea
| | - Isao Oze
- Division of Cancer Epidemiology and Prevention, Aichi Cancer Center Research Institute, Nagoya, Japan
| | - Wanqing Wen
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Keitaro Matsuo
- Division of Molecular and Clinical Epidemiology, Aichi Cancer Center Research Institute, Nagoya, Japan
- Department of Epidemiology, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Koichi Matsuda
- Laboratory of Clinical Genome Sequencing, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Tokyo, Japan
| | - Chizu Tanikawa
- Laboratory of Genome Technology, Human Genome Center, Institute of Medical Science, University of Tokyo, Tokyo, Japan
| | - Zefang Ren
- School of Public Health, Sun Yat-sen University, Guangzhou, China
| | - Yu-Tang Gao
- State Key Laboratory of Oncogenes and Related Genes and Department of Epidemiology, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Wei-Hua Jia
- State Key Laboratory of Oncology in South China, Cancer Center, Sun Yat-sen University, Guangzhou, China
| | - John Hopper
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, Melbourne, Victoria, Australia
- Department of Epidemiology, School of Public Health and Institute of Health and Environment, Seoul National University, Seoul, South Korea
| | - Mark Jenkins
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, Melbourne, Victoria, Australia
| | - Aung Ko Win
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, Melbourne, Victoria, Australia
| | - Rish Pai
- Department of Laboratory Medicine and Pathology, Mayo Clinic Arizona, Scottsdale, AZ, USA
| | - Jane Figueiredo
- Department of Preventive Medicine, USC Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Medicine, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Robert Haile
- Division of Oncology, Department of Medicine, Cedars-Sinai Cancer Research Center for Health Equity, Los Angeles, CA, USA
| | - Steven Gallinger
- Lunenfeld Tanenbaum Research Institute, Mount Sinai Hospital, University of Toronto, Toronto, Ontario, Canada
| | - Michael Woods
- Division of Biomedical Sciences, Memorial University of Newfoundland, St. John, Ontario, Canada
| | - Polly Newcomb
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- School of Public Health, University of Washington, Seattle, WA, USA
| | - David Duggan
- City of Hope National Medical Center, Translational Genomics Research Institute, Phoenix, AZ, USA
| | - Jeremy Cheadle
- Institute of Medical Genetics, Cardiff University, Cardiff, UK
| | - Richard Kaplan
- MRC Clinical Trials Unit, Medical Research Council, Cardiff, UK
| | - Timothy Maughan
- MRC Institute for Radiation Oncology, University of Oxford, Oxford, UK
| | - Rachel Kerr
- Department of Oncology, University of Oxford, Oxford, UK
| | - David Kerr
- Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Iva Kirac
- Department of Surgical Oncology, University Hospital for Tumors, Sestre milosrdnice University Hospital Center, Zagreb, Croatia
| | - Jan Böhm
- Department of Pathology, Central Finland Health Care District, Jyväskylä, Finland
| | | | - Pekka Jousilahti
- Department of Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
| | - Paul Knekt
- Department of Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
| | - Lauri Aaltonen
- Department of Medical and Clinical Genetics, University of Helsinki, Helsinki, Finland
- Genome-Scale Biology Research Program, University of Helsinki, Helsinki, Finland
| | - Harri Rissanen
- Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
| | - Eero Pukkala
- Finnish Cancer Registry, Institute for Statistical and Epidemiological Cancer Research, Helsinki, Finland
- Faculty of Social Sciences, Tampere University, Tampere, Finland
| | - Johan Eriksson
- Folkhälsan Research Centre, University of Helsinki, Helsinki, Finland
- Human Potential Translational Research Programme, National University of Singapore, Singapore, Singapore
- Unit of General Practice and Primary Health Care, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Tatiana Cajuso
- Department of Medical and Clinical Genetics, University of Helsinki, Helsinki, Finland
- Genome-Scale Biology Research Program, University of Helsinki, Helsinki, Finland
| | - Ulrika Hänninen
- Department of Medical and Clinical Genetics, University of Helsinki, Helsinki, Finland
- Genome-Scale Biology Research Program, University of Helsinki, Helsinki, Finland
| | - Johanna Kondelin
- Department of Medical and Clinical Genetics, University of Helsinki, Helsinki, Finland
- Genome-Scale Biology Research Program, University of Helsinki, Helsinki, Finland
| | - Kimmo Palin
- Department of Medical and Clinical Genetics, University of Helsinki, Helsinki, Finland
- Genome-Scale Biology Research Program, University of Helsinki, Helsinki, Finland
| | - Tomas Tanskanen
- Department of Medical and Clinical Genetics, University of Helsinki, Helsinki, Finland
- Genome-Scale Biology Research Program, University of Helsinki, Helsinki, Finland
| | | | - Brent Zanke
- Department of Oncology, University of Toronto, Toronto, Ontario, Canada
| | - Satu Männistö
- Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
| | - Demetrius Albanes
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Stephanie Weinstein
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Edward Ruiz-Narvaez
- Department of Nutritional Sciences, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Julie Palmer
- Slone Epidemiology Center at Boston University, Boston, MA, USA
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
| | - Daniel Buchanan
- Colorectal Oncogenomics Group, Department of Clinical Pathology, University of Melbourne, Parkville, Victoria, Australia
- University of Melbourne Centre for Cancer Research, Victorian Comprehensive Cancer Centre, Parkville, Victoria, Australia
- Genomic Medicine and Family Cancer Clinic, Royal Melbourne Hospital, Parkville, Victoria, Australia
| | - Elizabeth Platz
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Kala Visvanathan
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Cornelia Ulrich
- Huntsman Cancer Institute and Department of Population Health Sciences, University of Utah, Salt Lake City, UT, USA
| | - Erin Siegel
- Cancer Epidemiology Program, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Stefanie Brezina
- Institute of Cancer Research, Department of Medicine I, Medical University Vienna, Vienna, Austria
| | - Andrea Gsur
- Institute of Cancer Research, Department of Medicine I, Medical University Vienna, Vienna, Austria
| | - Peter Campbell
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, New York, NY, USA
| | - Jenny Chang-Claude
- Division of Cancer Epidemiology, German Cancer Research Center, Heidelberg, Germany
- University Medical Centre Hamburg-Eppendorf, University Cancer Centre Hamburg, Hamburg, Germany
| | - Michael Hoffmeister
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center, Heidelberg, Germany
| | - Hermann Brenner
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center, Heidelberg, Germany
- Division of Preventive Oncology, German Cancer Research Center and National Center for Tumor Diseases, Heidelberg, Germany
- German Cancer Consortium, German Cancer Research Center, Heidelberg, Germany
| | - Martha Slattery
- Department of Internal Medicine, University of Utah, Salt Lake City, UT, USA
| | - John Potter
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Research Centre for Hauora and Health, Massey University, Wellington, New Zealand
| | - Konstantinos Tsilidis
- Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece
- Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, UK
| | - Matthias Schulze
- Department of Molecular Epidemiology, German Institute of Human Nutrition Potsdam-Rehbruecke, Nuthetal, Germany
- Institute of Nutritional Science, University of Potsdam, Potsdam, Germany
| | - Marc Gunter
- Nutrition and Metabolism Branch, International Agency for Research on Cancer, World Health Organization, Lyon, France
| | - Neil Murphy
- Nutrition and Metabolism Branch, International Agency for Research on Cancer, World Health Organization, Lyon, France
| | - Antoni Castells
- Gastroenterology Department, Hospital Clínic, Institut d'Investigacions Biomèdiques August Pi i Sunyer, Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas, University of Barcelona, Barcelona, Spain
| | - Sergi Castellví-Bel
- Gastroenterology Department, Hospital Clínic, Institut d'Investigacions Biomèdiques August Pi i Sunyer, Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas, University of Barcelona, Barcelona, Spain
| | - Leticia Moreira
- Gastroenterology Department, Hospital Clínic, Institut d'Investigacions Biomèdiques August Pi i Sunyer, Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas, University of Barcelona, Barcelona, Spain
| | - Volker Arndt
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center, Heidelberg, Germany
| | - Anna Shcherbina
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Mariana Stern
- Department of Population and Public Health Sciences, USC Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Jeonnam Regional Cancer Center, Chonnam National University Hwasun Hospital, Hwasun, South Korea
| | - Bens Pardamean
- Bioinformatics and Data Science Research Center, Bina Nusantara University, Jakarta, Indonesia
| | - Timothy Bishop
- Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, UK
| | - Graham Giles
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, Melbourne, Victoria, Australia
- Cancer Epidemiology Division, Cancer Council Victoria, Melbourne, Victoria, Australia
- Precision Medicine, School of Clinical Sciences at Monash Health, Monash University, Clayton, Victoria, Australia
| | - Melissa Southey
- Cancer Epidemiology Division, Cancer Council Victoria, Melbourne, Victoria, Australia
- Precision Medicine, School of Clinical Sciences at Monash Health, Monash University, Clayton, Victoria, Australia
- Department of Clinical Pathology, University of Melbourne, Melbourne, Victoria, Australia
| | - Gregory Idos
- Department of Medical Oncology and Center For Precision Medicine, City of Hope National Medical Center, Duarte, CA, USA
| | - Kevin McDonnell
- Ruth and Bruce Rappaport Faculty of Medicine, Technion-Israel Institute of Technology, Haifa, Israel
- Clalit National Cancer Control Center, Haifa, Israel
- Department of Medical Oncology and Center For Precision Medicine, City of Hope National Medical Center, Duarte, CA, USA
| | - Zomoroda Abu-Ful
- Department of Community Medicine and Epidemiology, Lady Davis Carmel Medical Center, Haifa, Israel
| | - Joel Greenson
- Ruth and Bruce Rappaport Faculty of Medicine, Technion-Israel Institute of Technology, Haifa, Israel
- Clalit National Cancer Control Center, Haifa, Israel
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Katerina Shulman
- Department of Community Medicine and Epidemiology, Lady Davis Carmel Medical Center, Haifa, Israel
| | - Flavio Lejbkowicz
- Department of Community Medicine and Epidemiology, Lady Davis Carmel Medical Center, Haifa, Israel
- Clalit National Cancer Control Center, Haifa, Israel
- Clalit Health Services, Personalized Genomic Service, Lady Davis Carmel Medical Center, Haifa, Israel
| | - Kenneth Offit
- Clinical Genetics Service, Department of Medicine, Memorial Sloan-Kettering Cancer Center, New York, NY, USA
- Department of Medicine, Weill Cornell Medical College, New York, NY, USA
| | - Yu-Ru Su
- Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA
| | - Robert Steinfelder
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Temitope Keku
- Center for Gastrointestinal Biology and Disease, University of North Carolina, Chapel Hill, NC, USA
| | - Bethany van Guelpen
- Department of Radiation Sciences, Oncology Unit, Umeå University, Umeå, Sweden
- Wallenberg Centre for Molecular Medicine, Umeå University, Umeå, Sweden
| | - Thomas Hudson
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Heather Hampel
- Division of Human Genetics, Department of Internal Medicine, Ohio State University Comprehensive Cancer Center, Columbus, OH, USA
| | - Rachel Pearlman
- Division of Human Genetics, Department of Internal Medicine, Ohio State University Comprehensive Cancer Center, Columbus, OH, USA
| | - Sonja Berndt
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Richard Hayes
- Division of Epidemiology, Department of Population Health, New York University School of Medicine, New York, NY, USA
| | - Marie Elena Martinez
- Population Sciences, Disparities and Community Engagement, University of California San Diego Moores Cancer Center, La Jolla, CA, USA
- Department of Family Medicine and Public Health, University of California San Diego, La Jolla, CA, USA
| | - Sushma Thomas
- Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Douglas Corley
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA
- Department of Gastroenterology, Kaiser Permanente Medical Center, San Francisco, CA, USA
| | - Paul Pharoah
- Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
| | - Susanna Larsson
- Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Yun Yen
- Taipei Medical University, Taipei, Taiwan
| | - Heinz-Josef Lenz
- Department of Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Emily White
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Department of Epidemiology, University of Washington School of Public Health, Seattle, WA, USA
| | - Li Li
- Case Comprehensive Cancer Center, Case Western Reserve University, Cleveland, OH, USA
| | - Kimberly Doheny
- Center for Inherited Disease Research, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Elizabeth Pugh
- Center for Inherited Disease Research, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Tameka Shelford
- Center for Inherited Disease Research, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Andrew Chan
- Clinical and Translational Epidemiology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
| | - Marcia Cruz-Correa
- Comprehensive Cancer Center, University of Puerto Rico, San Juan, Puerto Rico
| | - Annika Lindblom
- Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden
- Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden
| | - David Hunter
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Amit Joshi
- Clinical and Translational Epidemiology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
| | - Clemens Schafmayer
- Department of General Surgery, University Hospital Rostock, Rostock, Germany
| | - Peter Scacheri
- Department of Genetics and Genome Sciences, Case Western Reserve University, Cleveland, OH, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Deborah Nickerson
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Robert Schoen
- Department of Medicine and Epidemiology, University of Pittsburgh Medical Center, Pittsburgh, PA, USA
| | - Jochen Hampe
- Department of Medicine I, University Hospital Dresden, Technische Universität Dresden, Dresden, Germany
| | - Zsofia Stadler
- Department of Medicine, Weill Cornell Medical College, New York, NY, USA
- Department of Medicine, Memorial Sloan-Kettering Cancer Center, New York, NY, USA
| | - Pavel Vodicka
- Department of Molecular Biology of Cancer, Institute of Experimental Medicine of the Czech Academy of Sciences, Prague, Czech Republic
- Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University, Prague, Czech Republic
- Faculty of Medicine and Biomedical Center in Pilsen, Charles University, Pilsen, Czech Republic
| | - Ludmila Vodickova
- Department of Molecular Biology of Cancer, Institute of Experimental Medicine of the Czech Academy of Sciences, Prague, Czech Republic
- Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University, Prague, Czech Republic
- Faculty of Medicine and Biomedical Center in Pilsen, Charles University, Pilsen, Czech Republic
| | - Veronika Vymetalkova
- Department of Molecular Biology of Cancer, Institute of Experimental Medicine of the Czech Academy of Sciences, Prague, Czech Republic
- Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University, Prague, Czech Republic
- Faculty of Medicine and Biomedical Center in Pilsen, Charles University, Pilsen, Czech Republic
| | - Nickolas Papadopoulos
- Department of Oncology Ludwig Center at the Sidney Kimmel Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Sol Goldman Pancreatic Cancer Research Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Pathology, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Chistopher Edlund
- Department of Preventive Medicine, USC Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - William Gauderman
- Department of Preventive Medicine, USC Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Duncan Thomas
- Department of Preventive Medicine, USC Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - David Shibata
- Department of Surgery, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Amanda Toland
- Departments of Cancer Biology and Genetics and Internal Medicine, Comprehensive Cancer Center, Ohio State University, Columbus, OH, USA
| | - Sanford Markowitz
- Departments of Medicine and Genetics, Case Comprehensive Cancer Center, Case Western Reserve University and University Hospitals of Cleveland, Cleveland, OH, USA
| | - Andre Kim
- Department of Preventive Medicine, USC Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Stephen Chanock
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Franzel van Duijnhoven
- Division of Human Nutrition and Health, Wageningen University and Research, Wageningen, The Netherlands
| | - Edith Feskens
- Division of Human Nutrition, Wageningen University and Research, Wageningen, The Netherlands
| | - Lori Sakoda
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA
| | - Manuela Gago-Dominguez
- Genomic Medicine Group, Galician Public Foundation of Genomic Medicine, Servicio Galego de Saude, Santiago de Compostela, Spain
- Instituto de Investigación Sanitaria de Santiago de Compostela, Santiago de Compostela, Spain
| | - Alicja Wolk
- Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Alessio Naccarati
- Italian Institute for Genomic Medicine, Candiolo Cancer Institute FPO-IRCCS, Candiolo (TO), Italy
- Candiolo Cancer Institute FPO-IRCCS, Candiolo (TO), Italy
| | - Barbara Pardini
- Italian Institute for Genomic Medicine, Candiolo Cancer Institute FPO-IRCCS, Candiolo (TO), Italy
- Candiolo Cancer Institute FPO-IRCCS, Candiolo (TO), Italy
| | - Liesel FitzGerald
- Cancer Epidemiology Division, Cancer Council Victoria, Melbourne, Victoria, Australia
- Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia
| | - Soo Chin Lee
- National University Cancer Institute, Singapore, Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore
| | - Shuji Ogino
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Immunology Program, Dana-Farber Harvard Cancer Center, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Stephanie Bien
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Charles Kooperberg
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Christopher Li
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Yi Lin
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Ross Prentice
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Conghui Qu
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Stéphane Bézieau
- Service de Génétique Médicale, Centre Hospitalier Universitaire Nantes, Nantes, France
| | - Catherine Tangen
- SWOG Statistical Center, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Elaine Mardis
- Department of Pediatrics, Nationwide Children's Hospital, The Steve and Cindy Rasmussen Institute for Genomic Medicine, Columbus, OH, USA
| | - Taiki Yamaji
- Division of Epidemiology, National Cancer Center Institute for Cancer Control, National Cancer Center, Tokyo, Japan
| | - Norie Sawada
- Division of Cohort Research, National Cancer Center Institute for Cancer Control, National Cancer Center, Tokyo, Japan
| | - Motoki Iwasaki
- Division of Epidemiology, National Cancer Center Institute for Cancer Control, National Cancer Center, Tokyo, Japan
- Division of Cohort Research, National Cancer Center Institute for Cancer Control, National Cancer Center, Tokyo, Japan
| | - Christopher Haiman
- Department of Preventive Medicine, Center for Genetic Epidemiology, University of Southern California, Los Angeles, CA, USA
| | | | - Anna Wu
- Preventative Medicine, University of Southern California, Los Angeles, CA, USA
| | - Chenxu Qu
- USC Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Caroline McNeil
- USC Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | | | - Caroline Hayward
- MRC Human Genetics Unit, Institute of Genomics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Ian Deary
- Lothian Birth Cohorts group, Department of Psychology, University of Edinburgh, Edinburgh, UK
| | - Sarah Harris
- Lothian Birth Cohorts group, Department of Psychology, University of Edinburgh, Edinburgh, UK
| | - Evropi Theodoratou
- Centre for Global Health, Usher Institute, University of Edinburgh, Edinburgh, UK
| | - Stuart Reid
- Colon Cancer Genetics Group, Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Marion Walker
- Colon Cancer Genetics Group, Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Li Yin Ooi
- Colon Cancer Genetics Group, Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
- Department of Pathology, National University Hospital, National University Health System, Singapore, Singapore
| | - Victor Moreno
- Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute, Barcelona, Spain
- Oncology Data Analytics Program, Catalan Institute of Oncology, Barcelona, Spain
- Consortium for Biomedical Research in Epidemiology and Public Health, Madrid, Madrid, Spain
- Department of Clinical Sciences, Faculty of Medicine, University of Barcelona, Barcelona, Spain
| | - Graham Casey
- Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA
| | - Stephen Gruber
- Department of Medical Oncology and Center For Precision Medicine, City of Hope National Medical Center, Duarte, CA, USA
| | - Ian Tomlinson
- Edinburgh Cancer Research Centre, Institute of Genomics and Cancer, University of Edinburgh, Edinburgh, UK.
- Institute of Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK.
| | - Wei Zheng
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, Nashville, TN, USA.
| | - Malcolm Dunlop
- Colon Cancer Genetics Group, Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK.
| | - Richard Houlston
- Division of Genetics and Epidemiology, Institute of Cancer Research, London, UK.
| | - Ulrike Peters
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA.
- Department of Epidemiology, University of Washington, Seattle, WA, USA.
| |
Collapse
|
56
|
Curti N, Levi G, Giampieri E, Castellani G, Remondini D. A network approach for low dimensional signatures from high throughput data. Sci Rep 2022; 12:22253. [PMID: 36564421 PMCID: PMC9789141 DOI: 10.1038/s41598-022-25549-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 11/30/2022] [Indexed: 12/24/2022] Open
Abstract
One of the main objectives of high-throughput genomics studies is to obtain a low-dimensional set of observables-a signature-for sample classification purposes (diagnosis, prognosis, stratification). Biological data, such as gene or protein expression, are commonly characterized by an up/down regulation behavior, for which discriminant-based methods could perform with high accuracy and easy interpretability. To obtain the most out of these methods features selection is even more critical, but it is known to be a NP-hard problem, and thus most feature selection approaches focuses on one feature at the time (k-best, Sequential Feature Selection, recursive feature elimination). We propose DNetPRO, Discriminant Analysis with Network PROcessing, a supervised network-based signature identification method. This method implements a network-based heuristic to generate one or more signatures out of the best performing feature pairs. The algorithm is easily scalable, allowing efficient computing for high number of observables ([Formula: see text]-[Formula: see text]). We show applications on real high-throughput genomic datasets in which our method outperforms existing results, or is compatible with them but with a smaller number of selected features. Moreover, the geometrical simplicity of the resulting class-separation surfaces allows a clearer interpretation of the obtained signatures in comparison to nonlinear classification models.
Collapse
Affiliation(s)
- Nico Curti
- grid.6292.f0000 0004 1757 1758Department of Physics and Astronomy, University of Bologna, Bologna, Italy ,grid.470193.80000 0004 8343 7610INFN Bologna, Bologna, Italy
| | - Giuseppe Levi
- grid.6292.f0000 0004 1757 1758Department of Physics and Astronomy, University of Bologna, Bologna, Italy ,grid.470193.80000 0004 8343 7610INFN Bologna, Bologna, Italy
| | - Enrico Giampieri
- grid.470193.80000 0004 8343 7610INFN Bologna, Bologna, Italy ,grid.6292.f0000 0004 1757 1758Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Gastone Castellani
- grid.470193.80000 0004 8343 7610INFN Bologna, Bologna, Italy ,grid.6292.f0000 0004 1757 1758Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, Bologna, Italy
| | - Daniel Remondini
- grid.6292.f0000 0004 1757 1758Department of Physics and Astronomy, University of Bologna, Bologna, Italy ,grid.470193.80000 0004 8343 7610INFN Bologna, Bologna, Italy
| |
Collapse
|
57
|
Vlk D, Trněný O, Řepková J. Genes Associated with Biological Nitrogen Fixation Efficiency Identified Using RNA Sequencing in Red Clover ( Trifolium pratense L.). LIFE (BASEL, SWITZERLAND) 2022; 12:life12121975. [PMID: 36556339 PMCID: PMC9785344 DOI: 10.3390/life12121975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 11/22/2022] [Accepted: 11/22/2022] [Indexed: 11/29/2022]
Abstract
Commonly studied in the context of legume-rhizobia symbiosis, biological nitrogen fixation (BNF) is a key component of the nitrogen cycle in nature. Despite its potential in plant breeding and many years of research, information is still lacking as to the regulation of hundreds of genes connected with plant-bacteria interaction, nodulation, and nitrogen fixation. Here, we compared root nodule transcriptomes of red clover (Trifolium pratense L.) genotypes with contrasting nitrogen fixation efficiency, and we found 491 differentially expressed genes (DEGs) between plants with high and low BNF efficiency. The annotation of genes expressed in nodules revealed more than 800 genes not yet experimentally confirmed. Among genes mediating nodule development, four nod-ule-specific cysteine-rich (NCR) peptides were confirmed in the nodule transcriptome. Gene duplication analyses revealed that genes originating from tandem and dispersed duplication are significantly over-represented among DEGs. Weighted correlation network analysis (WGCNA) organized expression profiles of the transcripts into 16 modules linked to the analyzed traits, such as nitrogen fixation efficiency or sample-specific modules. Overall, the results obtained broaden our knowledge about transcriptomic landscapes of red clover's root nodules and shift the phenotypic description of BNF efficiency on the level of gene expression in situ.
Collapse
Affiliation(s)
- David Vlk
- Department of Experimental Biology, Faculty of Sciences, Masaryk University, 611 37 Brno, Czech Republic
| | - Oldřich Trněný
- Agricultural Research, Ltd., Zahradní 1, 664 41 Troubsko, Czech Republic
| | - Jana Řepková
- Department of Experimental Biology, Faculty of Sciences, Masaryk University, 611 37 Brno, Czech Republic
- Correspondence: ; Tel.: +420-549-49-6895
| |
Collapse
|
58
|
Liu S, Won H, Clarke D, Matoba N, Khullar S, Mu Y, Wang D, Gerstein M. Illuminating links between cis-regulators and trans-acting variants in the human prefrontal cortex. Genome Med 2022; 14:133. [PMID: 36424644 PMCID: PMC9685876 DOI: 10.1186/s13073-022-01133-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 10/25/2022] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Neuropsychiatric disorders afflict a large portion of the global population and constitute a significant source of disability worldwide. Although Genome-wide Association Studies (GWAS) have identified many disorder-associated variants, the underlying regulatory mechanisms linking them to disorders remain elusive, especially those involving distant genomic elements. Expression quantitative trait loci (eQTLs) constitute a powerful means of providing this missing link. However, most eQTL studies in human brains have focused exclusively on cis-eQTLs, which link variants to nearby genes (i.e., those within 1 Mb of a variant). A complete understanding of disease etiology requires a clearer understanding of trans-regulatory mechanisms, which, in turn, entails a detailed analysis of the relationships between variants and expression changes in distant genes. METHODS By leveraging large datasets from the PsychENCODE consortium, we conducted a genome-wide survey of trans-eQTLs in the human dorsolateral prefrontal cortex. We also performed colocalization and mediation analyses to identify mediators in trans-regulation and use trans-eQTLs to link GWAS loci to schizophrenia risk genes. RESULTS We identified ~80,000 candidate trans-eQTLs (at FDR<0.25) that influence the expression of ~10K target genes (i.e., "trans-eGenes"). We found that many variants associated with these candidate trans-eQTLs overlap with known cis-eQTLs. Moreover, for >60% of these variants (by colocalization), the cis-eQTL's target gene acts as a mediator for the trans-eQTL SNP's effect on the trans-eGene, highlighting examples of cis-mediation as essential for trans-regulation. Furthermore, many of these colocalized variants fall into a discernable pattern wherein cis-eQTL's target is a transcription factor or RNA-binding protein, which, in turn, targets the gene associated with the candidate trans-eQTL. Finally, we show that trans-regulatory mechanisms provide valuable insights into psychiatric disorders: beyond what had been possible using only cis-eQTLs, we link an additional 23 GWAS loci and 90 risk genes (using colocalization between candidate trans-eQTLs and schizophrenia GWAS loci). CONCLUSIONS We demonstrate that the transcriptional architecture of the human brain is orchestrated by both cis- and trans-regulatory variants and found that trans-eQTLs provide insights into brain-disease biology.
Collapse
Affiliation(s)
- Shuang Liu
- Waisman Center, University of Wisconsin - Madison, Madison, WI, 53705, USA
| | - Hyejung Won
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.,Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Declan Clarke
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, 06520, USA
| | - Nana Matoba
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.,Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Saniya Khullar
- Waisman Center, University of Wisconsin - Madison, Madison, WI, 53705, USA.,Department of Biostatistics and Medical Informatics, University of Wisconsin - Madison, Madison, WI, 53706, USA
| | - Yudi Mu
- Department of Statistics, University of Wisconsin - Madison, Madison, WI, 53706, USA
| | - Daifeng Wang
- Waisman Center, University of Wisconsin - Madison, Madison, WI, 53705, USA. .,Department of Biostatistics and Medical Informatics, University of Wisconsin - Madison, Madison, WI, 53706, USA. .,Department of Computer Sciences, University of Wisconsin - Madison, Madison, WI, 53706, USA.
| | - Mark Gerstein
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, 06520, USA. .,Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06520, USA. .,Department of Computer Science, Yale University, New Haven, CT, 06520, USA. .,Department of Statistics and Data Science, Yale University, New Haven, CT, 06520, USA.
| |
Collapse
|
59
|
Whole genome DNA and RNA sequencing of whole blood elucidates the genetic architecture of gene expression underlying a wide range of diseases. Sci Rep 2022; 12:20167. [PMID: 36424512 PMCID: PMC9686236 DOI: 10.1038/s41598-022-24611-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 11/17/2022] [Indexed: 11/27/2022] Open
Abstract
To create a scientific resource of expression quantitative trail loci (eQTL), we conducted a genome-wide association study (GWAS) using genotypes obtained from whole genome sequencing (WGS) of DNA and gene expression levels from RNA sequencing (RNA-seq) of whole blood in 2622 participants in Framingham Heart Study. We identified 6,778,286 cis-eQTL variant-gene transcript (eGene) pairs at p < 5 × 10-8 (2,855,111 unique cis-eQTL variants and 15,982 unique eGenes) and 1,469,754 trans-eQTL variant-eGene pairs at p < 1e-12 (526,056 unique trans-eQTL variants and 7233 unique eGenes). In addition, 442,379 cis-eQTL variants were associated with expression of 1518 long non-protein coding RNAs (lncRNAs). Gene Ontology (GO) analyses revealed that the top GO terms for cis-eGenes are enriched for immune functions (FDR < 0.05). The cis-eQTL variants are enriched for SNPs reported to be associated with 815 traits in prior GWAS, including cardiovascular disease risk factors. As proof of concept, we used this eQTL resource in conjunction with genetic variants from public GWAS databases in causal inference testing (e.g., COVID-19 severity). After Bonferroni correction, Mendelian randomization analyses identified putative causal associations of 60 eGenes with systolic blood pressure, 13 genes with coronary artery disease, and seven genes with COVID-19 severity. This study created a comprehensive eQTL resource via BioData Catalyst that will be made available to the scientific community. This will advance understanding of the genetic architecture of gene expression underlying a wide range of diseases.
Collapse
|
60
|
Sauerwald N, Zhang Z, Ramos I, Nair VD, Soares-Schanoski A, Ge Y, Mao W, Alshammary H, Gonzalez-Reiche AS, van de Guchte A, Goforth CW, Lizewski RA, Lizewski SE, Amper MAS, Vasoya M, Seenarine N, Guevara K, Marjanovic N, Miller CM, Nudelman G, Schilling MA, Sealfon RSG, Termini MS, Vangeti S, Weir DL, Zaslavsky E, Chikina M, Wu YN, Van Bakel H, Letizia AG, Sealfon SC, Troyanskaya OG. Pre-infection antiviral innate immunity contributes to sex differences in SARS-CoV-2 infection. Cell Syst 2022; 13:924-931.e4. [PMID: 36323307 PMCID: PMC9623453 DOI: 10.1016/j.cels.2022.10.005] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 07/21/2022] [Accepted: 10/18/2022] [Indexed: 11/05/2022]
Abstract
Male sex is a major risk factor for SARS-CoV-2 infection severity. To understand the basis for this sex difference, we studied SARS-CoV-2 infection in a young adult cohort of United States Marine recruits. Among 2,641 male and 244 female unvaccinated and seronegative recruits studied longitudinally, SARS-CoV-2 infections occurred in 1,033 males and 137 females. We identified sex differences in symptoms, viral load, blood transcriptome, RNA splicing, and proteomic signatures. Females had higher pre-infection expression of antiviral interferon-stimulated gene (ISG) programs. Causal mediation analysis implicated ISG differences in number of symptoms, levels of ISGs, and differential splicing of CD45 lymphocyte phosphatase during infection. Our results indicate that the antiviral innate immunity set point causally contributes to sex differences in response to SARS-CoV-2 infection. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Natalie Sauerwald
- Center for Computational Biology, Flatiron Institute, New York, NY 10010, USA
| | - Zijun Zhang
- Center for Computational Biology, Flatiron Institute, New York, NY 10010, USA
| | - Irene Ramos
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Venugopalan D Nair
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | | | - Yongchao Ge
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Weiguang Mao
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Hala Alshammary
- Department of Microbiology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Ana S Gonzalez-Reiche
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Adriana van de Guchte
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Carl W Goforth
- Naval Medical Research Center, Silver Spring, MD 20910, USA
| | | | | | - Mary Anne S Amper
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Mital Vasoya
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Nitish Seenarine
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Kristy Guevara
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Nada Marjanovic
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Clare M Miller
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - German Nudelman
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | | | - Rachel S G Sealfon
- Center for Computational Biology, Flatiron Institute, New York, NY 10010, USA
| | - Michael S Termini
- Navy Medicine Readiness and Training Command Beaufort, Beaufort, SC 29902, USA
| | - Sindhu Vangeti
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Dawn L Weir
- Naval Medical Research Center, Silver Spring, MD 20910, USA
| | - Elena Zaslavsky
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Maria Chikina
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Ying Nian Wu
- Department of Statistics, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Harm Van Bakel
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | | | - Stuart C Sealfon
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
| | - Olga G Troyanskaya
- Center for Computational Biology, Flatiron Institute, New York, NY 10010, USA; Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08540, USA; Department of Computer Science, Princeton University, Princeton, NJ 08540, USA.
| |
Collapse
|
61
|
Bruscadin JJ, Cardoso TF, da Silva Diniz WJ, Afonso J, de Souza MM, Petrini J, Nascimento Andrade BG, da Silva VH, Ferraz JBS, Zerlotini A, Mourão GB, Coutinho LL, de Almeida Regitano LC. Allele-specific expression reveals functional SNPs affecting muscle-related genes in bovine. BIOCHIMICA ET BIOPHYSICA ACTA (BBA) - GENE REGULATORY MECHANISMS 2022; 1865:194886. [DOI: 10.1016/j.bbagrm.2022.194886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 09/27/2022] [Accepted: 10/12/2022] [Indexed: 11/09/2022]
|
62
|
DNA methyltransferase 3A controls intestinal epithelial barrier function and regeneration in the colon. Nat Commun 2022; 13:6266. [PMID: 36271073 PMCID: PMC9587301 DOI: 10.1038/s41467-022-33844-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 10/05/2022] [Indexed: 12/25/2022] Open
Abstract
Genetic variants in the DNA methyltransferase 3 A (DNMT3A) locus have been associated with inflammatory bowel disease (IBD). DNMT3A is part of the epigenetic machinery physiologically involved in DNA methylation. We show that DNMT3A plays a critical role in maintaining intestinal homeostasis and gut barrier function. DNMT3A expression is downregulated in intestinal epithelial cells from IBD patients and upon tumor necrosis factor treatment in murine intestinal organoids. Ablation of DNMT3A in Caco-2 cells results in global DNA hypomethylation, which is linked to impaired regenerative capacity, transepithelial resistance and intercellular junction formation. Genetic deletion of Dnmt3a in intestinal epithelial cells (Dnmt3aΔIEC) in mice confirms the phenotype of an altered epithelial ultrastructure with shortened apical-junctional complexes, reduced Goblet cell numbers and increased intestinal permeability in the colon in vivo. Dnmt3aΔIEC mice suffer from increased susceptibility to experimental colitis, characterized by reduced epithelial regeneration. These data demonstrate a critical role for DNMT3A in orchestrating intestinal epithelial homeostasis and response to tissue damage and suggest an involvement of impaired epithelial DNMT3A function in the etiology of IBD.
Collapse
|
63
|
Zhou HJ, Li L, Li Y, Li W, Li JJ. PCA outperforms popular hidden variable inference methods for molecular QTL mapping. Genome Biol 2022; 23:210. [PMID: 36221136 PMCID: PMC9552461 DOI: 10.1186/s13059-022-02761-4] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 08/26/2022] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Estimating and accounting for hidden variables is widely practiced as an important step in molecular quantitative trait locus (molecular QTL, henceforth "QTL") analysis for improving the power of QTL identification. However, few benchmark studies have been performed to evaluate the efficacy of the various methods developed for this purpose. RESULTS Here we benchmark popular hidden variable inference methods including surrogate variable analysis (SVA), probabilistic estimation of expression residuals (PEER), and hidden covariates with prior (HCP) against principal component analysis (PCA)-a well-established dimension reduction and factor discovery method-via 362 synthetic and 110 real data sets. We show that PCA not only underlies the statistical methodology behind the popular methods but is also orders of magnitude faster, better-performing, and much easier to interpret and use. CONCLUSIONS To help researchers use PCA in their QTL analysis, we provide an R package PCAForQTL along with a detailed guide, both of which are freely available at https://github.com/heatherjzhou/PCAForQTL . We believe that using PCA rather than SVA, PEER, or HCP will substantially improve and simplify hidden variable inference in QTL mapping as well as increase the transparency and reproducibility of QTL research.
Collapse
Affiliation(s)
- Heather J Zhou
- Department of Statistics, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Lei Li
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, 518055, China
| | - Yumei Li
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, CA, 92697, USA
| | - Wei Li
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, CA, 92697, USA
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Department of Biostatistics, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
| |
Collapse
|
64
|
Pudjihartono M, Perry JK, Print C, O'Sullivan JM, Schierding W. Interpretation of the role of germline and somatic non-coding mutations in cancer: expression and chromatin conformation informed analysis. Clin Epigenetics 2022; 14:120. [PMID: 36171609 PMCID: PMC9520844 DOI: 10.1186/s13148-022-01342-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 09/21/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND There has been extensive scrutiny of cancer driving mutations within the exome (especially amino acid altering mutations) as these are more likely to have a clear impact on protein functions, and thus on cell biology. However, this has come at the neglect of systematic identification of regulatory (non-coding) variants, which have recently been identified as putative somatic drivers and key germline risk factors for cancer development. Comprehensive understanding of non-coding mutations requires understanding their role in the disruption of regulatory elements, which then disrupt key biological functions such as gene expression. MAIN BODY We describe how advancements in sequencing technologies have led to the identification of a large number of non-coding mutations with uncharacterized biological significance. We summarize the strategies that have been developed to interpret and prioritize the biological mechanisms impacted by non-coding mutations, focusing on recent annotation of cancer non-coding variants utilizing chromatin states, eQTLs, and chromatin conformation data. CONCLUSION We believe that a better understanding of how to apply different regulatory data types into the study of non-coding mutations will enhance the discovery of novel mechanisms driving cancer.
Collapse
Affiliation(s)
| | - Jo K Perry
- Liggins Institute, The University of Auckland, Auckland, New Zealand
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
| | - Cris Print
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
- Department of Molecular Medicine and Pathology, School of Medical Sciences, University of Auckland, Auckland, 1142, New Zealand
| | - Justin M O'Sullivan
- Liggins Institute, The University of Auckland, Auckland, New Zealand
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
- Australian Parkinson's Mission, Garvan Institute of Medical Research, Sydney, NSW, Australia
- MRC Lifecourse Epidemiology Unit, University of Southampton, Southampton, UK
| | - William Schierding
- Liggins Institute, The University of Auckland, Auckland, New Zealand.
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand.
| |
Collapse
|
65
|
Fontanillas P, Kless A, 23andMe Research Team, Bothmer J, Tung JY. Genome-wide association study of pain sensitivity assessed by questionnaire and the cold pressor test. Pain 2022; 163:1763-1776. [PMID: 34924555 PMCID: PMC9393798 DOI: 10.1097/j.pain.0000000000002568] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 11/24/2021] [Accepted: 11/29/2021] [Indexed: 11/26/2022]
Abstract
ABSTRACT We deployed an online pain sensitivity questionnaire (PSQ) and an at-home version of the cold pressor test (CPT) in a large genotyped cohort. We performed genome-wide association studies on the PSQ score (25,321 participants) and CPT duration (6853). We identified one new genome-wide significant locus associated with the PSQ score, which was located in the TSSC1 (also known as EIPR1 ) gene (rs58194899, OR = 0.950 [0.933-0.967], P -value = 1.9 × 10 -8 ). Although high pain sensitivity measured by both PSQ and CPT was associated with individual history of chronic and acute pains, genetic correlation analyses surprisingly suggested an opposite direction: PSQ score was inversely genetically correlated with neck and shoulder pain ( rg = -0.71), rheumatoid arthritis (-0.68), and osteoarthritis (-0.38), and with known risk factors, such as the length of working week (-0.65), smoking (-0.36), or extreme BMI (-0.23). Gene-based analysis followed by pathway analysis showed that genome-wide association studies results were enriched for genes expressed in the brain and involved in neuronal development and glutamatergic synapse signaling pathways. Finally, we confirmed that females with red hair were more sensitive to pain and found that genetic variation in the MC1R gene was associated with an increase in self-perceived pain sensitivity as assessed by the PSQ.
Collapse
Affiliation(s)
| | - Achim Kless
- Grünenthal Innovation, Grünenthal GmbH, Aachen, Germany. Kless is now with the Neuroscience Genetics, Eli Lilly and Company, United Kingdom
| | | | - John Bothmer
- Grünenthal Innovation, Grünenthal GmbH, Aachen, Germany. Kless is now with the Neuroscience Genetics, Eli Lilly and Company, United Kingdom
| | | |
Collapse
|
66
|
Splicing QTL analysis focusing on coding sequences reveals mechanisms for disease susceptibility loci. Nat Commun 2022; 13:4659. [PMID: 36002455 PMCID: PMC9402578 DOI: 10.1038/s41467-022-32358-1] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Accepted: 07/26/2022] [Indexed: 12/26/2022] Open
Abstract
Splicing quantitative trait loci (sQTLs) are one of the major causal mechanisms in genome-wide association study (GWAS) loci, but their role in disease pathogenesis is poorly understood. One reason is the complexity of alternative splicing events producing many unknown isoforms. Here, we propose two approaches, namely integration and selection, for this complexity by focusing on protein-structure of isoforms. First, we integrate isoforms with the same coding sequence (CDS) and identify 369-601 integrated-isoform ratio QTLs (i2-rQTLs), which altered protein-structure, in six immune subsets. Second, we select CDS incomplete isoforms annotated in GENCODE and identify 175-337 isoform-ratio QTL (i-rQTL). By comprehensive long-read capture RNA-sequencing among these incomplete isoforms, we reveal 29 full-length isoforms with unannotated CDSs associated with GWAS traits. Furthermore, we show that disease-causal sQTL genes can be identified by evaluating their trans-eQTL effects. Our approaches highlight the understudied role of protein-altering sQTLs and are broadly applicable to other tissues and diseases. Splicing QTL (sQTL), genetic variants regulating alternative splicing, can be biologically important, but complex to detect and interpret. Here, the authors identify sQTL by focusing on protein coding sequences, as an alternative to junction-based approaches.
Collapse
|
67
|
Genetic control of RNA splicing and its distinct role in complex trait variation. Nat Genet 2022; 54:1355-1363. [PMID: 35982161 PMCID: PMC9470536 DOI: 10.1038/s41588-022-01154-4] [Citation(s) in RCA: 88] [Impact Index Per Article: 29.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Accepted: 07/08/2022] [Indexed: 12/11/2022]
Abstract
Most genetic variants identified from genome-wide association studies (GWAS) in humans are noncoding, indicating their role in gene regulation. Previous studies have shown considerable links of GWAS signals to expression quantitative trait loci (eQTLs) but the links to other genetic regulatory mechanisms, such as splicing QTLs (sQTLs), are underexplored. Here, we introduce an sQTL mapping method, testing for heterogeneity between isoform-eQTLeffects (THISTLE), with improved power over competing methods. Applying THISTLE together with a complementary sQTL mapping strategy to brain transcriptomic (n = 2,865) and genotype data, we identified 12,794 genes with cis-sQTLs at P < 5 × 10−8, approximately 61% of which were distinct from eQTLs. Integrating the sQTL data into GWAS for 12 brain-related complex traits (including diseases), we identified 244 genes associated with the traits through cis-sQTLs, approximately 61% of which could not be discovered using the corresponding eQTL data. Our study demonstrates the distinct role of most sQTLs in the genetic regulation of transcription and complex trait variation. A powerful method for splicing quantitative trait loci (sQTL) mapping, THISTLE, is presented and applied to a collection of 2,865 brain samples. Integration with GWAS identifies 244 genes associated via cis-sQTLs, of which 61% were not identified using expression QTLs.
Collapse
|
68
|
Hine E, Runcie DE, Allen SL, Wang Y, Chenoweth SF, Blows MW, McGuigan K. Maintenance of quantitative genetic variance in complex, multi-trait phenotypes: The contribution of rare, large effect variants in two Drosophila species. Genetics 2022; 222:6663993. [PMID: 35961029 PMCID: PMC9526065 DOI: 10.1093/genetics/iyac122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 08/02/2022] [Indexed: 11/29/2022] Open
Abstract
The interaction of evolutionary processes to determine quantitative genetic variation has implications for contemporary and future phenotypic evolution, as well as for our ability to detect causal genetic variants. While theoretical studies have provided robust predictions to discriminate among competing models, empirical assessment of these has been limited. In particular, theory highlights the importance of pleiotropy in resolving observations of selection and mutation, but empirical investigations have typically been limited to few traits. Here, we applied high-dimensional Bayesian Sparse Factor Genetic modeling to gene expression datasets in 2 species, Drosophila melanogaster and Drosophila serrata, to explore the distributions of genetic variance across high-dimensional phenotypic space. Surprisingly, most of the heritable trait covariation was due to few lines (genotypes) with extreme [>3 interquartile ranges (IQR) from the median] values. Intriguingly, while genotypes extreme for a multivariate factor also tended to have a higher proportion of individual traits that were extreme, we also observed genotypes that were extreme for multivariate factors but not for any individual trait. We observed other consistent differences between heritable multivariate factors with outlier lines vs those factors without extreme values, including differences in gene functions. We use these observations to identify further data required to advance our understanding of the evolutionary dynamics and nature of standing genetic variation for quantitative traits.
Collapse
Affiliation(s)
- Emma Hine
- School of Biological Sciences, The University of Queensland, Brisbane 4072 Australia
| | - Daniel E Runcie
- Department of Plant Sciences, University of California Davis, Davis, CA 95616, USA
| | - Scott L Allen
- School of Biological Sciences, The University of Queensland, Brisbane 4072 Australia
| | - Yiguan Wang
- School of Biological Sciences, The University of Queensland, Brisbane 4072 Australia.,Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, EH9 3FL, UK
| | - Stephen F Chenoweth
- School of Biological Sciences, The University of Queensland, Brisbane 4072 Australia
| | - Mark W Blows
- School of Biological Sciences, The University of Queensland, Brisbane 4072 Australia
| | - Katrina McGuigan
- School of Biological Sciences, The University of Queensland, Brisbane 4072 Australia
| |
Collapse
|
69
|
Glinos DA, Garborcauskas G, Hoffman P, Ehsan N, Jiang L, Gokden A, Dai X, Aguet F, Brown KL, Garimella K, Bowers T, Costello M, Ardlie K, Jian R, Tucker NR, Ellinor PT, Harrington ED, Tang H, Snyder M, Juul S, Mohammadi P, MacArthur DG, Lappalainen T, Cummings BB. Transcriptome variation in human tissues revealed by long-read sequencing. Nature 2022; 608:353-359. [PMID: 35922509 PMCID: PMC10337767 DOI: 10.1038/s41586-022-05035-y] [Citation(s) in RCA: 160] [Impact Index Per Article: 53.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Accepted: 06/28/2022] [Indexed: 12/12/2022]
Abstract
Regulation of transcript structure generates transcript diversity and plays an important role in human disease1-7. The advent of long-read sequencing technologies offers the opportunity to study the role of genetic variation in transcript structure8-16. In this Article, we present a large human long-read RNA-seq dataset using the Oxford Nanopore Technologies platform from 88 samples from Genotype-Tissue Expression (GTEx) tissues and cell lines, complementing the GTEx resource. We identified just over 70,000 novel transcripts for annotated genes, and validated the protein expression of 10% of novel transcripts. We developed a new computational package, LORALS, to analyse the genetic effects of rare and common variants on the transcriptome by allele-specific analysis of long reads. We characterized allele-specific expression and transcript structure events, providing new insights into the specific transcript alterations caused by common and rare genetic variants and highlighting the resolution gained from long-read data. We were able to perturb the transcript structure upon knockdown of PTBP1, an RNA binding protein that mediates splicing, thereby finding genetic regulatory effects that are modified by the cellular environment. Finally, we used this dataset to enhance variant interpretation and study rare variants leading to aberrant splicing patterns.
Collapse
Affiliation(s)
- Dafni A Glinos
- New York Genome Center, New York, NY, USA.
- Department of Systems Biology, Columbia University, New York, NY, USA.
| | - Garrett Garborcauskas
- Medical and Population Genetics Program, The Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | | | - Nava Ehsan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
| | - Lihua Jiang
- Department of Genetics, Stanford University, Stanford, CA, USA
| | | | | | | | - Kathleen L Brown
- New York Genome Center, New York, NY, USA
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | | | - Tera Bowers
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | | - Ruiqi Jian
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Nathan R Tucker
- Masonic Medical Research Institute, Utica, NY, USA
- Cardiovascular Disease Initiative, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Patrick T Ellinor
- Cardiovascular Disease Initiative, The Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | | | - Hua Tang
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Michael Snyder
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Sissel Juul
- Oxford Nanopore Technology, New York, NY, USA
| | - Pejman Mohammadi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
- Scripps Research Translational Institute, La Jolla, CA, USA
| | - Daniel G MacArthur
- Medical and Population Genetics Program, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Centre for Population Genomics, Garvan Institute of Medical Research, and UNSW Sydney, Sydney, New South Wales, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
| | - Tuuli Lappalainen
- New York Genome Center, New York, NY, USA.
- Department of Systems Biology, Columbia University, New York, NY, USA.
- Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden.
| | - Beryl B Cummings
- Medical and Population Genetics Program, The Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
70
|
Kubota N, Suyama M. Mapping of promoter usage QTL using RNA-seq data reveals their contributions to complex traits. PLoS Comput Biol 2022; 18:e1010436. [PMID: 36037215 PMCID: PMC9462676 DOI: 10.1371/journal.pcbi.1010436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 09/09/2022] [Accepted: 07/25/2022] [Indexed: 11/18/2022] Open
Abstract
Genomic variations are associated with gene expression levels, which are called expression quantitative trait loci (eQTL). Most eQTL may affect the total gene expression levels by regulating transcriptional activities of a specific promoter. However, the direct exploration of genomic loci associated with promoter activities using RNA-seq data has been challenging because eQTL analyses treat the total expression levels estimated by summing those of all isoforms transcribed from distinct promoters. Here we propose a new method for identifying genomic loci associated with promoter activities, called promoter usage quantitative trait loci (puQTL), using conventional RNA-seq data. By leveraging public RNA-seq datasets from the lymphoblastoid cell lines of 438 individuals from the GEUVADIS project, we obtained promoter activity estimates and mapped 2,592 puQTL at the 10% FDR level. The results of puQTL mapping enabled us to interpret the manner in which genomic variations regulate gene expression. We found that 310 puQTL genes (16.1%) were not detected by eQTL analysis, suggesting that our pipeline can identify novel variant-gene associations. Furthermore, we identified genomic loci associated with the activity of "hidden" promoters, which the standard eQTL studies have ignored. We found that most puQTL signals were concordant with at least one genome-wide association study (GWAS) signal, enabling novel interpretations of the molecular mechanisms of complex traits. Our results emphasize the importance of the re-analysis of public RNA-seq datasets to obtain novel insights into gene regulation by genomic variations and their contributions to complex traits.
Collapse
Affiliation(s)
- Naoto Kubota
- Division of Bioinformatics, Medical Institute of Bioregulation, Kyushu University, Fukuoka, Japan
| | - Mikita Suyama
- Division of Bioinformatics, Medical Institute of Bioregulation, Kyushu University, Fukuoka, Japan
| |
Collapse
|
71
|
Dutta D, He Y, Saha A, Arvanitis M, Battle A, Chatterjee N. Aggregative trans-eQTL analysis detects trait-specific target gene sets in whole blood. Nat Commun 2022; 13:4323. [PMID: 35882830 PMCID: PMC9325868 DOI: 10.1038/s41467-022-31845-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Accepted: 07/06/2022] [Indexed: 01/13/2023] Open
Abstract
Large scale genetic association studies have identified many trait-associated variants and understanding the role of these variants in the downstream regulation of gene-expressions can uncover important mediating biological mechanisms. Here we propose ARCHIE, a summary statistic based sparse canonical correlation analysis method to identify sets of gene-expressions trans-regulated by sets of known trait-related genetic variants. Simulation studies show that compared to standard methods, ARCHIE is better suited to identify "core"-like genes through which effects of many other genes may be mediated and can capture disease-specific patterns of genetic associations. By applying ARCHIE to publicly available summary statistics from the eQTLGen consortium, we identify gene sets which have significant evidence of trans-association with groups of known genetic variants across 29 complex traits. Around half (50.7%) of the selected genes do not have any strong trans-associations and are not detected by standard methods. We provide further evidence for causal basis of the target genes through a series of follow-up analyses. These results show ARCHIE is a powerful tool for identifying sets of genes whose trans-regulation may be related to specific complex traits.
Collapse
Affiliation(s)
- Diptavo Dutta
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA
| | - Yuan He
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Ashis Saha
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Marios Arvanitis
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
- Department of Cardiology, Johns Hopkins University, Baltimore, MD, USA
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
| | - Nilanjan Chatterjee
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA.
- Department of Oncology, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
72
|
Rowland B, Venkatesh S, Tardaguila M, Wen J, Rosen JD, Tapia AL, Sun Q, Graff M, Vuckovic D, Lettre G, Sankaran VG, Voloudakis G, Roussos P, Huffman JE, Reiner AP, Soranzo N, Raffield LM, Li Y. Transcriptome-wide association study in UK Biobank Europeans identifies associations with blood cell traits. Hum Mol Genet 2022; 31:2333-2347. [PMID: 35138379 PMCID: PMC9307312 DOI: 10.1093/hmg/ddac011] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 12/15/2021] [Accepted: 01/04/2022] [Indexed: 11/13/2022] Open
Abstract
Previous genome-wide association studies (GWAS) of hematological traits have identified over 10 000 distinct trait-specific risk loci. However, at these loci, the underlying causal mechanisms remain incompletely characterized. To elucidate novel biology and better understand causal mechanisms at known loci, we performed a transcriptome-wide association study (TWAS) of 29 hematological traits in 399 835 UK Biobank (UKB) participants of European ancestry using gene expression prediction models trained from whole blood RNA-seq data in 922 individuals. We discovered 557 gene-trait associations for hematological traits distinct from previously reported GWAS variants in European populations. Among the 557 associations, 301 were available for replication in a cohort of 141 286 participants of European ancestry from the Million Veteran Program. Of these 301 associations, 108 replicated at a strict Bonferroni adjusted threshold ($\alpha$= 0.05/301). Using our TWAS results, we systematically assigned 4261 out of 16 900 previously identified hematological trait GWAS variants to putative target genes. Compared to coloc, our TWAS results show reduced specificity and increased sensitivity in external datasets to assign variants to target genes.
Collapse
Affiliation(s)
- Bryce Rowland
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Sanan Venkatesh
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA
- Mental Illness Research, Education, and Clinical Center (VISN 2 South), James J. Peters VA Medical Center, Bronx, NY 10468, USA
| | - Manuel Tardaguila
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton CB10 1SA, UK
| | - Jia Wen
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jonathan D Rosen
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Amanda L Tapia
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Quan Sun
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Mariaelisa Graff
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Dragana Vuckovic
- Department of Epidemiology and Biostatistics, School of Public Health, Faculty of Medicine, Imperial College London, London SW7 2AZ, UK
| | - Guillaume Lettre
- Montreal Heart Institute, Université de Montréal, Montreal, Quebec, Canada
| | - Vijay G Sankaran
- Division of Hematology/Oncology, Boston Children's Hospital, Boston, MA 02115, USA
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Georgios Voloudakis
- Mental Illness Research, Education, and Clinical Center (VISN 2 South), James J. Peters VA Medical Center, Bronx, NY 10468, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA
| | - Panos Roussos
- Mental Illness Research, Education, and Clinical Center (VISN 2 South), James J. Peters VA Medical Center, Bronx, NY 10468, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA
| | - Jennifer E Huffman
- Center for Population Genomics, Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, MA 02130, USA
| | - Alexander P Reiner
- Department of Epidemiology, University of Washington, Seattle, WA 98195, USA
| | - Nicole Soranzo
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton CB10 1SA, UK
| | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
73
|
Natural Killer cells demonstrate distinct eQTL and transcriptome-wide disease associations, highlighting their role in autoimmunity. Nat Commun 2022; 13:4073. [PMID: 35835762 PMCID: PMC9283523 DOI: 10.1038/s41467-022-31626-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 06/24/2022] [Indexed: 12/13/2022] Open
Abstract
Natural Killer cells are innate lymphocytes with central roles in immunosurveillance and are implicated in autoimmune pathogenesis. The degree to which regulatory variants affect Natural Killer cell gene expression is poorly understood. Here we perform expression quantitative trait locus mapping of negatively selected Natural Killer cells from a population of healthy Europeans (n = 245). We find a significant subset of genes demonstrate expression quantitative trait loci specific to Natural Killer cells and these are highly informative of human disease, in particular autoimmunity. A Natural Killer cell transcriptome-wide association study across five common autoimmune diseases identifies further novel associations at 27 genes. In addition to these cis observations, we find novel master-regulatory regions impacting expression of trans gene networks at regions including 19q13.4, the Killer cell Immunoglobulin-like Receptor region, GNLY, MC1R and UVSSA. Our findings provide new insights into the unique biology of Natural Killer cells, demonstrating markedly different expression quantitative trait loci from other immune cells, with implications for disease mechanisms. Natural Killer cells are key mediators of anti-tumour immunosurveillance and anti-viral immunity. Here, the authors map regulatory genetic variation in primary Natural Killer cells, providing new insights into their role in human health and disease.
Collapse
|
74
|
Wright CJ, Smith CWJ, Jiggins CD. Alternative splicing as a source of phenotypic diversity. Nat Rev Genet 2022; 23:697-710. [PMID: 35821097 DOI: 10.1038/s41576-022-00514-4] [Citation(s) in RCA: 190] [Impact Index Per Article: 63.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/13/2022] [Indexed: 12/27/2022]
Abstract
A major goal of evolutionary genetics is to understand the genetic processes that give rise to phenotypic diversity in multicellular organisms. Alternative splicing generates multiple transcripts from a single gene, enriching the diversity of proteins and phenotypic traits. It is well established that alternative splicing contributes to key innovations over long evolutionary timescales, such as brain development in bilaterians. However, recent developments in long-read sequencing and the generation of high-quality genome assemblies for diverse organisms has facilitated comparisons of splicing profiles between closely related species, providing insights into how alternative splicing evolves over shorter timescales. Although most splicing variants are probably non-functional, alternative splicing is nonetheless emerging as a dynamic, evolutionarily labile process that can facilitate adaptation and contribute to species divergence.
Collapse
Affiliation(s)
- Charlotte J Wright
- Tree of Life, Wellcome Sanger Institute, Cambridge, UK. .,Department of Zoology, University of Cambridge, Cambridge, UK.
| | | | - Chris D Jiggins
- Department of Zoology, University of Cambridge, Cambridge, UK.
| |
Collapse
|
75
|
MicrobiomeGWAS: A Tool for Identifying Host Genetic Variants Associated with Microbiome Composition. Genes (Basel) 2022; 13:genes13071224. [PMID: 35886007 PMCID: PMC9317577 DOI: 10.3390/genes13071224] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 06/30/2022] [Accepted: 07/01/2022] [Indexed: 11/17/2022] Open
Abstract
The microbiome is the collection of all microbial genes and can be investigated by sequencing highly variable regions of 16S ribosomal RNA (rRNA) genes. Evidence suggests that environmental factors and host genetics may interact to impact human microbiome composition. Identifying host genetic variants associated with human microbiome composition not only provides clues for characterizing microbiome variation but also helps to elucidate biological mechanisms of genetic associations, prioritize genetic variants, and improve genetic risk prediction. Since a microbiota functions as a community, it is best characterized by β diversity; that is, a pairwise distance matrix. We develop a statistical framework and a computationally efficient software package, microbiomeGWAS, for identifying host genetic variants associated with microbiome β diversity with or without interacting with an environmental factor. We show that the score statistics have positive skewness and kurtosis due to the dependent nature of the pairwise data, which makes p-value approximations based on asymptotic distributions unacceptably liberal. By correcting for skewness and kurtosis, we develop accurate p-value approximations, whose accuracy was verified by extensive simulations. We exemplify our methods by analyzing a set of 147 genotyped subjects with 16S rRNA microbiome profiles from non-malignant lung tissues. Correcting for skewness and kurtosis eliminated the dramatic deviation in the quantile–quantile plots. We provided preliminary evidence that six established lung cancer risk SNPs were collectively associated with microbiome composition for both unweighted (p = 0.0032) and weighted (p = 0.011) UniFrac distance matrices. In summary, our methods will facilitate analyzing large-scale genome-wide association studies of the human microbiome.
Collapse
|
76
|
Large-Scale Multi-Omics Studies Provide New Insights into Blood Pressure Regulation. Int J Mol Sci 2022; 23:ijms23147557. [PMID: 35886906 PMCID: PMC9323755 DOI: 10.3390/ijms23147557] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 06/27/2022] [Accepted: 06/29/2022] [Indexed: 12/04/2022] Open
Abstract
Recent genome-wide association studies uncovered part of blood pressure’s heritability. However, there is still a vast gap between genetics and biology that needs to be bridged. Here, we followed up blood pressure genome-wide summary statistics of over 750,000 individuals, leveraging comprehensive epigenomic and transcriptomic data from blood with a follow-up in cardiovascular tissues to prioritise likely causal genes and underlying blood pressure mechanisms. We first prioritised genes based on coding consequences, multilayer molecular associations, blood pressure-associated expression levels, and coregulation evidence. Next, we followed up the prioritised genes in multilayer studies of genomics, epigenomics, and transcriptomics, functional enrichment, and their potential suitability as drug targets. Our analyses yielded 1880 likely causal genes for blood pressure, tens of which are targets of the available licensed drugs. We identified 34 novel genes for blood pressure, supported by more than one source of biological evidence. Twenty-eight (82%) of these new genes were successfully replicated by transcriptome-wide association analyses in a large independent cohort (n = ~220,000). We also found a substantial mediating role for epigenetic regulation of the prioritised genes. Our results provide new insights into genetic regulation of blood pressure in terms of likely causal genes and involved biological pathways offering opportunities for future translation into clinical practice.
Collapse
|
77
|
Chou SP, Alexander AK, Rice EJ, Choate LA, Danko CG. Genetic dissection of the RNA polymerase II transcription cycle. eLife 2022; 11:e78458. [PMID: 35775732 PMCID: PMC9286732 DOI: 10.7554/elife.78458] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 06/30/2022] [Indexed: 11/20/2022] Open
Abstract
How DNA sequence affects the dynamics and position of RNA Polymerase II (Pol II) during transcription remains poorly understood. Here, we used naturally occurring genetic variation in F1 hybrid mice to explore how DNA sequence differences affect the genome-wide distribution of Pol II. We measured the position and orientation of Pol II in eight organs collected from heterozygous F1 hybrid mice using ChRO-seq. Our data revealed a strong genetic basis for the precise coordinates of transcription initiation and promoter proximal pause, allowing us to redefine molecular models of core transcriptional processes. Our results implicate DNA sequence, including both known and novel DNA sequence motifs, as key determinants of the position of Pol II initiation and pause. We report evidence that initiation site selection follows a stochastic process similar to Brownian motion along the DNA template. We found widespread differences in the position of transcription termination, which impact the primary structure and stability of mature mRNA. Finally, we report evidence that allelic changes in transcription often affect mRNA and ncRNA expression across broad genomic domains. Collectively, we reveal how DNA sequences shape core transcriptional processes at single nucleotide resolution in mammals.
Collapse
Affiliation(s)
- Shao-Pei Chou
- Baker Institute for Animal Health, College of Veterinary Medicine, Cornell UniversityIthacaUnited States
| | - Adriana K Alexander
- Baker Institute for Animal Health, College of Veterinary Medicine, Cornell UniversityIthacaUnited States
- Department of Biomedical Sciences, College of Veterinary Medicine, Cornell UniversityIthacaUnited States
| | - Edward J Rice
- Baker Institute for Animal Health, College of Veterinary Medicine, Cornell UniversityIthacaUnited States
| | - Lauren A Choate
- Baker Institute for Animal Health, College of Veterinary Medicine, Cornell UniversityIthacaUnited States
| | - Charles G Danko
- Baker Institute for Animal Health, College of Veterinary Medicine, Cornell UniversityIthacaUnited States
- Department of Biomedical Sciences, College of Veterinary Medicine, Cornell UniversityIthacaUnited States
| |
Collapse
|
78
|
Bioinformatic Prioritization and Functional Annotation of GWAS-Based Candidate Genes for Primary Open-Angle Glaucoma. Genes (Basel) 2022; 13:genes13061055. [PMID: 35741817 PMCID: PMC9222386 DOI: 10.3390/genes13061055] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Revised: 05/29/2022] [Accepted: 05/30/2022] [Indexed: 12/19/2022] Open
Abstract
Background: Primary open-angle glaucoma (POAG) is the most prevalent glaucoma subtype, but its exact etiology is still unknown. In this study, we aimed to prioritize the most likely ‘causal’ genes and identify functional characteristics and underlying biological pathways of POAG candidate genes. Methods: We used the results of a large POAG genome-wide association analysis study from GERA and UK Biobank cohorts. First, we performed systematic gene-prioritization analyses based on: (i) nearest genes; (ii) nonsynonymous single-nucleotide polymorphisms; (iii) co-regulation analysis; (iv) transcriptome-wide association studies; and (v) epigenomic data. Next, we performed functional enrichment analyses to find overrepresented functional pathways and tissues. Results: We identified 142 prioritized genes, of which 64 were novel for POAG. BICC1, AFAP1, and ABCA1 were the most highly prioritized genes based on four or more lines of evidence. The most significant pathways were related to extracellular matrix turnover, transforming growth factor-β, blood vessel development, and retinoic acid receptor signaling. Ocular tissues such as sclera and trabecular meshwork showed enrichment in prioritized gene expression (>1.5 fold). We found pleiotropy of POAG with intraocular pressure and optic-disc parameters, as well as genetic correlation with hypertension and diabetes-related eye disease. Conclusions: Our findings contribute to a better understanding of the molecular mechanisms underlying glaucoma pathogenesis and have prioritized many novel candidate genes for functional follow-up studies.
Collapse
|
79
|
Khunsriraksakul C, McGuire D, Sauteraud R, Chen F, Yang L, Wang L, Hughey J, Eckert S, Dylan Weissenkampen J, Shenoy G, Marx O, Carrel L, Jiang B, Liu DJ. Integrating 3D genomic and epigenomic data to enhance target gene discovery and drug repurposing in transcriptome-wide association studies. Nat Commun 2022; 13:3258. [PMID: 35672318 PMCID: PMC9171100 DOI: 10.1038/s41467-022-30956-7] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 05/25/2022] [Indexed: 02/08/2023] Open
Abstract
Transcriptome-wide association studies (TWAS) are popular approaches to test for association between imputed gene expression levels and traits of interest. Here, we propose an integrative method PUMICE (Prediction Using Models Informed by Chromatin conformations and Epigenomics) to integrate 3D genomic and epigenomic data with expression quantitative trait loci (eQTL) to more accurately predict gene expressions. PUMICE helps define and prioritize regions that harbor cis-regulatory variants, which outperforms competing methods. We further describe an extension to our method PUMICE +, which jointly combines TWAS results from single- and multi-tissue models. Across 79 traits, PUMICE + identifies 22% more independent novel genes and increases median chi-square statistics values at known loci by 35% compared to the second-best method, as well as achieves the narrowest credible interval size. Lastly, we perform computational drug repurposing and confirm that PUMICE + outperforms other TWAS methods.
Collapse
Affiliation(s)
- Chachrit Khunsriraksakul
- grid.29857.310000 0001 2097 4281Bioinformatics and Genomics Graduate Program, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA ,grid.29857.310000 0001 2097 4281Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA
| | - Daniel McGuire
- grid.29857.310000 0001 2097 4281Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA ,grid.29857.310000 0001 2097 4281Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA
| | - Renan Sauteraud
- grid.29857.310000 0001 2097 4281Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA ,grid.29857.310000 0001 2097 4281Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA
| | - Fang Chen
- grid.29857.310000 0001 2097 4281Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA ,grid.29857.310000 0001 2097 4281Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA
| | - Lina Yang
- grid.29857.310000 0001 2097 4281Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA ,grid.29857.310000 0001 2097 4281Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA
| | - Lida Wang
- grid.29857.310000 0001 2097 4281Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA ,grid.29857.310000 0001 2097 4281Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA
| | - Jordan Hughey
- grid.29857.310000 0001 2097 4281Bioinformatics and Genomics Graduate Program, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA ,grid.29857.310000 0001 2097 4281Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA
| | - Scott Eckert
- grid.29857.310000 0001 2097 4281Bioinformatics and Genomics Graduate Program, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA ,grid.29857.310000 0001 2097 4281Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA
| | - J. Dylan Weissenkampen
- grid.29857.310000 0001 2097 4281Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA ,grid.29857.310000 0001 2097 4281Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA
| | - Ganesh Shenoy
- grid.29857.310000 0001 2097 4281Department of Neurosurgery, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA
| | - Olivia Marx
- grid.29857.310000 0001 2097 4281Biomedical Science Program, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA
| | - Laura Carrel
- grid.29857.310000 0001 2097 4281Department of Biochemistry and Molecular Biology, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA
| | - Bibo Jiang
- grid.29857.310000 0001 2097 4281Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA
| | - Dajiang J. Liu
- grid.29857.310000 0001 2097 4281Bioinformatics and Genomics Graduate Program, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA ,grid.29857.310000 0001 2097 4281Institute for Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA ,grid.29857.310000 0001 2097 4281Department of Public Health Sciences, Pennsylvania State University College of Medicine, Hershey, PA 17033 USA
| |
Collapse
|
80
|
Grishin D, Gusev A. Allelic imbalance of chromatin accessibility in cancer identifies candidate causal risk variants and their mechanisms. Nat Genet 2022; 54:837-849. [PMID: 35697866 PMCID: PMC9886437 DOI: 10.1038/s41588-022-01075-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Accepted: 04/08/2022] [Indexed: 02/02/2023]
Abstract
While many germline cancer risk variants have been identified through genome-wide association studies (GWAS), the mechanisms by which these variants operate remain largely unknown. Here we used 406 cancer ATAC-Seq samples across 23 cancer types to identify 7,262 germline allele-specific accessibility QTLs (as-aQTLs). Cancer as-aQTLs had stronger enrichment for cancer risk heritability (up to 145 fold) than any other functional annotation across seven cancer GWAS. Most cancer as-aQTLs directly altered transcription factor (TF) motifs and exhibited differential TF binding and gene expression in functional screens. To connect as-aQTLs to putative risk mechanisms, we introduced the regulome-wide associations study (RWAS). RWAS identified genetically associated accessible peaks at >70% of known breast and prostate loci and discovered new risk loci in all examined cancer types. Integrating as-aQTL discovery, motif analysis and RWAS identified candidate causal regulatory elements and their probable upstream regulators. Our work establishes cancer as-aQTLs and RWAS analysis as powerful tools to study the genetic architecture of cancer risk.
Collapse
Affiliation(s)
- Dennis Grishin
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Alexander Gusev
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA. .,The Eli and Edythe L. Broad Institute, Cambridge, MA, USA. .,Division of Genetics, Brigham and Women's Hospital, Boston, MA, USA.
| |
Collapse
|
81
|
Liu C, Joehanes R, Ma J, Wang Y, Sun X, Keshawarz A, Sooda M, Huan T, Hwang SJ, Bui H, Tejada B, Munson PJ, Cumhur D, Heard-Costa NL, Pitsillides AN, Peloso GM, Feolo M, Sharopova N, Vasan RS, Levy D. Whole Genome DNA and RNA Sequencing of Whole Blood Elucidates the Genetic Architecture of Gene Expression Underlying a Wide Range of Diseases. RESEARCH SQUARE 2022:rs.3.rs-1598646. [PMID: 35664994 PMCID: PMC9164515 DOI: 10.21203/rs.3.rs-1598646/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
To create a scientific resource of expression quantitative trail loci (eQTL), we conducted a genome-wide association study (GWAS) using genotypes obtained from whole genome sequencing (WGS) of DNA and gene expression levels from RNA sequencing (RNA-seq) of whole blood in 2622 participants in Framingham Heart Study. We identified 6,778,286 cis -eQTL variant-gene transcript (eGene) pairs at p < 5x10 - 8 (2,855,111 unique cis -eQTL variants and 15,982 unique eGenes) and 1,469,754 trans -eQTL variant-eGene pairs at p < 1e-12 (526,056 unique trans -eQTL variants and 7,233 unique eGenes). In addition, 442,379 cis -eQTL variants were associated with expression of 1518 long non-protein coding RNAs (lncRNAs). Gene Ontology (GO) analyses revealed that the top GO terms for cis- eGenes are enriched for immune functions (FDR < 0.05). The cis -eQTL variants are enriched for SNPs reported to be associated with 815 traits in prior GWAS, including cardiovascular disease risk factors. As proof of concept, we used this eQTL resource in conjunction with genetic variants from public GWAS databases in causal inference testing (e.g., COVID-19 severity). After Bonferroni correction, Mendelian randomization analyses identified putative causal associations of 60 eGenes with systolic blood pressure, 13 genes with coronary artery disease, and seven genes with COVID-19 severity. This study created a comprehensive eQTL resource via BioData Catalyst that will be made available to the scientific community. This will advance understanding of the genetic architecture of gene expression underlying a wide range of diseases.
Collapse
|
82
|
Tian Y, Soupir A, Liu Q, Wu L, Huang CC, Park JY, Wang L. Novel role of prostate cancer risk variant rs7247241 on PPP1R14A isoform transition through allelic TF binding and CpG methylation. Hum Mol Genet 2022; 31:1610-1621. [PMID: 34849858 PMCID: PMC9122641 DOI: 10.1093/hmg/ddab347] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 11/22/2021] [Accepted: 11/23/2021] [Indexed: 12/21/2022] Open
Abstract
Although previous studies identified numerous single nucleotide polymorphisms (SNPs) and their target genes predisposed to prostate cancer (PrCa) risks, SNP-related splicing associations are rarely reported. In this study, we applied distance-based sQTL analysis (sQTLseekeR) using RNA-seq and SNP genotype data from benign prostate tissue (n = 467) and identified significant associations in 3344 SNP-transcript pairs (P ≤ 0.05) at PrCa risk loci. We characterized a common SNP (rs7247241) and its target gene (PPP1R14A) located in chr19q13, an sQTL with risk allele T associated with upregulation of long isoform (P = 9.99E-7). We confirmed the associations in both TCGA (P = 2.42E-24) and GTEX prostate cohorts (P = 9.08E-78). To functionally characterize this SNP, we performed chromatin immunoprecipitation qPCR and confirmed stronger CTCF and PLAGL2 binding in rs7247241 C than T allele. We found that CTCF binding enrichment was negatively associated with methylation level at the SNP site in human cell lines (r = -0.58). Bisulfite sequencing showed consistent association of rs7247241-T allele with nearby sequence CpG hypermethylation in prostate cell lines and tissues. Moreover, the methylation level at CpG sites nearest to the CTCF binding and first exon splice-in (ψ) of PPP1R14A was significantly associated with aggressive phenotype in the TCGA PrCa cohort. Meanwhile, the long isoform of the gene also promoted cell proliferation. Taken together, with the most updated gene annotations, we reported a set of sQTL associated with multiple traits related to human prostate diseases and revealed a unique role of PrCa risk SNP rs7247241 on PPP1R14A isoform transition.
Collapse
Affiliation(s)
- Yijun Tian
- Department of Tumor Biology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL 33612, USA
| | - Alex Soupir
- Department of Tumor Biology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL 33612, USA
| | - Qian Liu
- Department of Tumor Biology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL 33612, USA
| | - Lang Wu
- Division of Cancer Epidemiology, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Hawaii, HI 96822, USA
| | - Chiang-Ching Huang
- Joseph J. Zilber School of Public Health, University of Wisconsin, Milwaukee, WI 53226, USA
| | - Jong Y Park
- Department of Cancer Epidemiology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL 33612, USA
| | - Liang Wang
- Department of Tumor Biology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL 33612, USA
| |
Collapse
|
83
|
Wang H, Yang J, Zhang Y, Qian J, Wang J. Reconstruct high-resolution 3D genome structures for diverse cell-types using FLAMINGO. Nat Commun 2022; 13:2645. [PMID: 35551182 PMCID: PMC9098643 DOI: 10.1038/s41467-022-30270-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 04/22/2022] [Indexed: 11/30/2022] Open
Abstract
High-resolution reconstruction of spatial chromosome organizations from chromatin contact maps is highly demanded, but is hindered by extensive pairwise constraints, substantial missing data, and limited resolution and cell-type availabilities. Here, we present FLAMINGO, a computational method that addresses these challenges by compressing inter-dependent Hi-C interactions to delineate the underlying low-rank structures in 3D space, based on the low-rank matrix completion technique. FLAMINGO successfully generates 5 kb- and 1 kb-resolution spatial conformations for all chromosomes in the human genome across multiple cell-types, the largest resources to date. Compared to other methods using various experimental metrics, FLAMINGO consistently demonstrates superior accuracy in recapitulating observed structures with raises in scalability by orders of magnitude. The reconstructed 3D structures efficiently facilitate discoveries of higher-order multi-way interactions, imply biological interpretations of long-range QTLs, reveal geometrical properties of chromatin, and provide high-resolution references to understand structural variabilities. Importantly, FLAMINGO achieves robust predictions against high rates of missing data and significantly boosts 3D structure resolutions. Moreover, FLAMINGO shows vigorous cross cell-type structure predictions that capture cell-type specific spatial configurations via integration of 1D epigenomic signals. FLAMINGO can be widely applied to large-scale chromatin contact maps and expand high-resolution spatial genome conformations for diverse cell-types.
Collapse
Affiliation(s)
- Hao Wang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA
| | - Jiaxin Yang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA
| | - Yu Zhang
- Center for Immunobiology, Department of Investigative Medicine, Western Michigan University Homer Stryker M.D. School of Medicine, Kalamazoo, MI, 49007, USA
| | - Jianliang Qian
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA.
- Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA.
| | - Jianrong Wang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA.
| |
Collapse
|
84
|
Liu C, Joehanes R, Ma J, Wang Y, Sun X, Keshawarz A, Sooda M, Huan T, Hwang SJ, Bui H, Tejada B, Munson PJ, Cumhur D, Heard-Costa NL, Pitsillides AN, Peloso GM, Feolo M, Sharopova N, Vasan RS, Levy D. Whole Genome DNA and RNA Sequencing of Whole Blood Elucidates the Genetic Architecture of Gene Expression Underlying a Wide Range of Diseases. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2022:2022.04.13.22273841. [PMID: 35547845 PMCID: PMC9094109 DOI: 10.1101/2022.04.13.22273841] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
To create a scientific resource of expression quantitative trail loci (eQTL), we conducted a genome-wide association study (GWAS) using genotypes obtained from whole genome sequencing (WGS) of DNA and gene expression levels from RNA sequencing (RNA-seq) of whole blood in 2622 participants in Framingham Heart Study. We identified 6,778,286 cis -eQTL variant-gene transcript (eGene) pairs at p <5×10 -8 (2,855,111 unique cis -eQTL variants and 15,982 unique eGenes) and 1,469,754 trans -eQTL variant-eGene pairs at p <1e-12 (526,056 unique trans -eQTL variants and 7,233 unique eGenes). In addition, 442,379 cis -eQTL variants were associated with expression of 1518 long non-protein coding RNAs (lncRNAs). Gene Ontology (GO) analyses revealed that the top GO terms for cis- eGenes are enriched for immune functions (FDR <0.05). The cis -eQTL variants are enriched for SNPs reported to be associated with 815 traits in prior GWAS, including cardiovascular disease risk factors. As proof of concept, we used this eQTL resource in conjunction with genetic variants from public GWAS databases in causal inference testing (e.g., COVID-19 severity). After Bonferroni correction, Mendelian randomization analyses identified putative causal associations of 60 eGenes with systolic blood pressure, 13 genes with coronary artery disease, and seven genes with COVID-19 severity. This study created a comprehensive eQTL resource via BioData Catalyst that will be made available to the scientific community. This will advance understanding of the genetic architecture of gene expression underlying a wide range of diseases.
Collapse
Affiliation(s)
- Chunyu Liu
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA, USA
- Framingham Heart Study, Framingham, MA, USA
| | - Roby Joehanes
- Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Jiantao Ma
- Nutrition Epidemiology and Data Science, Friedman School of Nutrition Science and Policy, Tufts University, Boston, MA, USA
| | - Yuxuan Wang
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA, USA
| | - Xianbang Sun
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA, USA
| | - Amena Keshawarz
- Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Meera Sooda
- Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Tianxiao Huan
- University of Massachusetts Medical School, Worcester, MA, USA
| | - Shih-Jen Hwang
- Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Helena Bui
- Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Brandon Tejada
- Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Peter J Munson
- Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Nancy L Heard-Costa
- Framingham Heart Study, Framingham, MA, USA
- Departments of Medicine and Epidemiology, Boston University Schools of Medicine and Public Health, Boston, MA, USA
| | | | - Gina M Peloso
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA, USA
| | - Michael Feolo
- University of Massachusetts Medical School, Worcester, MA, USA
| | | | - Ramachandran S Vasan
- Framingham Heart Study, Framingham, MA, USA
- Departments of Medicine and Epidemiology, Boston University Schools of Medicine and Public Health, Boston, MA, USA
| | - Daniel Levy
- Framingham Heart Study, Framingham, MA, USA
- Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
85
|
Flynn E, Lappalainen T. Functional Characterization of Genetic Variant Effects on Expression. Annu Rev Biomed Data Sci 2022; 5:119-139. [PMID: 35483347 DOI: 10.1146/annurev-biodatasci-122120-010010] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Thousands of common genetic variants in the human population have been associated with disease risk and phenotypic variation by genome-wide association studies (GWAS). However, the majority of GWAS variants fall into noncoding regions of the genome, complicating our understanding of their regulatory functions, and few molecular mechanisms of GWAS variant effects have been clearly elucidated. Here, we set out to review genetic variant effects, focusing on expression quantitative trait loci (eQTLs), including their utility in interpreting GWAS variant mechanisms. We discuss the interrelated challenges and opportunities for eQTL analysis, covering determining causal variants, elucidating molecular mechanisms of action, and understanding context variability. Addressing these questions can enable better functional characterization of disease-associated loci and provide insights into fundamental biological questions of the noncoding genetic regulatory code and its control of gene expression. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Elise Flynn
- New York Genome Center, New York, NY, USA; , .,Department of Systems Biology, Columbia University, New York, NY, USA
| | - Tuuli Lappalainen
- New York Genome Center, New York, NY, USA; , .,Department of Systems Biology, Columbia University, New York, NY, USA.,Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| |
Collapse
|
86
|
Abdalla M, Abdalla M. A general framework for predicting the transcriptomic consequences of non-coding variation and small molecules. PLoS Comput Biol 2022; 18:e1010028. [PMID: 35421087 PMCID: PMC9041867 DOI: 10.1371/journal.pcbi.1010028] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 04/26/2022] [Accepted: 03/16/2022] [Indexed: 11/18/2022] Open
Abstract
Genome wide association studies (GWASs) for complex traits have implicated thousands of genetic loci. Most GWAS-nominated variants lie in noncoding regions, complicating the systematic translation of these findings into functional understanding. Here, we leverage convolutional neural networks to assist in this challenge. Our computational framework, peaBrain, models the transcriptional machinery of a tissue as a two-stage process: first, predicting the mean tissue specific abundance of all genes and second, incorporating the transcriptomic consequences of genotype variation to predict individual abundance on a subject-by-subject basis. We demonstrate that peaBrain accounts for the majority (>50%) of variance observed in mean transcript abundance across most tissues and outperforms regularized linear models in predicting the consequences of individual genotype variation. We highlight the validity of the peaBrain model by calculating non-coding impact scores that correlate with nucleotide evolutionary constraint that are also predictive of disease-associated variation and allele-specific transcription factor binding. We further show how these tissue-specific peaBrain scores can be leveraged to pinpoint functional tissues underlying complex traits, outperforming methods that depend on colocalization of eQTL and GWAS signals. We subsequently: (a) derive continuous dense embeddings of genes for downstream applications; (b) highlight the utility of the model in predicting transcriptomic impact of small molecules and shRNA (on par with in vitro experimental replication of external test sets); (c) explore how peaBrain can be used to model difficult-to-study processes (such as neural induction); and (d) identify putatively functional eQTLs that are missed by high-throughput experimental approaches.
Collapse
Affiliation(s)
- Moustafa Abdalla
- Wellcome Trust Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
- Oxford Centre for Diabetes, Endocrinology and Metabolism, Radcliffe Department of Medicine, University of Oxford, Oxford, United Kingdom
- Computational Statistics and Machine Learning, Department of Statistics, University of Oxford, Oxford, United Kingdom
- Department of Surgery, Harvard Medical School, Boston, Massachusetts, United States of America
- * E-mail: (MA); (MA)
| | - Mohamed Abdalla
- Vector Institute for Artificial Intelligence, Toronto, Canada
- Department of Computer Science, University of Toronto, Toronto, Canada
- * E-mail: (MA); (MA)
| |
Collapse
|
87
|
Tian J, Chen C, Rao M, Zhang M, Lu Z, Cai Y, Ying P, Li B, Wang H, Wang L, Li Y, Huang J, Fan L, Cai X, Ning C, Li Y, Zhang F, Wang W, Jiang Y, Liu Y, Wang M, Li H, Huang C, Yang Z, Chang J, Zhu Y, Yang X, Miao X. Aberrant RNA splicing is a primary link between genetic variation and pancreatic cancer risk. Cancer Res 2022; 82:2084-2096. [PMID: 35363263 DOI: 10.1158/0008-5472.can-21-4367] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 02/15/2022] [Accepted: 03/30/2022] [Indexed: 11/16/2022]
Abstract
Understanding the genetic variation underlying transcript splicing is essential for fully dissecting the molecular mechanisms of common diseases. The available evidence from splicing quantitative trait locus (sQTL) studies using pancreatic ductal adenocarcinoma (PDAC) tissues have been limited to small sample sizes. Here we present a genome-wide sQTL analysis to identify single nucleotide polymorphisms (SNPs) that control mRNA splicing in 176 PDAC samples from TCGA. From this analysis, 16,175 sQTLs were found to be significantly enriched in RNA binding protein (RBP) binding sites and chromatin regulatory elements and overlapped with known loci from PDAC genome-wide association studies (GWAS). sQTLs and expression QTLs (eQTL) showed mostly non-overlapping patterns, suggesting sQTLs provide additional insights into the etiology of disease. Target genes affected by sQTLs were closely related to cancer signaling pathways, high mutational burden, immune infiltration, and pharmaceutical targets, which will be helpful for clinical applications. Integration of a large-scale population consisting of 2,782 PDAC patients and 7,983 healthy controls identified an sQTL variant rs1785932-T allele that promotes alternative splicing of ELP2 exon 6 and leads to a lower level of the ELP2 full-length isoform (ELP2_V1) and a higher level of a truncated ELP2 isoform (ELP2_V2), resulting in decreased risk of PDAC (OR=0.83, 95%CI=0.77-0.89, P=1.16×10-6). The ELP2_V2 isoform functioned as a potential tumor suppressor gene, inhibiting PDAC cell proliferation by exhibiting stronger binding affinity to JAK1/STAT3 than ELP2_V1 and subsequently blocking the pathological activation of the p-STAT3 pathway. Collectively, these findings provide an informative sQTL resource and insights into the regulatory mechanisms linking splicing variants to PDAC risk.
Collapse
Affiliation(s)
| | | | | | - Ming Zhang
- School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Zequn Lu
- School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Yimin Cai
- School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Pingting Ying
- School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Bin Li
- School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Haoxue Wang
- School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Lu Wang
- Department of Epidemiology and Biostatistics, Key Laboratory for Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Sciences and Technology, Wuhan, Hubei, China
| | - Yao Li
- Department of Epidemiology and Biostatistics, Key Laboratory for Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Sciences and Technology, China
| | - Jinyu Huang
- School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Linyun Fan
- School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Xiaomin Cai
- School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Caibo Ning
- School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Yanmin Li
- School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Fuwei Zhang
- School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Wenzhuo Wang
- School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | | | | | - Min Wang
- Affiliated Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Heng Li
- Tongji Hospital, Wuhan, Hubei, China
| | | | | | - Jiang Chang
- School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, China
| | | | | | | |
Collapse
|
88
|
Zhu Q, Schultz E, Long J, Roh JM, Valice E, Laurent CA, Radimer KH, Yan L, Ergas IJ, Davis W, Ranatunga D, Gandhi S, Kwan ML, Bao PP, Zheng W, Shu XO, Ambrosone C, Yao S, Kushi LH. UACA locus is associated with breast cancer chemoresistance and survival. NPJ Breast Cancer 2022; 8:39. [PMID: 35322040 PMCID: PMC8943134 DOI: 10.1038/s41523-022-00401-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 02/16/2022] [Indexed: 12/13/2022] Open
Abstract
Few germline genetic variants have been robustly linked with breast cancer outcomes. We conducted trans-ethnic meta genome-wide association study (GWAS) of overall survival (OS) in 3973 breast cancer patients from the Pathways Study, one of the largest prospective breast cancer survivor cohorts. A locus spanning the UACA gene, a key regulator of tumor suppressor Par-4, was associated with OS in patients taking Par-4 dependent chemotherapies, including anthracyclines and anti-HER2 therapy, at a genome-wide significance level (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$P = 1.27 \times 10^{ - 9}$$\end{document}P=1.27×10−9). This association was confirmed in meta-analysis across four independent prospective breast cancer cohorts (combined hazard ratio = 1.84, \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$P = 1.28 \times 10^{ - 11}$$\end{document}P=1.28×10−11). Transcriptome-wide association study revealed higher UACA gene expression was significantly associated with worse OS (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$P = 4.68 \times 10^{ - 7}$$\end{document}P=4.68×10−7). Our study identified the UACA locus as a genetic predictor of patient outcome following treatment with anthracyclines and/or anti-HER2 therapy, which may have clinical utility in formulating appropriate treatment strategies for breast cancer patients based on their genetic makeup.
Collapse
Affiliation(s)
- Qianqian Zhu
- Department of Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA.
| | - Emily Schultz
- Department of Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA
| | - Jirong Long
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Janise M Roh
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA
| | - Emily Valice
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA
| | - Cecile A Laurent
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA
| | - Kelly H Radimer
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA
| | - Li Yan
- Department of Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA
| | - Isaac J Ergas
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA
| | - Warren Davis
- Department of Cancer Prevention and Control, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA
| | - Dilrini Ranatunga
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA
| | - Shipra Gandhi
- Department of Medicine, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA
| | - Marilyn L Kwan
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA
| | - Ping-Ping Bao
- Shanghai Municipal Center for Disease Prevention and Control, Shanghai, China
| | - Wei Zheng
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Xiao-Ou Shu
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Christine Ambrosone
- Department of Cancer Prevention and Control, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA
| | - Song Yao
- Department of Cancer Prevention and Control, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA.
| | - Lawrence H Kushi
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA.
| |
Collapse
|
89
|
Kundu K, Tardaguila M, Mann AL, Watt S, Ponstingl H, Vasquez L, Von Schiller D, Morrell NW, Stegle O, Pastinen T, Sawcer SJ, Anderson CA, Walter K, Soranzo N. Genetic associations at regulatory phenotypes improve fine-mapping of causal variants for 12 immune-mediated diseases. Nat Genet 2022; 54:251-262. [PMID: 35288711 DOI: 10.1038/s41588-022-01025-y] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Accepted: 01/31/2022] [Indexed: 12/11/2022]
Abstract
The resolution of causal genetic variants informs understanding of disease biology. We used regulatory quantitative trait loci (QTLs) from the BLUEPRINT, GTEx and eQTLGen projects to fine-map putative causal variants for 12 immune-mediated diseases. We identify 340 unique loci that colocalize with high posterior probability (≥98%) with regulatory QTLs and apply Bayesian frameworks to fine-map associations at each locus. We show that fine-mapping credible sets derived from regulatory QTLs are smaller compared to disease summary statistics. Further, they are enriched for more functionally interpretable candidate causal variants and for putatively causal insertion/deletion (INDEL) polymorphisms. Finally, we use massively parallel reporter assays to evaluate candidate causal variants at the ITGA4 locus associated with inflammatory bowel disease. Overall, our findings suggest that fine-mapping applied to disease-colocalizing regulatory QTLs can enhance the discovery of putative causal disease variants and enhance insights into the underlying causal genes and molecular mechanisms.
Collapse
Affiliation(s)
- Kousik Kundu
- Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.,Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
| | - Manuel Tardaguila
- Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Alice L Mann
- Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Stephen Watt
- Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Hannes Ponstingl
- Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Louella Vasquez
- Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Dominique Von Schiller
- Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Nicholas W Morrell
- Division of Respiratory Medicine, Department of Medicine, University of Cambridge School of Clinical Medicine, Addenbrooke's and Papworth Hospitals, Cambridge, UK
| | - Oliver Stegle
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.,Division of Computational Genomics and Systems Genetics, German Cancer Research Center, Heidelberg, Germany.,Cellular Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Tomi Pastinen
- Genomic Medicine Center, Children's Mercy Kansas City and Children's Mercy Research Institute, Kansas City, MO, USA
| | - Stephen J Sawcer
- Department of Clinical Neurosciences, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
| | - Carl A Anderson
- Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Klaudia Walter
- Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Nicole Soranzo
- Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK. .,Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK. .,British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK. .,National Institute for Health Research Blood and Transplant Research Unit in Donor Health and Genomics, University of Cambridge, Cambridge, UK. .,Genomics Research Centre, Human Technopole, Milan, Italy.
| |
Collapse
|
90
|
Tapia AL, Rowland BT, Rosen JD, Preuss M, Young K, Graff M, Choquet H, Couper DJ, Buyske S, Bien SA, Jorgenson E, Kooperberg C, Loos RJF, Morrison AC, North KE, Yu B, Reiner AP, Li Y, Raffield LM. A large-scale transcriptome-wide association study (TWAS) of 10 blood cell phenotypes reveals complexities of TWAS fine-mapping. Genet Epidemiol 2022; 46:3-16. [PMID: 34779012 PMCID: PMC8887641 DOI: 10.1002/gepi.22436] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 08/26/2021] [Accepted: 10/18/2021] [Indexed: 02/03/2023]
Abstract
Hematological measures are important intermediate clinical phenotypes for many acute and chronic diseases and are highly heritable. Although genome-wide association studies (GWAS) have identified thousands of loci containing trait-associated variants, the causal genes underlying these associations are often uncertain. To better understand the underlying genetic regulatory mechanisms, we performed a transcriptome-wide association study (TWAS) to systematically investigate the association between genetically predicted gene expression and hematological measures in 54,542 Europeans from the Genetic Epidemiology Research on Aging cohort. We found 239 significant gene-trait associations with hematological measures; we replicated 71 associations at p < 0.05 in a TWAS meta-analysis consisting of up to 35,900 Europeans from the Women's Health Initiative, Atherosclerosis Risk in Communities Study, and BioMe Biobank. Additionally, we attempted to refine this list of candidate genes by performing conditional analyses, adjusting for individual variants previously associated with hematological measures, and performed further fine-mapping of TWAS loci. To facilitate interpretation of our findings, we designed an R Shiny application to interactively visualize our TWAS results by integrating them with additional genetic data sources (GWAS, TWAS from multiple reference panels, conditional analyses, known GWAS variants, etc.). Our results and application highlight frequently overlooked TWAS challenges and illustrate the complexity of TWAS fine-mapping.
Collapse
Affiliation(s)
- Amanda L. Tapia
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Bryce T. Rowland
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Jonathan D. Rosen
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Michael Preuss
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Kris Young
- Department of Epidemiology, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Misa Graff
- Department of Epidemiology, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Hélène Choquet
- Division of Research, Kaiser Permanente Northern California, Oakland, California, USA
| | - David J. Couper
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Steve Buyske
- Department of Statistics, Rutgers University, Piscataway, New Jersey, USA
| | - Stephanie A. Bien
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Eric Jorgenson
- Division of Research, Kaiser Permanente Northern California, Oakland, California, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Ruth J. F. Loos
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Alanna C. Morrison
- Department of Epidemiology, Human Genetics, and Environmental Sciences, Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Kari E. North
- Department of Epidemiology, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Bing Yu
- Department of Epidemiology, Human Genetics, and Environmental Sciences, Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Alexander P. Reiner
- Department of Epidemiology, University of Washington, Seattle, Washington, USA
| | - Yun Li
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, USA
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, USA
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Laura M. Raffield
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, USA
| |
Collapse
|
91
|
Meta-imputation of transcriptome from genotypes across multiple datasets by leveraging publicly available summary-level data. PLoS Genet 2022; 18:e1009571. [PMID: 35100255 PMCID: PMC8830793 DOI: 10.1371/journal.pgen.1009571] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 02/10/2022] [Accepted: 01/07/2022] [Indexed: 11/22/2022] Open
Abstract
Transcriptome wide association studies (TWAS) can be used as a powerful method to identify and interpret the underlying biological mechanisms behind GWAS by mapping gene expression levels with phenotypes. In TWAS, gene expression is often imputed from individual-level genotypes of regulatory variants identified from external resources, such as Genotype-Tissue Expression (GTEx) Project. In this setting, a straightforward approach to impute expression levels of a specific tissue is to use the model trained from the same tissue type. When multiple tissues are available for the same subjects, it has been demonstrated that training imputation models from multiple tissue types improves the accuracy because of shared eQTLs between the tissues and increase in effective sample size. However, existing joint-tissue methods require access of genotype and expression data across all tissues. Moreover, they cannot leverage the abundance of various expression datasets across various tissues for non-overlapping individuals. Here, we explore the optimal way to combine imputed levels across training models from multiple tissues and datasets in a flexible manner using summary-level data. Our proposed method (SWAM) combines arbitrary number of transcriptome imputation models to linearly optimize the imputation accuracy given a target tissue. By integrating models across tissues and/or individuals, SWAM can improve the accuracy of transcriptome imputation or to improve power to TWAS while only requiring individual-level data from a single reference cohort. To evaluate the accuracy of SWAM, we combined 49 tissue-specific gene expression imputation models from the GTEx Project as well as from a large eQTL study of Depression Susceptibility Genes and Networks (DGN) Project and tested imputation accuracy in GEUVADIS lymphoblastoid cell lines samples. We also extend our meta-imputation method to meta-TWAS to leverage multiple tissues in TWAS analysis with summary-level statistics. Our results capitalize on the importance of integrating multiple tissues to unravel regulatory impacts of genetic variants on complex traits. The gene expression levels within a cell are affected by various factors, including DNA variation, cell type, cellular microenvironment, disease status, and other environmental factors surrounding the individual. The genetic component of gene expression is known to explain a substantial fraction of transcriptional variation among individuals and can be imputed from genotypes in a tissue-specific manner, by training from population-scale transcriptomic profiles designed to identify expression quantitative loci (eQTLs). Imputing gene expression levels is shown to help understand the genetic basis of human disease through Transcriptome-wide association analysis (TWAS) and Mendelian Randomization (MR). However, it has been unclear how to integrate multiple imputation models trained from individual datasets to maximize their accuracy without having to access individual genotypes and expression levels that are often protected for privacy concerns. We developed SWAM (Smartly Weighted Averaging across Multiple datasets), a meta-imputation framework which can accurately impute gene expression levels from genotypes by integrating multiple imputation models without requiring individual-level data. Our method examines the similarity or differences between resources and borrowing information most relevant to the tissue of interest. We demonstrate that SWAM outperforms existing single-tissue and multi-tissue imputation models and continue to increase accuracy when integrating additional imputation models.
Collapse
|
92
|
Khaliq SA, Umair Z, Yoon MS. Role of ARHGEF3 as a GEF and mTORC2 Regulator. Front Cell Dev Biol 2022; 9:806258. [PMID: 35174167 PMCID: PMC8841341 DOI: 10.3389/fcell.2021.806258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Accepted: 12/24/2021] [Indexed: 11/26/2022] Open
Abstract
Guanine nucleotide exchange factors (GEFs) activate GTPases by stimulating the release of guanosine diphosphate to permit the binding of guanosine triphosphate. ARHGEF3 or XPLN (exchange factor found in platelets, leukemic, and neuronal tissues) is a selective guanine nucleotide exchange factor for Rho GTPases (RhoGEFs) that activates RhoA and RhoB but not RhoC, RhoG, Rac1, or Cdc42. ARHGEF3 contains the diffuse B-cell lymphoma homology and pleckstrin homology domains but lacks similarity with other known functional domains. ARHGEF3 also binds the mammalian target of rapamycin complex 2 (mTORC2) and subsequently inhibits mTORC2 and Akt. In vivo investigation has also indicated the communication between ARHGEF3 and autophagy-related muscle pathologies. Moreover, studies on genetic variation in ARHGEF3 and genome-wide association studies have predicted exciting novel roles of ARHGEF3 in controlling bone mineral density, platelet formation and differentiation, and Hirschsprung disease. In conclusion, we hypothesized that additional biochemical and functional studies are required to elucidate the detailed mechanism of ARHGEF3-related pathologies and therapeutics.
Collapse
Affiliation(s)
- Sana Abdul Khaliq
- Department of Molecular Medicine, Gachon University College of Medicine, Incheon, South Korea
- Department of Health Sciences and Technology, GAIHST, Gachon University, Incheon, South Korea
| | - Zobia Umair
- Department of Molecular Medicine, Gachon University College of Medicine, Incheon, South Korea
- *Correspondence: Zobia Umair, ; Mee-Sup Yoon,
| | - Mee-Sup Yoon
- Department of Molecular Medicine, Gachon University College of Medicine, Incheon, South Korea
- Department of Health Sciences and Technology, GAIHST, Gachon University, Incheon, South Korea
- Lee Gil Ya Cancer and Diabetes Institute, Gachon University, Incheon, South Korea
- *Correspondence: Zobia Umair, ; Mee-Sup Yoon,
| |
Collapse
|
93
|
Korsching E, Matschke J, Hotfilder M. Splice variants denote differences between a cancer stem cell side population of EWSR1‑ERG‑based Ewing sarcoma cells, its main population and EWSR1‑FLI‑based cells. Int J Mol Med 2022; 49:39. [PMID: 35088879 PMCID: PMC8815407 DOI: 10.3892/ijmm.2022.5094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Accepted: 12/17/2021] [Indexed: 11/06/2022] Open
Abstract
Ewing sarcoma is a challenging cancer entity, which, besides the characteristic presence of a fusion gene, is driven by multiple alternative splicing events. So far, splice variants in Ewing sarcoma cells were mainly analyzed for EWSR1‑FLI1. The present study provided a comprehensive alternative splicing study on CADO‑ES1, an Ewing model cell line for an EWSR1‑ERG fusion gene. Based on a well‑-characterized RNA‑sequencing dataset with extensive control mechanisms across all levels of analysis, the differential spliced genes in Ewing cancer stem cells were ATP13A3 and EPB41, while the main population was defined by ACADVL, NOP58 and TSPAN3. All alternatively spliced genes were further characterized by their Gene Ontology (GO) terms and by their membership in known protein complexes. These results confirm and extend previous studies towards a systematic whole‑transcriptome analysis. A highlight is the striking segregation of GO terms associated with five basic splice events. This mechanistic insight, together with a coherent integration of all observations with prior knowledge, indicates that EWSR1‑ERG is truly a close twin to EWSR1‑FLI1, but still exhibits certain individuality. Thus, the present study provided a measure of variability in Ewing sarcoma, whose understanding is essential both for clinical procedures and basic mechanistic insight.
Collapse
Affiliation(s)
- Eberhard Korsching
- Institute of Bioinformatics, Faculty of Medicine, University of Münster, D‑48149 Münster, Germany
| | - Julian Matschke
- Institute of Bioinformatics, Faculty of Medicine, University of Münster, D‑48149 Münster, Germany
| | - Marc Hotfilder
- Department of Pediatric Hematology and Oncology, University Hospital Münster, D‑48149 Münster, Germany
| |
Collapse
|
94
|
Mortlock S, McKinnon B, Montgomery GW. Genetic Regulation of Transcription in the Endometrium in Health and Disease. FRONTIERS IN REPRODUCTIVE HEALTH 2022; 3:795464. [PMID: 36304015 PMCID: PMC9580733 DOI: 10.3389/frph.2021.795464] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 12/06/2021] [Indexed: 11/25/2023] Open
Abstract
The endometrium is a complex and dynamic tissue essential for fertility and implicated in many reproductive disorders. The tissue consists of glandular epithelium and vascularised stroma and is unique because it is constantly shed and regrown with each menstrual cycle, generating up to 10 mm of new mucosa. Consequently, there are marked changes in cell composition and gene expression across the menstrual cycle. Recent evidence shows expression of many genes is influenced by genetic variation between individuals. We and others have reported evidence for genetic effects on hundreds of genes in endometrium. The genetic factors influencing endometrial gene expression are highly correlated with the genetic effects on expression in other reproductive (e.g., in uterus and ovary) and digestive tissues (e.g., salivary gland and stomach), supporting a shared genetic regulation of gene expression in biologically similar tissues. There is also increasing evidence for cell specific genetic effects for some genes. Sample size for studies in endometrium are modest and results from the larger studies of gene expression in blood report genetic effects for a much higher proportion of genes than currently reported for endometrium. There is also emerging evidence for the importance of genetic variation on RNA splicing. Gene mapping studies for common disease, including diseases associated with endometrium, show most variation maps to intergenic regulatory regions. It is likely that genetic risk factors for disease function through modifying the program of cell specific gene expression. The emerging evidence from our gene mapping studies coupled with tissue specific studies, and the GTEx, eQTLGen and EpiMap projects, show we need to expand our understanding of the complex regulation of gene expression. These data also help to link disease genetic risk factors to specific target genes. Combining our data on genetic regulation of gene expression in endometrium, and cell types within the endometrium with gene mapping data for endometriosis and related diseases is beginning to uncover the specific genes and pathways responsible for increased risk of these diseases.
Collapse
Affiliation(s)
| | | | - Grant W. Montgomery
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
| |
Collapse
|
95
|
Gokuladhas S, Zaied RE, Schierding W, Farrow S, Fadason T, O'Sullivan JM. Integrating Multimorbidity into a Whole-Body Understanding of Disease Using Spatial Genomics. Results Probl Cell Differ 2022; 70:157-187. [PMID: 36348107 DOI: 10.1007/978-3-031-06573-6_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Multimorbidity is characterized by multidimensional complexity emerging from interactions between multiple diseases across levels of biological (including genetic) and environmental determinants and the complex array of interactions between and within cells, tissues and organ systems. Advances in spatial genomic research have led to an unprecedented expansion in our ability to link alterations in genome folding with changes that are associated with human disease. Studying disease-associated genetic variants in the context of the spatial genome has enabled the discovery of transcriptional regulatory programmes that potentially link dysregulated genes to disease development. However, the approaches that have been used have typically been applied to uncover pathological molecular mechanisms occurring in a specific disease-relevant tissue. These forms of reductionist, targeted investigations are not appropriate for the molecular dissection of multimorbidity that typically involves contributions from multiple tissues. In this perspective, we emphasize the importance of a whole-body understanding of multimorbidity and discuss how spatial genomics, when integrated with additional omic datasets, could provide novel insights into the molecular underpinnings of multimorbidity.
Collapse
Affiliation(s)
| | - Roan E Zaied
- Liggins Institute, The University of Auckland, Auckland, New Zealand
| | - William Schierding
- Liggins Institute, The University of Auckland, Auckland, New Zealand
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
| | - Sophie Farrow
- Liggins Institute, The University of Auckland, Auckland, New Zealand
| | - Tayaza Fadason
- Liggins Institute, The University of Auckland, Auckland, New Zealand
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
| | - Justin M O'Sullivan
- Liggins Institute, The University of Auckland, Auckland, New Zealand.
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand.
- Australian Parkinson's Mission, Garvan Institute of Medical Research, Sydney, NSW, Australia.
- MRC Lifecourse Epidemiology Unit, University of Southampton, Southampton, UK.
| |
Collapse
|
96
|
Elorbany R, Popp JM, Rhodes K, Strober BJ, Barr K, Qi G, Gilad Y, Battle A. Single-cell sequencing reveals lineage-specific dynamic genetic regulation of gene expression during human cardiomyocyte differentiation. PLoS Genet 2022; 18:e1009666. [PMID: 35061661 PMCID: PMC8809621 DOI: 10.1371/journal.pgen.1009666] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 02/02/2022] [Accepted: 12/21/2021] [Indexed: 12/13/2022] Open
Abstract
Dynamic and temporally specific gene regulatory changes may underlie unexplained genetic associations with complex disease. During a dynamic process such as cellular differentiation, the overall cell type composition of a tissue (or an in vitro culture) and the gene regulatory profile of each cell can both experience significant changes over time. To identify these dynamic effects in high resolution, we collected single-cell RNA-sequencing data over a differentiation time course from induced pluripotent stem cells to cardiomyocytes, sampled at 7 unique time points in 19 human cell lines. We employed a flexible approach to map dynamic eQTLs whose effects vary significantly over the course of bifurcating differentiation trajectories, including many whose effects are specific to one of these two lineages. Our study design allowed us to distinguish true dynamic eQTLs affecting a specific cell lineage from expression changes driven by potentially non-genetic differences between cell lines such as cell composition. Additionally, we used the cell type profiles learned from single-cell data to deconvolve and re-analyze data from matched bulk RNA-seq samples. Using this approach, we were able to identify a large number of novel dynamic eQTLs in single cell data while also attributing dynamic effects in bulk to a particular lineage. Overall, we found that using single cell data to uncover dynamic eQTLs can provide new insight into the gene regulatory changes that occur among heterogeneous cell types during cardiomyocyte differentiation.
Collapse
Affiliation(s)
- Reem Elorbany
- Interdisciplinary Scientist Training Program, University of Chicago, Chicago, Illinois, United States of America
| | - Joshua M. Popp
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Katherine Rhodes
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Benjamin J. Strober
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Kenneth Barr
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Guanghao Qi
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Yoav Gilad
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Department of Medicine, University of Chicago, Chicago, Illinois, United States of America
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, United States of America
- Department of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland, United States of America
| |
Collapse
|
97
|
Yang Z, Yang J, Mao Y, Li MD. Investigation of the genetic effect of 56 tobacco-smoking susceptibility genes on DNA methylation and RNA expression in human brain. Front Psychiatry 2022; 13:924062. [PMID: 36061282 PMCID: PMC9433921 DOI: 10.3389/fpsyt.2022.924062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 07/29/2022] [Indexed: 11/24/2022] Open
Abstract
Although various susceptibility genes have been revealed to influence tobacco smoking, the underlying regulatory mechanisms between genetic variants and smoking are poorly understood. In this study, we investigated cis-expression quantitative trait loci (cis-eQTLs) and methylation quantitative trait loci (mQTLs) for 56 candidate smoking-linked genes using the BrainCloud cohort samples. An eQTL was revealed to significantly affect EGLN2 expression in the European sample and two mQTLs were respectively detected in CpG sites in NRXN1 and CYP2A7. Interestingly, we found for the first time that the minor allele of the single nucleotide polymorphism (SNP) rs3745277 located in CYP2A7P1 (downstream of CYP2B6) significantly decreased methylation at the CpG site for CYP2A7 (cg25427638; P = 5.31 × 10-7), reduced expression of CYP2B6 (P = 0.03), and lowered the percentage of smokers (8.8% vs. 42.3%; Odds Ratio (OR) = 0.14, 95% Confidence Interval (CI): 0.02-0.62; P = 4.47 × 10-3) in a dominant way for the same cohort sample. Taken together, our findings resulted from analyzing genetic variation, DNA methylation, mRNA expression, and smoking status together using the same participants revealed a regulatory mechanism linking mQTLs to the smoking phenotype. Moreover, we demonstrated the presence of different regulatory effects of low-frequency and common variants on mRNA expression and DNA methylation.
Collapse
Affiliation(s)
- Zhongli Yang
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, National Medical Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Jiekun Yang
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, National Medical Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Ying Mao
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, National Medical Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Ming D Li
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, National Medical Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China.,Research Center for Air Pollution and Health, Zhejiang University, Hangzhou, China
| |
Collapse
|
98
|
Portella AK, Papantoni A, Joseph AT, Chen L, Lee RS, Silveira PP, Dube L, Carnell S. Genetically-predicted prefrontal DRD4 gene expression modulates differentiated brain responses to food cues in adolescent girls and boys. Sci Rep 2021; 11:24094. [PMID: 34916545 PMCID: PMC8677785 DOI: 10.1038/s41598-021-02797-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Accepted: 11/09/2021] [Indexed: 11/18/2022] Open
Abstract
The dopamine receptor 4 (DRD4) in the prefrontal cortex (PFC) acts to modulate behaviours including cognitive control and motivation, and has been implicated in behavioral inhibition and responsivity to food cues. Adolescence is a sensitive period for the development of habitual eating behaviors and obesity risk, with potential mediation by development of the PFC. We previously found that genetic variations influencing DRD4 function or expression were associated with measures of laboratory and real-world eating behavior in girls and boys. Here we investigated brain responses to high energy–density (ED) and low-ED food cues using an fMRI task conducted in the satiated state. We used the gene-based association method PrediXcan to estimate tissue-specific DRD4 gene expression in prefrontal brain areas from individual genotypes. Among girls, those with lower vs. higher predicted prefrontal DRD4 expression showed lesser activation to high-ED and low-ED vs. non-food cues in a distributed network of regions implicated in attention and sensorimotor processing including middle frontal gyrus, and lesser activation to low-ED vs non-food cues in key regions implicated in valuation including orbitofrontal cortex and ventromedial PFC. In contrast, males with lower vs. higher predicted prefrontal DRD4 expression showed minimal differences in food cue response, namely relatively greater activation to high-ED and low-ED vs. non-food cues in the inferior parietal lobule. Our data suggest sex-specific effects of prefrontal DRD4 on brain food responsiveness in adolescence, with modulation of distributed regions relevant to cognitive control and motivation observable in female adolescents.
Collapse
Affiliation(s)
- Andre K Portella
- Desautels Faculty of Management, McGill Center for the Convergence of Health and Economics, McGill University, Montreal, QC, Canada.,Postgraduate Program in Pediatrics, Universidade Federal de Ciencias da Saude de Porto Alegre, Porto Alegre, RS, Brazil
| | - Afroditi Papantoni
- Department of Nutrition, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Antoneta T Joseph
- McGill Centre for the Convergence of Health and Economics (MCCHE), McGill University, Montreal, Canada
| | - Liuyi Chen
- Department of Psychiatry and Behavioral Sciences, Division of Psychiatric Neuroimaging, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Richard S Lee
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Patricia P Silveira
- Ludmer Centre for Neuroinformatics and Mental Health, Montreal, QC, Canada.,Department of Psychiatry, McGill University, Montreal, QC, Canada
| | - Laurette Dube
- Desautels Faculty of Management, McGill Center for the Convergence of Health and Economics, McGill University, Montreal, QC, Canada
| | - Susan Carnell
- Department of Psychiatry and Behavioral Sciences, Division of Child and Adolescent Psychiatry, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
99
|
Genetically regulated expression in late-onset Alzheimer's disease implicates risk genes within known and novel loci. Transl Psychiatry 2021; 11:618. [PMID: 34873149 PMCID: PMC8648734 DOI: 10.1038/s41398-021-01677-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Revised: 09/27/2021] [Accepted: 10/06/2021] [Indexed: 12/22/2022] Open
Abstract
Late-onset Alzheimer disease (LOAD) is highly polygenic, with a heritability estimated between 40 and 80%, yet risk variants identified in genome-wide studies explain only ~8% of phenotypic variance. Due to its increased power and interpretability, genetically regulated expression (GReX) analysis is an emerging approach to investigate the genetic mechanisms of complex diseases. Here, we conducted GReX analysis within and across 51 tissues on 39 LOAD GWAS data sets comprising 58,713 cases and controls from the Alzheimer's Disease Genetics Consortium (ADGC) and the International Genomics of Alzheimer's Project (IGAP). Meta-analysis across studies identified 216 unique significant genes, including 72 with no previously reported LOAD GWAS associations. Cross-brain-tissue and cross-GTEx models revealed eight additional genes significantly associated with LOAD. Conditional analysis of previously reported loci using established LOAD-risk variants identified eight genes reaching genome-wide significance independent of known signals. Moreover, the proportion of SNP-based heritability is highly enriched in genes identified by GReX analysis. In summary, GReX-based meta-analysis in LOAD identifies 216 genes (including 72 novel genes), illuminating the role of gene regulatory models in LOAD.
Collapse
|
100
|
Ng B, Casazza W, Kim NH, Wang C, Farhadi F, Tasaki S, Bennett DA, De Jager PL, Gaiteri C, Mostafavi S. Cascading epigenomic analysis for identifying disease genes from the regulatory landscape of GWAS variants. PLoS Genet 2021; 17:e1009918. [PMID: 34807913 PMCID: PMC8648125 DOI: 10.1371/journal.pgen.1009918] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 12/06/2021] [Accepted: 10/30/2021] [Indexed: 12/14/2022] Open
Abstract
The majority of genetic variants detected in genome wide association studies (GWAS) exert their effects on phenotypes through gene regulation. Motivated by this observation, we propose a multi-omic integration method that models the cascading effects of genetic variants from epigenome to transcriptome and eventually to the phenome in identifying target genes influenced by risk alleles. This cascading epigenomic analysis for GWAS, which we refer to as CEWAS, comprises two types of models: one for linking cis genetic effects to epigenomic variation and another for linking cis epigenomic variation to gene expression. Applying these models in cascade to GWAS summary statistics generates gene level statistics that reflect genetically-driven epigenomic effects. We show on sixteen brain-related GWAS that CEWAS provides higher gene detection rate than related methods, and finds disease relevant genes and gene sets that point toward less explored biological processes. CEWAS thus presents a novel means for exploring the regulatory landscape of GWAS variants in uncovering disease mechanisms. The majority of genetic variants detected in genome wide association studies (GWAS) exert their effects on phenotypes through gene regulation. Motivated by this observation, we propose a multi-omic integration method that models the cascading effects of genetic variants from epigenome to transcriptome and eventually to the phenome in identifying target genes influenced by risk alleles. This cascading epigenomic analysis for GWAS, which we refer to as CEWAS, combines the effect of genetic variants on DNA methylation as well as gene expression. We show on sixteen brain-related GWAS that CEWAS provides higher gene detection rate than related methods, and finds disease relevant genes and gene sets that point toward less explored biological processes.
Collapse
Affiliation(s)
- Bernard Ng
- Department of Statistics and Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
- Centre for Molecular Medicine and Therapeutics, Vancouver, British Columbia, Canada
| | - William Casazza
- Department of Statistics and Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
- Centre for Molecular Medicine and Therapeutics, Vancouver, British Columbia, Canada
| | - Nam Hee Kim
- Department of Computer Science, University of British Columbia, Vancouver, British Columbia, Canada
| | - Chendi Wang
- Department of Statistics and Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
- Centre for Molecular Medicine and Therapeutics, Vancouver, British Columbia, Canada
| | - Farnush Farhadi
- Centre for Molecular Medicine and Therapeutics, Vancouver, British Columbia, Canada
| | - Shinya Tasaki
- Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, Illinois, United States of America
| | - David A. Bennett
- Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, Illinois, United States of America
| | - Philip L. De Jager
- Center for Translational & Computational Neuroimmunology, Department of Neurology and the Taub Institute for Research on Alzheimer’s Disease and the Aging Brain, Columbia University Irving Medical Center, New York, New York, United States of America
| | - Christopher Gaiteri
- Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, Illinois, United States of America
| | - Sara Mostafavi
- Department of Statistics and Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
- Paul G. Allen School for Computer Science and Engineering, University of Washington, Seattle, Washington, United States of America
- * E-mail:
| |
Collapse
|