1
|
Martins Rodrigues F, Terekhanova NV, Imbach KJ, Clauser KR, Esai Selvan M, Mendizabal I, Geffen Y, Akiyama Y, Maynard M, Yaron TM, Li Y, Cao S, Storrs EP, Gonda OS, Gaite-Reguero A, Govindan A, Kawaler EA, Wyczalkowski MA, Klein RJ, Turhan B, Krug K, Mani DR, Leprevost FDV, Nesvizhskii AI, Carr SA, Fenyö D, Gillette MA, Colaprico A, Iavarone A, Robles AI, Huang KL, Kumar-Sinha C, Aguet F, Lazar AJ, Cantley LC, Marigorta UM, Gümüş ZH, Bailey MH, Getz G, Porta-Pardo E, Ding L. Precision proteogenomics reveals pan-cancer impact of germline variants. Cell 2025; 188:2312-2335.e26. [PMID: 40233739 DOI: 10.1016/j.cell.2025.03.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 04/29/2024] [Accepted: 03/13/2025] [Indexed: 04/17/2025]
Abstract
We investigate the impact of germline variants on cancer patients' proteomes, encompassing 1,064 individuals across 10 cancer types. We introduced an approach, "precision peptidomics," mapping 337,469 coding germline variants onto peptides from patients' mass spectrometry data, revealing their potential impact on post-translational modifications, protein stability, allele-specific expression, and protein structure by leveraging the relevant protein databases. We identified rare pathogenic and common germline variants in cancer genes potentially affecting proteomic features, including variants altering protein abundance and structure and variants in kinases (ERBB2 and MAP2K2) impacting phosphorylation. Precision peptidome analysis predicted destabilizing events in signal-regulatory protein alpha (SIRPA) and glial fibrillary acid protein (GFAP), relevant to immunomodulation and glioblastoma diagnostics, respectively. Genome-wide association studies identified quantitative trait loci for gene expression and protein levels, spanning millions of SNPs and thousands of proteins. Polygenic risk scores correlated with distal effects from risk variants. Our findings emphasize the contribution of germline genetics to cancer heterogeneity and high-throughput precision peptidomics.
Collapse
Affiliation(s)
- Fernanda Martins Rodrigues
- Department of Medicine, Washington University in St. Louis, Saint Louis, MO, USA; McDonnell Genome Institute, Washington University in St. Louis, Saint Louis, MO, USA; Department of Genetics, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Nadezhda V Terekhanova
- Department of Medicine, Washington University in St. Louis, Saint Louis, MO, USA; McDonnell Genome Institute, Washington University in St. Louis, Saint Louis, MO, USA; Department of Genetics, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Kathleen J Imbach
- Josep Carreras Leukaemia Research Institute (IJC), Badalona, Barcelona, Spain; Universitat Autonoma de Barcelona, Barcelona, Spain
| | | | - Myvizhi Esai Selvan
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Center for Thoracic Oncology, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Precision Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Isabel Mendizabal
- Center for Cooperative Research in Biosciences (CIC bioGUNE), Basque Research and Technology Alliance (BRTA), Bizkaia Technology Park, Derio, Spain; Ikerbasque, Basque Foundation for Science, Bilbao, Spain; Translational Prostate Cancer Research Lab, CIC bioGUNE-Basurto, Biocruces Bizkaia Health Research Institute, Derio, Spain
| | - Yifat Geffen
- Broad Institute of MIT and Harvard, Cambridge, MA, USA; Cancer Center and Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
| | - Yo Akiyama
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Tomer M Yaron
- Meyer Cancer Center, Department of Medicine, Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| | - Yize Li
- Department of Medicine, Washington University in St. Louis, Saint Louis, MO, USA; McDonnell Genome Institute, Washington University in St. Louis, Saint Louis, MO, USA; Department of Genetics, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Song Cao
- Department of Medicine, Washington University in St. Louis, Saint Louis, MO, USA; McDonnell Genome Institute, Washington University in St. Louis, Saint Louis, MO, USA; Department of Genetics, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Erik P Storrs
- Department of Medicine, Washington University in St. Louis, Saint Louis, MO, USA; McDonnell Genome Institute, Washington University in St. Louis, Saint Louis, MO, USA; Department of Genetics, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Olivia S Gonda
- Department of Biology, Brigham Young University, Salt Lake City, UT, USA
| | - Adrian Gaite-Reguero
- Center for Cooperative Research in Biosciences (CIC bioGUNE), Basque Research and Technology Alliance (BRTA), Bizkaia Technology Park, Derio, Spain
| | - Akshay Govindan
- Department of Medicine, Washington University in St. Louis, Saint Louis, MO, USA; McDonnell Genome Institute, Washington University in St. Louis, Saint Louis, MO, USA; Department of Genetics, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Emily A Kawaler
- Applied Bioinformatics Laboratories, New York University Langone Health, New York City, NY, USA
| | - Matthew A Wyczalkowski
- Department of Medicine, Washington University in St. Louis, Saint Louis, MO, USA; McDonnell Genome Institute, Washington University in St. Louis, Saint Louis, MO, USA; Department of Genetics, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Robert J Klein
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Berk Turhan
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Karsten Krug
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - D R Mani
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Alexey I Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA; Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Steven A Carr
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - David Fenyö
- Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY, USA
| | | | - Antonio Colaprico
- Department of Public Health Sciences, University of Miami Miller School of Medicine, Miami, FL, USA; Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Antonio Iavarone
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL, USA; Department of Neurological Surgery, Department of Biochemistry and Molecular Biology, University of Miami, Miller School of Medicine, Miami, FL, USA
| | - Ana I Robles
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Rockville, MD, USA
| | - Kuan-Lin Huang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Center for Transformative Disease Modeling, Tisch Cancer Institute, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Chandan Kumar-Sinha
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA; Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, MI, USA
| | | | - Alexander J Lazar
- Departments of Pathology and Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | | | - Urko M Marigorta
- Center for Cooperative Research in Biosciences (CIC bioGUNE), Basque Research and Technology Alliance (BRTA), Bizkaia Technology Park, Derio, Spain; Ikerbasque, Basque Foundation for Science, Bilbao, Spain
| | - Zeynep H Gümüş
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Center for Thoracic Oncology, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Precision Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Matthew H Bailey
- Department of Biology, Brigham Young University, Salt Lake City, UT, USA.
| | - Gad Getz
- Broad Institute of MIT and Harvard, Cambridge, MA, USA; Cancer Center and Department of Pathology, Massachusetts General Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA.
| | - Eduard Porta-Pardo
- Josep Carreras Leukaemia Research Institute (IJC), Badalona, Barcelona, Spain; Barcelona Supercomputing Center (BSC), Barcelona, Spain.
| | - Li Ding
- Department of Medicine, Washington University in St. Louis, Saint Louis, MO, USA; McDonnell Genome Institute, Washington University in St. Louis, Saint Louis, MO, USA; Department of Genetics, Washington University in St. Louis, St. Louis, MO 63110, USA; Siteman Cancer Center, Washington University in St. Louis, Saint Louis, MO, USA.
| |
Collapse
|
2
|
Park D, Cenik C. Long-read RNA sequencing reveals allele-specific N 6-methyladenosine modifications. Genome Res 2025; 35:999-1011. [PMID: 39472020 PMCID: PMC12047277 DOI: 10.1101/gr.279270.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 10/23/2024] [Indexed: 11/06/2024]
Abstract
Long-read sequencing technology enables highly accurate detection of allele-specific RNA expression, providing insights into the effects of genetic variation on splicing and RNA abundance. Furthermore, the ability to directly sequence RNA enables the detection of RNA modifications in tandem with ascertaining the allelic origin of each molecule. Here, we leverage these advantages to determine allele-biased patterns of N 6-methyladenosine (m6A) modifications in native mRNA. We used human and mouse cells with known genetic variants to assign the allelic origin of each mRNA molecule combined with a supervised machine learning model to detect read-level m6A modification ratios. Our analyses reveal the importance of sequences adjacent to the DRACH motif in determining m6A deposition, in addition to allelic differences that directly alter the motif. Moreover, we discover allele-specific m6A modification events with no genetic variants in close proximity to the differentially modified nucleotide, demonstrating the unique advantage of using long-reads and surpassing the capabilities of antibody-based short-read approaches. This technological advance will further our understanding of the role of genetics in determining mRNA modifications.
Collapse
Affiliation(s)
- Dayea Park
- Department of Molecular Biosciences, University of Texas at Austin, Austin, Texas 78712, USA
| | - Can Cenik
- Department of Molecular Biosciences, University of Texas at Austin, Austin, Texas 78712, USA
| |
Collapse
|
3
|
Buyan A, Meshcheryakov G, Safronov V, Abramov S, Boytsov A, Nozdrin V, Baulin EF, Kolmykov S, Vierstra J, Kolpakov F, Makeev VJ, Kulakovskiy IV. Statistical framework for calling allelic imbalance in high-throughput sequencing data. Nat Commun 2025; 16:1739. [PMID: 39966391 PMCID: PMC11836314 DOI: 10.1038/s41467-024-55513-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 12/16/2024] [Indexed: 02/20/2025] Open
Abstract
High-throughput sequencing facilitates large-scale studies of gene regulation and allows tracing the associations of individual genomic variants with changes in gene regulation and expression. Compared to classic association studies, the assessment of an allelic imbalance at heterozygous variants captures functional variant effects with smaller sample sizes, higher sensitivity, and better resolution. Yet, identification of allele-specific variants from allelic read counts remains challenging due to data-dependent biases and overdispersion arising from technical and biological variability. We present MIXALIME, a novel computational framework for calling allele-specific variants in diverse omics data with a repertoire of statistical models accounting for read mapping bias and copy number variation. We benchmark MIXALIME with DNase-Seq, ATAC-Seq, and CAGE-Seq data, and we demonstrate that the allelic imbalance highlights causal variants in GWAS results. Finally, as a showcase of the large-scale practical application of MIXALIME, we present an atlas of variants exhibiting allele-specific chromatin accessibility, built from thousands of available datasets obtained from diverse cell types.
Collapse
Affiliation(s)
- Andrey Buyan
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Russia
- Life Improvement by Future Technologies (LIFT) Center, Moscow, Russia
| | | | - Viacheslav Safronov
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| | - Sergey Abramov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
- Moscow Center for Advanced Studies, Moscow, Russia
| | - Alexandr Boytsov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
- Moscow Center for Advanced Studies, Moscow, Russia
| | - Vladimir Nozdrin
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| | - Eugene F Baulin
- Moscow Center for Advanced Studies, Moscow, Russia
- International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Semyon Kolmykov
- Department of Computational Biology, Sirius University of Science and Technology, Sirius, Krasnodar region, Russia
| | - Jeff Vierstra
- Altius Institute for Biomedical Sciences, Seattle, WA, USA
| | - Fedor Kolpakov
- Department of Computational Biology, Sirius University of Science and Technology, Sirius, Krasnodar region, Russia
- Bioinformatics Laboratory, Federal Research Center for Information and Computational Technologies, Novosibirsk, Russia
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia.
- Moscow Center for Advanced Studies, Moscow, Russia.
- Institute of Biochemistry and Genetics, Ufa Federal Research Centre of the Russian Academy of Sciences, Ufa, Russia.
- Cancer Research UK National Biomarker Centre, University of Manchester, Manchester, UK.
| | - Ivan V Kulakovskiy
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Russia.
- Life Improvement by Future Technologies (LIFT) Center, Moscow, Russia.
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia.
| |
Collapse
|
4
|
Niharika, Asthana S, Narayan Yadav H, Sharma N, Kumar Singh V. A compendium of methods: Searching allele specific expression via RNA sequencing. Gene 2025; 936:149102. [PMID: 39561903 DOI: 10.1016/j.gene.2024.149102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2024] [Revised: 11/04/2024] [Accepted: 11/14/2024] [Indexed: 11/21/2024]
Abstract
Diploid mammalian genome has paired alleles for each gene; typically allowing for equal expression of the two alleles within the cell/tissue. However, genetic regulatory elements and epigenetic modifications can disrupt this equality, leading to preferential expression of one allele. Examining high-confidence allele-specific expression (ASE) is vital for understanding genetic variations and their impact on major diseases like cancers and diabetes. ASE analysis not only aids in disease prognosis and diagnosis but also helps to identify regulatory mechanisms operating within the genome. While advances in sequencing technologies have greatly improved our understanding of ASE, challenges remain in estimating it accurately. In this article, we reviewed methods for detecting ASE using both bulk RNASeq and single-cell RNASeq data to provide deeper insights beyond the mere prediction of ASE genes. Fundamentally, ASE detection methods are data-driven and can be classified according to type of data used. Some methods utilize both, DNA genotyping information and RNASeq while others rely solely on RNASeq data. This article offers a comparative analysis of these methods and compilation of repositories providing valuable insights.
Collapse
Affiliation(s)
- Niharika
- Department of Bioinformatics, Central University of South Bihar, Gaya, Bihar 824236, India
| | - Shailendra Asthana
- Computational and Mathematical Biology Centre, Translational Health Science and Technology Institute, NCR Biotech Science Cluster 3rd 15 Milestone, Faridabad-Gurugram 16 expressway, PO Box # 4. Faridabad, Haryana 121001, India
| | - Harlokesh Narayan Yadav
- Department of Pharmacology, All India Institute of Medical Sciences, Ansari Nagar, New Delhi 110029, India
| | - Nanaocha Sharma
- Institute of Bioresources and Sustainable Development, Takyelpat, Manipur 795001 Imphal, India.
| | - Vijay Kumar Singh
- Department of Bioinformatics, Central University of South Bihar, Gaya, Bihar 824236, India.
| |
Collapse
|
5
|
Park D, Cenik C. Long-read RNA sequencing reveals allele-specific N 6-methyladenosine modifications. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.08.602538. [PMID: 39026828 PMCID: PMC11257478 DOI: 10.1101/2024.07.08.602538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/20/2024]
Abstract
Long-read sequencing technology enables highly accurate detection of allele-specific RNA expression, providing insights into the effects of genetic variation on splicing and RNA abundance. Furthermore, the ability to directly sequence RNA promises the detection of RNA modifications in tandem with ascertaining the allelic origin of each molecule. Here, we leverage these advantages to determine allele-biased patterns of N6-methyladenosine (m6A) modifications in native mRNA. We utilized human and mouse cells with known genetic variants to assign allelic origin of each mRNA molecule combined with a supervised machine learning model to detect read-level m6A modification ratios. Our analyses revealed the importance of sequences adjacent to the DRACH-motif in determining m6A deposition, in addition to allelic differences that directly alter the motif. Moreover, we discovered allele-specific m6A modification (ASM) events with no genetic variants in close proximity to the differentially modified nucleotide, demonstrating the unique advantage of using long reads and surpassing the capabilities of antibody-based short-read approaches. This technological advancement promises to advance our understanding of the role of genetics in determining mRNA modifications.
Collapse
Affiliation(s)
- Dayea Park
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA
| | - Can Cenik
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA
| |
Collapse
|
6
|
Solomon BD, Zheng H, Dillon LW, Goldman J, Hourigan CS, Heath J, Khatri P. Prediction of HLA genotypes from single-cell transcriptome data. Front Immunol 2023; 14:1146826. [PMID: 37180102 PMCID: PMC10167300 DOI: 10.3389/fimmu.2023.1146826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Accepted: 04/04/2023] [Indexed: 05/15/2023] Open
Abstract
The human leukocyte antigen (HLA) locus plays a central role in adaptive immune function and has significant clinical implications for tissue transplant compatibility and allelic disease associations. Studies using bulk-cell RNA sequencing have demonstrated that HLA transcription may be regulated in an allele-specific manner and single-cell RNA sequencing (scRNA-seq) has the potential to better characterize these expression patterns. However, quantification of allele-specific expression (ASE) for HLA loci requires sample-specific reference genotyping due to extensive polymorphism. While genotype prediction from bulk RNA sequencing is well described, the feasibility of predicting HLA genotypes directly from single-cell data is unknown. Here we evaluate and expand upon several computational HLA genotyping tools by comparing predictions from human single-cell data to gold-standard, molecular genotyping. The highest 2-field accuracy averaged across all loci was 76% by arcasHLA and increased to 86% using a composite model of multiple genotyping tools. We also developed a highly accurate model (AUC 0.93) for predicting HLA-DRB345 copy number in order to improve genotyping accuracy of the HLA-DRB locus. Genotyping accuracy improved with read depth and was reproducible at repeat sampling. Using a metanalytic approach, we also show that HLA genotypes from PHLAT and OptiType can generate ASE ratios that are highly correlated (R2 = 0.8 and 0.94, respectively) with those derived from gold-standard genotyping.
Collapse
Affiliation(s)
| | - Hong Zheng
- Institute for Immunity, Transplantation and Infection, School of Medicine, Stanford University, Stanford, CA, United States
- Center for Biomedical Informatics Research, Department of Medicine, School of Medicine, Stanford University, Stanford, CA, United States
| | - Laura W. Dillon
- Laboratory of Myeloid Malignancies, National Heart Lung and Blood Institute, Bethesda, MD, United States
| | - Jason D. Goldman
- Swedish Center for Research and Innovation, Swedish Medical Center, Seattle, WA, United States
- Providence St. Joseph Health, Renton, WA, United States
- Division of Allergy & Infectious Diseases, University of Washington, Seattle, WA, United States
| | - Christopher S. Hourigan
- Laboratory of Myeloid Malignancies, National Heart Lung and Blood Institute, Bethesda, MD, United States
| | - James R. Heath
- Institute for Systems Biology, Seattle, WA, United States
- Department of Bioengineering, University of Washington, Seattle, WA, United States
| | - Purvesh Khatri
- Institute for Immunity, Transplantation and Infection, School of Medicine, Stanford University, Stanford, CA, United States
- Center for Biomedical Informatics Research, Department of Medicine, School of Medicine, Stanford University, Stanford, CA, United States
| |
Collapse
|
7
|
Rozowsky J, Gao J, Borsari B, Yang YT, Galeev T, Gürsoy G, Epstein CB, Xiong K, Xu J, Li T, Liu J, Yu K, Berthel A, Chen Z, Navarro F, Sun MS, Wright J, Chang J, Cameron CJF, Shoresh N, Gaskell E, Drenkow J, Adrian J, Aganezov S, Aguet F, Balderrama-Gutierrez G, Banskota S, Corona GB, Chee S, Chhetri SB, Cortez Martins GC, Danyko C, Davis CA, Farid D, Farrell NP, Gabdank I, Gofin Y, Gorkin DU, Gu M, Hecht V, Hitz BC, Issner R, Jiang Y, Kirsche M, Kong X, Lam BR, Li S, Li B, Li X, Lin KZ, Luo R, Mackiewicz M, Meng R, Moore JE, Mudge J, Nelson N, Nusbaum C, Popov I, Pratt HE, Qiu Y, Ramakrishnan S, Raymond J, Salichos L, Scavelli A, Schreiber JM, Sedlazeck FJ, See LH, Sherman RM, Shi X, Shi M, Sloan CA, Strattan JS, Tan Z, Tanaka FY, Vlasova A, Wang J, Werner J, Williams B, Xu M, Yan C, Yu L, Zaleski C, Zhang J, Ardlie K, Cherry JM, Mendenhall EM, Noble WS, Weng Z, Levine ME, Dobin A, Wold B, Mortazavi A, Ren B, Gillis J, Myers RM, Snyder MP, Choudhary J, Milosavljevic A, Schatz MC, Bernstein BE, et alRozowsky J, Gao J, Borsari B, Yang YT, Galeev T, Gürsoy G, Epstein CB, Xiong K, Xu J, Li T, Liu J, Yu K, Berthel A, Chen Z, Navarro F, Sun MS, Wright J, Chang J, Cameron CJF, Shoresh N, Gaskell E, Drenkow J, Adrian J, Aganezov S, Aguet F, Balderrama-Gutierrez G, Banskota S, Corona GB, Chee S, Chhetri SB, Cortez Martins GC, Danyko C, Davis CA, Farid D, Farrell NP, Gabdank I, Gofin Y, Gorkin DU, Gu M, Hecht V, Hitz BC, Issner R, Jiang Y, Kirsche M, Kong X, Lam BR, Li S, Li B, Li X, Lin KZ, Luo R, Mackiewicz M, Meng R, Moore JE, Mudge J, Nelson N, Nusbaum C, Popov I, Pratt HE, Qiu Y, Ramakrishnan S, Raymond J, Salichos L, Scavelli A, Schreiber JM, Sedlazeck FJ, See LH, Sherman RM, Shi X, Shi M, Sloan CA, Strattan JS, Tan Z, Tanaka FY, Vlasova A, Wang J, Werner J, Williams B, Xu M, Yan C, Yu L, Zaleski C, Zhang J, Ardlie K, Cherry JM, Mendenhall EM, Noble WS, Weng Z, Levine ME, Dobin A, Wold B, Mortazavi A, Ren B, Gillis J, Myers RM, Snyder MP, Choudhary J, Milosavljevic A, Schatz MC, Bernstein BE, Guigó R, Gingeras TR, Gerstein M. The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models. Cell 2023; 186:1493-1511.e40. [PMID: 37001506 PMCID: PMC10074325 DOI: 10.1016/j.cell.2023.02.018] [Show More Authors] [Citation(s) in RCA: 39] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Revised: 10/16/2022] [Accepted: 02/10/2023] [Indexed: 04/03/2023]
Abstract
Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × ∼15 assays). The datasets are mapped to matched, diploid genomes with long-read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci. These loci exhibit coordinated activity along haplotypes and are less conserved than corresponding, non-allele-specific ones. Surprisingly, a deep-learning transformer model can predict the allele-specific activity based only on local nucleotide-sequence context, highlighting the importance of transcription-factor-binding motifs particularly sensitive to variants. Furthermore, combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci. It also enables models for transferring known eQTLs to difficult-to-profile tissues (e.g., from skin to heart). Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.
Collapse
Affiliation(s)
- Joel Rozowsky
- Section on Biomedical Informatics and Data Science, Yale University, New Haven, CT, USA; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Jiahao Gao
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Beatrice Borsari
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA; Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Yucheng T Yang
- Institute of Science and Technology for Brain-Inspired Intelligence; MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence; MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200433, China; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Timur Galeev
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Gamze Gürsoy
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | | | - Kun Xiong
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Jinrui Xu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Tianxiao Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Jason Liu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Keyang Yu
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Ana Berthel
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Zhanlin Chen
- Department of Statistics and Data Science, Yale University, New Haven, CT, USA
| | - Fabio Navarro
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Maxwell S Sun
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | | | - Justin Chang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Christopher J F Cameron
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Noam Shoresh
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Jorg Drenkow
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Jessika Adrian
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Sergey Aganezov
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA
| | | | | | | | | | - Sora Chee
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
| | - Surya B Chhetri
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Gabriel Conte Cortez Martins
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Cassidy Danyko
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Carrie A Davis
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Daniel Farid
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | | | - Idan Gabdank
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Yoel Gofin
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - David U Gorkin
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
| | - Mengting Gu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Vivian Hecht
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Benjamin C Hitz
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Robbyn Issner
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Yunzhe Jiang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Melanie Kirsche
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Xiangmeng Kong
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Bonita R Lam
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Shantao Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Bian Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Xiqi Li
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Khine Zin Lin
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Ruibang Luo
- Department of Computer Science, The University of Hong Kong, Hong Kong, CHN
| | - Mark Mackiewicz
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Ran Meng
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Jill E Moore
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
| | - Jonathan Mudge
- European Bioinformatics Institute, Cambridge, Cambridgeshire, GB
| | | | - Chad Nusbaum
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ioann Popov
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Henry E Pratt
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
| | - Yunjiang Qiu
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
| | - Srividya Ramakrishnan
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Joe Raymond
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Leonidas Salichos
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA; Department of Biological and Chemical Sciences, New York Institute of Technology, Old Westbury, NY, USA
| | - Alexandra Scavelli
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Jacob M Schreiber
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Fritz J Sedlazeck
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA; Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Lei Hoon See
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Rachel M Sherman
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Xu Shi
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Minyi Shi
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Cricket Alicia Sloan
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - J Seth Strattan
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Zhen Tan
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Forrest Y Tanaka
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | - Anna Vlasova
- Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain; Comparative Genomics Group, Life Science Programme, Barcelona Supercomputing Centre, Barcelona, Spain; Institute of Research in Biomedicine, Barcelona, Spain
| | - Jun Wang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Jonathan Werner
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Brian Williams
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Min Xu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Chengfei Yan
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Lu Yu
- Institute of Cancer Research, London, UK
| | - Christopher Zaleski
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Jing Zhang
- Department of Computer Science, University of California, Irvine, Irvine, CA, USA
| | | | - J Michael Cherry
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | | | - William S Noble
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA
| | - Morgan E Levine
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Pathology, Yale University School of Medicine, New Haven, CT, USA
| | - Alexander Dobin
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Barbara Wold
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Ali Mortazavi
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA, USA
| | - Bing Ren
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, CA, USA
| | - Jesse Gillis
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA; Department of Physiology, University of Toronto, Toronto, ON, Canada
| | - Richard M Myers
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Michael P Snyder
- Department of Genetics, School of Medicine, Stanford University, Palo Alto, CA, USA
| | | | | | - Michael C Schatz
- Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA; Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
| | - Bradley E Bernstein
- Broad Institute of MIT and Harvard, Cambridge, MA, USA; Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.
| | - Roderic Guigó
- Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain; Universitat Pompeu Fabra, Barcelona, Catalonia, Spain.
| | - Thomas R Gingeras
- Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
| | - Mark Gerstein
- Section on Biomedical Informatics and Data Science, Yale University, New Haven, CT, USA; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA; Department of Statistics and Data Science, Yale University, New Haven, CT, USA; Department of Computer Science, Yale University, New Haven, CT, USA.
| |
Collapse
|
8
|
Deviatiiarov RM, Gams A, Kulakovskiy IV, Buyan A, Meshcheryakov G, Syunyaev R, Singh R, Shah P, Tatarinova TV, Gusev O, Efimov IR. An atlas of transcribed human cardiac promoters and enhancers reveals an important role of regulatory elements in heart failure. NATURE CARDIOVASCULAR RESEARCH 2023; 2:58-75. [PMID: 39196209 DOI: 10.1038/s44161-022-00182-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2021] [Accepted: 11/02/2022] [Indexed: 08/29/2024]
Abstract
A deeper knowledge of the dynamic transcriptional activity of promoters and enhancers is needed to improve mechanistic understanding of the pathogenesis of heart failure and heart diseases. In this study, we used cap analysis of gene expression (CAGE) to identify and quantify the activity of transcribed regulatory elements (TREs) in the four cardiac chambers of 21 healthy and ten failing adult human hearts. We identified 17,668 promoters and 14,920 enhancers associated with the expression of 14,519 genes. We showed how these regulatory elements are alternatively transcribed in different heart regions, in healthy versus failing hearts and in ischemic versus non-ischemic heart failure samples. Cardiac-disease-related single-nucleotide polymorphisms (SNPs) appeared to be enriched in TREs, potentially affecting the allele-specific transcription factor binding. To conclude, our open-source heart CAGE atlas will serve the cardiovascular community in improving the understanding of the role of the cardiac gene regulatory networks in cardiovascular disease and therapy.
Collapse
Affiliation(s)
- Ruslan M Deviatiiarov
- Laboratory of Regulatory Genomics, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan, Russia
| | - Anna Gams
- Department of Biomedical Engineering, The George Washington University, Washington, DC, USA
| | - Ivan V Kulakovskiy
- Laboratory of Regulatory Genomics, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan, Russia
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Russia
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Andrey Buyan
- Laboratory of Regulatory Genomics, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan, Russia
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Russia
| | | | - Roman Syunyaev
- Department of Biomedical Engineering, The George Washington University, Washington, DC, USA
- I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| | - Ramesh Singh
- Inova Heart and Vascular Institute, Falls Church, VA, USA
| | - Palak Shah
- Department of Biomedical Engineering, The George Washington University, Washington, DC, USA
- Inova Heart and Vascular Institute, Falls Church, VA, USA
| | - Tatiana V Tatarinova
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia.
- Department of Biology, University of La Verne, La Verne, CA, USA.
| | - Oleg Gusev
- Laboratory of Regulatory Genomics, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan, Russia.
- Graduate School of Medicine, Juntendo University, Tokyo, Japan.
- RIKEN Center for Integrative Medical Sciences, RIKEN, Yokohama, Japan.
- Endocrinology Research Center, Moscow, Russia.
| | - Igor R Efimov
- Department of Biomedical Engineering, The George Washington University, Washington, DC, USA.
- Department of Biomedical Engineering, Northwestern University, Chicago, IL, USA.
- Department of Medicine, Northwestern University, Chicago, IL, USA.
| |
Collapse
|
9
|
Her L, Shi J, Wang X, He B, Smith LS, Jiang H, Zhu H. Identification of regulatory variants of carboxylesterase 1 (CES1): A proof-of-concept study for the application of the Allele-Specific Protein Expression (ASPE) assay in identifying cis-acting regulatory genetic polymorphisms. Proteomics 2023; 23:e2200176. [PMID: 36413357 PMCID: PMC10077986 DOI: 10.1002/pmic.202200176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 11/15/2022] [Accepted: 11/16/2022] [Indexed: 11/24/2022]
Abstract
It is challenging to study regulatory genetic variants as gene expression is affected by both genetic polymorphisms and non-genetic regulators. The mRNA allele-specific expression (ASE) assay has been increasingly used for the study of cis-acting regulatory variants because cis-acting variants affect gene expression in an allele-specific manner. However, poor correlations between mRNA and protein expressions were observed for many genes, highlighting the importance of studying gene expression regulation at the protein level. In the present study, we conducted a proof-of-concept study to utilize a recently developed allele-specific protein expression (ASPE) assay to identify the cis-acting regulatory variants of CES1 using a large set of human liver samples. The CES1 gene encodes for carboxylesterase 1 (CES1), the most abundant hepatic hydrolase in humans. Two cis-acting regulatory variants were found to be significantly associated with CES1 ASPE, CES1 protein expression, and its catalytic activity on enalapril hydrolysis in human livers. Compared to conventional gene expression-based approaches, ASPE demonstrated an improved statistical power to detect regulatory variants with small effect sizes since allelic protein expression ratios are less prone to the influence of non-genetic regulators (e.g., diseases and inducers). This study suggests that the ASPE approach is a powerful tool for identifying cis-regulatory variants.
Collapse
Affiliation(s)
- Lucy Her
- Eli Lilly and CompanyIndianapolisIndianaUSA
| | - Jian Shi
- Alliance Pharma, IncMalvernPennsylvaniaUSA
| | - Xinwen Wang
- Department of Pharmaceutical SciencesNortheast Ohio Medical UniversityRootstownOhioUSA
| | - Bing He
- Department of Computational Medicine and BioinformaticsUniversity of MichiganAnn ArborMichiganUSA
| | - Logan S. Smith
- Department of Clinical PharmacyUniversity of MichiganAnn ArborMichiganUSA
| | - Hui Jiang
- Department of BiostatisticsUniversity of MichiganAnn ArborMichiganUSA
| | - Hao‐Jie Zhu
- Department of Clinical PharmacyUniversity of MichiganAnn ArborMichiganUSA
| |
Collapse
|
10
|
Maximizing Small Biopsy Patient Samples: Unified RNA-Seq Platform Assessment of over 120,000 Patient Biopsies. J Pers Med 2022; 13:jpm13010024. [PMID: 36675685 PMCID: PMC9866839 DOI: 10.3390/jpm13010024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 12/06/2022] [Accepted: 12/20/2022] [Indexed: 12/24/2022] Open
Abstract
Despite its wide-ranging benefits, whole-transcriptome or RNA exome profiling is challenging to implement in a clinical diagnostic setting. The Unified Assay is a comprehensive workflow wherein exome-enriched RNA-sequencing (RNA-Seq) assays are performed on clinical samples and analyzed by a series of advanced machine learning-based classifiers. Gene expression signatures and rare and/or novel genomic events, including fusions, mitochondrial variants, and loss of heterozygosity were assessed using RNA-Seq data generated from 120,313 clinical samples across three clinical indications (thyroid cancer, lung cancer, and interstitial lung disease). Since its implementation, the data derived from the Unified Assay have allowed significantly more patients to avoid unnecessary diagnostic surgery and have played an important role in guiding follow-up decisions regarding treatment. Collectively, data from the Unified Assay show the utility of RNA-Seq and RNA expression signatures in the clinical laboratory, and their importance to the future of precision medicine.
Collapse
|
11
|
Zhou T, Afzal R, Haroon M, Ma Y, Zhang H, Li L. Dominant complementation of biological pathways in maize hybrid lines is associated with heterosis. PLANTA 2022; 256:111. [PMID: 36352050 DOI: 10.1007/s00425-022-04028-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 11/03/2022] [Indexed: 06/16/2023]
Abstract
Allele-specific expressed genes (ASEGs) are widespread in maize hybrid lines and play important roles of complementation of biological pathways in heterosis. Heterosis (hybrid vigor) is an important phenomenon with both theoretical and practical value. However, our understanding of the genetic and molecular mechanisms behind heterosis is still limited. Here, we analyzed a comprehensive dataset of maize (Zea mays L.), including RNA-seq data from three hybrid-parent triplets (HPTs) and acetylated protein data from one HPT. The gene expression patterns exhibited extensive variation between the hybrids and their parents, and a substantial number of allele-specific expressed genes (ASEGs) were identified in the hybrids. Notably, ASEGs from different HPTs were significantly enriched in various conserved pathways. The parental alleles of ASEGs with fewer deleterious single-nucleotide polymorphisms were more likely to be expressed in hybrid lines than other parental alleles. ASEGs were mainly enriched in the functional gene ontology terms protein biosynthesis, photosynthesis, and metabolism. In addition, the ASEGs across the three HPTs were involved in key photosynthetic pathways and might enhance the photosynthetic efficiency of the hybrids. These findings suggest that ASEGs involved in complementary biological pathways in maize hybrids contribute to heterosis, shedding new light on the molecular mechanism of heterosis.
Collapse
Affiliation(s)
- Tao Zhou
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Rabail Afzal
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Muhammad Haroon
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China
| | - Yuting Ma
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Hongwei Zhang
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China.
| | - Lin Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China.
| |
Collapse
|
12
|
Long Q, Yuan Y, Li M. RNA-SSNV: A Reliable Somatic Single Nucleotide Variant Identification Framework for Bulk RNA-Seq Data. Front Genet 2022; 13:865313. [PMID: 35846154 PMCID: PMC9279659 DOI: 10.3389/fgene.2022.865313] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Accepted: 05/17/2022] [Indexed: 11/13/2022] Open
Abstract
The usage of expressed somatic mutations may have a unique advantage in identifying active cancer driver mutations. However, accurately calling mutations from RNA-seq data is difficult due to confounding factors such as RNA-editing, reverse transcription, and gap alignment. In the present study, we proposed a framework (named RNA-SSNV, https://github.com/pmglab/RNA-SSNV) to call somatic single nucleotide variants (SSNV) from tumor bulk RNA-seq data. Based on a comprehensive multi-filtering strategy and a machine-learning classification model trained with comprehensively curated features, RNA-SSNV achieved the best precision–recall rate (0.880–0.884) in a testing dataset and robustly retained 0.94 AUC for the precision–recall curve in three validation adult-based TCGA (The Cancer Genome Atlas) datasets. We further showed that the somatic mutations called by RNA-SSNV tended to have a higher functional impact and therapeutic power in known driver genes. Furthermore, VAF (variant allele fraction) analysis revealed that subclonal harboring expressed mutations had evolutional selection advantage and RNA had higher detection power to rescue DNA-omitted mutations. In sum, RNA-SSNV will be a useful approach to accurately call expressed somatic mutations for a more insightful analysis of cancer drive genes and carcinogenic mechanisms.
Collapse
Affiliation(s)
- Qihan Long
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Precision Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Disease Genome Research, Sun Yat-Sen University, Guangzhou, China
| | - Yangyang Yuan
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Precision Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Disease Genome Research, Sun Yat-Sen University, Guangzhou, China
| | - Miaoxin Li
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Precision Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Disease Genome Research, Sun Yat-Sen University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, China
- Key Laboratory of Tropical Disease Control (SYSU), Ministry of Education, Guangzhou, China
- *Correspondence: Miaoxin Li,
| |
Collapse
|
13
|
Harwood MP, Alves I, Edgington H, Agbessi M, Bruat V, Soave D, Lamaze FC, Favé MJ, Awadalla P. Recombination affects allele-specific expression of deleterious variants in human populations. SCIENCE ADVANCES 2022; 8:eabl3819. [PMID: 35559670 PMCID: PMC9106294 DOI: 10.1126/sciadv.abl3819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 03/29/2022] [Indexed: 06/15/2023]
Abstract
How the genetic composition of a population changes through stochastic processes, such as genetic drift, in combination with deterministic processes, such as selection, is critical to understanding how phenotypes vary in space and time. Here, we show how evolutionary forces affecting selection, including recombination and effective population size, drive genomic patterns of allele-specific expression (ASE). Integrating tissue-specific genotypic and transcriptomic data from 1500 individuals from two different cohorts, we demonstrate that ASE is less often observed in regions of low recombination, and loci in high or normal recombination regions are more efficient at using ASE to underexpress harmful mutations. By tracking genetic ancestry, we discriminate between ASE variability due to past demographic effects, including subsequent bottlenecks, versus local environment. We observe that ASE is not randomly distributed along the genome and that population parameters influencing the efficacy of natural selection alter ASE levels genome wide.
Collapse
Affiliation(s)
- Michelle P. Harwood
- Ontario Institute for Cancer Research, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Isabel Alves
- Université de Nantes, CHU Nantes, CNRS, INSERM, L’Institut du thorax, F-44000 Nantes, France
| | | | | | - Vanessa Bruat
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - David Soave
- Ontario Institute for Cancer Research, Toronto, ON, Canada
- Department of Mathematics, Wilfrid Laurier University, Waterloo, ON, Canada
| | - Fabien C. Lamaze
- Ontario Institute for Cancer Research, Toronto, ON, Canada
- Institut universitaire de cardiologie et de pneumologie de Québec, Université Laval, Québec, QC, Canada
| | | | - Philip Awadalla
- Ontario Institute for Cancer Research, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
14
|
Marwaha S, Knowles JW, Ashley EA. A guide for the diagnosis of rare and undiagnosed disease: beyond the exome. Genome Med 2022; 14:23. [PMID: 35220969 PMCID: PMC8883622 DOI: 10.1186/s13073-022-01026-w] [Citation(s) in RCA: 167] [Impact Index Per Article: 55.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 02/10/2022] [Indexed: 02/07/2023] Open
Abstract
Rare diseases affect 30 million people in the USA and more than 300-400 million worldwide, often causing chronic illness, disability, and premature death. Traditional diagnostic techniques rely heavily on heuristic approaches, coupling clinical experience from prior rare disease presentations with the medical literature. A large number of rare disease patients remain undiagnosed for years and many even die without an accurate diagnosis. In recent years, gene panels, microarrays, and exome sequencing have helped to identify the molecular cause of such rare and undiagnosed diseases. These technologies have allowed diagnoses for a sizable proportion (25-35%) of undiagnosed patients, often with actionable findings. However, a large proportion of these patients remain undiagnosed. In this review, we focus on technologies that can be adopted if exome sequencing is unrevealing. We discuss the benefits of sequencing the whole genome and the additional benefit that may be offered by long-read technology, pan-genome reference, transcriptomics, metabolomics, proteomics, and methyl profiling. We highlight computational methods to help identify regionally distant patients with similar phenotypes or similar genetic mutations. Finally, we describe approaches to automate and accelerate genomic analysis. The strategies discussed here are intended to serve as a guide for clinicians and researchers in the next steps when encountering patients with non-diagnostic exomes.
Collapse
Affiliation(s)
- Shruti Marwaha
- Department of Medicine, Division of Cardiovascular Medicine, School of Medicine, Stanford University, Stanford, CA, USA.
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA.
| | - Joshua W Knowles
- Department of Medicine, Division of Cardiovascular Medicine, School of Medicine, Stanford University, Stanford, CA, USA
- Department of Medicine, Diabetes Research Center, Cardiovascular Institute and Prevention Research Center, Stanford, CA, USA
| | - Euan A Ashley
- Department of Medicine, Division of Cardiovascular Medicine, School of Medicine, Stanford University, Stanford, CA, USA.
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA.
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA.
| |
Collapse
|
15
|
Sherbina K, León-Novelo LG, Nuzhdin SV, McIntyre LM, Marroni F. Power calculator for detecting allelic imbalance using hierarchical Bayesian model. BMC Res Notes 2021; 14:436. [PMID: 34838135 PMCID: PMC8626927 DOI: 10.1186/s13104-021-05851-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 11/15/2021] [Indexed: 11/10/2022] Open
Abstract
OBJECTIVE Allelic imbalance (AI) is the differential expression of the two alleles in a diploid. AI can vary between tissues, treatments, and environments. Methods for testing AI exist, but methods are needed to estimate type I error and power for detecting AI and difference of AI between conditions. As the costs of the technology plummet, what is more important: reads or replicates? RESULTS We find that a minimum of 2400, 480, and 240 allele specific reads divided equally among 12, 5, and 3 replicates is needed to detect a 10, 20, and 30%, respectively, deviation from allelic balance in a condition with power > 80%. A minimum of 960 and 240 allele specific reads divided equally among 8 replicates is needed to detect a 20 or 30% difference in AI between conditions with comparable power. Higher numbers of replicates increase power more than adding coverage without affecting type I error. We provide a Python package that enables simulation of AI scenarios and enables individuals to estimate type I error and power in detecting AI and differences in AI between conditions.
Collapse
Affiliation(s)
- Katrina Sherbina
- Quantitative and Computational Biology Section, University of Southern California, Los Angeles, CA, 90046, USA
| | - Luis G León-Novelo
- Department of Biostatistics and Data Science, The University of Texas Health Science Center at Houston-School of Public Health, Houston, TX, 77030, USA
| | - Sergey V Nuzhdin
- Molecular and Computational Biology Section, University of Southern California, Los Angeles, CA, 90046, USA
| | - Lauren M McIntyre
- Genetics Institute and Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, 32603, USA
| | - Fabio Marroni
- Dipartimento di Scienze Agroalimentari, Ambientali e Animali, Università di Udine, 33100, Udine, Italy.
| |
Collapse
|
16
|
Ye Z, Jiang X, Pfrender ME, Lynch M. Genome-Wide Allele-Specific Expression in Obligately Asexual Daphnia pulex and the Implications for the Genetic Basis of Asexuality. Genome Biol Evol 2021; 13:6415829. [PMID: 34726699 PMCID: PMC8598174 DOI: 10.1093/gbe/evab243] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/24/2021] [Indexed: 01/17/2023] Open
Abstract
Although obligately asexual lineages are thought to experience selective disadvantages associated with reduced efficiency of fixing beneficial mutations and purging deleterious mutations, such lineages are phylogenetically and geographically widespread. However, despite several genome-wide association studies, little is known about the genetic elements underlying the origin of obligate asexuality and how they spread. Because many obligately asexual lineages have hybrid origins, it has been suggested that asexuality is caused by the unbalanced expression of alleles from the hybridizing species. Here, we investigate this idea by identifying genes with allele-specific expression (ASE) in a Daphnia pulex population, in which obligate parthenogens (OP) and cyclical parthenogens (CP) coexist, with the OP clones having been originally derived from hybridization between CP D. pulex and its sister species, Daphnia pulicaria. OP D. pulex have significantly more ASE genes (ASEGs) than do CP D. pulex. Whole-genomic comparison of OP and CP clones revealed ∼15,000 OP-specific markers and 42 consistent ASEGs enriched in marker-defined regions. Ten of the 42 ASEGs have alleles coding for different protein sequences, suggesting functional differences between the products of the two parental alleles. At least three of these ten genes appear to be directly involved in meiosis-related processes, for example, RanBP2 can cause abnormal chromosome segregation in anaphase I, and the presence of Wee1 in immature oocytes leads to failure to enter meiosis II. These results provide a guide for future molecular resolution of the genetic basis of the transition to ameiotic parthenogenesis.
Collapse
Affiliation(s)
- Zhiqiang Ye
- Center for Mechanisms of Evolution, Arizona State University, Tempe, Arizona
| | | | - Michael E Pfrender
- Department of Biological Sciences and Environmental Change Initiative, University of Notre Dame, Notre Dame, Indiana
| | - Michael Lynch
- Center for Mechanisms of Evolution, Arizona State University, Tempe, Arizona
| |
Collapse
|
17
|
Du Q, Smith GC, Luu PL, Ferguson JM, Armstrong NJ, Caldon CE, Campbell EM, Nair SS, Zotenko E, Gould CM, Buckley M, Chia KM, Portman N, Lim E, Kaczorowski D, Chan CL, Barton K, Deveson IW, Smith MA, Powell JE, Skvortsova K, Stirzaker C, Achinger-Kawecka J, Clark SJ. DNA methylation is required to maintain both DNA replication timing precision and 3D genome organization integrity. Cell Rep 2021; 36:109722. [PMID: 34551299 DOI: 10.1016/j.celrep.2021.109722] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Revised: 06/22/2021] [Accepted: 08/25/2021] [Indexed: 02/08/2023] Open
Abstract
DNA replication timing and three-dimensional (3D) genome organization are associated with distinct epigenome patterns across large domains. However, whether alterations in the epigenome, in particular cancer-related DNA hypomethylation, affects higher-order levels of genome architecture is still unclear. Here, using Repli-Seq, single-cell Repli-Seq, and Hi-C, we show that genome-wide methylation loss is associated with both concordant loss of replication timing precision and deregulation of 3D genome organization. Notably, we find distinct disruption in 3D genome compartmentalization, striking gains in cell-to-cell replication timing heterogeneity and loss of allelic replication timing in cancer hypomethylation models, potentially through the gene deregulation of DNA replication and genome organization pathways. Finally, we identify ectopic H3K4me3-H3K9me3 domains from across large hypomethylated domains, where late replication is maintained, which we purport serves to protect against catastrophic genome reorganization and aberrant gene transcription. Our results highlight a potential role for the methylome in the maintenance of 3D genome regulation.
Collapse
Affiliation(s)
- Qian Du
- Garvan Institute of Medical Research, Sydney, NSW 2010, Australia; St Vincent's Clinical School, University of New South Wales, Sydney, NSW 2010, Australia
| | - Grady C Smith
- Garvan Institute of Medical Research, Sydney, NSW 2010, Australia
| | - Phuc Loi Luu
- Garvan Institute of Medical Research, Sydney, NSW 2010, Australia; St Vincent's Clinical School, University of New South Wales, Sydney, NSW 2010, Australia
| | - James M Ferguson
- The Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Sydney, NSW 2010, Australia
| | - Nicola J Armstrong
- Mathematics and Statistics, Murdoch University, Murdoch, WA 6150, Australia
| | - C Elizabeth Caldon
- Garvan Institute of Medical Research, Sydney, NSW 2010, Australia; St Vincent's Clinical School, University of New South Wales, Sydney, NSW 2010, Australia
| | | | - Shalima S Nair
- Garvan Institute of Medical Research, Sydney, NSW 2010, Australia; St Vincent's Clinical School, University of New South Wales, Sydney, NSW 2010, Australia
| | - Elena Zotenko
- Garvan Institute of Medical Research, Sydney, NSW 2010, Australia
| | - Cathryn M Gould
- Garvan Institute of Medical Research, Sydney, NSW 2010, Australia
| | - Michael Buckley
- Garvan Institute of Medical Research, Sydney, NSW 2010, Australia
| | - Kee-Ming Chia
- Garvan Institute of Medical Research, Sydney, NSW 2010, Australia
| | - Neil Portman
- Garvan Institute of Medical Research, Sydney, NSW 2010, Australia
| | - Elgene Lim
- Garvan Institute of Medical Research, Sydney, NSW 2010, Australia; St Vincent's Clinical School, University of New South Wales, Sydney, NSW 2010, Australia
| | - Dominik Kaczorowski
- Garvan-Weizmann Centre for Cellular Genomics, Garvan Institute of Medical Research, Sydney, NSW 2010, Australia
| | - Chia-Ling Chan
- Garvan-Weizmann Centre for Cellular Genomics, Garvan Institute of Medical Research, Sydney, NSW 2010, Australia
| | - Kirston Barton
- The Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Sydney, NSW 2010, Australia
| | - Ira W Deveson
- St Vincent's Clinical School, University of New South Wales, Sydney, NSW 2010, Australia; The Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Sydney, NSW 2010, Australia
| | - Martin A Smith
- St Vincent's Clinical School, University of New South Wales, Sydney, NSW 2010, Australia; The Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Sydney, NSW 2010, Australia
| | - Joseph E Powell
- Garvan-Weizmann Centre for Cellular Genomics, Garvan Institute of Medical Research, Sydney, NSW 2010, Australia; UNSW Cellular Genomics Futures Institute, School of Medical Sciences, UNSW Sydney, NSW 2010, Australia
| | - Ksenia Skvortsova
- Garvan Institute of Medical Research, Sydney, NSW 2010, Australia; St Vincent's Clinical School, University of New South Wales, Sydney, NSW 2010, Australia
| | - Clare Stirzaker
- Garvan Institute of Medical Research, Sydney, NSW 2010, Australia; St Vincent's Clinical School, University of New South Wales, Sydney, NSW 2010, Australia
| | - Joanna Achinger-Kawecka
- Garvan Institute of Medical Research, Sydney, NSW 2010, Australia; St Vincent's Clinical School, University of New South Wales, Sydney, NSW 2010, Australia
| | - Susan J Clark
- Garvan Institute of Medical Research, Sydney, NSW 2010, Australia; St Vincent's Clinical School, University of New South Wales, Sydney, NSW 2010, Australia.
| |
Collapse
|
18
|
Allele-specific expression of GATA2 due to epigenetic dysregulation in CEBPA double-mutant AML. Blood 2021; 138:160-177. [PMID: 33831168 DOI: 10.1182/blood.2020009244] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 03/24/2021] [Indexed: 12/11/2022] Open
Abstract
Transcriptional deregulation is a central event in the development of acute myeloid leukemia (AML). To identify potential disturbances in gene regulation, we conducted an unbiased screen of allele-specific expression (ASE) in 209 AML cases. The gene encoding GATA binding protein 2 (GATA2) displayed ASE more often than any other myeloid- or cancer-related gene. GATA2 ASE was strongly associated with CEBPA double mutations (DMs), with 95% of cases presenting GATA2 ASE. In CEBPA DM AML with GATA2 mutations, the mutated allele was preferentially expressed. We found that GATA2 ASE was a somatic event lost in complete remission, supporting the notion that it plays a role in CEBPA DM AML. Acquisition of GATA2 ASE involved silencing of 1 allele via promoter methylation and concurrent overactivation of the other allele, thereby preserving expression levels. Notably, promoter methylation was also lost in remission along with GATA2 ASE. In summary, we propose that GATA2 ASE is acquired by epigenetic mechanisms and is a prerequisite for the development of AML with CEBPA DMs. This finding constitutes a novel example of an epigenetic hit cooperating with a genetic hit in the pathogenesis of AML.
Collapse
|
19
|
Niu G, Bak A, Nusselt M, Zhang Y, Pausch H, Flisikowska T, Schnieke AE, Flisikowski K. Allelic Expression Imbalance Analysis Identified YAP1 Amplification in p53- Dependent Osteosarcoma. Cancers (Basel) 2021; 13:cancers13061364. [PMID: 33803512 PMCID: PMC8002920 DOI: 10.3390/cancers13061364] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 03/15/2021] [Accepted: 03/16/2021] [Indexed: 12/12/2022] Open
Abstract
Simple Summary Osteosarcoma (OS) is a highly heterogenous cancer, making the identification of genetic driving factors difficult. Genetic factors, such as heritable mutations of Rb1 and TP53, are associated with an increased risk of OS. We previously generated pigs carrying a mutated TP53 gene, which develop OS at high frequency. RNA sequencing and allelic expression imbalance analysis identified an amplification of YAP1 involved in p53- dependent OS progression. The inactivation of YAP1 inhibits proliferation, migration, and invasion, and leads to the silencing of TP63 and reconstruction of p16 expression in p53-deficient porcine OS cells. This study confirms the importance of p53/YAP1 network in cancer. Abstract Osteosarcoma (OS) is a primary bone malignancy that mainly occurs during adolescent growth, suggesting that bone growth plays an important role in the aetiology of the disease. Genetic factors, such as heritable mutations of Rb1 and TP53, are associated with an increased risk of OS. Identifying driver mutations for OS has been challenging due to the complexity of bone growth-related pathways and the extensive intra-tumoral heterogeneity of this cancer. We previously generated pigs carrying a mutated TP53 gene, which develop OS at high frequency. RNA sequencing and allele expression imbalance (AEI) analysis of OS and matched healthy control samples revealed a highly significant AEI (p = 2.14 × 10−39) for SNPs in the BIRC3-YAP1 locus on pig chromosome 9. Analysis of copy number variation showed that YAP1 amplification is associated with the AEI and the progression of OS. Accordingly, the inactivation of YAP1 inhibits proliferation, migration, and invasion, and leads to the silencing of TP63 and reconstruction of p16 expression in p53-deficient porcine OS cells. Increased p16 mRNA expression correlated with lower methylation of its promoter. Altogether, our study provides molecular evidence for the role of YAP1 amplification in the progression of p53-dependent OS.
Collapse
Affiliation(s)
- Guanglin Niu
- Chair of Livestock Biotechnology, Technical University of Munich, 85354 Freising, Germany; (G.N.); (A.B.); (M.N.); (Y.Z.); (T.F.); (A.E.S.)
| | - Agnieszka Bak
- Chair of Livestock Biotechnology, Technical University of Munich, 85354 Freising, Germany; (G.N.); (A.B.); (M.N.); (Y.Z.); (T.F.); (A.E.S.)
| | - Melanie Nusselt
- Chair of Livestock Biotechnology, Technical University of Munich, 85354 Freising, Germany; (G.N.); (A.B.); (M.N.); (Y.Z.); (T.F.); (A.E.S.)
| | - Yue Zhang
- Chair of Livestock Biotechnology, Technical University of Munich, 85354 Freising, Germany; (G.N.); (A.B.); (M.N.); (Y.Z.); (T.F.); (A.E.S.)
| | - Hubert Pausch
- Institute of Agricultural Sciences, ETH Zurich, 8092 Zurich, Switzerland;
| | - Tatiana Flisikowska
- Chair of Livestock Biotechnology, Technical University of Munich, 85354 Freising, Germany; (G.N.); (A.B.); (M.N.); (Y.Z.); (T.F.); (A.E.S.)
| | - Angelika E. Schnieke
- Chair of Livestock Biotechnology, Technical University of Munich, 85354 Freising, Germany; (G.N.); (A.B.); (M.N.); (Y.Z.); (T.F.); (A.E.S.)
| | - Krzysztof Flisikowski
- Chair of Livestock Biotechnology, Technical University of Munich, 85354 Freising, Germany; (G.N.); (A.B.); (M.N.); (Y.Z.); (T.F.); (A.E.S.)
- Correspondence:
| |
Collapse
|
20
|
Robles-Espinoza CD, Mohammadi P, Bonilla X, Gutierrez-Arcelus M. Allele-specific expression: applications in cancer and technical considerations. Curr Opin Genet Dev 2021; 66:10-19. [PMID: 33383480 PMCID: PMC7985293 DOI: 10.1016/j.gde.2020.10.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 10/26/2020] [Accepted: 10/31/2020] [Indexed: 11/18/2022]
Abstract
Allele-specific gene expression can influence disease traits. Non-coding germline genetic variants that alter regulatory elements can cause allele-specific gene expression and contribute to cancer susceptibility. In tumors, both somatic copy number alterations and somatic single nucleotide variants have been shown to lead to allele-specific expression of genes, many of which are considered drivers of tumor growth. Here, we review recent studies revealing the pervasive presence of this phenomenon in cancer susceptibility and progression. Furthermore, we underscore the importance of careful experimental design and computational analysis for accurate allelic expression quantification and avoidance of false positives. Finally, we discuss additional methodological challenges encountered in cancer studies and in the burgeoning field of single-cell transcriptomics.
Collapse
Affiliation(s)
- Carla Daniela Robles-Espinoza
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Campus Juriquilla, Boulevard Juriquilla 3001, Santiago de Querétaro 76230, Mexico; Wellcome Sanger Institute, Hinxton, Cambridgeshire CB10 1SA, UK
| | - Pejman Mohammadi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA; Scripps Translational Science Institute, The Scripps Research Institute, La Jolla, CA, USA
| | - Ximena Bonilla
- Department of Computer Science, ETH Zurich, Universitätsstr. 6, 8092 Zürich, Switzerland; Swiss Institute of Bioinformatics, Quartier Sorge - Bâtiment Amphipôle, Lausanne 1015, Switzerland; University Hospital Zurich, Rämistrasse 100, 8091 Zürich, Switzerland
| | - Maria Gutierrez-Arcelus
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA; Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA; Division of Rheumatology, Inflammation and Immunity, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02142, USA; Division of Immunology, Department of Pediatrics, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, USA.
| |
Collapse
|
21
|
Hodel KP, Sun MJS, Ungerleider N, Park VS, Williams LG, Bauer DL, Immethun VE, Wang J, Suo Z, Lu H, McLachlan JB, Pursell ZF. POLE Mutation Spectra Are Shaped by the Mutant Allele Identity, Its Abundance, and Mismatch Repair Status. Mol Cell 2020; 78:1166-1177.e6. [PMID: 32497495 PMCID: PMC8177757 DOI: 10.1016/j.molcel.2020.05.012] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Revised: 04/10/2020] [Accepted: 05/11/2020] [Indexed: 12/11/2022]
Abstract
Human tumors with exonuclease domain mutations in the gene encoding DNA polymerase ε (POLE) have incredibly high mutation burdens. These errors arise in four unique mutation signatures occurring in different relative amounts, the etiologies of which remain poorly understood. We used CRISPR-Cas9 to engineer human cell lines expressing POLE tumor variants, with and without mismatch repair (MMR). Whole-exome sequencing of these cells after defined numbers of population doublings permitted analysis of nascent mutation accumulation. Unlike an exonuclease active site mutant that we previously characterized, POLE cancer mutants readily drive signature mutagenesis in the presence of functional MMR. Comparison of cell line and human patient data suggests that the relative abundance of mutation signatures partitions POLE tumors into distinct subgroups dependent on the nature of the POLE allele, its expression level, and MMR status. These results suggest that different POLE mutants have previously unappreciated differences in replication fidelity and mutagenesis.
Collapse
Affiliation(s)
- Karl P Hodel
- Department of Biochemistry and Molecular Biology, Tulane University School of Medicine, New Orleans, LA 70112, USA
| | - Meijuan J S Sun
- Department of Biochemistry and Molecular Biology, Tulane University School of Medicine, New Orleans, LA 70112, USA
| | - Nathan Ungerleider
- Department of Pathology, Tulane University School of Medicine, New Orleans, LA 70112, USA; Tulane Cancer Center, Tulane University School of Medicine, 1430 Tulane Ave., New Orleans, LA 70112, USA
| | - Vivian S Park
- Department of Biochemistry and Molecular Biology, Tulane University School of Medicine, New Orleans, LA 70112, USA
| | - Leonard G Williams
- Department of Biochemistry and Molecular Biology, Tulane University School of Medicine, New Orleans, LA 70112, USA; BioInnovation Program, Tulane University, New Orleans, LA 70112, USA
| | - David L Bauer
- Department of Microbiology and Immunology, Tulane University School of Medicine, New Orleans, LA 70112, USA
| | - Victoria E Immethun
- Department of Microbiology and Immunology, Tulane University School of Medicine, New Orleans, LA 70112, USA
| | - Jieqiong Wang
- Department of Biochemistry and Molecular Biology, Tulane University School of Medicine, New Orleans, LA 70112, USA; Tulane Cancer Center, Tulane University School of Medicine, 1430 Tulane Ave., New Orleans, LA 70112, USA
| | - Zucai Suo
- Department of Biomedical Sciences, Florida State University, Tallahassee, FL 32306, USA
| | - Hua Lu
- Department of Biochemistry and Molecular Biology, Tulane University School of Medicine, New Orleans, LA 70112, USA; Tulane Cancer Center, Tulane University School of Medicine, 1430 Tulane Ave., New Orleans, LA 70112, USA
| | - James B McLachlan
- Department of Microbiology and Immunology, Tulane University School of Medicine, New Orleans, LA 70112, USA
| | - Zachary F Pursell
- Department of Biochemistry and Molecular Biology, Tulane University School of Medicine, New Orleans, LA 70112, USA; Tulane Cancer Center, Tulane University School of Medicine, 1430 Tulane Ave., New Orleans, LA 70112, USA.
| |
Collapse
|
22
|
Clayton EA, Khalid S, Ban D, Wang L, Jordan IK, McDonald JF. Tumor suppressor genes and allele-specific expression: mechanisms and significance. Oncotarget 2020; 11:462-479. [PMID: 32064050 PMCID: PMC6996918 DOI: 10.18632/oncotarget.27468] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Accepted: 01/13/2020] [Indexed: 12/12/2022] Open
Abstract
Recent findings indicate that allele-specific expression (ASE) at specific cancer driver gene loci may be of importance in onset/progression of the disease. Of particular interest are loss-of-function (LOF) of tumor suppressor gene (TSGs) alleles. While LOF tumor suppressor mutations are typically considered to be recessive, if these mutant alleles can be significantly differentially expressed relative to wild-type alleles in heterozygotes, the clinical consequences could be significant. LOF TSG alleles are shown to be segregating at high frequencies in world-wide populations of normal/healthy individuals. Matched sets of normal and tumor tissues isolated from 233 cancer patients representing four diverse tumor types demonstrate functionally important changes in patterns of ASE in individuals heterozygous for LOF TSG alleles associated with cancer onset/progression. While a variety of molecular mechanisms were identified as potentially contributing to changes in ASE patterns in cancer, changes in DNA copy number and allele-specific alternative splicing possibly mediated by antisense RNA emerged as predominant factors. In conclusion, LOF TSGs are segregating in human populations at significant frequencies indicating that many otherwise healthy individuals are at elevated risk of developing cancer. Changes in ASE between normal and cancer tissues indicates that LOF TSG alleles may contribute to cancer onset/progression even when heterozygous with wild-type functional alleles.
Collapse
Affiliation(s)
- Evan A. Clayton
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - Shareef Khalid
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - Dongjo Ban
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - Lu Wang
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- PanAmerican Bioinformatics Institute, Cali, Colombia
| | - I. King Jordan
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- PanAmerican Bioinformatics Institute, Cali, Colombia
- Applied Bioinformatics Laboratory, Atlanta, GA, USA
| | - John F. McDonald
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| |
Collapse
|
23
|
Kravitz SN, Gregg C. New subtypes of allele-specific epigenetic effects: implications for brain development, function and disease. Curr Opin Neurobiol 2019; 59:69-78. [PMID: 31153086 PMCID: PMC7476552 DOI: 10.1016/j.conb.2019.04.012] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Accepted: 04/24/2019] [Indexed: 01/15/2023]
Abstract
Typically, it is assumed that the maternal and paternal alleles for most genes are equally expressed. Known exceptions include canonical imprinted genes, random X-chromosome inactivation, olfactory receptors and clustered protocadherins. Here, we highlight recent studies showing that allele-specific expression is frequent in the genome and involves subtypes of epigenetic allelic effects that differ in terms of heritability, clonality and stability over time. Different forms of epigenetic allele regulation could have different roles in brain development, function, and disease. An emerging area involves understanding allelic effects in a cell-type and developmental stage-specific manner and determining how these effects influence the impact of genetic variants and mutations on the brain. A deeper understanding of epigenetics at the allele and cellular level in the brain could help clarify the mechanisms underlying phenotypic variance.
Collapse
Affiliation(s)
- Stephanie N Kravitz
- Department of Neurobiology & Anatomy, University of Utah, Salt Lake City, UT 84132-3401, USA; Department of Human Genetics, University of Utah, Salt Lake City, UT 84132-3401, USA
| | - Christopher Gregg
- Department of Neurobiology & Anatomy, University of Utah, Salt Lake City, UT 84132-3401, USA; Department of Human Genetics, University of Utah, Salt Lake City, UT 84132-3401, USA.
| |
Collapse
|