1
|
Nanni A, Titus-McQuillan J, Bankole KS, Pardo-Palacios F, Signor S, Vlaho S, Moskalenko O, Morse A, Rogers RL, Conesa A, McIntyre LM. Nucleotide-level distance metrics to quantify alternative splicing implemented in TranD. Nucleic Acids Res 2024; 52:e28. [PMID: 38340337 PMCID: PMC10954468 DOI: 10.1093/nar/gkae056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 11/29/2023] [Accepted: 01/18/2024] [Indexed: 02/12/2024] Open
Abstract
Advances in affordable transcriptome sequencing combined with better exon and gene prediction has motivated many to compare transcription across the tree of life. We develop a mathematical framework to calculate complexity and compare transcript models. Structural features, i.e. intron retention (IR), donor/acceptor site variation, alternative exon cassettes, alternative 5'/3' UTRs, are compared and the distance between transcript models is calculated with nucleotide level precision. All metrics are implemented in a PyPi package, TranD and output can be used to summarize splicing patterns for a transcriptome (1GTF) and between transcriptomes (2GTF). TranD output enables quantitative comparisons between: annotations augmented by empirical RNA-seq data and the original transcript models; transcript model prediction tools for longread RNA-seq (e.g. FLAIR versus Isoseq3); alternate annotations for a species (e.g. RefSeq vs Ensembl); and between closely related species. In C. elegans, Z. mays, D. melanogaster, D. simulans and H. sapiens, alternative exons were observed more frequently in combination with an alternative donor/acceptor than alone. Transcript models in RefSeq and Ensembl are linked and both have unique transcript models with empirical support. D. melanogaster and D. simulans, share many transcript models and long-read RNAseq data suggests that both species are under-annotated. We recommend combined references.
Collapse
Affiliation(s)
- Adalena Nanni
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32611, USA
- University of Florida Genetics Institute, University of Florida, Gainesville, FL 32611, USA
| | - James Titus-McQuillan
- University of North Carolina at Charlotte Department of Bioinformatics and Genomics Charlotte, NC, USA
| | - Kinfeosioluwa S Bankole
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32611, USA
- University of Florida Genetics Institute, University of Florida, Gainesville, FL 32611, USA
| | | | - Sarah Signor
- Department of Biological Sciences, North Dakota State University, Fargo, ND, USA
| | - Srna Vlaho
- Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA
| | - Oleksandr Moskalenko
- University of Florida Research Computing, University of Florida, Gainesville, FL 32611, USA
| | - Alison M Morse
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32611, USA
- University of Florida Genetics Institute, University of Florida, Gainesville, FL 32611, USA
| | - Rebekah L Rogers
- University of North Carolina at Charlotte Department of Bioinformatics and Genomics Charlotte, NC, USA
| | - Ana Conesa
- Institute for Integrative Systems Biology. Spanish National Research Council, Paterna, Spain
| | - Lauren M McIntyre
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32611, USA
- University of Florida Genetics Institute, University of Florida, Gainesville, FL 32611, USA
| |
Collapse
|
2
|
Karousis ED, Gypas F, Zavolan M, Mühlemann O. Nanopore sequencing reveals endogenous NMD-targeted isoforms in human cells. Genome Biol 2021; 22:223. [PMID: 34389041 PMCID: PMC8361881 DOI: 10.1186/s13059-021-02439-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 07/26/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Nonsense-mediated mRNA decay (NMD) is a eukaryotic, translation-dependent degradation pathway that targets mRNAs with premature termination codons and also regulates the expression of some mRNAs that encode full-length proteins. Although many genes express NMD-sensitive transcripts, identifying them based on short-read sequencing data remains a challenge. RESULTS To identify and analyze endogenous targets of NMD, we apply cDNA Nanopore sequencing and short-read sequencing to human cells with varying expression levels of NMD factors. Our approach detects full-length NMD substrates that are highly unstable and increase in levels or even only appear when NMD is inhibited. Among the many new NMD-targeted isoforms that our analysis identifies, most derive from alternative exon usage. The isoform-aware analysis reveals many genes with significant changes in splicing but no significant changes in overall expression levels upon NMD knockdown. NMD-sensitive mRNAs have more exons in the 3΄UTR and, for those mRNAs with a termination codon in the last exon, the length of the 3΄UTR per se does not correlate with NMD sensitivity. Analysis of splicing signals reveals isoforms where NMD has been co-opted in the regulation of gene expression, though the main function of NMD seems to be ridding the transcriptome of isoforms resulting from spurious splicing events. CONCLUSIONS Long-read sequencing enables the identification of many novel NMD-sensitive mRNAs and reveals both known and unexpected features concerning their biogenesis and their biological role. Our data provide a highly valuable resource of human NMD transcript targets for future genomic and transcriptomic applications.
Collapse
Affiliation(s)
- Evangelos D Karousis
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| | - Foivos Gypas
- Friedrich Miescher Institute for Biomedical Research, Maulbeerstrasse 66, 4058, Basel, Switzerland
| | - Mihaela Zavolan
- Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Klingelbergstrasse 50-70, 4056, Basel, Switzerland
| | - Oliver Mühlemann
- Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland.
| |
Collapse
|
3
|
Amarasinghe SL, Su S, Dong X, Zappia L, Ritchie ME, Gouil Q. Opportunities and challenges in long-read sequencing data analysis. Genome Biol 2020; 21:30. [PMID: 32033565 PMCID: PMC7006217 DOI: 10.1186/s13059-020-1935-5] [Citation(s) in RCA: 915] [Impact Index Per Article: 183.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 01/15/2020] [Indexed: 12/11/2022] Open
Abstract
Long-read technologies are overcoming early limitations in accuracy and throughput, broadening their application domains in genomics. Dedicated analysis tools that take into account the characteristics of long-read data are thus required, but the fast pace of development of such tools can be overwhelming. To assist in the design and analysis of long-read sequencing projects, we review the current landscape of available tools and present an online interactive database, long-read-tools.org, to facilitate their browsing. We further focus on the principles of error correction, base modification detection, and long-read transcriptomics analysis and highlight the challenges that remain.
Collapse
Affiliation(s)
- Shanika L. Amarasinghe
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, 3052 Australia
- Department of Medical Biology, The University of Melbourne, Parkville, 3010 Australia
| | - Shian Su
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, 3052 Australia
- Department of Medical Biology, The University of Melbourne, Parkville, 3010 Australia
| | - Xueyi Dong
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, 3052 Australia
- Department of Medical Biology, The University of Melbourne, Parkville, 3010 Australia
| | - Luke Zappia
- Bioinformatics, Murdoch Children’s Research Institute, Parkville, 3052 Australia
- School of Biosciences, Faculty of Science, The University of Melbourne, Parkville, 3010 Australia
| | - Matthew E. Ritchie
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, 3052 Australia
- Department of Medical Biology, The University of Melbourne, Parkville, 3010 Australia
- School of Mathematics and StatisticsThe University of Melbourne, Parkville, 3010 Australia
| | - Quentin Gouil
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, 3052 Australia
- Department of Medical Biology, The University of Melbourne, Parkville, 3010 Australia
| |
Collapse
|
4
|
Re-annotation of 191 developmental and epileptic encephalopathy-associated genes unmasks de novo variants in SCN1A. NPJ Genom Med 2019; 4:31. [PMID: 31814998 PMCID: PMC6889285 DOI: 10.1038/s41525-019-0106-7] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Accepted: 11/01/2019] [Indexed: 12/21/2022] Open
Abstract
The developmental and epileptic encephalopathies (DEE) are a group of rare, severe neurodevelopmental disorders, where even the most thorough sequencing studies leave 60-65% of patients without a molecular diagnosis. Here, we explore the incompleteness of transcript models used for exome and genome analysis as one potential explanation for a lack of current diagnoses. Therefore, we have updated the GENCODE gene annotation for 191 epilepsy-associated genes, using human brain-derived transcriptomic libraries and other data to build 3,550 putative transcript models. Our annotations increase the transcriptional 'footprint' of these genes by over 674 kb. Using SCN1A as a case study, due to its close phenotype/genotype correlation with Dravet syndrome, we screened 122 people with Dravet syndrome or a similar phenotype with a panel of exon sequences representing eight established genes and identified two de novo SCN1A variants that now - through improved gene annotation - are ascribed to residing among our exons. These two (from 122 screened people, 1.6%) molecular diagnoses carry significant clinical implications. Furthermore, we identified a previously classified SCN1A intronic Dravet syndrome-associated variant that now lies within a deeply conserved exon. Our findings illustrate the potential gains of thorough gene annotation in improving diagnostic yields for genetic disorders.
Collapse
|
5
|
Li D, Harlan-Williams LM, Kumaraswamy E, Jensen RA. BRCA1-No Matter How You Splice It. Cancer Res 2019; 79:2091-2098. [PMID: 30992324 PMCID: PMC6497576 DOI: 10.1158/0008-5472.can-18-3190] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Revised: 02/09/2019] [Accepted: 03/05/2019] [Indexed: 02/07/2023]
Abstract
BRCA1 (breast cancer 1, early onset), a well-known breast cancer susceptibility gene, is a highly alternatively spliced gene. BRCA1 alternative splicing may serve as an alternative regulatory mechanism for the inactivation of the BRCA1 gene in both hereditary and sporadic breast cancers, and other BRCA1-associated cancers. The alternative transcripts of BRCA1 can mimic known functions, possess unique functions compared with the full-length BRCA1 transcript, and in some cases, appear to function in opposition to full-length BRCA1 In this review, we will summarize the functional "naturally occurring" alternative splicing transcripts of BRCA1 and then discuss the latest next-generation sequencing-based detection methods and techniques to detect alternative BRCA1 splicing patterns and their potential use in cancer diagnosis, prognosis, and therapy.
Collapse
Affiliation(s)
- Dan Li
- The University of Kansas Cancer Center, Kansas City, Kansas
| | - Lisa M Harlan-Williams
- The University of Kansas Cancer Center, Kansas City, Kansas
- Department of Anatomy and Cell Biology, University of Kansas Medical Center, Kansas City, Kansas
| | - Easwari Kumaraswamy
- The University of Kansas Cancer Center, Kansas City, Kansas
- Department of Pathology and Laboratory Medicine, University of Kansas Medical Center, Kansas City, Kansas
| | - Roy A Jensen
- The University of Kansas Cancer Center, Kansas City, Kansas.
- Department of Anatomy and Cell Biology, University of Kansas Medical Center, Kansas City, Kansas
- Department of Pathology and Laboratory Medicine, University of Kansas Medical Center, Kansas City, Kansas
- Department of Cancer Biology, University of Kansas Medical Center, Kansas City, Kansas
- Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas
| |
Collapse
|
6
|
Bhuiyan SA, Ly S, Phan M, Huntington B, Hogan E, Liu CC, Liu J, Pavlidis P. Systematic evaluation of isoform function in literature reports of alternative splicing. BMC Genomics 2018; 19:637. [PMID: 30153812 PMCID: PMC6114036 DOI: 10.1186/s12864-018-5013-2] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2018] [Accepted: 08/14/2018] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Although most genes in mammalian genomes have multiple isoforms, an ongoing debate is whether these isoforms are all functional as well as the extent to which they increase the functional repertoire of the genome. To ground this debate in data, it would be helpful to have a corpus of experimentally-verified cases of genes which have functionally distinct splice isoforms (FDSIs). RESULTS We established a curation framework for evaluating experimental evidence of FDSIs, and analyzed over 700 human and mouse genes, strongly biased towards genes that are prominent in the alternative splicing literature. Despite this bias, we found experimental evidence meeting the classical definition for functionally distinct isoforms for ~ 5% of the curated genes. If we relax our criteria for inclusion to include weaker forms of evidence, the fraction of genes with evidence of FDSIs remains low (~ 13%). We provide evidence that this picture will not change substantially with further curation and conclude there is a large gap between the presumed impact of splicing on gene function and the experimental evidence. Furthermore, many functionally distinct isoforms were not traceable to a specific isoform in Ensembl, a database that forms the basis for much computational research. CONCLUSIONS We conclude that the claim that alternative splicing vastly increases the functional repertoire of the genome is an extrapolation from a limited number of empirically supported cases. We also conclude that more work is needed to integrate experimental evidence and genome annotation databases. Our work should help shape research around the role of splicing on gene function from presuming large general effects to acknowledging the need for stronger experimental evidence.
Collapse
Affiliation(s)
- Shamsuddin A. Bhuiyan
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4 Canada
- Department of Psychiatry, University of British Columbia, Vancouver, BC V6T 1Z4 Canada
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, Canada
| | - Sophia Ly
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4 Canada
| | - Minh Phan
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4 Canada
| | - Brandon Huntington
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4 Canada
| | - Ellie Hogan
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4 Canada
| | - Chao Chun Liu
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4 Canada
| | - James Liu
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4 Canada
| | - Paul Pavlidis
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4 Canada
- Department of Psychiatry, University of British Columbia, Vancouver, BC V6T 1Z4 Canada
| |
Collapse
|
7
|
Tardaguila M, de la Fuente L, Marti C, Pereira C, Pardo-Palacios FJ, Del Risco H, Ferrell M, Mellado M, Macchietto M, Verheggen K, Edelmann M, Ezkurdia I, Vazquez J, Tress M, Mortazavi A, Martens L, Rodriguez-Navarro S, Moreno-Manzano V, Conesa A. SQANTI: extensive characterization of long-read transcript sequences for quality control in full-length transcriptome identification and quantification. Genome Res 2018; 28:396-411. [PMID: 29440222 PMCID: PMC5848618 DOI: 10.1101/gr.222976.117] [Citation(s) in RCA: 264] [Impact Index Per Article: 37.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2017] [Accepted: 01/08/2018] [Indexed: 01/15/2023]
Abstract
High-throughput sequencing of full-length transcripts using long reads has paved the way for the discovery of thousands of novel transcripts, even in well-annotated mammalian species. The advances in sequencing technology have created a need for studies and tools that can characterize these novel variants. Here, we present SQANTI, an automated pipeline for the classification of long-read transcripts that can assess the quality of data and the preprocessing pipeline using 47 unique descriptors. We apply SQANTI to a neuronal mouse transcriptome using Pacific Biosciences (PacBio) long reads and illustrate how the tool is effective in characterizing and describing the composition of the full-length transcriptome. We perform extensive evaluation of ToFU PacBio transcripts by PCR to reveal that an important number of the novel transcripts are technical artifacts of the sequencing approach and that SQANTI quality descriptors can be used to engineer a filtering strategy to remove them. Most novel transcripts in this curated transcriptome are novel combinations of existing splice sites, resulting more frequently in novel ORFs than novel UTRs, and are enriched in both general metabolic and neural-specific functions. We show that these new transcripts have a major impact in the correct quantification of transcript levels by state-of-the-art short-read-based quantification algorithms. By comparing our iso-transcriptome with public proteomics databases, we find that alternative isoforms are elusive to proteogenomics detection. SQANTI allows the user to maximize the analytical outcome of long-read technologies by providing the tools to deliver quality-evaluated and curated full-length transcriptomes.
Collapse
Affiliation(s)
- Manuel Tardaguila
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, Genetics Institute, University of Florida, Gainesville, Florida 32611, USA
| | - Lorena de la Fuente
- Genomics of Gene Expression Laboratory, Centro de Investigaciones Principe Felipe (CIPF), 46012 Valencia, Spain
| | - Cristina Marti
- Genomics of Gene Expression Laboratory, Centro de Investigaciones Principe Felipe (CIPF), 46012 Valencia, Spain
| | - Cécile Pereira
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, Genetics Institute, University of Florida, Gainesville, Florida 32611, USA
| | | | - Hector Del Risco
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, Genetics Institute, University of Florida, Gainesville, Florida 32611, USA
| | - Marc Ferrell
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, Genetics Institute, University of Florida, Gainesville, Florida 32611, USA
| | | | - Marissa Macchietto
- Department of Developmental and Cell Biology, University of California, Irvine, California 92617, USA
| | - Kenneth Verheggen
- VIB-UGent Center for Medical Biotechnology, VIB, B-9000 Ghent, Belgium
- Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium
| | - Mariola Edelmann
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, Genetics Institute, University of Florida, Gainesville, Florida 32611, USA
| | - Iakes Ezkurdia
- Centro Nacional de Investigaciones Cardiovasculares CNIC, 28029 Madrid, Spain
| | - Jesus Vazquez
- Centro Nacional de Investigaciones Cardiovasculares CNIC, 28029 Madrid, Spain
| | - Michael Tress
- Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain
| | - Ali Mortazavi
- Department of Developmental and Cell Biology, University of California, Irvine, California 92617, USA
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, B-9000 Ghent, Belgium
- Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium
| | - Susana Rodriguez-Navarro
- Gene Expression and mRNA Metabolism Laboratory, CSIC, IBV, 46010 Valencia, Spain
- Gene Expression and mRNA Metabolism Laboratory, CIPF, 46012 Valencia, Spain
| | | | - Ana Conesa
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, Genetics Institute, University of Florida, Gainesville, Florida 32611, USA
- Genomics of Gene Expression Laboratory, Centro de Investigaciones Principe Felipe (CIPF), 46012 Valencia, Spain
| |
Collapse
|
8
|
Charton K, Suel L, Henriques SF, Moussu JP, Bovolenta M, Taillepierre M, Becker C, Lipson K, Richard I. Exploiting the CRISPR/Cas9 system to study alternative splicing in vivo: application to titin. Hum Mol Genet 2018; 25:4518-4532. [PMID: 28173117 DOI: 10.1093/hmg/ddw280] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2016] [Revised: 07/29/2016] [Accepted: 08/18/2016] [Indexed: 11/12/2022] Open
Abstract
The giant protein titin is the third most abundant protein in striated muscle. Mutations in its gene are responsible for diseases affecting the cardiac and/or the skeletal muscle. Titin has been reported to be expressed in multiple isoforms with considerable variability in the I-band, ensuring the modulation of the passive mechanical properties of the sarcomere. In the M-line, only the penultimate Mex5 exon coding for the specific is7 domain has been reported to be subjected to alternative splicing. Using the CRISPR-Cas9 editing technology, we generated a mouse model where we stably prevent the expression of alternative spliced variant(s) carrying the corresponding domain. Interestingly, the suppression of the domain induces a phenotype mostly in tissues usually expressing the isoform that has been suppressed, indicating that it fulfills (a) specific function(s) in these tissues allowing a perfect adaptation of the M-line to physiological demands of different muscles.
Collapse
Affiliation(s)
- Karine Charton
- INSERM, U951, INTEGRARE research unit Evry, France,Généthon, Evry, France
| | - Laurence Suel
- INSERM, U951, INTEGRARE research unit Evry, France,Généthon, Evry, France
| | - Sara F Henriques
- INSERM, U951, INTEGRARE research unit Evry, France,Généthon, Evry, France,University of Evry-Val-D’Essone, Evry, France
| | - Jean-Paul Moussu
- SEAT - SErvice des Animaux Transgéniques CNRS -TAAM -phenomin UPS44 Bâtiment G 7, rue Guy Môquet 94800 Villejuif, France
| | - Matteo Bovolenta
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| | - Miguel Taillepierre
- SEAT - SErvice des Animaux Transgéniques CNRS -TAAM -phenomin UPS44 Bâtiment G 7, rue Guy Môquet 94800 Villejuif, France
| | - Céline Becker
- SEAT - SErvice des Animaux Transgéniques CNRS -TAAM -phenomin UPS44 Bâtiment G 7, rue Guy Môquet 94800 Villejuif, France
| | - Karelia Lipson
- SEAT - SErvice des Animaux Transgéniques CNRS -TAAM -phenomin UPS44 Bâtiment G 7, rue Guy Môquet 94800 Villejuif, France
| | - Isabelle Richard
- INSERM, U951, INTEGRARE research unit Evry, France,Généthon, Evry, France
| |
Collapse
|
9
|
López-Urrutia E, Campos-Parra A, Herrera LA, Pérez-Plasencia C. Alternative splicing regulation in tumor necrosis factor-mediated inflammation. Oncol Lett 2017; 14:5114-5120. [PMID: 29113151 PMCID: PMC5656035 DOI: 10.3892/ol.2017.6905] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2016] [Accepted: 07/07/2017] [Indexed: 02/06/2023] Open
Abstract
It is generally accepted that alternative splicing has an effect on disease when it leads to conspicuous changes in relevant proteins, but that the combinatorial effect of several small modifications can have marked outcomes as well. Inflammation is a complex process involving numerous signaling pathways, among which the tumor necrosis factor (TNF) pathway is one of the most studied. Signaling pathways are commonly represented as intricate cascades of molecular interactions that eventually lead to the activation of one or several genes. Alternative splicing is a common means of controlling protein expression in time and space; therefore, it can modulate the outcome of signaling pathways through small changes in their elements. Notably, the overall process is tightly regulated, which is easily overlooked when analyzing the pathway as a whole. The present review summarizes recent studies of the alternative splicing of key players of the TNF pathway leading to inflammation, and hypothesizes on the cumulative results of those modifications and the impact on cancer development.
Collapse
Affiliation(s)
- Eduardo López-Urrutia
- Genomics Laboratory, UBIMED, Faculty of Higher Studies-Iztacala, National Autonomous University, Tlalnepantla, 54090 State of Mexico, Mexico
| | - Alma Campos-Parra
- Genomics Laboratory, National Cancer Institute of Mexico, Tlalpan, 14680 Mexico City, Mexico
| | - Luis Alonso Herrera
- Epigenetics Laboratory, National Cancer Institute of Mexico, Tlalpan, 14680 Mexico City, Mexico
| | - Carlos Pérez-Plasencia
- Genomics Laboratory, UBIMED, Faculty of Higher Studies-Iztacala, National Autonomous University, Tlalnepantla, 54090 State of Mexico, Mexico.,Genomics Laboratory, National Cancer Institute of Mexico, Tlalpan, 14680 Mexico City, Mexico
| |
Collapse
|
10
|
Steward CA, Parker APJ, Minassian BA, Sisodiya SM, Frankish A, Harrow J. Genome annotation for clinical genomic diagnostics: strengths and weaknesses. Genome Med 2017; 9:49. [PMID: 28558813 PMCID: PMC5448149 DOI: 10.1186/s13073-017-0441-1] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
The Human Genome Project and advances in DNA sequencing technologies have revolutionized the identification of genetic disorders through the use of clinical exome sequencing. However, in a considerable number of patients, the genetic basis remains unclear. As clinicians begin to consider whole-genome sequencing, an understanding of the processes and tools involved and the factors to consider in the annotation of the structure and function of genomic elements that might influence variant identification is crucial. Here, we discuss and illustrate the strengths and weaknesses of approaches for the annotation and classification of important elements of protein-coding genes, other genomic elements such as pseudogenes and the non-coding genome, comparative-genomic approaches for inferring gene function, and new technologies for aiding genome annotation, as a practical guide for clinicians when considering pathogenic sequence variation. Complete and accurate annotation of structure and function of genome features has the potential to reduce both false-negative (from missing annotation) and false-positive (from incorrect annotation) errors in causal variant identification in exome and genome sequences. Re-analysis of unsolved cases will be necessary as newer technology improves genome annotation, potentially improving the rate of diagnosis.
Collapse
Affiliation(s)
- Charles A Steward
- Congenica Ltd, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1DR, UK. .,The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
| | | | - Berge A Minassian
- Department of Pediatrics (Neurology), University of Texas Southwestern, Dallas, TX, USA.,Program in Genetics and Genome Biology and Department of Paediatrics (Neurology), The Hospital for Sick Children and University of Toronto, Toronto, Canada
| | - Sanjay M Sisodiya
- Department of Clinical and Experimental Epilepsy, UCL Institute of Neurology, London, WC1N 3BG, UK.,Chalfont Centre for Epilepsy, Chesham Lane, Chalfont St Peter, Buckinghamshire, SL9 0RJ, UK
| | - Adam Frankish
- The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.,European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Jennifer Harrow
- The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.,Illumina Inc, Great Chesterford, Essex, CB10 1XL, UK
| |
Collapse
|
11
|
Hu Z, Scott HS, Qin G, Zheng G, Chu X, Xie L, Adelson DL, Oftedal BE, Venugopal P, Babic M, Hahn CN, Zhang B, Wang X, Li N, Wei C. Revealing Missing Human Protein Isoforms Based on Ab Initio Prediction, RNA-seq and Proteomics. Sci Rep 2015; 5:10940. [PMID: 26156868 PMCID: PMC4496727 DOI: 10.1038/srep10940] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2015] [Accepted: 05/05/2015] [Indexed: 01/02/2023] Open
Abstract
Biological and biomedical research relies on comprehensive understanding of protein-coding transcripts. However, the total number of human proteins is still unknown due to the prevalence of alternative splicing. In this paper, we detected 31,566 novel transcripts with coding potential by filtering our ab initio predictions with 50 RNA-seq datasets from diverse tissues/cell lines. PCR followed by MiSeq sequencing showed that at least 84.1% of these predicted novel splice sites could be validated. In contrast to known transcripts, the expression of these novel transcripts were highly tissue-specific. Based on these novel transcripts, at least 36 novel proteins were detected from shotgun proteomics data of 41 breast samples. We also showed L1 retrotransposons have a more significant impact on the origin of new transcripts/genes than previously thought. Furthermore, we found that alternative splicing is extraordinarily widespread for genes involved in specific biological functions like protein binding, nucleoside binding, neuron projection, membrane organization and cell adhesion. In the end, the total number of human transcripts with protein-coding potential was estimated to be at least 204,950.
Collapse
Affiliation(s)
- Zhiqiang Hu
- 1] School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China [2] Shanghai Center for Bioinformation Technology, 1278 Keyuan Road, Pudong District, Shanghai 201203, China
| | - Hamish S Scott
- 1] Department of Genetics and Molecular Pathology, Centre for Cancer Biology, Frome Road, Adelaide, SA 5000 Australia [2] School of Biological Sciences, University of Adelaide, SA 5005, Australia [3] School of Medicine, University of Adelaide, North Terrace, Adelaide, SA 5000, Australia [4] School of Pharmacy and Medical Sciences, Division of Health Sciences, University of South Australia, SA, Australia [5] ACRF Cancer Genomics Facility, Centre for Cancer Biology, SA Pathology, Frome Road, Adelaide, SA 5000, Australia
| | - Guangrong Qin
- Shanghai Center for Bioinformation Technology, 1278 Keyuan Road, Pudong District, Shanghai 201203, China
| | - Guangyong Zheng
- 1] Shanghai Center for Bioinformation Technology, 1278 Keyuan Road, Pudong District, Shanghai 201203, China [2] CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China
| | - Xixia Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
| | - Lu Xie
- Shanghai Center for Bioinformation Technology, 1278 Keyuan Road, Pudong District, Shanghai 201203, China
| | - David L Adelson
- School of Biological Sciences, University of Adelaide, SA 5005, Australia
| | - Bergithe E Oftedal
- 1] Department of Genetics and Molecular Pathology, Centre for Cancer Biology, Frome Road, Adelaide, SA 5000 Australia [2] Department of Biomedical Informatics (DBMI), Vanderbilt University Medical Center (VUMC), 2525 West End Ave, Suite 800, Nashville, TN 37203, USA
| | - Parvathy Venugopal
- 1] Department of Genetics and Molecular Pathology, Centre for Cancer Biology, Frome Road, Adelaide, SA 5000 Australia [2] School of Biological Sciences, University of Adelaide, SA 5005, Australia
| | - Milena Babic
- Department of Genetics and Molecular Pathology, Centre for Cancer Biology, Frome Road, Adelaide, SA 5000 Australia
| | - Christopher N Hahn
- 1] Department of Genetics and Molecular Pathology, Centre for Cancer Biology, Frome Road, Adelaide, SA 5000 Australia [2] School of Biological Sciences, University of Adelaide, SA 5005, Australia [3] School of Medicine, University of Adelaide, North Terrace, Adelaide, SA 5000, Australia
| | - Bing Zhang
- Department of Biomedical Informatics (DBMI), Vanderbilt University Medical Center (VUMC), 2525 West End Ave, Suite 800, Nashville, TN 37203, USA
| | - Xiaojing Wang
- Department of Biomedical Informatics (DBMI), Vanderbilt University Medical Center (VUMC), 2525 West End Ave, Suite 800, Nashville, TN 37203, USA
| | - Nan Li
- Institute of Immunology, Second Military Medical University, 800 Xiangyin Road, Shanghai 200433, China
| | - Chaochun Wei
- 1] School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China [2] Shanghai Center for Bioinformation Technology, 1278 Keyuan Road, Pudong District, Shanghai 201203, China
| |
Collapse
|
12
|
Rodriguez JM, Carro A, Valencia A, Tress ML. APPRIS WebServer and WebServices. Nucleic Acids Res 2015; 43:W455-9. [PMID: 25990727 PMCID: PMC4489225 DOI: 10.1093/nar/gkv512] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2015] [Accepted: 05/05/2015] [Indexed: 01/08/2023] Open
Abstract
This paper introduces the APPRIS WebServer (http://appris.bioinfo.cnio.es) and WebServices (http://apprisws.bioinfo.cnio.es). Both the web servers and the web services are based around the APPRIS Database, a database that presently houses annotations of splice isoforms for five different vertebrate genomes. The APPRIS WebServer and WebServices provide access to the computational methods implemented in the APPRIS Database, while the APPRIS WebServices also allows retrieval of the annotations. The APPRIS WebServer and WebServices annotate splice isoforms with protein structural and functional features, and with data from cross-species alignments. In addition they can use the annotations of structure, function and conservation to select a single reference isoform for each protein-coding gene (the principal protein isoform). APPRIS principal isoforms have been shown to agree overwhelmingly with the main protein isoform detected in proteomics experiments. The APPRIS WebServer allows for the annotation of splice isoforms for individual genes, and provides a range of visual representations and tools to allow researchers to identify the likely effect of splicing events. The APPRIS WebServices permit users to generate annotations automatically in high throughput mode and to interrogate the annotations in the APPRIS Database. The APPRIS WebServices have been implemented using REST architecture to be flexible, modular and automatic.
Collapse
Affiliation(s)
- Jose Manuel Rodriguez
- Spanish National Bioinformatics Institute (INB), Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Angel Carro
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Alfonso Valencia
- Spanish National Bioinformatics Institute (INB), Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Michael L Tress
- Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| |
Collapse
|
13
|
Hudson WH, Pickard MR, de Vera IMS, Kuiper EG, Mourtada-Maarabouni M, Conn GL, Kojetin DJ, Williams GT, Ortlund EA. Conserved sequence-specific lincRNA-steroid receptor interactions drive transcriptional repression and direct cell fate. Nat Commun 2014; 5:5395. [PMID: 25377354 DOI: 10.1038/ncomms6395] [Citation(s) in RCA: 91] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2014] [Accepted: 09/26/2014] [Indexed: 01/01/2023] Open
Abstract
The majority of the eukaryotic genome is transcribed, generating a significant number of long intergenic noncoding RNAs (lincRNAs). Although lincRNAs represent the most poorly understood product of transcription, recent work has shown lincRNAs fulfill important cellular functions. In addition to low sequence conservation, poor understanding of structural mechanisms driving lincRNA biology hinders systematic prediction of their function. Here we report the molecular requirements for the recognition of steroid receptors (SRs) by the lincRNA growth arrest-specific 5 (Gas5), which regulates steroid-mediated transcriptional regulation, growth arrest and apoptosis. We identify the functional Gas5-SR interface and generate point mutations that ablate the SR-Gas5 lincRNA interaction, altering Gas5-driven apoptosis in cancer cell lines. Further, we find that the Gas5 SR-recognition sequence is conserved among haplorhines, with its evolutionary origin as a splice acceptor site. This study demonstrates that lincRNAs can recognize protein targets in a conserved, sequence-specific manner in order to affect critical cell functions.
Collapse
Affiliation(s)
- William H Hudson
- 1] Department of Biochemistry, Emory University School of Medicine, Atlanta, Georgia 30322, USA [2] Discovery and Developmental Therapeutics, Winship Cancer Institute, Atlanta, Georgia 30322, USA
| | - Mark R Pickard
- Institute of Science and Technology in Medicine, School of Life Sciences, Keele University, Keele ST5 5BG, UK
| | - Ian Mitchelle S de Vera
- Department of Molecular Therapeutics, Scripps Research Institute, Jupiter, Florida 33458 USA
| | - Emily G Kuiper
- Department of Biochemistry, Emory University School of Medicine, Atlanta, Georgia 30322, USA
| | - Mirna Mourtada-Maarabouni
- Institute of Science and Technology in Medicine, School of Life Sciences, Keele University, Keele ST5 5BG, UK
| | - Graeme L Conn
- Department of Biochemistry, Emory University School of Medicine, Atlanta, Georgia 30322, USA
| | - Douglas J Kojetin
- Department of Molecular Therapeutics, Scripps Research Institute, Jupiter, Florida 33458 USA
| | - Gwyn T Williams
- Institute of Science and Technology in Medicine, School of Life Sciences, Keele University, Keele ST5 5BG, UK
| | - Eric A Ortlund
- 1] Department of Biochemistry, Emory University School of Medicine, Atlanta, Georgia 30322, USA [2] Discovery and Developmental Therapeutics, Winship Cancer Institute, Atlanta, Georgia 30322, USA
| |
Collapse
|
14
|
Sinha A, Nagarajaram HA. Nodes occupying central positions in human tissue specific PPI networks are enriched with many splice variants. Proteomics 2014; 14:2242-8. [PMID: 25092398 DOI: 10.1002/pmic.201400249] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2014] [Revised: 07/04/2014] [Accepted: 08/01/2014] [Indexed: 12/22/2022]
Abstract
The functional repertoire of genes in the eukaryotic organisms is enhanced by the phenomenon of alternative splicing. Hence, a node in a tissue specific protein-protein interaction (TS PPIN) network can be thought of as an ensemble of various spliced protein products of the corresponding gene expressed in that tissue. Here we demonstrate that the nodes that occupy topologically central positions characterized by high degree, betweenness, closeness, and eigenvector centrality values in TS PPINs of Homo sapiens are associated with high number of splice variants. We also show that the high "centrality" of these genes/nodes could in part be explained by the presence of a large number of promiscuous domains.
Collapse
Affiliation(s)
- Anupam Sinha
- Laboratory of Computational Biology, Centre for DNA Fingerprinting & Diagnostics (CDFD), Hyderabad, Telangana, India
| | | |
Collapse
|
15
|
Colombo M, Blok MJ, Whiley P, Santamariña M, Gutiérrez-Enríquez S, Romero A, Garre P, Becker A, Smith LD, De Vecchi G, Brandão RD, Tserpelis D, Brown M, Blanco A, Bonache S, Menéndez M, Houdayer C, Foglia C, Fackenthal JD, Baralle D, Wappenschmidt B, Díaz-Rubio E, Caldés T, Walker L, Díez O, Vega A, Spurdle AB, Radice P, De La Hoya M. Comprehensive annotation of splice junctions supports pervasive alternative splicing at the BRCA1 locus: a report from the ENIGMA consortium. Hum Mol Genet 2014; 23:3666-80. [DOI: 10.1093/hmg/ddu075] [Citation(s) in RCA: 63] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Affiliation(s)
- Mara Colombo
- Department of Preventive
and Predictive Medicine, Fondazione IRCCS Istituto Nazionale dei Tumori, Milano, Italy,
| | - Marinus J. Blok
- Department of Clinical Genetics, Maastricht University Medical Center, Maastricht, The Netherlands,
| | - Phillip Whiley
- Molecular Cancer Epidemiology Laboratory, Genetics and Computational Division, QIMR Berghofer Medical Research Institute, Brisbane, Australia,
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia,
| | - Marta Santamariña
- Grupo de Medicina Xenómica-USC, Universidad de Santiago de Compostela, CIBERER, IDIS, Santiago de Compostela, Spain,
| | | | - Atocha Romero
- Laboratorio de Oncología Molecular, Instituto de Investigación Sanitaria San Carlos (IdISSC), Hospital Clínico San Carlos, Madrid, Spain,
| | - Pilar Garre
- Laboratorio de Oncología Molecular, Instituto de Investigación Sanitaria San Carlos (IdISSC), Hospital Clínico San Carlos, Madrid, Spain,
| | - Alexandra Becker
- Center of Familial Breast and Ovarian Cancer, University Hospital Cologne, Cologne, Germany,
- Center for Molecular Medicine Cologne (CMMC), University of Cologne, Cologne, Germany,
| | - Lindsay Denise Smith
- Human Development and Health Academic Unit, Faculty of Medicine, University of Southampton, Southampton General Hospital, Southampton, UK,
| | - Giovanna De Vecchi
- Department of Preventive
and Predictive Medicine, Fondazione IRCCS Istituto Nazionale dei Tumori, Milano, Italy,
| | - Rita D. Brandão
- Department of Clinical Genetics, Maastricht University Medical Center, Maastricht, The Netherlands,
| | - Demis Tserpelis
- Department of Clinical Genetics, Maastricht University Medical Center, Maastricht, The Netherlands,
| | - Melissa Brown
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia,
| | - Ana Blanco
- Fundación Pública Galega de Medicina Xenómica-SERGAS, Grupo de Medicina Xenómica-USC, CIBERER, IDIS, Santiago de Compostela, Spain,
| | - Sandra Bonache
- Oncogenetics Group, Vall d'Hebron Institute of Oncology (VHIO) and
- Oncogenetics Group, Vall d'Hebron Research Institute (VHIR), Universitat Autonoma de Barcelona, Barcelona, Spain,
| | - Mireia Menéndez
- Genetic Diagnosis Unit, Hereditary Cancer Program, Institut Català d'Oncologia, Barcelona, Spain,
| | - Claude Houdayer
- Service de Génétique and INSERM U830, Institut Curie and Université Paris Descartes, Sorbonne Paris Cité, Paris, France,
| | - Claudia Foglia
- Department of Preventive
and Predictive Medicine, Fondazione IRCCS Istituto Nazionale dei Tumori, Milano, Italy,
| | - James D. Fackenthal
- Department of Medicine, The University of Chicago Medical Center, Chicago, IL, USA,
| | - Diana Baralle
- Human Development and Health Academic Unit, Faculty of Medicine, University of Southampton, Southampton General Hospital, Southampton, UK,
| | - Barbara Wappenschmidt
- Center of Familial Breast and Ovarian Cancer, University Hospital Cologne, Cologne, Germany,
- Center for Molecular Medicine Cologne (CMMC), University of Cologne, Cologne, Germany,
| | - Eduardo Díaz-Rubio
- Laboratorio de Oncología Molecular, Instituto de Investigación Sanitaria San Carlos (IdISSC), Hospital Clínico San Carlos, Madrid, Spain,
- Servicio de Oncología Médica, Hospital Clínico San Carlos, Madrid, Spain,
| | - Trinidad Caldés
- Laboratorio de Oncología Molecular, Instituto de Investigación Sanitaria San Carlos (IdISSC), Hospital Clínico San Carlos, Madrid, Spain,
| | - Logan Walker
- Department of Pathology, University of Otago, Christchurch, New Zealand
| | - Orland Díez
- Oncogenetics Group, Vall d'Hebron Institute of Oncology (VHIO) and
- Oncogenetics Group, Vall d'Hebron Research Institute (VHIR), Universitat Autonoma de Barcelona, Barcelona, Spain,
- Oncogenetics Group, University Hospital of Vall d'Hebron, Barcelona, Spain
| | - Ana Vega
- Fundación Pública Galega de Medicina Xenómica-SERGAS, Grupo de Medicina Xenómica-USC, CIBERER, IDIS, Santiago de Compostela, Spain,
| | - Amanda B. Spurdle
- Molecular Cancer Epidemiology Laboratory, Genetics and Computational Division, QIMR Berghofer Medical Research Institute, Brisbane, Australia,
| | - Paolo Radice
- Department of Preventive
and Predictive Medicine, Fondazione IRCCS Istituto Nazionale dei Tumori, Milano, Italy,
| | - Miguel De La Hoya
- Laboratorio de Oncología Molecular, Instituto de Investigación Sanitaria San Carlos (IdISSC), Hospital Clínico San Carlos, Madrid, Spain,
| | | |
Collapse
|
16
|
Genomics of alternative splicing: evolution, development and pathophysiology. Hum Genet 2014; 133:679-87. [PMID: 24378600 DOI: 10.1007/s00439-013-1411-3] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2013] [Accepted: 12/15/2013] [Indexed: 12/11/2022]
Abstract
Alternative splicing is a major cellular mechanism in metazoans for generating proteomic diversity. A large proportion of protein-coding genes in multicellular organisms undergo alternative splicing, and in humans, it has been estimated that nearly 90 % of protein-coding genes-much larger than expected-are subject to alternative splicing. Genomic analyses of alternative splicing have illuminated its universal role in shaping the evolution of genomes, in the control of developmental processes, and in the dynamic regulation of the transcriptome to influence phenotype. Disruption of the splicing machinery has been found to drive pathophysiology, and indeed reprogramming of aberrant splicing can provide novel approaches to the development of molecular therapy. This review focuses on the recent progress in our understanding of alternative splicing brought about by the unprecedented explosive growth of genomic data and highlights the relevance of human splicing variation on disease and therapy.
Collapse
|
17
|
Abstract
Alternative pre-mRNA splicing is an integral part of gene regulation in eukaryotes. Here we provide a basic overview of the various types of alternative splicing, as well as the functional role, highlighting how alternative splicing varies across phylogeny. Regulated alternative splicing can affect protein function and ultimately impact biological outcomes. We examine the possibility that portions of alternatively spliced transcripts are the result of stochastic processes rather than regulated. We discuss the implications of misregulated alternative splicing and explore of the role of alternative splicing in human disease.
Collapse
Affiliation(s)
- Stacey D Wagner
- Department of Chemistry and Institute of Molecular Biology, University of Oregon, Eugene, OR, USA
| | | |
Collapse
|
18
|
Abstract
The last decade has seen tremendous effort committed to the annotation of the human genome sequence, most notably perhaps in the form of the ENCODE project. One of the major findings of ENCODE, and other genome analysis projects, is that the human transcriptome is far larger and more complex than previously thought. This complexity manifests, for example, as alternative splicing within protein-coding genes, as well as in the discovery of thousands of long noncoding RNAs. It is also possible that significant numbers of human transcripts have not yet been described by annotation projects, while existing transcript models are frequently incomplete. The question as to what proportion of this complexity is truly functional remains open, however, and this ambiguity presents a serious challenge to genome scientists. In this article, we will discuss the current state of human transcriptome annotation, drawing on our experience gained in generating the GENCODE gene annotation set. We highlight the gaps in our knowledge of transcript functionality that remain, and consider the potential computational and experimental strategies that can be used to help close them. We propose that an understanding of the true overlap between transcriptional complexity and functionality will not be gained in the short term. However, significant steps toward obtaining this knowledge can now be taken by using an integrated strategy, combining all of the experimental resources at our disposal.
Collapse
Affiliation(s)
- Jonathan M Mudge
- Department of Informatics, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, United Kingdom
| | | | | |
Collapse
|
19
|
Buckberry S, Bianco-Miotto T, Roberts CT. Imprinted and X-linked non-coding RNAs as potential regulators of human placental function. Epigenetics 2013; 9:81-9. [PMID: 24081302 DOI: 10.4161/epi.26197] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Pregnancy outcome is inextricably linked to placental development, which is strictly controlled temporally and spatially through mechanisms that are only partially understood. However, increasing evidence suggests non-coding RNAs (ncRNAs) direct and regulate a considerable number of biological processes and therefore may constitute a previously hidden layer of regulatory information in the placenta. Many ncRNAs, including both microRNAs and long non-coding transcripts, show almost exclusive or predominant expression in the placenta compared with other somatic tissues and display altered expression patterns in placentas from complicated pregnancies. In this review, we explore the results of recent genome-scale and single gene expression studies using human placental tissue, but include studies in the mouse where human data are lacking. Our review focuses on the ncRNAs epigenetically regulated through genomic imprinting or X-chromosome inactivation and includes recent evidence surrounding the H19 lincRNA, the imprinted C19MC cluster microRNAs, and X-linked miRNAs associated with pregnancy complications.
Collapse
Affiliation(s)
- Sam Buckberry
- The Robinson Institute; Research Centre for Reproductive Health; School of Paediatrics and Reproductive Health; The University of Adelaide; Adelaide, SA Australia
| | - Tina Bianco-Miotto
- The Robinson Institute; Research Centre for Reproductive Health; School of Paediatrics and Reproductive Health; The University of Adelaide; Adelaide, SA Australia; School of Agriculture Food & Wine; The University of Adelaide; Adelaide, SA Australia
| | - Claire T Roberts
- The Robinson Institute; Research Centre for Reproductive Health; School of Paediatrics and Reproductive Health; The University of Adelaide; Adelaide, SA Australia
| |
Collapse
|
20
|
Light S, Elofsson A. The impact of splicing on protein domain architecture. Curr Opin Struct Biol 2013; 23:451-8. [PMID: 23562110 DOI: 10.1016/j.sbi.2013.02.013] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2013] [Revised: 02/22/2013] [Accepted: 02/28/2013] [Indexed: 10/27/2022]
Abstract
Many proteins are composed of protein domains, functional units of common descent. Multidomain forms are common in all eukaryotes making up more than half of the proteome and the evolution of novel domain architecture has been accelerated in metazoans. It is also becoming increasingly clear that alternative splicing is prevalent among vertebrates. Given that protein domains are defined as structurally, functionally and evolutionarily distinct units, one may speculate that some alternative splicing events may lead to clean excisions of protein domains, thus generating a number of different domain architectures from one gene template. However, recent findings indicate that smaller alternative splicing events, in particular in disordered regions, might be more prominent than domain architectural changes. The problem of identifying protein isoforms is, however, still not resolved. Clearly, many splice forms identified through detection of mRNA sequences appear to produce 'nonfunctional' proteins, such as proteins with missing internal secondary structure elements. Here, we review the state of the art methods for identification of functional isoforms and present a summary of what is known, thus far, about alternative splicing with regard to protein domain architectures.
Collapse
Affiliation(s)
- Sara Light
- Science for Life Laboratory, Stockholm University, Box 1031 SE-171 21 Solna, Sweden
| | | |
Collapse
|
21
|
Rodriguez JM, Maietta P, Ezkurdia I, Pietrelli A, Wesselink JJ, Lopez G, Valencia A, Tress ML. APPRIS: annotation of principal and alternative splice isoforms. Nucleic Acids Res 2012; 41:D110-7. [PMID: 23161672 PMCID: PMC3531113 DOI: 10.1093/nar/gks1058] [Citation(s) in RCA: 164] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Here, we present APPRIS (http://appris.bioinfo.cnio.es), a database that houses annotations of human splice isoforms. APPRIS has been designed to provide value to manual annotations of the human genome by adding reliable protein structural and functional data and information from cross-species conservation. The visual representation of the annotations provided by APPRIS for each gene allows annotators and researchers alike to easily identify functional changes brought about by splicing events. In addition to collecting, integrating and analyzing reliable predictions of the effect of splicing events, APPRIS also selects a single reference sequence for each gene, here termed the principal isoform, based on the annotations of structure, function and conservation for each transcript. APPRIS identifies a principal isoform for 85% of the protein-coding genes in the GENCODE 7 release for ENSEMBL. Analysis of the APPRIS data shows that at least 70% of the alternative (non-principal) variants would lose important functional or structural information relative to the principal isoform.
Collapse
|
22
|
Gaudet P, Arighi C, Bastian F, Bateman A, Blake JA, Cherry MJ, D'Eustachio P, Finn R, Giglio M, Hirschman L, Kania R, Klimke W, Martin MJ, Karsch-Mizrachi I, Munoz-Torres M, Natale D, O'Donovan C, Ouellette F, Pruitt KD, Robinson-Rechavi M, Sansone SA, Schofield P, Sutton G, Van Auken K, Vasudevan S, Wu C, Young J, Mazumder R. Recent advances in biocuration: meeting report from the Fifth International Biocuration Conference. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2012; 2012:bas036. [PMID: 23110974 PMCID: PMC3483532 DOI: 10.1093/database/bas036] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
The 5th International Biocuration Conference brought together over 300 scientists to exchange on their work, as well as discuss issues relevant to the International Society for Biocuration's (ISB) mission. Recurring themes this year included the creation and promotion of gold standards, the need for more ontologies, and more formal interactions with journals. The conference is an essential part of the ISB's goal to support exchanges among members of the biocuration community. Next year's conference will be held in Cambridge, UK, from 7 to 10 April 2013. In the meanwhile, the ISB website provides information about the society's activities (http://biocurator.org), as well as related events of interest.
Collapse
Affiliation(s)
- Pascale Gaudet
- International Society for Biocuration and CALIPHO Group, Swiss Institute of Bioinformatics, 1 Rue Michel Servet, Geneva, Switzerland.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|