1
|
Bénitière F, Necsulea A, Duret L. Random genetic drift sets an upper limit on mRNA splicing accuracy in metazoans. eLife 2024; 13:RP93629. [PMID: 38470242 DOI: 10.7554/elife.93629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2024] Open
Abstract
Most eukaryotic genes undergo alternative splicing (AS), but the overall functional significance of this process remains a controversial issue. It has been noticed that the complexity of organisms (assayed by the number of distinct cell types) correlates positively with their genome-wide AS rate. This has been interpreted as evidence that AS plays an important role in adaptive evolution by increasing the functional repertoires of genomes. However, this observation also fits with a totally opposite interpretation: given that 'complex' organisms tend to have small effective population sizes (Ne), they are expected to be more affected by genetic drift, and hence more prone to accumulate deleterious mutations that decrease splicing accuracy. Thus, according to this 'drift barrier' theory, the elevated AS rate in complex organisms might simply result from a higher splicing error rate. To test this hypothesis, we analyzed 3496 transcriptome sequencing samples to quantify AS in 53 metazoan species spanning a wide range of Ne values. Our results show a negative correlation between Ne proxies and the genome-wide AS rates among species, consistent with the drift barrier hypothesis. This pattern is dominated by low abundance isoforms, which represent the vast majority of the splice variant repertoire. We show that these low abundance isoforms are depleted in functional AS events, and most likely correspond to errors. Conversely, the AS rate of abundant isoforms, which are relatively enriched in functional AS events, tends to be lower in more complex species. All these observations are consistent with the hypothesis that variation in AS rates across metazoans reflects the limits set by drift on the capacity of selection to prevent gene expression errors.
Collapse
Affiliation(s)
- Florian Bénitière
- Laboratoire de Biometrie et Biologie Evolutive, CNRS, Universite Lyon 1, Villeurbanne, France
| | - Anamaria Necsulea
- Laboratoire de Biometrie et Biologie Evolutive, CNRS, Universite Lyon 1, Villeurbanne, France
| | - Laurent Duret
- Laboratoire de Biometrie et Biologie Evolutive, CNRS, Universite Lyon 1, Villeurbanne, France
| |
Collapse
|
2
|
Song Y, Zhang C, Omenn GS, O’Meara MJ, Welch JD. Predicting the Structural Impact of Human Alternative Splicing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.21.572928. [PMID: 38187531 PMCID: PMC10769328 DOI: 10.1101/2023.12.21.572928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Protein structure prediction with neural networks is a powerful new method for linking protein sequence, structure, and function, but structures have generally been predicted for only a single isoform of each gene, neglecting splice variants. To investigate the structural implications of alternative splicing, we used AlphaFold2 to predict the structures of more than 11,000 human isoforms. We employed multiple metrics to identify splicing-induced structural alterations, including template matching score, secondary structure composition, surface charge distribution, radius of gyration, accessibility of post-translational modification sites, and structure-based function prediction. We identified examples of how alternative splicing induced clear changes in each of these properties. Structural similarity between isoforms largely correlated with degree of sequence identity, but we identified a subset of isoforms with low structural similarity despite high sequence similarity. Exon skipping and alternative last exons tended to increase the surface charge and radius of gyration. Splicing also buried or exposed numerous post-translational modification sites, most notably among the isoforms of BAX. Functional prediction nominated numerous functional differences among isoforms of the same gene, with loss of function compared to the reference predominating. Finally, we used single-cell RNA-seq data from the Tabula Sapiens to determine the cell types in which each structure is expressed. Our work represents an important resource for studying the structure and function of splice isoforms across the cell types of the human body.
Collapse
Affiliation(s)
- Yuxuan Song
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Gilbert S. Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Matthew J. O’Meara
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Joshua D. Welch
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Computer Science and Engineering, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
3
|
Carrion SA, Michal JJ, Jiang Z. Alternative Transcripts Diversify Genome Function for Phenome Relevance to Health and Diseases. Genes (Basel) 2023; 14:2051. [PMID: 38002994 PMCID: PMC10671453 DOI: 10.3390/genes14112051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 11/06/2023] [Accepted: 11/07/2023] [Indexed: 11/26/2023] Open
Abstract
Manipulation using alternative exon splicing (AES), alternative transcription start (ATS), and alternative polyadenylation (APA) sites are key to transcript diversity underlying health and disease. All three are pervasive in organisms, present in at least 50% of human protein-coding genes. In fact, ATS and APA site use has the highest impact on protein identity, with their ability to alter which first and last exons are utilized as well as impacting stability and translation efficiency. These RNA variants have been shown to be highly specific, both in tissue type and stage, with demonstrated importance to cell proliferation, differentiation and the transition from fetal to adult cells. While alternative exon splicing has a limited effect on protein identity, its ubiquity highlights the importance of these minor alterations, which can alter other features such as localization. The three processes are also highly interwoven, with overlapping, complementary, and competing factors, RNA polymerase II and its CTD (C-terminal domain) chief among them. Their role in development means dysregulation leads to a wide variety of disorders and cancers, with some forms of disease disproportionately affected by specific mechanisms (AES, ATS, or APA). Challenges associated with the genome-wide profiling of RNA variants and their potential solutions are also discussed in this review.
Collapse
Affiliation(s)
| | | | - Zhihua Jiang
- Department of Animal Sciences and Center for Reproductive Biology, Washington State University, Pullman, WA 99164-7620, USA; (S.A.C.); (J.J.M.)
| |
Collapse
|
4
|
Goldtzvik Y, Sen N, Lam SD, Orengo C. Protein diversification through post-translational modifications, alternative splicing, and gene duplication. Curr Opin Struct Biol 2023; 81:102640. [PMID: 37354790 DOI: 10.1016/j.sbi.2023.102640] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 05/05/2023] [Accepted: 05/24/2023] [Indexed: 06/26/2023]
Abstract
Proteins provide the basis for cellular function. Having multiple versions of the same protein within a single organism provides a way of regulating its activity or developing novel functions. Post-translational modifications of proteins, by means of adding/removing chemical groups to amino acids, allow for a well-regulated and controlled way of generating functionally distinct protein species. Alternative splicing is another method with which organisms possibly generate new isoforms. Additionally, gene duplication events throughout evolution generate multiple paralogs of the same genes, resulting in multiple versions of the same protein within an organism. In this review, we discuss recent advancements in the study of these three methods of protein diversification and provide illustrative examples of how they affect protein structure and function.
Collapse
Affiliation(s)
- Yonathan Goldtzvik
- Department of Structural and Molecular Biology, University College London, London, United Kingdom
| | - Neeladri Sen
- Department of Structural and Molecular Biology, University College London, London, United Kingdom. https://twitter.com/@NeeladriSen
| | - Su Datt Lam
- Department of Structural and Molecular Biology, University College London, London, United Kingdom; Department of Applied Physics, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Malaysia
| | - Christine Orengo
- Department of Structural and Molecular Biology, University College London, London, United Kingdom.
| |
Collapse
|
5
|
Angira D, Chaudhary S, Abiramasundari A, Thiruvenkatam V. To Explore the Binding Affinity of Human γ-Secretase Activating Protein (GSAP) Isoform 4 with APP-C99 Peptides. ACS OMEGA 2023; 8:13435-13443. [PMID: 37065030 PMCID: PMC10099435 DOI: 10.1021/acsomega.3c01117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Accepted: 03/14/2023] [Indexed: 06/19/2023]
Abstract
γ-Secretase activating protein (GSAP) is known to play an important role in the β-amyloid pathway. It acts as a modulator and accentuates the truncation of the amyloid precursor protein C-99 fragment through the γ-secretase complex. GSAP has four isoforms, out of which canonical isoform 1, a 16 kDa C-terminal portion, has been extensively studied, whereas the function of other three isoforms remains unknown. Here, we explore the GSAP isoform 4 (GSAP_I4) expression and purification from inclusion bodies followed by the refolding of the protein. The secondary structure of GSAP_I4 is predicted using circular dichroism. The protein is further characterized by western blotting and mass spectroscopy analysis. Additionally, biochemical assays and in silico molecular docking and molecular simulation are performed to investigate the binding of GSAP_I4 and APP-C99 peptide fragments. The results reflect that although GSAP_I1 and GSAP_I4 share high sequence similarity, the isoform 4 does not show any affinity toward APP-C99 peptide fragments. This hints toward the fact that GSAP_I4 might have a different role in the living system that is yet unexplored.
Collapse
Affiliation(s)
- Deekshi Angira
- Discipline
of Chemistry, Indian Institute of Technology
Gandhinagar, Gandhinagar, Gujarat 382355, India
| | - Sonali Chaudhary
- Discipline
of Chemistry, Indian Institute of Technology
Gandhinagar, Gandhinagar, Gujarat 382355, India
| | - Arumugam Abiramasundari
- Discipline
of Biological Engineering, Indian Institute
of Technology Gandhinagar, Gandhinagar, Gujarat 382355, India
| | - Vijay Thiruvenkatam
- Discipline
of Biological Engineering, Indian Institute
of Technology Gandhinagar, Gandhinagar, Gujarat 382355, India
| |
Collapse
|
6
|
Fackenthal JD. Alternative mRNA Splicing and Promising Therapies in Cancer. Biomolecules 2023; 13:biom13030561. [PMID: 36979496 PMCID: PMC10046298 DOI: 10.3390/biom13030561] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2023] [Revised: 03/09/2023] [Accepted: 03/16/2023] [Indexed: 03/30/2023] Open
Abstract
Cancer is among the leading causes of mortality worldwide. While considerable attention has been given to genetic and epigenetic sources of cancer-specific cellular activities, the role of alternative mRNA splicing has only recently received attention as a major contributor to cancer initiation and progression. The distribution of alternate mRNA splicing variants in cancer cells is different from their non-cancer counterparts, and cancer cells are more sensitive than non-cancer cells to drugs that target components of the splicing regulatory network. While many of the alternatively spliced mRNAs in cancer cells may represent "noise" from splicing dysregulation, certain recurring splicing variants have been shown to contribute to tumor progression. Some pathogenic splicing disruption events result from mutations in cis-acting splicing regulatory sequences in disease-associated genes, while others may result from shifts in balance among naturally occurring alternate splicing variants among mRNAs that participate in cell cycle progression and the regulation of apoptosis. This review provides examples of cancer-related alternate splicing events resulting from each step of mRNA processing and the promising therapies that may be used to address them.
Collapse
Affiliation(s)
- James D Fackenthal
- Department of Biological Sciences, College of Science and Health, Benedictine University, Lisle, IL 60532, USA
| |
Collapse
|
7
|
Martinez-Gomez L, Cerdán-Vélez D, Abascal F, Tress ML. Origins and Evolution of Human Tandem Duplicated Exon Substitution Events. Genome Biol Evol 2022; 14:6809199. [PMID: 36346145 PMCID: PMC9741552 DOI: 10.1093/gbe/evac162] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 10/25/2022] [Accepted: 10/29/2022] [Indexed: 11/10/2022] Open
Abstract
The mutually exclusive splicing of tandem duplicated exons produces protein isoforms that are identical save for a homologous region that allows for the fine tuning of protein function. Tandem duplicated exon substitution events are rare, yet highly important alternative splicing events. Most events are ancient, their isoforms are highly expressed, and they have significantly more pathogenic mutations than other splice events. Here, we analyzed the physicochemical properties and functional roles of the homologous polypeptide regions produced by the 236 tandem duplicated exon substitutions annotated in the human gene set. We find that the most important structural and functional residues in these homologous regions are maintained, and that most changes are conservative rather than drastic. Three quarters of the isoforms produced from tandem duplicated exon substitution events are tissue-specific, particularly in nervous and cardiac tissues, and tandem duplicated exon substitution events are enriched in functional terms related to structures in the brain and skeletal muscle. We find considerable evidence for the convergent evolution of tandem duplicated exon substitution events in vertebrates, arthropods, and nematodes. Twelve human gene families have orthologues with tandem duplicated exon substitution events in both Drosophila melanogaster and Caenorhabditis elegans. Six of these gene families are ion transporters, suggesting that tandem exon duplication in genes that control the flow of ions into the cell has an adaptive benefit. The ancient origins, the strong indications of tissue-specific functions, and the evidence of convergent evolution suggest that these events may have played important roles in the evolution of animal tissues and organs.
Collapse
Affiliation(s)
- Laura Martinez-Gomez
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), C. Melchor Fernandez Almagro, 3, 28029 Madrid, Spain
| | - Daniel Cerdán-Vélez
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), C. Melchor Fernandez Almagro, 3, 28029 Madrid, Spain
| | - Federico Abascal
- Somatic Evolution Group, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, United Kingdom
| | | |
Collapse
|
8
|
Pozo F, Rodriguez JM, Martínez Gómez L, Vázquez J, Tress ML. APPRIS principal isoforms and MANE Select transcripts define reference splice variants. Bioinformatics 2022; 38:ii89-ii94. [PMID: 36124785 PMCID: PMC9486585 DOI: 10.1093/bioinformatics/btac473] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION Selecting the splice variant that best represents a coding gene is a crucial first step in many experimental analyses, and vital for mapping clinically relevant variants. This study compares the longest isoforms, MANE Select transcripts, APPRIS principal isoforms, and expression data, and aims to determine which method is best for selecting biological important reference splice variants for large-scale analyses. RESULTS Proteomics analyses and human genetic variation data suggest that most coding genes have a single main protein isoform. We show that APPRIS principal isoforms and MANE Select transcripts best describe these main cellular isoforms, and find that using the longest splice variant as the representative is a poor strategy. Exons unique to the longest splice isoforms are not under selective pressure, and so are unlikely to be functionally relevant. Expression data are also a poor means of selecting the main splice variant. APPRIS principal and MANE Select exons are under purifying selection, while exons specific to alternative transcripts are not. There are MANE and APPRIS representatives for almost 95% of genes, and where they agree they are particularly effective, coinciding with the main proteomics isoform for over 98.2% of genes. AVAILABILITY AND IMPLEMENTATION APPRIS principal isoforms for human, mouse and other model species can be downloaded from the APPRIS database (https://appris.bioinfo.cnio.es), GENCODE genes (https://www.gencodegenes.org/) and the Ensembl website (https://www.ensembl.org). MANE Select transcripts for the human reference set are available from the Ensembl, GENCODE and RefSeq databases (https://www.ncbi.nlm.nih.gov/refseq/). Lists of splice variants where MANE and APPRIS coincide are available from the APPRIS database. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Fernando Pozo
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain
| | - José Manuel Rodriguez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), 28029 Madrid, Spain
| | - Laura Martínez Gómez
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain
| | - Jesús Vázquez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), 28029 Madrid, Spain,CIBER de Investigaciones Cardiovasculares (CIBERCV), 28029 Madrid, Spain
| | | |
Collapse
|
9
|
Wright CJ, Smith CWJ, Jiggins CD. Alternative splicing as a source of phenotypic diversity. Nat Rev Genet 2022; 23:697-710. [PMID: 35821097 DOI: 10.1038/s41576-022-00514-4] [Citation(s) in RCA: 96] [Impact Index Per Article: 48.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/13/2022] [Indexed: 12/27/2022]
Abstract
A major goal of evolutionary genetics is to understand the genetic processes that give rise to phenotypic diversity in multicellular organisms. Alternative splicing generates multiple transcripts from a single gene, enriching the diversity of proteins and phenotypic traits. It is well established that alternative splicing contributes to key innovations over long evolutionary timescales, such as brain development in bilaterians. However, recent developments in long-read sequencing and the generation of high-quality genome assemblies for diverse organisms has facilitated comparisons of splicing profiles between closely related species, providing insights into how alternative splicing evolves over shorter timescales. Although most splicing variants are probably non-functional, alternative splicing is nonetheless emerging as a dynamic, evolutionarily labile process that can facilitate adaptation and contribute to species divergence.
Collapse
Affiliation(s)
- Charlotte J Wright
- Tree of Life, Wellcome Sanger Institute, Cambridge, UK. .,Department of Zoology, University of Cambridge, Cambridge, UK.
| | | | - Chris D Jiggins
- Department of Zoology, University of Cambridge, Cambridge, UK.
| |
Collapse
|
10
|
Deng W, Mou T, Pawitan Y, Vu TN. Quantification of mutant–allele expression at isoform level in cancer from RNA-seq data. NAR Genom Bioinform 2022; 4:lqac052. [PMID: 35855322 PMCID: PMC9278039 DOI: 10.1093/nargab/lqac052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Revised: 06/26/2022] [Accepted: 07/04/2022] [Indexed: 11/13/2022] Open
Abstract
Even though the role of DNA mutations in cancer is well recognized, current quantification of the RNA expression, performed either at gene or isoform level, typically ignores the mutation status. Standard methods for estimating allele-specific expression (ASE) consider gene-level expression, but the functional impact of a mutation is best assessed at isoform level. Hence our goal is to quantify the mutant–allele expression at isoform level. We have developed and implemented a method, named MAX, for quantifying mutant–allele expression given a list of mutations. For a gene of interest, a mutant reference is constructed by incorporating all possible mutant versions of the wild-type isoforms in the transcriptome annotation. The mutant reference is then used for the RNA-seq reads mapping, which in principle works similarly for any quantification tool. We apply an alternating EM algorithm to the read-count data from the mapping step. In a simulation study, MAX performs well against standard isoform-quantification methods. Also, MAX achieves higher accuracy than conventional gene-based ASE methods such as ASEP. An analysis of a real dataset of acute myeloid leukemia reveals a subgroup of NPM1-mutated patients responding well to a kinase inhibitor. Our findings indicate that quantification of mutant–allele expression at isoform level is feasible and has potential added values for assessing the functional impact of DNA mutations in cancers.
Collapse
Affiliation(s)
- Wenjiang Deng
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet , Stockholm, Sweden
| | - Tian Mou
- School of Biomedical Engineering, Shenzhen University , Shenzhen, China
| | - Yudi Pawitan
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet , Stockholm, Sweden
| | - Trung Nghia Vu
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet , Stockholm, Sweden
| |
Collapse
|
11
|
Zhou D, Tran Y, Abou Elela S, Scott MS. SAPFIR: A webserver for the identification of alternative protein features. BMC Bioinformatics 2022; 23:250. [PMID: 35751026 PMCID: PMC9229502 DOI: 10.1186/s12859-022-04804-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 06/20/2022] [Indexed: 11/29/2022] Open
Abstract
Background Alternative splicing can increase the diversity of gene functions by generating multiple isoforms with different sequences and functions. However, the extent to which splicing events have functional consequences remains unclear and predicting the impact of splicing events on protein activity is limited to gene-specific analysis. Results To accelerate the identification of functionally relevant alternative splicing events we created SAPFIR, a predictor of protein features associated with alternative splicing events. This webserver tool uses InterProScan to predict protein features such as functional domains, motifs and sites in the human and mouse genomes and link them to alternative splicing events. Alternative protein features are displayed as functions of the transcripts and splice sites. SAPFIR could be used to analyze proteins generated from a single gene or a group of genes and can directly identify alternative protein features in large sequence data sets. The accuracy and utility of SAPFIR was validated by its ability to rediscover previously validated alternative protein domains. In addition, our de novo analysis of public datasets using SAPFIR indicated that only a small portion of alternative protein domains was conserved between human and mouse, and that in human, genes involved in nervous system process, regulation of DNA-templated transcription and aging are more likely to produce isoforms missing functional domains due to alternative splicing. Conclusion Overall SAPFIR represents a new tool for the rapid identification of functional alternative splicing events and enables the identification of cellular functions affected by a defined splicing program. SAPFIR is freely available at https://bioinfo-scottgroup.med.usherbrooke.ca/sapfir/, a website implemented in Python, with all major browsers supported. The source code is available at https://github.com/DelongZHOU/SAPFIR. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04804-w.
Collapse
Affiliation(s)
- Delong Zhou
- Département de Microbiologie et d'infectiologie, Faculté de Médecine et des Sciences de la Santé, Université de Sherbrooke, Sherbrooke, QC, J1E 4K8, Canada
| | - Yvan Tran
- Département de Biochimie et Génomique Fonctionnelle, Faculté de Médecine et des Sciences de la Santé, Université de Sherbrooke, Sherbrooke, QC, J1E 4K8, Canada
| | - Sherif Abou Elela
- Département de Microbiologie et d'infectiologie, Faculté de Médecine et des Sciences de la Santé, Université de Sherbrooke, Sherbrooke, QC, J1E 4K8, Canada.
| | - Michelle S Scott
- Département de Biochimie et Génomique Fonctionnelle, Faculté de Médecine et des Sciences de la Santé, Université de Sherbrooke, Sherbrooke, QC, J1E 4K8, Canada.
| |
Collapse
|
12
|
Ilieva M, Uchida S. Long Non-Coding RNAs in Induced Pluripotent Stem Cells and Their Differentiation. Am J Physiol Cell Physiol 2022; 322:C769-C774. [PMID: 35235428 DOI: 10.1152/ajpcell.00059.2022] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The breakthrough technology for reprogramming somatic cells into induced pluripotent stem cells (iPSC) has created a new path for science and medicine. The iPSC technology provides a powerful tool for elucidating the mechanisms of cellular differentiation and cell fate decision as well as to study targets and pathways relevant to pathological processes. Since they can be generated from any person, iPSC are a promising resource for regenerative medicine potentiating the possibility to discover new drugs in a high-throughput screening format and treat diseases through personalized cell therapy-based strategies. However, the reprogramming process is complex, and its regulation needs fine tuning. The regulatory mechanisms of cell reprogramming and differentiation are still not elucidated, but significant results show that multiple long non-coding RNAs (lncRNAs) play essential roles. In this mini review, we discuss the latest research on lncRNAs in iPSC stemness, neuronal and cardiac differentiation.
Collapse
Affiliation(s)
- Mirolyuba Ilieva
- Center for RNA Medicine, Department of Clinical Medicine, Aalborg University, Copenhagen SV, Denmark
| | - Shizuka Uchida
- Center for RNA Medicine, Department of Clinical Medicine, Aalborg University, Copenhagen SV, Denmark
| |
Collapse
|
13
|
Gohr A, Mantica F, Hermoso-Pulido A, Tapial J, Márquez Y, Irimia M. Computational Analysis of Alternative Splicing Using VAST-TOOLS and the VastDB Framework. Methods Mol Biol 2022; 2537:97-128. [PMID: 35895261 DOI: 10.1007/978-1-0716-2521-7_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Alternative splicing (AS) can vastly expand animal transcriptomes and proteomes. Two main open questions in the field are how AS is regulated across cell/tissue types and disease, and what roles different AS events play. To facilitate AS research, we have created the computational VastDB framework, which comprises a series of complementary software and resources that we describe in this chapter. The VastDB framework is especially designed to aid biomedical researchers without a strong computational background. It offers tools and resources to: (a) quantify AS and identify differentially spliced AS events using RNA-seq data (vast-tools), (b) perform multiple genomic and sequence analyses for investigating AS events (Matt), (c) identify AS events with genomic and regulatory conservation among species (ExOrthist), and (d) help with the biological interpretation of the results, and, ultimately, with the identification of interesting AS events to design wet-lab experiments (VastDB and PastDB).
Collapse
Affiliation(s)
- André Gohr
- Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Federica Mantica
- Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Antonio Hermoso-Pulido
- Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Javier Tapial
- Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Yamile Márquez
- Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Manuel Irimia
- Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain.
- Universitat Pompeu Fabra, Barcelona, Spain.
- ICREA, Barcelona, Spain.
| |
Collapse
|
14
|
Mazin PV, Khaitovich P, Cardoso-Moreira M, Kaessmann H. Alternative splicing during mammalian organ development. Nat Genet 2021; 53:925-934. [PMID: 33941934 PMCID: PMC8187152 DOI: 10.1038/s41588-021-00851-w] [Citation(s) in RCA: 68] [Impact Index Per Article: 22.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Accepted: 03/19/2021] [Indexed: 12/30/2022]
Abstract
Alternative splicing (AS) is pervasive in mammalian genomes, yet cross-species comparisons have been largely restricted to adult tissues and the functionality of most AS events remains unclear. We assessed AS patterns across pre- and postnatal development of seven organs in six mammals and a bird. Our analyses revealed that developmentally dynamic AS events, which are especially prevalent in the brain, are substantially more conserved than nondynamic ones. Cassette exons with increasing inclusion frequencies during development show the strongest signals of conserved and regulated AS. Newly emerged cassette exons are typically incorporated late in testis development, but those retained during evolution are predominantly brain specific. Our work suggests that an intricate interplay of programs controlling gene expression levels and AS is fundamental to organ development, especially for the brain and heart. In these regulatory networks, AS affords substantial functional diversification of genes through the generation of tissue- and time-specific isoforms from broadly expressed genes.
Collapse
Affiliation(s)
- Pavel V Mazin
- V. Zelman Center for Neurobiology and Brain Restoration, Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Philipp Khaitovich
- V. Zelman Center for Neurobiology and Brain Restoration, Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Margarida Cardoso-Moreira
- Center for Molecular Biology of Heidelberg University (ZMBH), DKFZ-ZMBH Alliance, Heidelberg, Germany.
- Evolutionary Developmental Biology Laboratory, The Francis Crick Institute, London, UK.
| | - Henrik Kaessmann
- Center for Molecular Biology of Heidelberg University (ZMBH), DKFZ-ZMBH Alliance, Heidelberg, Germany.
| |
Collapse
|
15
|
Pozo F, Martinez-Gomez L, Walsh TA, Rodriguez JM, Di Domenico T, Abascal F, Vazquez J, Tress ML. Assessing the functional relevance of splice isoforms. NAR Genom Bioinform 2021; 3:lqab044. [PMID: 34046593 PMCID: PMC8140736 DOI: 10.1093/nargab/lqab044] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Revised: 04/22/2021] [Accepted: 05/17/2021] [Indexed: 12/20/2022] Open
Abstract
Alternative splicing of messenger RNA can generate an array of mature transcripts, but it is not clear how many go on to produce functionally relevant protein isoforms. There is only limited evidence for alternative proteins in proteomics analyses and data from population genetic variation studies indicate that most alternative exons are evolving neutrally. Determining which transcripts produce biologically important isoforms is key to understanding isoform function and to interpreting the real impact of somatic mutations and germline variations. Here we have developed a method, TRIFID, to classify the functional importance of splice isoforms. TRIFID was trained on isoforms detected in large-scale proteomics analyses and distinguishes these biologically important splice isoforms with high confidence. Isoforms predicted as functionally important by the algorithm had measurable cross species conservation and significantly fewer broken functional domains. Additionally, exons that code for these functionally important protein isoforms are under purifying selection, while exons from low scoring transcripts largely appear to be evolving neutrally. TRIFID has been developed for the human genome, but it could in principle be applied to other well-annotated species. We believe that this method will generate valuable insights into the cellular importance of alternative splicing.
Collapse
Affiliation(s)
- Fernando Pozo
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Laura Martinez-Gomez
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Thomas A Walsh
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - José Manuel Rodriguez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares (CNIC), Madrid, Spain
| | - Tomas Di Domenico
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Federico Abascal
- Somatic Evolution Group, Wellcome Sanger Institute, Hinxton CB10 1SA, UK
| | - Jesús Vazquez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares (CNIC), Madrid, Spain
| | - Michael L Tress
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| |
Collapse
|
16
|
Farkas C, Quiroz A, Alvarez C, Hermosilla V, Aylwin CF, Lomniczi A, Castro AF, Hepp MI, Pincheira R. Characterization of SALL2 Gene Isoforms and Targets Across Cell Types Reveals Highly Conserved Networks. Front Genet 2021; 12:613808. [PMID: 33692826 PMCID: PMC7937961 DOI: 10.3389/fgene.2021.613808] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Accepted: 01/28/2021] [Indexed: 12/21/2022] Open
Abstract
The SALL2 transcription factor, an evolutionarily conserved gene through vertebrates, is involved in normal development and neuronal differentiation. In disease, SALL2 is associated with eye, kidney, and brain disorders, but mainly is related to cancer. Some studies support a tumor suppressor role and others an oncogenic role for SALL2, which seems to depend on the cancer type. An additional consideration is tissue-dependent expression of different SALL2 isoforms. Human and mouse SALL2 gene loci contain two promoters, each controlling the expression of a different protein isoform (E1 and E1A). Also, several improvements on the human genome assembly and gene annotation through next-generation sequencing technologies reveal correction and annotation of additional isoforms, obscuring dissection of SALL2 isoform-specific transcriptional targets and functions. We here integrated current data of normal/tumor gene expression databases along with ChIP-seq binding profiles to analyze SALL2 isoforms expression distribution and infer isoform-specific SALL2 targets. We found that the canonical SALL2 E1 isoform is one of the lowest expressed, while the E1A isoform is highly predominant across cell types. To dissect SALL2 isoform-specific targets, we analyzed publicly available ChIP-seq data from Glioblastoma tumor-propagating cells and in-house ChIP-seq datasets performed in SALL2 wild-type and E1A isoform knockout HEK293 cells. Another available ChIP-seq data in HEK293 cells (ENCODE Consortium Phase III) overexpressing a non-canonical SALL2 isoform (short_E1A) was also analyzed. Regardless of cell type, our analysis indicates that the SALL2 long E1 and E1A isoforms, but not short_E1A, are mostly contributing to transcriptional control, and reveals a highly conserved network of brain-specific transcription factors (i.e., SALL3, POU3F2, and NPAS3). Our data integration identified a conserved molecular network in which SALL2 regulates genes associated with neural function, cell differentiation, development, and cell adhesion between others. Also, we identified PODXL as a gene that is likely regulated by SALL2 across tissues. Our study encourages the validation of publicly available ChIP-seq datasets to assess a specific gene/isoform’s transcriptional targets. The knowledge of SALL2 isoforms expression and function in different tissue contexts is relevant to understanding its role in disease.
Collapse
Affiliation(s)
- Carlos Farkas
- Laboratorio de Transducción de Señales y Cáncer, Departamento de Bioquímica y Biología Molecular, Facultad de Ciencias Biológicas, Universidad de Concepción, Concepción, Chile
| | - Aracelly Quiroz
- Laboratorio de Transducción de Señales y Cáncer, Departamento de Bioquímica y Biología Molecular, Facultad de Ciencias Biológicas, Universidad de Concepción, Concepción, Chile
| | - Claudia Alvarez
- Laboratorio de Transducción de Señales y Cáncer, Departamento de Bioquímica y Biología Molecular, Facultad de Ciencias Biológicas, Universidad de Concepción, Concepción, Chile
| | - Viviana Hermosilla
- Laboratorio de Transducción de Señales y Cáncer, Departamento de Bioquímica y Biología Molecular, Facultad de Ciencias Biológicas, Universidad de Concepción, Concepción, Chile
| | - Carlos F Aylwin
- Division of Neuroscience, Oregon National Primate Research Center, Oregon Health and Science University, Portland, OR, United States
| | - Alejandro Lomniczi
- Division of Neuroscience, Oregon National Primate Research Center, Oregon Health and Science University, Portland, OR, United States
| | - Ariel F Castro
- Laboratorio de Transducción de Señales y Cáncer, Departamento de Bioquímica y Biología Molecular, Facultad de Ciencias Biológicas, Universidad de Concepción, Concepción, Chile
| | - Matias I Hepp
- Laboratorio de Transducción de Señales y Cáncer, Departamento de Bioquímica y Biología Molecular, Facultad de Ciencias Biológicas, Universidad de Concepción, Concepción, Chile.,Laboratorio de Investigación en Ciencias Biomédicas, Departamento de Ciencias Básicas y Morfología, Facultad de Medicina, Universidad Católica de la Santísima Concepción, Concepción, Chile
| | - Roxana Pincheira
- Laboratorio de Transducción de Señales y Cáncer, Departamento de Bioquímica y Biología Molecular, Facultad de Ciencias Biológicas, Universidad de Concepción, Concepción, Chile
| |
Collapse
|
17
|
Characterization of the rat Acetylcholinesterase readthrough (AChE-R) splice variant: Implications for toxicological studies. Biochem Biophys Res Commun 2020; 532:528-534. [PMID: 32896378 DOI: 10.1016/j.bbrc.2020.08.065] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 08/21/2020] [Indexed: 11/23/2022]
Abstract
Exposure to chemicals and other environmental stressors can differentially impact the expression of Acetylcholinesterase (AChE) splice variants. Surprisingly, despite the widespread use of the rat model in toxicological studies and the wealth of literature on this important biomarker of neurotoxicity, AChE coding exons and splice variants are not yet fully annotated in this species. To address this knowledge gap, a short problematic region of the rat AChE genomic DNA present in GenBank was first re-sequenced. This revised genomic sequence was then aligned to rat AChE RefSeq mRNA and compared to orthologous mammalian sequences, in order to map the coding exon and intron boundaries of the rat AChE gene. Based on these bioinformatics analyses, a sequence was predicted for the yet-unannotated rat Acetylcholinesterase readthrough (AChE-R) splice variant. PCR primers designed to specifically amplify rat AChE-R were used to confirm its expression in rat PC12 cells. Compared to the canonical AChE-S splice variant, AChE-R was expressed at much lower levels but presented distinct regulation patterns in PC12 cells and rat primary cerebral granule cells (CGCs) following exposure to Chlorpyrifos (a well-known neurotoxic organophosphate pesticide). Taken together, these observations point to the evolutionary conservation of the AChE-R splicing event between rodents and human and to the distinct regulation of AChE splice variants in response to toxicological challenges.
Collapse
|
18
|
Rodriguez JM, Pozo F, di Domenico T, Vazquez J, Tress ML. An analysis of tissue-specific alternative splicing at the protein level. PLoS Comput Biol 2020; 16:e1008287. [PMID: 33017396 PMCID: PMC7561204 DOI: 10.1371/journal.pcbi.1008287] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Revised: 10/15/2020] [Accepted: 08/25/2020] [Indexed: 01/09/2023] Open
Abstract
The role of alternative splicing is one of the great unanswered questions in cellular biology. There is strong evidence for alternative splicing at the transcript level, and transcriptomics experiments show that many splice events are tissue specific. It has been suggested that alternative splicing evolved in order to remodel tissue-specific protein-protein networks. Here we investigated the evidence for tissue-specific splicing among splice isoforms detected in a large-scale proteomics analysis. Although the data supporting alternative splicing is limited at the protein level, clear patterns emerged among the small numbers of alternative splice events that we could detect in the proteomics data. More than a third of these splice events were tissue-specific and most were ancient: over 95% of splice events that were tissue-specific in both proteomics and RNAseq analyses evolved prior to the ancestors of lobe-finned fish, at least 400 million years ago. By way of contrast, three in four alternative exons in the human gene set arose in the primate lineage, so our results cannot be extrapolated to the whole genome. Tissue-specific alternative protein forms in the proteomics analysis were particularly abundant in nervous and muscle tissues and their genes had roles related to the cytoskeleton and either the structure of muscle fibres or cell-cell connections. Our results suggest that this conserved tissue-specific alternative splicing may have played a role in the development of the vertebrate brain and heart. We manually curated a set of 255 splice events detected in a large-scale tissue-based proteomics experiment and found that more than a third had evidence of significant tissue-specific differences. Events that were significantly tissue-specific at the protein level were highly conserved; almost 75% evolved over 400 million years ago. The tissues in which we found most evidence for tissue-specific splicing were nervous tissues and cardiac tissues. Genes with tissue-specific events in these two tissues had functions related to important cellular structures in brain and heart tissues. These splice events may have been essential for the development of vertebrate heart and muscle. However, our data set may not be representative of alternative exons as a whole. We found that most tissue specific splicing was strongly conserved, but just 5% of annotated alternative exons in the human gene set are ancient. More than three quarters of alternative exons are primate-derived. Although the analysis does not provide a definitive answer to the question of the functional role of alternative splicing, our results do indicate that alternative splice variants may have played a significant part in the evolution of brain and heart tissues in vertebrates.
Collapse
Affiliation(s)
- Jose Manuel Rodriguez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares (CNIC), Calle Melchor Fernandez, Madrid, Spain
| | - Fernando Pozo
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Calle Melchor Fernandez, Madrid, Spain
| | - Tomas di Domenico
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Calle Melchor Fernandez, Madrid, Spain
| | - Jesus Vazquez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares (CNIC), Calle Melchor Fernandez, Madrid, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Cardiovasculares (CIBERCV), Madrid, Spain
| | - Michael L. Tress
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Calle Melchor Fernandez, Madrid, Spain
- * E-mail:
| |
Collapse
|
19
|
Abstract
Systematics is described for annotation of variations in RNA molecules. The conceptual framework is part of Variation Ontology (VariO) and facilitates depiction of types of variations, their functional and structural effects and other consequences in any RNA molecule in any organism. There are more than 150 RNA related VariO terms in seven levels, which can be further combined to generate even more complicated and detailed annotations. The terms are described together with examples, usually for variations and effects in human and in diseases. RNA variation type has two subcategories: variation classification and origin with subterms. Altogether six terms are available for function description. Several terms are available for affected RNA properties. The ontology contains also terms for structural description for affected RNA type, post-transcriptional RNA modifications, secondary and tertiary structure effects and RNA sugar variations. Together with the DNA and protein concepts and annotations, RNA terms allow comprehensive description of variations of genetic and non-genetic origin at all possible levels. The VariO annotations are readable both for humans and computer programs for advanced data integration and mining.
Collapse
Affiliation(s)
- Mauno Vihinen
- Department of Experimental Medical Science, Lund University, Lund, Sweden
| |
Collapse
|
20
|
Korona D, Nightingale D, Fabre B, Nelson M, Fischer B, Johnson G, Lees J, Hubbard S, Lilley K, Russell S. Characterisation of protein isoforms encoded by the Drosophila Glycogen Synthase Kinase 3 gene shaggy. PLoS One 2020; 15:e0236679. [PMID: 32760087 PMCID: PMC7410302 DOI: 10.1371/journal.pone.0236679] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Accepted: 07/09/2020] [Indexed: 12/15/2022] Open
Abstract
The Drosophila shaggy gene (sgg, GSK-3) encodes multiple protein isoforms with serine/threonine kinase activity and is a key player in diverse developmental signalling pathways. Currently it is unclear whether different Sgg proteoforms are similarly involved in signalling or if different proteoforms have distinct functions. We used CRISPR/Cas9 genome engineering to tag eight different Sgg proteoform classes and determined their localization during embryonic development. We performed proteomic analysis of the two major proteoform classes and generated mutant lines for both of these for transcriptomic and phenotypic analysis. We uncovered distinct tissue-specific localization patterns for all of the tagged proteoforms we examined, most of which have not previously been characterised directly at the protein level, including one proteoform initiating with a non-standard codon. Collectively, this suggests complex developmentally regulated splicing of the sgg primary transcript. Further, affinity purification followed by mass spectrometric analyses indicate a different repertoire of interacting proteins for the two major proteoforms we examined, one with ubiquitous expression (Sgg-PB) and one with nervous system specific expression (Sgg-PA). Specific mutation of these proteoforms shows that Sgg-PB performs the well characterised maternal and zygotic segmentations functions of the sgg locus, while Sgg-PA mutants show adult lifespan and locomotor defects consistent with its nervous system localisation. Our findings provide new insights into the role of GSK-3 proteoforms and intriguing links with the GSK-3α and GSK-3β proteins encoded by independent vertebrate genes. Our analysis suggests that different proteoforms generated by alternative splicing are likely to perform distinct functions.
Collapse
Affiliation(s)
- Dagmara Korona
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Daniel Nightingale
- Department of Biochemistry, Cambridge Centre for Proteomics, University of Cambridge, Cambridge, United Kingdom
| | - Bertrand Fabre
- Department of Biochemistry, Cambridge Centre for Proteomics, University of Cambridge, Cambridge, United Kingdom
| | - Michael Nelson
- Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre Manchester, University of Manchester, Manchester, United Kingdom
| | - Bettina Fischer
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Glynnis Johnson
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Jonathan Lees
- Department of Biological and Medical Sciences, Faculty of Health and Life Sciences, Oxford Brookes University, Oxford, United Kingdom
| | - Simon Hubbard
- Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre Manchester, University of Manchester, Manchester, United Kingdom
| | - Kathryn Lilley
- Department of Biochemistry, Cambridge Centre for Proteomics, University of Cambridge, Cambridge, United Kingdom
| | - Steven Russell
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
- * E-mail:
| |
Collapse
|
21
|
Functional and structural features of proteins associated with alternative splicing. Int J Biol Macromol 2020; 147:513-520. [PMID: 31931065 DOI: 10.1016/j.ijbiomac.2019.09.241] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Revised: 09/16/2019] [Accepted: 09/21/2019] [Indexed: 12/16/2022]
Abstract
The alternative splicing is a mechanism increasing the number of expressed proteins and a variety of these functions. We uncovered the protein domains most frequently lacked or occurred in the splice variants. Proteins presented by several isoforms participate in such processes as transcription regulation, immune response, etc. Our results displayed the association of alternative splicing with branched regulatory pathways. By considering the published data on the protein proteins encoded by the 18th human chromosome, we noted that alternative products display the differences in several functional features, such as phosphorylation, subcellular location, ligand specificity, protein-protein interactions, etc. The investigation of alternative variants referred to the protein kinase domain was performed by comparing the alternative sequences with 3D structures. It was shown that large enough insertions/deletions could be compatible with the kinase fold if they match between the conserved secondary structures. Using the 3D data on human proteins, we showed that conformational flexibility could accommodate fold alterations in splice variants. The investigations of structural and functional differences in splice isoforms are required to understand how to distinguish the isoforms expressed as functioning proteins from the non-realized transcripts. These studies allow filling the gap between genomic and proteomic data.
Collapse
|
22
|
Ait-Hamlat A, Zea DJ, Labeeuw A, Polit L, Richard H, Laine E. Transcripts' Evolutionary History and Structural Dynamics Give Mechanistic Insights into the Functional Diversity of the JNK Family. J Mol Biol 2020; 432:2121-2140. [PMID: 32067951 DOI: 10.1016/j.jmb.2020.01.032] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 01/03/2020] [Accepted: 01/28/2020] [Indexed: 12/14/2022]
Abstract
Alternative splicing and alternative initiation/termination transcription sites have the potential to greatly expand the proteome in eukaryotes by producing several transcript isoforms from the same gene. Although these mechanisms are well described at the genomic level, little is known about their contribution to protein evolution and their impact at the protein structure level. Here, we address both issues by reconstructing the evolutionary history of transcripts and by modeling the tertiary structures of the corresponding protein isoforms. We reconstruct phylogenetic forests relating 60 protein-coding transcripts from the c-Jun N-terminal kinase (JNK) family observed in seven species. We identify two alternative splicing events of ancient origin and show that they induce subtle changes in the protein's structural dynamics. We highlight a previously uncharacterized transcript whose predicted structure seems stable in solution. We further demonstrate that orphan transcripts, for which no phylogeny could be reconstructed, display peculiar sequence and structural properties. Our approach is implemented in PhyloSofS (Phylogenies of Splicing Isoforms Structures), a fully automated computational tool freely available at https://github.com/PhyloSofS-Team/PhyloSofS.
Collapse
Affiliation(s)
- Adel Ait-Hamlat
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, 75005, France
| | - Diego Javier Zea
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, 75005, France
| | - Antoine Labeeuw
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, 75005, France
| | - Lélia Polit
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, 75005, France
| | - Hugues Richard
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, 75005, France.
| | - Elodie Laine
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, 75005, France.
| |
Collapse
|
23
|
Hatje K, Mühlhausen S, Simm D, Kollmar M. The Protein-Coding Human Genome: Annotating High-Hanging Fruits. Bioessays 2019; 41:e1900066. [PMID: 31544971 DOI: 10.1002/bies.201900066] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 08/07/2019] [Indexed: 12/19/2022]
Abstract
The major transcript variants of human protein-coding genes are annotated to a certain degree of accuracy combining manual curation, transcript data, and proteomics evidence. However, there is considerable disagreement on the annotation of about 2000 genes-they can be protein-coding, noncoding, or pseudogenes-and on the annotation of most of the predicted alternative transcripts. Pure transcriptome mapping approaches seem to be limited in discriminating functional expression from noise. These limitations have partially been overcome by dedicated algorithms to detect alternative spliced micro-exons and wobble splice variants. Recently, knowledge about splice mechanism and protein structure are incorporated into an algorithm to predict neighboring homologous exons, often spliced in a mutually exclusive manner. Predicted exons are evaluated by transcript data, structural compatibility, and evolutionary conservation, revealing hundreds of novel coding exons and splice mechanism re-assignments. The emerging human pan-genome is necessitating distinctive annotations incorporating differences between individuals and between populations.
Collapse
Affiliation(s)
- Klas Hatje
- Roche Pharmaceutical Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstr. 124, 4070, Basel, Switzerland
| | - Stefanie Mühlhausen
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077, Göttingen, Germany
| | - Dominic Simm
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077, Göttingen, Germany.,Theoretical Computer Science and Algorithmic Methods, Institute of Computer Science, Georg-August-University Göttingen, Goldschmidtstr. 7, 37077, Göttingen, Germany
| | - Martin Kollmar
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077, Göttingen, Germany
| |
Collapse
|
24
|
El-Athman R, Knezevic D, Fuhr L, Relógio A. A Computational Analysis of Alternative Splicing across Mammalian Tissues Reveals Circadian and Ultradian Rhythms in Splicing Events. Int J Mol Sci 2019; 20:E3977. [PMID: 31443305 PMCID: PMC6721216 DOI: 10.3390/ijms20163977] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Revised: 08/03/2019] [Accepted: 08/10/2019] [Indexed: 02/07/2023] Open
Abstract
Mounting evidence points to a role of the circadian clock in the temporal regulation of post-transcriptional processes in mammals, including alternative splicing (AS). In this study, we carried out a computational analysis of circadian and ultradian rhythms on the transcriptome level to characterise the landscape of rhythmic AS events in published datasets covering 76 tissues from mouse and olive baboon. Splicing-related genes with 24-h rhythmic expression patterns showed a bimodal distribution of peak phases across tissues and species, indicating that they might be controlled by the circadian clock. On the output level, we identified putative oscillating AS events in murine microarray data and pairs of differentially rhythmic splice isoforms of the same gene in baboon RNA-seq data that peaked at opposing times of the day and included oncogenes and tumour suppressors. We further explored these findings using a new circadian RNA-seq dataset of human colorectal cancer cell lines. Rhythmic isoform expression patterns differed between the primary tumour and the metastatic cell line and were associated with cancer-related biological processes, indicating a functional role of rhythmic AS that might be implicated in tumour progression. Our data shows that rhythmic AS events are widespread across mammalian tissues and might contribute to a temporal diversification of the proteome.
Collapse
Affiliation(s)
- Rukeia El-Athman
- Institute for Theoretical Biology (ITB), Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, 10117 Berlin, Germany
- Medical Department of Hematology, Oncology and Tumor Immunology, and Molekulares Krebsforschungszentrum (MKFZ), Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, 10117 Berlin, Germany
| | - Dora Knezevic
- Institute for Theoretical Biology (ITB), Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, 10117 Berlin, Germany
- Medical Department of Hematology, Oncology and Tumor Immunology, and Molekulares Krebsforschungszentrum (MKFZ), Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, 10117 Berlin, Germany
| | - Luise Fuhr
- Institute for Theoretical Biology (ITB), Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, 10117 Berlin, Germany
- Medical Department of Hematology, Oncology and Tumor Immunology, and Molekulares Krebsforschungszentrum (MKFZ), Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, 10117 Berlin, Germany
| | - Angela Relógio
- Institute for Theoretical Biology (ITB), Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, 10117 Berlin, Germany.
- Medical Department of Hematology, Oncology and Tumor Immunology, and Molekulares Krebsforschungszentrum (MKFZ), Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, 10117 Berlin, Germany.
| |
Collapse
|