1
|
Maigné É, Noirot C, Henry J, Adu Kesewaah Y, Badin L, Déjean S, Guilmineau C, Krebs A, Mathevet F, Segalini A, Thomassin L, Colongo D, Gaspin C, Liaubet L, Vialaneix N. Asterics: a simple tool for the ExploRation and Integration of omiCS data. BMC Bioinformatics 2023; 24:391. [PMID: 37853347 PMCID: PMC10583411 DOI: 10.1186/s12859-023-05504-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 09/28/2023] [Indexed: 10/20/2023] Open
Abstract
BACKGROUND The rapid development of omics acquisition techniques has induced the production of a large volume of heterogeneous and multi-level omics datasets, which require specific and sometimes complex analyses to obtain relevant biological information. Here, we present ASTERICS (version 2.5), a publicly available web interface for the analyses of omics datasets. RESULTS ASTERICS is designed to make both standard and complex exploratory and integration analysis workflows easily available to biologists and to provide high quality interactive plots. Special care has been taken to provide a comprehensive documentation of the implemented analyses and to guide users toward sound analysis choices regarding some specific omics data. Data and analyses are organized in a comprehensive graphical workflow within ASTERICS workspace to facilitate the understanding of successive data editions and analyses leading to a given result. CONCLUSION ASTERICS provides an easy to use platform for omics data exploration and integration. The modular organization of its open source code makes it easy to incorporate new workflows and analyses by external contributors. ASTERICS is available at https://asterics.miat.inrae.fr and can also be deployed using provided docker images.
Collapse
Affiliation(s)
- Élise Maigné
- Université de Toulouse, INRAE, UR MIAT, 31326, Castanet-Tolosan, France
| | - Céline Noirot
- Université de Toulouse, INRAE, UR MIAT, 31326, Castanet-Tolosan, France
- Université Fédérale de Toulouse, INRAE, Bioinfomics, Genotoul Bioinformatics Facility, 31326, Castanet-Tolosan, France
| | - Julien Henry
- Université de Toulouse, INRAE, UR MIAT, 31326, Castanet-Tolosan, France
- Plateforme Biostatistique, Genotoul, Toulouse, France
| | - Yaa Adu Kesewaah
- Université de Toulouse, INRAE, UR MIAT, 31326, Castanet-Tolosan, France
- Plateforme Biostatistique, Genotoul, Toulouse, France
| | | | - Sébastien Déjean
- Plateforme Biostatistique, Genotoul, Toulouse, France
- IMT, UMR 5219, Université de Toulouse, CNRS, UPS, 31062, Toulouse, France
| | - Camille Guilmineau
- Université de Toulouse, INRAE, UR MIAT, 31326, Castanet-Tolosan, France
- Plateforme Biostatistique, Genotoul, Toulouse, France
| | - Arielle Krebs
- Université de Toulouse, INRAE, UR MIAT, 31326, Castanet-Tolosan, France
- Université Fédérale de Toulouse, INRAE, Bioinfomics, Genotoul Bioinformatics Facility, 31326, Castanet-Tolosan, France
| | - Fanny Mathevet
- Université de Toulouse, INRAE, UR MIAT, 31326, Castanet-Tolosan, France
- Plateforme Biostatistique, Genotoul, Toulouse, France
| | | | | | | | - Christine Gaspin
- Université de Toulouse, INRAE, UR MIAT, 31326, Castanet-Tolosan, France
- Université Fédérale de Toulouse, INRAE, Bioinfomics, Genotoul Bioinformatics Facility, 31326, Castanet-Tolosan, France
| | - Laurence Liaubet
- GenPhySE, Université de Toulouse, INRAE, ENVT, 31326, Castanet-Tolosan, France
| | - Nathalie Vialaneix
- Université de Toulouse, INRAE, UR MIAT, 31326, Castanet-Tolosan, France.
- Plateforme Biostatistique, Genotoul, Toulouse, France.
| |
Collapse
|
2
|
Eché C, Iampietro C, Birbes C, Dréau A, Kuchly C, Di Franco A, Klopp C, Faraut T, Djebali S, Castinel A, Zytnicki M, Denis E, Boussaha M, Grohs C, Boichard D, Gaspin C, Milan D, Donnadieu C. A Bos taurus sequencing methods benchmark for assembly, haplotyping, and variant calling. Sci Data 2023; 10:369. [PMID: 37291142 PMCID: PMC10250393 DOI: 10.1038/s41597-023-02249-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 05/16/2023] [Indexed: 06/10/2023] Open
Abstract
Inspired by the production of reference data sets in the Genome in a Bottle project, we sequenced one Charolais heifer with different technologies: Illumina paired-end, Oxford Nanopore, Pacific Biosciences (HiFi and CLR), 10X Genomics linked-reads, and Hi-C. In order to generate haplotypic assemblies, we also sequenced both parents with short reads. From these data, we built two haplotyped trio high quality reference genomes and a consensus assembly, using up-to-date software packages. The assemblies obtained using PacBio HiFi reaches a size of 3.2 Gb, which is significantly larger than the 2.7 Gb ARS-UCD1.2 reference. The BUSCO score of the consensus assembly reaches a completeness of 95.8%, among highly conserved mammal genes. We also identified 35,866 structural variants larger than 50 base pairs. This assembly is a contribution to the bovine pangenome for the "Charolais" breed. These datasets will prove to be useful resources enabling the community to gain additional insight on sequencing technologies for applications such as SNP, indel or structural variant calling, and de novo assembly.
Collapse
Affiliation(s)
- Camille Eché
- INRAE, US 1426, GeT-PlaGe, Genotoul, France Genomique, Université Fédérale de Toulouse, Castanet-Tolosan, France
| | - Carole Iampietro
- INRAE, US 1426, GeT-PlaGe, Genotoul, France Genomique, Université Fédérale de Toulouse, Castanet-Tolosan, France
| | - Clément Birbes
- Université Fédérale de Toulouse, INRAE, BioinfOmics, GenoToul Bioinformatics facility, 31326, Castanet-Tolosan, France
| | - Andreea Dréau
- Université Fédérale de Toulouse, INRAE, BioinfOmics, GenoToul Bioinformatics facility, 31326, Castanet-Tolosan, France
| | - Claire Kuchly
- INRAE, US 1426, GeT-PlaGe, Genotoul, France Genomique, Université Fédérale de Toulouse, Castanet-Tolosan, France
| | - Arnaud Di Franco
- Université Fédérale de Toulouse, INRAE, BioinfOmics, GenoToul Bioinformatics facility, 31326, Castanet-Tolosan, France
| | - Christophe Klopp
- Université Fédérale de Toulouse, INRAE, BioinfOmics, GenoToul Bioinformatics facility, 31326, Castanet-Tolosan, France
| | - Thomas Faraut
- GenPhySE, Université de Toulouse, INRAE, INPT, ENVT, Castanet-Tolosan, 31326, France
| | - Sarah Djebali
- GenPhySE, Université de Toulouse, INRAE, INPT, ENVT, Castanet-Tolosan, 31326, France
- IRSD, Université de Toulouse, INSERM, INRAE, ENVT, UPS, 31024, Toulouse, France
| | - Adrien Castinel
- INRAE, US 1426, GeT-PlaGe, Genotoul, France Genomique, Université Fédérale de Toulouse, Castanet-Tolosan, France
| | - Matthias Zytnicki
- Université Fédérale de Toulouse, INRAE, MIAT, 31326, Castanet-Tolosan, France
| | - Erwan Denis
- INRAE, US 1426, GeT-PlaGe, Genotoul, France Genomique, Université Fédérale de Toulouse, Castanet-Tolosan, France
| | - Mekki Boussaha
- Université Paris-Saclay, INRAE, AgroParisTech, GABI, 78350, Jouy-en-Josas, France
| | - Cécile Grohs
- Université Paris-Saclay, INRAE, AgroParisTech, GABI, 78350, Jouy-en-Josas, France
| | - Didier Boichard
- Université Paris-Saclay, INRAE, AgroParisTech, GABI, 78350, Jouy-en-Josas, France
| | - Christine Gaspin
- Université Fédérale de Toulouse, INRAE, BioinfOmics, GenoToul Bioinformatics facility, 31326, Castanet-Tolosan, France
- Université Fédérale de Toulouse, INRAE, MIAT, 31326, Castanet-Tolosan, France
| | - Denis Milan
- INRAE, US 1426, GeT-PlaGe, Genotoul, France Genomique, Université Fédérale de Toulouse, Castanet-Tolosan, France
- GenPhySE, Université de Toulouse, INRAE, INPT, ENVT, Castanet-Tolosan, 31326, France
| | - Cécile Donnadieu
- INRAE, US 1426, GeT-PlaGe, Genotoul, France Genomique, Université Fédérale de Toulouse, Castanet-Tolosan, France.
| |
Collapse
|
3
|
Homberg N, Galvão Ferrarini M, Gaspin C, Sagot MF. MicroRNA Target Identification: Revisiting Accessibility and Seed Anchoring. Genes (Basel) 2023; 14:genes14030664. [PMID: 36980936 PMCID: PMC10048102 DOI: 10.3390/genes14030664] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 02/23/2023] [Accepted: 03/04/2023] [Indexed: 03/09/2023] Open
Abstract
By pairing to messenger RNAs (mRNAs for short), microRNAs (miRNAs) regulate gene expression in animals and plants. Accurately identifying which mRNAs interact with a given miRNA and the precise location of the interaction sites is crucial to reaching a more complete view of the regulatory network of an organism. Only a few experimental approaches, however, allow the identification of both within a single experiment. Computational predictions of miRNA–mRNA interactions thus remain generally the first step used, despite their drawback of a high rate of false-positive predictions. The major computational approaches available rely on a diversity of features, among which anchoring the miRNA seed and measuring mRNA accessibility are the key ones, with the first being universally used, while the use of the second remains controversial. Revisiting the importance of each is the aim of this paper, which uses Cross-Linking, Ligation, And Sequencing of Hybrids (CLASH) datasets to achieve this goal. Contrary to what might be expected, the results are more ambiguous regarding the use of the seed match as a feature, while accessibility appears to be a feature worth considering, indicating that, at least under some conditions, it may favour anchoring by miRNAs.
Collapse
Affiliation(s)
- Nicolas Homberg
- Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, CNRS, UMR5558, 69622 Villeurbanne, France
- INRIA Lyon Centre, 69100 Villeurbanne, France
- UR0875 MIAT, INRAE, Université de Toulouse, 31326 Castanet-Tolosan, France
| | - Mariana Galvão Ferrarini
- Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, CNRS, UMR5558, 69622 Villeurbanne, France
- INRIA Lyon Centre, 69100 Villeurbanne, France
| | - Christine Gaspin
- UR0875 MIAT, INRAE, Université de Toulouse, 31326 Castanet-Tolosan, France
- Correspondence: (C.G.); (M.-F.S.)
| | - Marie-France Sagot
- Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, CNRS, UMR5558, 69622 Villeurbanne, France
- INRIA Lyon Centre, 69100 Villeurbanne, France
- Correspondence: (C.G.); (M.-F.S.)
| |
Collapse
|
4
|
Zytnicki M, Gaspin C. srnaMapper: an optimal mapping tool for sRNA-Seq reads. BMC Bioinformatics 2022; 23:495. [DOI: 10.1186/s12859-022-05048-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 11/08/2022] [Indexed: 11/19/2022] Open
Abstract
Abstract
Background
Sequencing is the key method to study the impact of short RNAs, which include micro RNAs, tRNA-derived RNAs, and piwi-interacting RNA, among others. The first step to make use of these reads is to map them to a genome. Existing mapping tools have been developed for long RNAs in mind, and, so far, no tool has been conceived for short RNAs. However, short RNAs have several distinctive features which make them different from messenger RNAs: they are shorter, they are often redundant, they can be produced by duplicated loci, and they may be edited at their ends.
Results
In this work, we present a new tool, srnaMapper, that exhaustively maps these reads with all these features in mind, and is most efficient when applied to reads no longer than 50 base pairs. We show, on several datasets, that srnaMapper is very efficient considering computation time and edition error handling: it retrieves all the hits, with arbitrary number of errors, in time comparable with non-exhaustive tools.
Collapse
|
5
|
Fuchs S, Babin L, Andraos E, Bessiere C, Willier S, Schulte JH, Gaspin C, Meggetto F. Generation of full-length circular RNA libraries for Oxford Nanopore long-read sequencing. PLoS One 2022; 17:e0273253. [PMID: 36070299 PMCID: PMC9451095 DOI: 10.1371/journal.pone.0273253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 07/29/2022] [Indexed: 11/19/2022] Open
Abstract
Circular RNA (circRNA) is a noncoding RNA class with important implications for gene expression regulation, mostly by interaction with other RNA species or RNA-binding proteins. While the commonly applied short-read Illumina RNA-sequencing techniques can be used to detect circRNAs, their full sequence is not revealed. However, the complete sequence information is needed to analyze potential interactions and thus the mechanism of action of circRNAs. Here, we present an improved protocol to enrich and sequence full-length circRNAs by using the Oxford Nanopore long-read sequencing platform. The protocol involves an enrichment of lowly abundant circRNAs by exonuclease treatment and negative selection of linear RNAs. Then, a cDNA library is created and amplified by PCR. This protocol provides enough material for several sequencing runs. The library is used as input for ligation-based sequencing together with native barcoding. Stringent quality control of the libraries is ensured by a combination of Qubit, Fragment Analyzer and qRT-PCR. Multiplexing of up to 4 libraries yields in total more than 1–2 Million reads per library, of which 1–2% are circRNA-specific reads with >99% of them full-length. The protocol works well with human cancer cell lines. We further provide suggestions for the bioinformatic analysis of the created data, as well as the limitations of our approach together with recommendations for troubleshooting and interpretation. Taken together, this protocol enables reliable full-length analysis of circRNAs, a noncoding RNA type involved in a growing number of physiologic and pathologic conditions. Metadata Associated content. https://dx.doi.org/10.17504/protocols.io.rm7vzy8r4lx1/v2.
Collapse
Affiliation(s)
- Steffen Fuchs
- Department of Pediatric Oncology and Hematology, Charité - Universitätsmedizin Berlin, Berlin, Germany
- German Cancer Consortium (DKTK), Partner Site Berlin, Berlin, Germany
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Berlin Institute of Health at Charité –Universitätsmedizin Berlin, BIH Biomedical Innovation Academy, BIH Charité Clinician Scientist Program, Berlin, Germany
- CRCT, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Université de Toulouse, Toulouse, France
- Laboratoire d’Excellence Toulouse Cancer‐TOUCAN, Toulouse, France
- * E-mail: , (SF); (FM)
| | - Loélia Babin
- CRCT, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Université de Toulouse, Toulouse, France
- Laboratoire d’Excellence Toulouse Cancer‐TOUCAN, Toulouse, France
| | - Elissa Andraos
- CRCT, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Université de Toulouse, Toulouse, France
- Laboratoire d’Excellence Toulouse Cancer‐TOUCAN, Toulouse, France
| | - Chloé Bessiere
- CRCT, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Université de Toulouse, Toulouse, France
- Laboratoire d’Excellence Toulouse Cancer‐TOUCAN, Toulouse, France
| | - Semjon Willier
- Department of Pediatric Hematology, Oncology and Stem Cell Transplantation, Dr. von Hauner Children’s Hospital, University Hospital, LMU Munich, Munich, Germany
| | - Johannes H. Schulte
- Department of Pediatric Oncology and Hematology, Charité - Universitätsmedizin Berlin, Berlin, Germany
- German Cancer Consortium (DKTK), Partner Site Berlin, Berlin, Germany
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Berlin Institute of Health at Charité –Universitätsmedizin Berlin, BIH Biomedical Innovation Academy, BIH Charité Clinician Scientist Program, Berlin, Germany
| | - Christine Gaspin
- INRAE, BioinfOmics, GenoToul Bioinformatics Facility, Université Fédérale de Toulouse, Castanet-Tolosan, France
- INRAE, MIAT, Université Fédérale de Toulouse, Castanet-Tolosan, France
| | - Fabienne Meggetto
- CRCT, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Université de Toulouse, Toulouse, France
- Laboratoire d’Excellence Toulouse Cancer‐TOUCAN, Toulouse, France
- * E-mail: , (SF); (FM)
| |
Collapse
|
6
|
Dupré G, Hoede C, Figueroa T, Bessière P, Bertagnoli S, Ducatez M, Gaspin C, Volmer R. Phylodynamic Study of the Conserved RNA Structure Encompassing the Hemagglutinin Cleavage Site Encoding Region of H5 and H7 Low Pathogenic Avian Influenza Viruses. Virus Evol 2021; 7:veab093. [PMID: 35299790 PMCID: PMC8923263 DOI: 10.1093/ve/veab093] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Revised: 10/07/2021] [Accepted: 10/29/2021] [Indexed: 11/14/2022] Open
Abstract
Abstract
Highly Pathogenic Avian Influenza Viruses (HPAIV) evolve from Low Pathogenic Avian Influenza Viruses (LPAIV) of the H5 and H7 subtypes. This evolution is characterized by the acquisition of a multi-basic cleavage site (MBCS) motif in the hemagglutinin (HA) that leads to an extended viral tropism and severe disease in poultry. One key unanswered question is whether the risk of transition to HPAIV is similar for all LPAIV H5 or H7 strains, or whether specific determinants in the HA sequence of some H5 or H7 LPAIV strains correlate with a higher risk of transition to HPAIV. Here we determined if specific features of the conserved RNA stem loop located at the hemagglutinin cleavage site-encoding region could be detected along the LPAIV to HPAIV evolutionary pathway. Analysis of the thermodynamic stability of the predicted RNA structures showed no specific patterns common to HA sequences leading to HPAIV and distinct from those remaining LPAIV. However, RNA structure clustering analysis revealed that most of the American lineage ancestors leading to H7 emergences via recombination shared the same vRNA structure topology at the HA1/HA2 boundary region. Our study thus identified predicted secondary RNA structures present in the HA of H7 viruses, which could promote genetic recombination and acquisition of a MBCS.
Collapse
Affiliation(s)
- Gabriel Dupré
- Ecole nationale vétérinaire de Toulouse, Université de Toulouse, ENVT, INRAE, IHAP, UMR 1225, Toulouse, France
| | - Claire Hoede
- INRAE, UR875 Mathématiques et Informatique Appliquées Toulouse, Plateforme GenoToul BioInfo, F-31326 Castanet-Tolosan, France
| | - Thomas Figueroa
- Ecole nationale vétérinaire de Toulouse, Université de Toulouse, ENVT, INRAE, IHAP, UMR 1225, Toulouse, France
| | - Pierre Bessière
- Ecole nationale vétérinaire de Toulouse, Université de Toulouse, ENVT, INRAE, IHAP, UMR 1225, Toulouse, France
| | - Stéphane Bertagnoli
- Ecole nationale vétérinaire de Toulouse, Université de Toulouse, ENVT, INRAE, IHAP, UMR 1225, Toulouse, France
| | - Mariette Ducatez
- Ecole nationale vétérinaire de Toulouse, Université de Toulouse, ENVT, INRAE, IHAP, UMR 1225, Toulouse, France
| | - Christine Gaspin
- INRAE, UR875 Mathématiques et Informatique Appliquées Toulouse, Plateforme GenoToul BioInfo, F-31326 Castanet-Tolosan, France
| | - Romain Volmer
- Ecole nationale vétérinaire de Toulouse, Université de Toulouse, ENVT, INRAE, IHAP, UMR 1225, Toulouse, France
| |
Collapse
|
7
|
Azevedo-Favory J, Gaspin C, Ayadi L, Montacié C, Marchand V, Jobet E, Rompais M, Carapito C, Motorin Y, Sáez-Vásquez J. Mapping rRNA 2'-O-methylations and identification of C/D snoRNAs in Arabidopsis thaliana plants. RNA Biol 2021; 18:1760-1777. [PMID: 33596769 PMCID: PMC8583080 DOI: 10.1080/15476286.2020.1869892] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
In all eukaryotic cells, the most abundant modification of ribosomal RNA (rRNA) is methylation at the ribose moiety (2ʹ-O-methylation). Ribose methylation at specific rRNA sites is guided by small nucleolar RNAs (snoRNAs) of C/D-box type (C/D snoRNA) and achieved by the methyltransferase Fibrillarin (FIB). Here we used the Illumina-based RiboMethSeq approach for mapping rRNA 2ʹ-O-methylation sites in A. thaliana Col-0 (WT) plants. This analysis detected novel C/D snoRNA-guided rRNA 2ʹ-O-methylation positions and also some orphan sites without a matching C/D snoRNA. Furthermore, immunoprecipitation of Arabidopsis FIB2 identified and demonstrated expression of C/D snoRNAs corresponding to majority of mapped rRNA sites. On the other hand, we show that disruption of Arabidopsis Nucleolin 1 gene (NUC1), encoding a major nucleolar protein, decreases 2ʹ-O-methylation at specific rRNA sites suggesting functional/structural interconnections of 2ʹ-O-methylation with nucleolus organization and plant development. Finally, based on our findings and existent database sets, we introduce a new nomenclature system for C/D snoRNA in Arabidopsis plants.
Collapse
Affiliation(s)
- J Azevedo-Favory
- CNRS, Laboratoire Génome et Développement des Plantes (LGDP), UMR 5096, 66860 Perpignan, France.,Univ. Perpignan Via Domitia, LGDP, UMR5096, 66860 Perpignan, France
| | - C Gaspin
- Université Fédérale de Toulouse, INRAE, MIAT, 31326, Castanet-Tolosan, France.,Université Fédérale de Toulouse, INRAE, BioinfOmics, Genotoul Bioinformatics facility, 31326
| | - L Ayadi
- Université de Lorraine, CNRS, INSERM, IBSLor, (UMS2008/US40), Epitranscriptomics and RNA Sequencing (EpiRNA-Seq) Core Facility, F-54000 Nancy, France.,Université de Lorraine, CNRS, IMoPA (UMR7365), F-54000 Nancy, France
| | - C Montacié
- CNRS, Laboratoire Génome et Développement des Plantes (LGDP), UMR 5096, 66860 Perpignan, France.,Univ. Perpignan Via Domitia, LGDP, UMR5096, 66860 Perpignan, France
| | - V Marchand
- Université de Lorraine, CNRS, INSERM, IBSLor, (UMS2008/US40), Epitranscriptomics and RNA Sequencing (EpiRNA-Seq) Core Facility, F-54000 Nancy, France
| | - E Jobet
- CNRS, Laboratoire Génome et Développement des Plantes (LGDP), UMR 5096, 66860 Perpignan, France.,Univ. Perpignan Via Domitia, LGDP, UMR5096, 66860 Perpignan, France
| | - M Rompais
- Laboratoire de Spectrométrie de Masse BioOrganique, Institut Pluridisciplinaire Hubert Curien, UMR7178 CNRS/Université de Strasbourg, Strasbourg, France
| | - C Carapito
- Laboratoire de Spectrométrie de Masse BioOrganique, Institut Pluridisciplinaire Hubert Curien, UMR7178 CNRS/Université de Strasbourg, Strasbourg, France
| | - Y Motorin
- Université de Lorraine, CNRS, INSERM, IBSLor, (UMS2008/US40), Epitranscriptomics and RNA Sequencing (EpiRNA-Seq) Core Facility, F-54000 Nancy, France.,Université de Lorraine, CNRS, IMoPA (UMR7365), F-54000 Nancy, France
| | - J Sáez-Vásquez
- CNRS, Laboratoire Génome et Développement des Plantes (LGDP), UMR 5096, 66860 Perpignan, France.,Univ. Perpignan Via Domitia, LGDP, UMR5096, 66860 Perpignan, France
| |
Collapse
|
8
|
Abstract
High-throughput sequencing makes it possible to provide the genome-wide distribution of small non coding RNAs in a single experiment, and contributed greatly to the identification and understanding of these RNAs in the last decade. Small non coding RNAs gather a wide collection of classes, such as microRNAs, tRNA-derived fragments, small nucleolar RNAs and small nuclear RNAs, to name a few. As usual in RNA-seq studies, the sequencing step is followed by a feature quantification step: when a genome is available, the reads are aligned to the genome, their genomic positions are compared to the already available annotations, and the corresponding features are quantified. However, problem arises when many reads map at several positions and while different strategies exist to circumvent this problem, all of them are biased. In this article, we present a new strategy that compares all the reads that map at several positions, and their annotations when available. In many cases, all the hits co-localize with the same feature annotation (a duplicated miRNA or a duplicated gene, for instance). When different annotations exist for a given read, we propose to merge existing features and provide the counts for the merged features. This new strategy has been implemented in a tool, mmannot, freely available at https://github.com/mzytnicki/mmannot.
Collapse
Affiliation(s)
- Matthias Zytnicki
- Unité de Mathématiques et Informatique Appliquées, Toulouse INRA, Castanet Tolosan, France
- * E-mail:
| | - Christine Gaspin
- Unité de Mathématiques et Informatique Appliquées, Toulouse INRA, Castanet Tolosan, France
| |
Collapse
|
9
|
Ipoutcha T, Tsarmpopoulos I, Talenton V, Gaspin C, Moisan A, Walker CA, Brownlie J, Blanchard A, Thebault P, Sirand-Pugnet P. Multiple Origins and Specific Evolution of CRISPR/Cas9 Systems in Minimal Bacteria ( Mollicutes). Front Microbiol 2019; 10:2701. [PMID: 31824468 PMCID: PMC6882279 DOI: 10.3389/fmicb.2019.02701] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Accepted: 11/07/2019] [Indexed: 12/13/2022] Open
Abstract
CRISPR/Cas systems provide adaptive defense mechanisms against invading nucleic acids in prokaryotes. Because of its interest as a genetic tool, the Type II CRISPR/Cas9 system from Streptococcus pyogenes has been extensively studied. It includes the Cas9 endonuclease that is dependent on a dual-guide RNA made of a tracrRNA and a crRNA. Target recognition relies on crRNA annealing and the presence of a protospacer adjacent motif (PAM). Mollicutes are currently the bacteria with the smallest genome in which CRISPR/Cas systems have been reported. Many of them are pathogenic to humans and animals (mycoplasmas and ureaplasmas) or plants (phytoplasmas and some spiroplasmas). A global survey was conducted to identify and compare CRISPR/Cas systems found in the genome of these minimal bacteria. Complete or degraded systems classified as Type II-A and less frequently as Type II-C were found in the genome of 21 out of 52 representative mollicutes species. Phylogenetic reconstructions predicted a common origin of all CRISPR/Cas systems of mycoplasmas and at least two origins were suggested for spiroplasmas systems. Cas9 in mollicutes were structurally related to the S. aureus Cas9 except the PI domain involved in the interaction with the PAM, suggesting various PAM might be recognized by Cas9 of different mollicutes. Structure of the predicted crRNA/tracrRNA hybrids was conserved and showed typical stem-loop structures pairing the Direct Repeat part of crRNAs with the 5' region of tracrRNAs. Most mollicutes crRNA/tracrRNAs showed G + C% significantly higher than the genome, suggesting a selective pressure for maintaining stability of these secondary structures. Examples of CRISPR spacers matching with mollicutes phages were found, including the textbook case of Mycoplasma cynos strain C142 having no prophage sequence but a CRISPR/Cas system with spacers targeting prophage sequences that were found in the genome of another M. cynos strain that is devoid of a CRISPR system. Despite their small genome size, mollicutes have maintained protective means against invading DNAs, including restriction/modification and CRISPR/Cas systems. The apparent lack of CRISPR/Cas systems in several groups of species including main pathogens of humans, ruminants, and plants suggests different evolutionary routes or a lower risk of phage infection in specific ecological niches.
Collapse
Affiliation(s)
- Thomas Ipoutcha
- INRA, UMR 1332 de Biologie du Fruit et Pathologie, Villenave d'Ornon, France.,Université de Bordeaux, UMR 1332 de Biologie du Fruit et Pathologie, Villenave d'Ornon, France
| | - Iason Tsarmpopoulos
- INRA, UMR 1332 de Biologie du Fruit et Pathologie, Villenave d'Ornon, France.,Université de Bordeaux, UMR 1332 de Biologie du Fruit et Pathologie, Villenave d'Ornon, France
| | - Vincent Talenton
- INRA, UMR 1332 de Biologie du Fruit et Pathologie, Villenave d'Ornon, France.,Université de Bordeaux, UMR 1332 de Biologie du Fruit et Pathologie, Villenave d'Ornon, France
| | - Christine Gaspin
- INRA, Mathématiques et Informatique Appliquées de Toulouse, Université de Toulouse, Toulouse, France
| | - Annick Moisan
- INRA, Mathématiques et Informatique Appliquées de Toulouse, Université de Toulouse, Toulouse, France
| | - Caray A Walker
- School of Life Sciences, Anglia Ruskin University, Cambridge, United Kingdom
| | - Joe Brownlie
- Department of Pathobiology and Population Sciences, Royal Veterinary College, University of London, London, United Kingdom
| | - Alain Blanchard
- INRA, UMR 1332 de Biologie du Fruit et Pathologie, Villenave d'Ornon, France.,Université de Bordeaux, UMR 1332 de Biologie du Fruit et Pathologie, Villenave d'Ornon, France
| | | | - Pascal Sirand-Pugnet
- INRA, UMR 1332 de Biologie du Fruit et Pathologie, Villenave d'Ornon, France.,Université de Bordeaux, UMR 1332 de Biologie du Fruit et Pathologie, Villenave d'Ornon, France
| |
Collapse
|
10
|
Laguerre S, González I, Nouaille S, Moisan A, Villa-Vialaneix N, Gaspin C, Bouvier M, Carpousis AJ, Cocaign-Bousquet M, Girbal L. Large-Scale Measurement of mRNA Degradation in Escherichia coli: To Delay or Not to Delay. Methods Enzymol 2018; 612:47-66. [PMID: 30502954 DOI: 10.1016/bs.mie.2018.07.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
In this study, we compared different computational methods used for genome-wide determination of mRNA half-lives in Escherichia coli with a special focus on the impact on considering a delay before the onset of mRNA decay after transcription arrest. A wide variety of datasets were analyzed coming from different technical methods for mRNA quantification (microarrays, RNA-seq, and RT-qPCR) and different bacterial growth conditions. The exponential decay of mRNA levels was fitted using both linear and exponential models and with or without a delay. We showed that for all the models, independently of mRNA quantification methods and growth conditions, ignoring the delay resulted in only a modest overestimation of the half-life. For approximately 80% of the mRNAs, differences in mRNA half-life values were less than 34s. The correlation between half-lives estimated with and without a delay was extremely high. However, the slope of the linear regression between the half-lives with and without a delay tended to decrease with the delay. For the few mRNAs for which taking into account the delay influenced the estimated half-life, the impact was dependent on the model and the growth condition. The smallest impact was obtained for the linear model.
Collapse
Affiliation(s)
| | | | | | - Annick Moisan
- MIAT, Université de Toulouse, INRA, Castanet-Tolosan, France
| | | | | | - Marie Bouvier
- LMGM, CBI, Université de Toulouse, CNRS, Toulouse, France
| | | | | | - Laurence Girbal
- LISBP, Université de Toulouse, CNRS, INRA, INSA, Toulouse, France
| |
Collapse
|
11
|
Plomion C, Aury JM, Amselem J, Leroy T, Murat F, Duplessis S, Faye S, Francillonne N, Labadie K, Le Provost G, Lesur I, Bartholomé J, Faivre-Rampant P, Kohler A, Leplé JC, Chantret N, Chen J, Diévart A, Alaeitabar T, Barbe V, Belser C, Bergès H, Bodénès C, Bogeat-Triboulot MB, Bouffaud ML, Brachi B, Chancerel E, Cohen D, Couloux A, Da Silva C, Dossat C, Ehrenmann F, Gaspin C, Grima-Pettenati J, Guichoux E, Hecker A, Herrmann S, Hugueney P, Hummel I, Klopp C, Lalanne C, Lascoux M, Lasserre E, Lemainque A, Desprez-Loustau ML, Luyten I, Madoui MA, Mangenot S, Marchal C, Maumus F, Mercier J, Michotey C, Panaud O, Picault N, Rouhier N, Rué O, Rustenholz C, Salin F, Soler M, Tarkka M, Velt A, Zanne AE, Martin F, Wincker P, Quesneville H, Kremer A, Salse J. Oak genome reveals facets of long lifespan. Nat Plants 2018; 4:440-452. [PMID: 29915331 PMCID: PMC6086335 DOI: 10.1038/s41477-018-0172-3] [Citation(s) in RCA: 183] [Impact Index Per Article: 30.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/12/2017] [Accepted: 05/08/2018] [Indexed: 05/18/2023]
Abstract
Oaks are an important part of our natural and cultural heritage. Not only are they ubiquitous in our most common landscapes1 but they have also supplied human societies with invaluable services, including food and shelter, since prehistoric times2. With 450 species spread throughout Asia, Europe and America3, oaks constitute a critical global renewable resource. The longevity of oaks (several hundred years) probably underlies their emblematic cultural and historical importance. Such long-lived sessile organisms must persist in the face of a wide range of abiotic and biotic threats over their lifespans. We investigated the genomic features associated with such a long lifespan by sequencing, assembling and annotating the oak genome. We then used the growing number of whole-genome sequences for plants (including tree and herbaceous species) to investigate the parallel evolution of genomic characteristics potentially underpinning tree longevity. A further consequence of the long lifespan of trees is their accumulation of somatic mutations during mitotic divisions of stem cells present in the shoot apical meristems. Empirical4 and modelling5 approaches have shown that intra-organismal genetic heterogeneity can be selected for6 and provides direct fitness benefits in the arms race with short-lived pests and pathogens through a patchwork of intra-organismal phenotypes7. However, there is no clear proof that large-statured trees consist of a genetic mosaic of clonally distinct cell lineages within and between branches. Through this case study of oak, we demonstrate the accumulation and transmission of somatic mutations and the expansion of disease-resistance gene families in trees.
Collapse
Affiliation(s)
| | - Jean-Marc Aury
- Commissariat à l'Energie Atomique (CEA), Genoscope, Institut de Biologie François-Jacob, Evry, France
| | | | | | | | | | - Sébastien Faye
- Commissariat à l'Energie Atomique (CEA), Genoscope, Institut de Biologie François-Jacob, Evry, France
| | | | - Karine Labadie
- Commissariat à l'Energie Atomique (CEA), Genoscope, Institut de Biologie François-Jacob, Evry, France
| | | | - Isabelle Lesur
- BIOGECO, INRA, Université de Bordeaux, Cestas, France
- HelixVenture, Mérignac, France
| | | | | | | | | | - Nathalie Chantret
- AGAP, Université de Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | - Jun Chen
- Department of Ecology and Genetics, Evolutionary Biology Centre, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Anne Diévart
- CIRAD, UMR AGAP, Montpellier, France
- Université de Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | | | - Valérie Barbe
- Commissariat à l'Energie Atomique (CEA), Genoscope, Institut de Biologie François-Jacob, Evry, France
| | - Caroline Belser
- Commissariat à l'Energie Atomique (CEA), Genoscope, Institut de Biologie François-Jacob, Evry, France
| | | | | | | | - Marie-Lara Bouffaud
- Department of Soil Ecology, UFZ-Helmholtz Centre for Environmental Research, Halle/Saale, Germany
| | | | | | - David Cohen
- UMR Silva, INRA, Université de Lorraine, AgroPariTech, Nancy, France
| | - Arnaud Couloux
- Commissariat à l'Energie Atomique (CEA), Genoscope, Institut de Biologie François-Jacob, Evry, France
| | - Corinne Da Silva
- Commissariat à l'Energie Atomique (CEA), Genoscope, Institut de Biologie François-Jacob, Evry, France
| | - Carole Dossat
- Commissariat à l'Energie Atomique (CEA), Genoscope, Institut de Biologie François-Jacob, Evry, France
| | | | - Christine Gaspin
- Plateforme bioinformatique Toulouse Midi-Pyrénées, INRA, Auzeville Castanet-Tolosan, France
| | | | | | - Arnaud Hecker
- IAM, INRA, Université de Lorraine, Champenoux, France
| | - Sylvie Herrmann
- German Centre for Integrative Research (iDiv), Halle-Jena-Leipzig, Leipzig, Germany
| | | | - Irène Hummel
- UMR Silva, INRA, Université de Lorraine, AgroPariTech, Nancy, France
| | - Christophe Klopp
- Plateforme bioinformatique Toulouse Midi-Pyrénées, INRA, Auzeville Castanet-Tolosan, France
| | | | - Martin Lascoux
- Department of Ecology and Genetics, Evolutionary Biology Centre, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Eric Lasserre
- Université de Perpignan, UMR 5096, Perpignan, France
| | - Arnaud Lemainque
- Commissariat à l'Energie Atomique (CEA), Genoscope, Institut de Biologie François-Jacob, Evry, France
| | | | | | - Mohammed-Amin Madoui
- Commissariat à l'Energie Atomique (CEA), Genoscope, Institut de Biologie François-Jacob, Evry, France
| | - Sophie Mangenot
- Commissariat à l'Energie Atomique (CEA), Genoscope, Institut de Biologie François-Jacob, Evry, France
| | | | | | - Jonathan Mercier
- Commissariat à l'Energie Atomique (CEA), Genoscope, Institut de Biologie François-Jacob, Evry, France
| | | | | | | | | | - Olivier Rué
- Plateforme bioinformatique Toulouse Midi-Pyrénées, INRA, Auzeville Castanet-Tolosan, France
| | | | - Franck Salin
- BIOGECO, INRA, Université de Bordeaux, Cestas, France
| | - Marçal Soler
- Université de Toulouse, CNRS, UMR 5546, LRSV, Castanet-Tolosan, France
- Laboratori del Suro, University of Girona, Girona, Spain
| | - Mika Tarkka
- Department of Soil Ecology, UFZ-Helmholtz Centre for Environmental Research, Halle/Saale, Germany
| | - Amandine Velt
- SVQV, Université de Strasbourg, INRA, Colmar, France
| | - Amy E Zanne
- Department of Biological Sciences, George Washington University, Washington, DC, USA
| | | | - Patrick Wincker
- Génomique Métabolique, Genoscope, Institut de Biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Université d'Evry, Université Paris-Saclay, Evry, France
| | | | | | | |
Collapse
|
12
|
Cerutti F, Mallet L, Painset A, Hoede C, Moisan A, Bécavin C, Duval M, Dussurget O, Cossart P, Gaspin C, Chiapello H. Unraveling the evolution and coevolution of small regulatory RNAs and coding genes in Listeria. BMC Genomics 2017; 18:882. [PMID: 29145803 PMCID: PMC5689173 DOI: 10.1186/s12864-017-4242-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2017] [Accepted: 10/29/2017] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Small regulatory RNAs (sRNAs) are widely found in bacteria and play key roles in many important physiological and adaptation processes. Studying their evolution and screening for events of coevolution with other genomic features is a powerful way to better understand their origin and assess a common functional or adaptive relationship between them. However, evolution and coevolution of sRNAs with coding genes have been sparsely investigated in bacterial pathogens. RESULTS We designed a robust and generic phylogenomics approach that detects correlated evolution between sRNAs and protein-coding genes using their observed and inferred patterns of presence-absence in a set of annotated genomes. We applied this approach on 79 complete genomes of the Listeria genus and identified fifty-two accessory sRNAs, of which most were present in the Listeria common ancestor and lost during Listeria evolution. We detected significant coevolution between 23 sRNA and 52 coding genes and inferred the Listeria sRNA-coding genes coevolution network. We characterized a main hub of 12 sRNAs that coevolved with genes encoding cell wall proteins and virulence factors. Among them, an sRNA specific to L. monocytogenes species, rli133, coevolved with genes involved either in pathogenicity or in interaction with host cells, possibly acting as a direct negative post-transcriptional regulation. CONCLUSIONS Our approach allowed the identification of candidate sRNAs potentially involved in pathogenicity and host interaction, consistent with recent findings on known pathogenicity actors. We highlight four sRNAs coevolving with seven internalin genes, some of which being important virulence factors in Listeria.
Collapse
Affiliation(s)
- Franck Cerutti
- Université de Toulouse, INRA, UR 875 Unité Mathématiques et Informatique Appliquées de Toulouse, Auzeville, 31326, Castanet-Tolosan, France
| | - Ludovic Mallet
- Université de Toulouse, INRA, UR 875 Unité Mathématiques et Informatique Appliquées de Toulouse, Auzeville, 31326, Castanet-Tolosan, France
| | - Anaïs Painset
- Université de Toulouse, INRA, UR 875 Unité Mathématiques et Informatique Appliquées de Toulouse, Auzeville, 31326, Castanet-Tolosan, France.,Present address: Public Health England, 61 Colindale Avenue, London, NW9 5EQ, England
| | - Claire Hoede
- Université de Toulouse, INRA, UR 875 Unité Mathématiques et Informatique Appliquées de Toulouse, Auzeville, 31326, Castanet-Tolosan, France
| | - Annick Moisan
- Université de Toulouse, INRA, UR 875 Unité Mathématiques et Informatique Appliquées de Toulouse, Auzeville, 31326, Castanet-Tolosan, France
| | - Christophe Bécavin
- Département de Biologie Cellulaire et Infection, Institut Pasteur, Unité des Interactions Bactéries-Cellules, F-75015, Paris, France.,INSERM, U604,F-75015, Paris, France.,INRA, USC2020, F-75015, Paris, France.,Institut Pasteur - Bioinformatics and Biostatistics Hub - C3BI, USR 3756 IP CNRS, Paris, France
| | - Mélodie Duval
- Département de Biologie Cellulaire et Infection, Institut Pasteur, Unité des Interactions Bactéries-Cellules, F-75015, Paris, France.,INSERM, U604,F-75015, Paris, France.,INRA, USC2020, F-75015, Paris, France
| | - Olivier Dussurget
- Département de Biologie Cellulaire et Infection, Institut Pasteur, Unité des Interactions Bactéries-Cellules, F-75015, Paris, France.,INSERM, U604,F-75015, Paris, France.,INRA, USC2020, F-75015, Paris, France.,Université Paris Diderot, Sorbonne Paris Cité, F-75013, Paris, France
| | - Pascale Cossart
- Département de Biologie Cellulaire et Infection, Institut Pasteur, Unité des Interactions Bactéries-Cellules, F-75015, Paris, France.,INSERM, U604,F-75015, Paris, France.,INRA, USC2020, F-75015, Paris, France
| | - Christine Gaspin
- Université de Toulouse, INRA, UR 875 Unité Mathématiques et Informatique Appliquées de Toulouse, Auzeville, 31326, Castanet-Tolosan, France
| | - Hélène Chiapello
- Université de Toulouse, INRA, UR 875 Unité Mathématiques et Informatique Appliquées de Toulouse, Auzeville, 31326, Castanet-Tolosan, France.
| |
Collapse
|
13
|
Belkorchia A, Pombert JF, Polonais V, Parisot N, Delbac F, Brugère JF, Peyret P, Gaspin C, Peyretaillade E. Comparative genomics of microsporidian genomes reveals a minimal non-coding RNA set and new insights for transcription in minimal eukaryotic genomes. DNA Res 2017; 24:251-260. [PMID: 28338834 PMCID: PMC5499648 DOI: 10.1093/dnares/dsx002] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Accepted: 01/21/2017] [Indexed: 11/14/2022] Open
Abstract
Microsporidia are ubiquitous intracellular pathogens whose opportunistic nature led to their increased recognition with the rise of the AIDS pandemic. As the RNA world was largely unexplored in this parasitic lineage, we developed a dedicated in silico methodology to carry out exhaustive identification of ncRNAs across the Encephalitozoon and Nosema genera. Thus, the previously missing U1 small nuclear RNA (snRNA) and small nucleolar RNAs (snoRNAs) targeting only the LSU rRNA were highlighted and were further validated using 5' and 3'RACE-PCR experiments. Overall, the 15 ncRNAs that were found shared between Encephalitozoon and Nosema spp. may represent the minimal core set required for parasitic life. Interestingly, the systematic presence of a CCC- or GGG-like motif in 5' of all ncRNA and mRNA gene transcripts regardless of the RNA polymerase involved suggests that the RNA polymerase machineries in microsporidia species could use common factors. Our data provide additional insights in accordance with the simplification processes observed in these reduce genomes and underline the usefulness of sequencing closely related species to help identify highly divergent ncRNAs in these parasites.
Collapse
Affiliation(s)
- Abdel Belkorchia
- Laboratoire "Microorganismes: Génome et Environnement", Université Clermont Auvergne, BP 10448, F-63000 Clermont-Ferrand, France.,CNRS, UMR 6023, LMGE, F-63171 Aubière, France
| | | | - Valérie Polonais
- Laboratoire "Microorganismes: Génome et Environnement", Université Clermont Auvergne, BP 10448, F-63000 Clermont-Ferrand, France.,CNRS, UMR 6023, LMGE, F-63171 Aubière, France
| | - Nicolas Parisot
- Université Clermont Auvergne, EA 4678 CIDAM, BP 10448, F-63001 Clermont-Ferrand, France
| | - Frédéric Delbac
- Laboratoire "Microorganismes: Génome et Environnement", Université Clermont Auvergne, BP 10448, F-63000 Clermont-Ferrand, France.,CNRS, UMR 6023, LMGE, F-63171 Aubière, France
| | - Jean-François Brugère
- Université Clermont Auvergne, EA 4678 CIDAM, BP 10448, F-63001 Clermont-Ferrand, France
| | - Pierre Peyret
- Université Clermont Auvergne, EA 4678 CIDAM, BP 10448, F-63001 Clermont-Ferrand, France
| | | | - Eric Peyretaillade
- Université Clermont Auvergne, EA 4678 CIDAM, BP 10448, F-63001 Clermont-Ferrand, France
| |
Collapse
|
14
|
Juanchich A, Bardou P, Rué O, Gabillard JC, Gaspin C, Bobe J, Guiguen Y. Characterization of an extensive rainbow trout miRNA transcriptome by next generation sequencing. BMC Genomics 2016; 17:164. [PMID: 26931235 PMCID: PMC4774146 DOI: 10.1186/s12864-016-2505-9] [Citation(s) in RCA: 58] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2015] [Accepted: 02/19/2016] [Indexed: 01/08/2023] Open
Abstract
Background MicroRNAs (miRNAs) have emerged as important post-transcriptional regulators of gene expression in a wide variety of physiological processes. They can control both temporal and spatial gene expression and are believed to regulate 30 to 70 % of the genes. Data are however limited for fish species, with only 9 out of the 30,000 fish species present in miRBase. The aim of the current study was to discover and characterize rainbow trout (Oncorhynchus mykiss) miRNAs in a large number of tissues using next-generation sequencing in order to provide an extensive repertoire of rainbow trout miRNAs. Results A total of 38 different samples corresponding to 16 different tissues or organs were individually sequenced and analyzed independently in order to identify a large number of miRNAs with high confidence. This led to the identification of 2946 miRNA loci in the rainbow trout genome, including 445 already known miRNAs. Differential expression analysis was performed in order to identify miRNAs exhibiting specific or preferential expression among the 16 analyzed tissues. In most cases, miRNAs exhibit a specific pattern of expression in only a few tissues. The expression data from sRNA sequencing were confirmed by RT-qPCR. In addition, novel miRNAs are described in rainbow trout that had not been previously reported in other species. Conclusion This study represents the first characterization of rainbow trout miRNA transcriptome from a wide variety of tissue and sets an extensive repertoire of rainbow trout miRNAs. It provides a starting point for future studies aimed at understanding the roles of miRNAs in major physiological process such as growth, reproduction or adaptation to stress. These rainbow trout miRNAs repertoire provide a novel resource to advance genomic research in salmonid species. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2505-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Philippe Bardou
- INRA, UMR1388, Plate-forme SIGENAE/GenPhySE, Chemin de Borde Rouge, Auzeville CS 52627, F-31326, Castanet-Tolosan, France.
| | - Olivier Rué
- INRA, UR875 Plate-forme GenoToul Bioinfo, Chemin de Borde Rouge, Auzeville CS 52627, F-31326, Castanet-Tolosan, France.
| | | | - Christine Gaspin
- INRA, UR875 Plate-forme GenoToul Bioinfo, Chemin de Borde Rouge, Auzeville CS 52627, F-31326, Castanet-Tolosan, France.
| | - Julien Bobe
- INRA, UR1037 LPGP, Campus de Beaulieu, F-35000, Rennes, France.
| | - Yann Guiguen
- INRA, UR1037 LPGP, Campus de Beaulieu, F-35000, Rennes, France.
| |
Collapse
|
15
|
Mariette J, Escudié F, Bardou P, Nabihoudine I, Noirot C, Trotard MS, Gaspin C, Klopp C. Jflow: a workflow management system for web applications. Bioinformatics 2015; 32:456-8. [PMID: 26454273 PMCID: PMC5859998 DOI: 10.1093/bioinformatics/btv589] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Accepted: 10/07/2015] [Indexed: 11/14/2022] Open
Abstract
SUMMARY Biologists produce large data sets and are in demand of rich and simple web portals in which they can upload and analyze their files. Providing such tools requires to mask the complexity induced by the needed High Performance Computing (HPC) environment. The connection between interface and computing infrastructure is usually specific to each portal. With Jflow, we introduce a Workflow Management System (WMS), composed of jQuery plug-ins which can easily be embedded in any web application and a Python library providing all requested features to setup, run and monitor workflows. AVAILABILITY AND IMPLEMENTATION Jflow is available under the GNU General Public License (GPL) at http://bioinfo.genotoul.fr/jflow. The package is coming with full documentation, quick start and a running test portal. CONTACT Jerome.Mariette@toulouse.inra.fr.
Collapse
Affiliation(s)
- Jérôme Mariette
- Plate-forme Bio-informatique Genotoul, INRA, UR875 Mathmatiques et Informatique Appliques Toulouse, Castanet-Tolosan, France and
| | - Frédéric Escudié
- Plate-forme Bio-informatique Genotoul, INRA, UR875 Mathmatiques et Informatique Appliques Toulouse, Castanet-Tolosan, France and
| | - Philippe Bardou
- Plate-forme SIGENAE, INRA, GenPhyse, Castanet-Tolosan Cedex, France
| | - Ibouniyamine Nabihoudine
- Plate-forme Bio-informatique Genotoul, INRA, UR875 Mathmatiques et Informatique Appliques Toulouse, Castanet-Tolosan, France and
| | - Céline Noirot
- Plate-forme Bio-informatique Genotoul, INRA, UR875 Mathmatiques et Informatique Appliques Toulouse, Castanet-Tolosan, France and
| | - Marie-Stéphane Trotard
- Plate-forme Bio-informatique Genotoul, INRA, UR875 Mathmatiques et Informatique Appliques Toulouse, Castanet-Tolosan, France and
| | - Christine Gaspin
- Plate-forme Bio-informatique Genotoul, INRA, UR875 Mathmatiques et Informatique Appliques Toulouse, Castanet-Tolosan, France and
| | - Christophe Klopp
- Plate-forme Bio-informatique Genotoul, INRA, UR875 Mathmatiques et Informatique Appliques Toulouse, Castanet-Tolosan, France and Plate-forme SIGENAE, INRA, GenPhyse, Castanet-Tolosan Cedex, France
| |
Collapse
|
16
|
Thébault P, Bourqui R, Benchimol W, Gaspin C, Sirand-Pugnet P, Uricaru R, Dutour I. Advantages of mixing bioinformatics and visualization approaches for analyzing sRNA-mediated regulatory bacterial networks. Brief Bioinform 2015; 16:795-805. [PMID: 25477348 PMCID: PMC4570199 DOI: 10.1093/bib/bbu045] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2014] [Revised: 11/05/2014] [Indexed: 12/29/2022] Open
Abstract
The revolution in high-throughput sequencing technologies has enabled the acquisition of gigabytes of RNA sequences in many different conditions and has highlighted an unexpected number of small RNAs (sRNAs) in bacteria. Ongoing exploitation of these data enables numerous applications for investigating bacterial transacting sRNA-mediated regulation networks. Focusing on sRNAs that regulate mRNA translation in trans, recent works have noted several sRNA-based regulatory pathways that are essential for key cellular processes. Although the number of known bacterial sRNAs is increasing, the experimental validation of their interactions with mRNA targets remains challenging and involves expensive and time-consuming experimental strategies. Hence, bioinformatics is crucial for selecting and prioritizing candidates before designing any experimental work. However, current software for target prediction produces a prohibitive number of candidates because of the lack of biological knowledge regarding the rules governing sRNA-mRNA interactions. Therefore, there is a real need to develop new approaches to help biologists focus on the most promising predicted sRNA-mRNA interactions. In this perspective, this review aims at presenting the advantages of mixing bioinformatics and visualization approaches for analyzing predicted sRNA-mediated regulatory bacterial networks.
Collapse
|
17
|
Plomion C, Aury JM, Amselem J, Alaeitabar T, Barbe V, Belser C, Bergès H, Bodénès C, Boudet N, Boury C, Canaguier A, Couloux A, Da Silva C, Duplessis S, Ehrenmann F, Estrada-Mairey B, Fouteau S, Francillonne N, Gaspin C, Guichard C, Klopp C, Labadie K, Lalanne C, Le Clainche I, Leplé JC, Le Provost G, Leroy T, Lesur I, Martin F, Mercier J, Michotey C, Murat F, Salin F, Steinbach D, Faivre-Rampant P, Wincker P, Salse J, Quesneville H, Kremer A. Decoding the oak genome: public release of sequence data, assembly, annotation and publication strategies. Mol Ecol Resour 2015; 16:254-65. [PMID: 25944057 DOI: 10.1111/1755-0998.12425] [Citation(s) in RCA: 81] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Revised: 04/27/2015] [Accepted: 04/30/2015] [Indexed: 12/31/2022]
Abstract
The 1.5 Gbp/2C genome of pedunculate oak (Quercus robur) has been sequenced. A strategy was established for dealing with the challenges imposed by the sequencing of such a large, complex and highly heterozygous genome by a whole-genome shotgun (WGS) approach, without the use of costly and time-consuming methods, such as fosmid or BAC clone-based hierarchical sequencing methods. The sequencing strategy combined short and long reads. Over 49 million reads provided by Roche 454 GS-FLX technology were assembled into contigs and combined with shorter Illumina sequence reads from paired-end and mate-pair libraries of different insert sizes, to build scaffolds. Errors were corrected and gaps filled with Illumina paired-end reads and contaminants detected, resulting in a total of 17,910 scaffolds (>2 kb) corresponding to 1.34 Gb. Fifty per cent of the assembly was accounted for by 1468 scaffolds (N50 of 260 kb). Initial comparison with the phylogenetically related Prunus persica gene model indicated that genes for 84.6% of the proteins present in peach (mean protein coverage of 90.5%) were present in our assembly. The second and third steps in this project are genome annotation and the assignment of scaffolds to the oak genetic linkage map. In accordance with the Bermuda and Fort Lauderdale agreements and the more recent Toronto Statement, the oak genome data have been released into public sequence repositories in advance of publication. In this presubmission paper, the oak genome consortium describes its principal lines of work and future directions for analyses of the nature, function and evolution of the oak genome.
Collapse
Affiliation(s)
- Christophe Plomion
- INRA, UMR1202, BIOGECO, Cestas, F-33610, France.,University of Bordeaux, BIOGECO, UMR1202, Talence, F-33170, France
| | - Jean-Marc Aury
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, Evry, 91057, France
| | - Joëlle Amselem
- INRA, Unité de Recherche Génomique Info (URGI), Versailles, F78026, France
| | - Tina Alaeitabar
- INRA, Unité de Recherche Génomique Info (URGI), Versailles, F78026, France
| | - Valérie Barbe
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, Evry, 91057, France
| | - Caroline Belser
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, Evry, 91057, France
| | | | - Catherine Bodénès
- INRA, UMR1202, BIOGECO, Cestas, F-33610, France.,University of Bordeaux, BIOGECO, UMR1202, Talence, F-33170, France
| | | | - Christophe Boury
- INRA, UMR1202, BIOGECO, Cestas, F-33610, France.,University of Bordeaux, BIOGECO, UMR1202, Talence, F-33170, France
| | | | - Arnaud Couloux
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, Evry, 91057, France
| | - Corinne Da Silva
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, Evry, 91057, France
| | - Sébastien Duplessis
- INRA, UMR1136 INRA-Université de Lorraine, Interactions Arbres/Micro-organismes, Laboratoire d'Excellence ARBRE, Champenoux, F-54280, France
| | - François Ehrenmann
- INRA, UMR1202, BIOGECO, Cestas, F-33610, France.,University of Bordeaux, BIOGECO, UMR1202, Talence, F-33170, France
| | - Barbara Estrada-Mairey
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, Evry, 91057, France
| | - Stéphanie Fouteau
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, Evry, 91057, France
| | | | - Christine Gaspin
- Plateforme bioinformatique Toulouse Midi-Pyrénées, UBIA, INRA, Castanet-Tolosan, F-31326, France
| | | | - Christophe Klopp
- Plateforme bioinformatique Toulouse Midi-Pyrénées, UBIA, INRA, Castanet-Tolosan, F-31326, France
| | - Karine Labadie
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, Evry, 91057, France
| | - Céline Lalanne
- INRA, UMR1202, BIOGECO, Cestas, F-33610, France.,University of Bordeaux, BIOGECO, UMR1202, Talence, F-33170, France
| | | | - Jean-Charles Leplé
- INRA, UR0588 Amélioration Génétique et Physiologie Forestières, Orléans, F-45075, France
| | - Grégoire Le Provost
- INRA, UMR1202, BIOGECO, Cestas, F-33610, France.,University of Bordeaux, BIOGECO, UMR1202, Talence, F-33170, France
| | - Thibault Leroy
- INRA, UMR1202, BIOGECO, Cestas, F-33610, France.,University of Bordeaux, BIOGECO, UMR1202, Talence, F-33170, France
| | - Isabelle Lesur
- INRA, UMR1202, BIOGECO, Cestas, F-33610, France.,University of Bordeaux, BIOGECO, UMR1202, Talence, F-33170, France
| | - Francis Martin
- INRA, UMR1136 INRA-Université de Lorraine, Interactions Arbres/Micro-organismes, Laboratoire d'Excellence ARBRE, Champenoux, F-54280, France
| | - Jonathan Mercier
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, Evry, 91057, France
| | - Célia Michotey
- INRA, Unité de Recherche Génomique Info (URGI), Versailles, F78026, France
| | - Florent Murat
- INRA/UBP UMR 1095, Laboratoire Génétique, Diversité et Ecophysiologie des Céréales, Clermont-Ferrand, F-63039, France
| | - Franck Salin
- INRA, UMR1202, BIOGECO, Cestas, F-33610, France.,University of Bordeaux, BIOGECO, UMR1202, Talence, F-33170, France
| | - Delphine Steinbach
- INRA, Unité de Recherche Génomique Info (URGI), Versailles, F78026, France
| | | | - Patrick Wincker
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, Evry, 91057, France.,Université d'Evry Val d'Essone, UMR 8030, Evry, CP5706, France.,Centre National de Recherche Scientifique (CNRS), UMR 8030, Evry, CP5706, France
| | - Jérôme Salse
- INRA/UBP UMR 1095, Laboratoire Génétique, Diversité et Ecophysiologie des Céréales, Clermont-Ferrand, F-63039, France
| | - Hadi Quesneville
- INRA, Unité de Recherche Génomique Info (URGI), Versailles, F78026, France
| | - Antoine Kremer
- INRA, UMR1202, BIOGECO, Cestas, F-33610, France.,University of Bordeaux, BIOGECO, UMR1202, Talence, F-33170, France
| |
Collapse
|
18
|
Higashi S, Fournier C, Gautier C, Gaspin C, Sagot MF. Mirinho: An efficient and general plant and animal pre-miRNA predictor for genomic and deep sequencing data. BMC Bioinformatics 2015; 16:179. [PMID: 26022464 PMCID: PMC4448272 DOI: 10.1186/s12859-015-0594-0] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Accepted: 04/23/2015] [Indexed: 12/13/2022] Open
Abstract
Background Several methods exist for the prediction of precursor miRNAs (pre-miRNAs) in genomic or sRNA-seq (small RNA sequences) data produced by NGS (Next Generation Sequencing). One key information used for this task is the characteristic hairpin structure adopted by pre-miRNAs, that in general are identified using RNA folders whose complexity is cubic in the size of the input. The vast majority of pre-miRNA predictors then rely on further information learned from previously validated miRNAs from the same or a closely related genome for the final prediction of new miRNAs. With this paper, we wished to address three main issues. The first was methodological and aimed at obtaining a more time-efficient predictor, however without losing in accuracy which represented a second issue. We indeed aimed at better predicting miRNAs at a genome scale, but also from sRNAseq data where in some cases, notably of plants, the current folding methods often infer the wrong structure. The third issue is related to the fact that it is important to rely as little as possible on previously recorded examples of miRNAs. We therefore also sought a method that is less dependent on previous miRNA records. Results As concerns the first and second issues, we present a novel alternative to a classical folder based on a thermodynamic Nearest-Neighbour (NN) model for computing the free energy and predicting the classical hairpin structure of a pre-miRNA. We show that the free energies thus computed correlate well with those of RNAfold. This novel method, called Mirinho, has quadratic instead of cubic complexity and is much more efficient also in practice. When applied to sRNAseq data of plants, it gives in general better results than classical folders. On the third issue, we show that Mirinho, which uses as only knowledge the length of the loops and stem-arms and the free energy of the pre-miRNA hairpin, compares well with algorithms that require more information. The results, obtained with different datasets, are indeed similar to those of other approaches with which such a comparison was possible. These needed to be publicly available softwares that could be used on a large input. In some cases, Mirinho is even better in terms of sensitivity or precision. Conclusion We provide a simpler and much faster method with very reasonable sensitivity and precision, which can be applied without special adaptation to the prediction of both animal and plant pre-miRNAs, using as input either genomic sequences or sRNA-seq data. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0594-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Susan Higashi
- ERABLE team, Inria Grenoble Rhône-Alpes, Montbonnot Saint-Martin, 38330, France. .,Université de Lyon, F-69000, Lyon; Université Lyon 1; CNRS, UMR5558, Laboratoire de Biométrie et Biologie Evolutive, Villeurbanne, F-69622, France.
| | - Cyril Fournier
- Université de Lyon, F-69000, Lyon; Université Lyon 1; CNRS, UMR5558, Laboratoire de Biométrie et Biologie Evolutive, Villeurbanne, F-69622, France.
| | - Christian Gautier
- ERABLE team, Inria Grenoble Rhône-Alpes, Montbonnot Saint-Martin, 38330, France. .,Université de Lyon, F-69000, Lyon; Université Lyon 1; CNRS, UMR5558, Laboratoire de Biométrie et Biologie Evolutive, Villeurbanne, F-69622, France.
| | - Christine Gaspin
- INRA, UBIA & Plateforme Bioinformatique, 24 Chemin de Borde Rouge, Auzeville, POBOX 5627, Castanet Tolosan, 31326, France.
| | - Marie-France Sagot
- ERABLE team, Inria Grenoble Rhône-Alpes, Montbonnot Saint-Martin, 38330, France. .,Université de Lyon, F-69000, Lyon; Université Lyon 1; CNRS, UMR5558, Laboratoire de Biométrie et Biologie Evolutive, Villeurbanne, F-69622, France.
| |
Collapse
|
19
|
Esquerré T, Moisan A, Chiapello H, Arike L, Vilu R, Gaspin C, Cocaign-Bousquet M, Girbal L. Genome-wide investigation of mRNA lifetime determinants in Escherichia coli cells cultured at different growth rates. BMC Genomics 2015; 16:275. [PMID: 25887031 PMCID: PMC4421995 DOI: 10.1186/s12864-015-1482-8] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2014] [Accepted: 03/24/2015] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND Changes to mRNA lifetime adjust mRNA concentration, facilitating the adaptation of growth rate to changes in growth conditions. However, the mechanisms regulating mRNA lifetime are poorly understood at the genome-wide scale and have not been investigated in bacteria growing at different rates. RESULTS We used linear covariance models and the best model selected according to the Akaike information criterion to identify and rank intrinsic and extrinsic general transcript parameters correlated with mRNA lifetime, using mRNA half-life datasets for E. coli, obtained at four growth rates. The principal parameter correlated with mRNA stability was mRNA concentration, the mRNAs most concentrated in the cells being the least stable. However, sequence-related features (codon adaptation index (CAI), ORF length, GC content, polycistronic mRNA), gene function and essentiality also affected mRNA lifetime at all growth rates. We also identified sequence motifs within the 5'UTRs potentially related to mRNA stability. Growth rate-dependent effects were confined to particular functional categories (e.g. carbohydrate and nucleotide metabolism). Finally, mRNA stability was less strongly correlated with the amount of protein produced than mRNA concentration and CAI. CONCLUSIONS This study provides the most complete genome-wide analysis to date of the general factors correlated with mRNA lifetime in E. coli. We have generalized for the entire population of transcripts or excluded determinants previously defined as regulators of stability for some particular mRNAs and identified new, unexpected general indicators. These results will pave the way for discussions of the underlying mechanisms and their interaction with the growth physiology of bacteria.
Collapse
Affiliation(s)
- Thomas Esquerré
- Université de Toulouse; ISBP, INSA, UPS, INP; LISBP, 135, avenue de Rangueil, 31077, Toulouse cedex 4, France. .,INRA, UMR792 Ingénierie des systèmes biologiques et des procédés, 31400, Toulouse, France. .,CNRS, UMR5504, 31400, Toulouse, France. .,Laboratoire de Microbiologie et Génétique Moléculaires, UMR5100, Centre National de la Recherche Scientifique et Université Paul Sabatier, 118 Route de Narbonne, 31062, Toulouse, France.
| | | | | | - Liisa Arike
- Competence Center of Food and Fermentation Technologies, Akadeemia tee 15A, 12618, Tallinn, Estonia.
| | - Raivo Vilu
- Competence Center of Food and Fermentation Technologies, Akadeemia tee 15A, 12618, Tallinn, Estonia. .,Department of Chemistry, Tallinn University of Technology, Akadeemia tee 15, 12618, Tallinn, Estonia.
| | | | - Muriel Cocaign-Bousquet
- Université de Toulouse; ISBP, INSA, UPS, INP; LISBP, 135, avenue de Rangueil, 31077, Toulouse cedex 4, France. .,INRA, UMR792 Ingénierie des systèmes biologiques et des procédés, 31400, Toulouse, France. .,CNRS, UMR5504, 31400, Toulouse, France.
| | - Laurence Girbal
- Université de Toulouse; ISBP, INSA, UPS, INP; LISBP, 135, avenue de Rangueil, 31077, Toulouse cedex 4, France. .,INRA, UMR792 Ingénierie des systèmes biologiques et des procédés, 31400, Toulouse, France. .,CNRS, UMR5504, 31400, Toulouse, France.
| |
Collapse
|
20
|
Choulet F, Alberti A, Theil S, Glover N, Barbe V, Daron J, Pingault L, Sourdille P, Couloux A, Paux E, Leroy P, Mangenot S, Guilhot N, Le Gouis J, Balfourier F, Alaux M, Jamilloux V, Poulain J, Durand C, Bellec A, Gaspin C, Safar J, Dolezel J, Rogers J, Vandepoele K, Aury JM, Mayer K, Berges H, Quesneville H, Wincker P, Feuillet C. Structural and functional partitioning of bread wheat chromosome 3B. Science 2014; 345:1249721. [PMID: 25035497 DOI: 10.1126/science.1249721] [Citation(s) in RCA: 382] [Impact Index Per Article: 38.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
We produced a reference sequence of the 1-gigabase chromosome 3B of hexaploid bread wheat. By sequencing 8452 bacterial artificial chromosomes in pools, we assembled a sequence of 774 megabases carrying 5326 protein-coding genes, 1938 pseudogenes, and 85% of transposable elements. The distribution of structural and functional features along the chromosome revealed partitioning correlated with meiotic recombination. Comparative analyses indicated high wheat-specific inter- and intrachromosomal gene duplication activities that are potential sources of variability for adaption. In addition to providing a better understanding of the organization, function, and evolution of a large and polyploid genome, the availability of a high-quality sequence anchored to genetic maps will accelerate the identification of genes underlying important agronomic traits.
Collapse
Affiliation(s)
- Frédéric Choulet
- Institut National de la Recherche Agronomique (INRA) UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France. University Blaise Pascal, UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France.
| | - Adriana Alberti
- Commissariat à l'Energie Atomique et aux Energies Alternatives, Direction des Sciences du Vivant, Institut de Génomique, Genoscope, 2 Rue Gaston Crémieux, 91000 Evry, France
| | - Sébastien Theil
- Institut National de la Recherche Agronomique (INRA) UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France. University Blaise Pascal, UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France
| | - Natasha Glover
- Institut National de la Recherche Agronomique (INRA) UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France. University Blaise Pascal, UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France
| | - Valérie Barbe
- Commissariat à l'Energie Atomique et aux Energies Alternatives, Direction des Sciences du Vivant, Institut de Génomique, Genoscope, 2 Rue Gaston Crémieux, 91000 Evry, France
| | - Josquin Daron
- Institut National de la Recherche Agronomique (INRA) UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France. University Blaise Pascal, UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France
| | - Lise Pingault
- Institut National de la Recherche Agronomique (INRA) UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France. University Blaise Pascal, UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France
| | - Pierre Sourdille
- Institut National de la Recherche Agronomique (INRA) UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France. University Blaise Pascal, UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France
| | - Arnaud Couloux
- Commissariat à l'Energie Atomique et aux Energies Alternatives, Direction des Sciences du Vivant, Institut de Génomique, Genoscope, 2 Rue Gaston Crémieux, 91000 Evry, France
| | - Etienne Paux
- Institut National de la Recherche Agronomique (INRA) UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France. University Blaise Pascal, UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France
| | - Philippe Leroy
- Institut National de la Recherche Agronomique (INRA) UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France. University Blaise Pascal, UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France
| | - Sophie Mangenot
- Commissariat à l'Energie Atomique et aux Energies Alternatives, Direction des Sciences du Vivant, Institut de Génomique, Genoscope, 2 Rue Gaston Crémieux, 91000 Evry, France
| | - Nicolas Guilhot
- Institut National de la Recherche Agronomique (INRA) UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France. University Blaise Pascal, UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France
| | - Jacques Le Gouis
- Institut National de la Recherche Agronomique (INRA) UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France. University Blaise Pascal, UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France
| | - Francois Balfourier
- Institut National de la Recherche Agronomique (INRA) UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France. University Blaise Pascal, UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France
| | - Michael Alaux
- INRA, UR1164 Unité de Recherche Génomique Info Research Unit in Genomics-Info, INRA de Versailles, Route de Saint-Cyr, 78026 Versailles, France
| | - Véronique Jamilloux
- INRA, UR1164 Unité de Recherche Génomique Info Research Unit in Genomics-Info, INRA de Versailles, Route de Saint-Cyr, 78026 Versailles, France
| | - Julie Poulain
- Commissariat à l'Energie Atomique et aux Energies Alternatives, Direction des Sciences du Vivant, Institut de Génomique, Genoscope, 2 Rue Gaston Crémieux, 91000 Evry, France
| | - Céline Durand
- Commissariat à l'Energie Atomique et aux Energies Alternatives, Direction des Sciences du Vivant, Institut de Génomique, Genoscope, 2 Rue Gaston Crémieux, 91000 Evry, France
| | - Arnaud Bellec
- Centre National des Ressources Génomiques Végétales, INRA UPR 1258, 24 Chemin de Borde Rouge, 31326 Castanet-Tolosan, France
| | - Christine Gaspin
- Biométrie et Intelligence Artificielle, INRA, Chemin de Borde Rouge, BP 27, 31326 Castanet-Tolosan, France
| | - Jan Safar
- Centre of the Region Haná for Biotechnological and Agricultural Research, Institute of Experimental Botany, Slechtitelu 31, CZ-78371 Olomouc, Czech Republic
| | - Jaroslav Dolezel
- Centre of the Region Haná for Biotechnological and Agricultural Research, Institute of Experimental Botany, Slechtitelu 31, CZ-78371 Olomouc, Czech Republic
| | - Jane Rogers
- The Genome Analysis Centre, Norwich, Norwich Research Park, Norwich NR4 7UH, UK
| | - Klaas Vandepoele
- Department of Plant Systems Biology (VIB) and Department of Plant Biotechnology and Bioinformatics (Ghent University), Technologiepark 927, 9052 Gent, Belgium
| | - Jean-Marc Aury
- Commissariat à l'Energie Atomique et aux Energies Alternatives, Direction des Sciences du Vivant, Institut de Génomique, Genoscope, 2 Rue Gaston Crémieux, 91000 Evry, France
| | - Klaus Mayer
- Munich Information Center for Protein Sequences, Institute of Bioinformatics and Systems Biology, Helmholtz Zentrum Muenchen, D-85764 Neuherberg, Germany
| | - Hélène Berges
- Centre National des Ressources Génomiques Végétales, INRA UPR 1258, 24 Chemin de Borde Rouge, 31326 Castanet-Tolosan, France
| | - Hadi Quesneville
- INRA, UR1164 Unité de Recherche Génomique Info Research Unit in Genomics-Info, INRA de Versailles, Route de Saint-Cyr, 78026 Versailles, France
| | - Patrick Wincker
- Commissariat à l'Energie Atomique et aux Energies Alternatives, Direction des Sciences du Vivant, Institut de Génomique, Genoscope, 2 Rue Gaston Crémieux, 91000 Evry, France. CNRS UMR 8030, 2 Rue Gaston Crémieux, 91000 Evry, France. Université d'Evry, CP5706 Evry, France
| | - Catherine Feuillet
- Institut National de la Recherche Agronomique (INRA) UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France. University Blaise Pascal, UMR1095, Genetics, Diversity and Ecophysiology of Cereals, 5 Chemin de Beaulieu, 63039 Clermont-Ferrand, France
| |
Collapse
|
21
|
Abstract
Pyrococcales are members of the order Thermococcales, a group of hyperthermophilic euryarchaea that are frequently found in deep sea hydrothermal vents. Infectious genetic elements, such as plasmids and viruses, remain a threat even in this remote environment and these microorganisms have developed several ways to fight their genetic invaders. Among these are the recently discovered CRISPR systems. In this review, we have combined and condensed available information on genetic elements infecting the Thermococcales and on the multiple CRISPR systems found in the Pyrococcales to fight them. Their organization and mode of action will be presented with emphasis on the Type III-B system that is the only CRISPR system known to target RNA molecules in a process reminiscent of RNA interference. The intriguing case of Pyrococcus abyssi, which is among the rare strains to present a CRISPR system devoid of the universal cas1 and cas2 genes, is also discussed.
Collapse
Affiliation(s)
- Cédric Norais
- Laboratoire de Biochimie, UMR CNRS 7654, Département de Biologie, Ecole Polytechnique, Palaiseau, France
| | | | | | | |
Collapse
|
22
|
Cros MJ, de Monte A, Mariette J, Bardou P, Grenier-Boley B, Gautheret D, Touzet H, Gaspin C. RNAspace.org: An integrated environment for the prediction, annotation, and analysis of ncRNA. RNA 2011; 17:1947-56. [PMID: 21947200 PMCID: PMC3198588 DOI: 10.1261/rna.2844911] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2011] [Accepted: 08/07/2011] [Indexed: 05/22/2023]
Abstract
The annotation of noncoding RNA genes remains a major bottleneck in genome sequencing projects. Most genome sequences released today still come with sets of tRNAs and rRNAs as the only annotated RNA elements, ignoring hundreds of other RNA families. We have developed a web environment that is dedicated to noncoding RNA (ncRNA) prediction, annotation, and analysis and allows users to run a variety of tools in an integrated and flexible manner. This environment offers complementary ncRNA gene finders and a set of tools for the comparison, visualization, editing, and export of ncRNA candidates. Predictions can be filtered according to a large set of characteristics. Based on this environment, we created a public website located at http://RNAspace.org. It accepts genomic sequences up to 5 Mb, which permits for an online annotation of a complete bacterial genome or a small eukaryotic chromosome. The project is hosted as a Source Forge project (http://rnaspace.sourceforge.net/).
Collapse
Affiliation(s)
| | - Antoine de Monte
- LIFL, UMR CNRS 8022 Université Lille 1 and INRIA Lille Nord Europe, 59655 Villeneuve d'Ascq cedex, France
| | - Jérôme Mariette
- INRA, Plateforme Bioinformatique, F-31320, UR 875, Castanet-Tolosan, France
| | | | - Benjamin Grenier-Boley
- LIFL, UMR CNRS 8022 Université Lille 1 and INRIA Lille Nord Europe, 59655 Villeneuve d'Ascq cedex, France
| | | | - Hélène Touzet
- LIFL, UMR CNRS 8022 Université Lille 1 and INRIA Lille Nord Europe, 59655 Villeneuve d'Ascq cedex, France
| | - Christine Gaspin
- INRA, UBIA, UR 875, F-31320 Castanet-Tolosan, France
- INRA, Plateforme Bioinformatique, F-31320, UR 875, Castanet-Tolosan, France
| |
Collapse
|
23
|
Phok K, Moisan A, Rinaldi D, Brucato N, Carpousis AJ, Gaspin C, Clouet-d'Orval B. Identification of CRISPR and riboswitch related RNAs among novel noncoding RNAs of the euryarchaeon Pyrococcus abyssi. BMC Genomics 2011; 12:312. [PMID: 21668986 PMCID: PMC3124441 DOI: 10.1186/1471-2164-12-312] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2011] [Accepted: 06/13/2011] [Indexed: 01/28/2023] Open
Abstract
Background Noncoding RNA (ncRNA) has been recognized as an important regulator of gene expression networks in Bacteria and Eucaryota. Little is known about ncRNA in thermococcal archaea except for the eukaryotic-like C/D and H/ACA modification guide RNAs. Results Using a combination of in silico and experimental approaches, we identified and characterized novel P. abyssi ncRNAs transcribed from 12 intergenic regions, ten of which are conserved throughout the Thermococcales. Several of them accumulate in the late-exponential phase of growth. Analysis of the genomic context and sequence conservation amongst related thermococcal species revealed two novel P. abyssi ncRNA families. The CRISPR family is comprised of crRNAs expressed from two of the four P. abyssi CRISPR cassettes. The 5'UTR derived family includes four conserved ncRNAs, two of which have features similar to known bacterial riboswitches. Several of the novel ncRNAs have sequence similarities to orphan OrfB transposase elements. Based on RNA secondary structure predictions and experimental results, we show that three of the twelve ncRNAs include Kink-turn RNA motifs, arguing for a biological role of these ncRNAs in the cell. Furthermore, our results show that several of the ncRNAs are subjected to processing events by enzymes that remain to be identified and characterized. Conclusions This work proposes a revised annotation of CRISPR loci in P. abyssi and expands our knowledge of ncRNAs in the Thermococcales, thus providing a starting point for studies needed to elucidate their biological function.
Collapse
Affiliation(s)
- Kounthéa Phok
- Laboratoire de Microbiologie et Génétique Moléculaire, Centre National de la Recherche Scientifique et Université de Toulouse III, France
| | | | | | | | | | | | | |
Collapse
|
24
|
Gaspin C, Rami JF, Lescure B. Distribution of short interstitial telomere motifs in two plant genomes: putative origin and function. BMC Plant Biol 2010; 10:283. [PMID: 21171996 PMCID: PMC3022908 DOI: 10.1186/1471-2229-10-283] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2009] [Accepted: 12/20/2010] [Indexed: 05/05/2023]
Abstract
BACKGROUND Short interstitial telomere motifs (telo boxes) are short sequences identical to plant telomere repeat units. They are observed within the 5' region of several genes over-expressed in cycling cells. In synergy with various cis-acting elements, these motifs participate in the activation of expression. Here, we have analysed the distribution of telo boxes within Arabidopsis thaliana and Oryza sativa genomes and their association with genes involved in the biogenesis of the translational apparatus. RESULTS Our analysis showed that the distribution of the telo box (AAACCCTA) in different genomic regions of A. thaliana and O. sativa is not random. As is also the case for plant microsatellites, they are preferentially located in the 5' flanking regions of genes, mainly within the 5' UTR, and distributed as a gradient along the direction of transcription. As previously reported in Arabidopsis, a conserved topological association of telo boxes with site II or TEF cis-acting elements is observed in almost all promoters of genes encoding ribosomal proteins in O. sativa. Such a conserved promoter organization can be found in other genes involved in the biogenesis of the translational machinery including rRNA processing proteins and snoRNAs. Strikingly, the association of telo boxes with site II motifs or TEF boxes is conserved in promoters of genes harbouring snoRNA clusters nested within an intron as well as in the 5' flanking regions of non-intronic snoRNA genes. Thus, the search for associations between telo boxes and site II motifs or TEF box in plant genomes could provide a useful tool for characterizing new cryptic RNA pol II promoters. CONCLUSIONS The data reported in this work support the model previously proposed for the spreading of telo boxes within plant genomes and provide new insights into a putative process for the acquisition of microsatellites in plants. The association of telo boxes with site II or TEF cis-acting elements appears to be an essential feature of plant genes involved in the biogenesis of ribosomes and clearly indicates that most plant snoRNAs are RNA pol II products.
Collapse
Affiliation(s)
- Christine Gaspin
- INRA Toulouse, UBIA & Plateforme Bioinformatique, UR 875, Chemin de Borde Rouge, Auzeville BP 52627, 31326 Castanet-Tolosan, France
| | - Jean-François Rami
- Centre de coopération internationale en recherche agronomique pour le développement (CIRAD). UMR Développement et Amélioration des Plantes, TA A96/3, Avenue Agropolis, 34398 Montpellier Cedex 5, France
| | - Bernard Lescure
- Laboratoire Interactions Plantes-Microorganismes (LIPM), UMR 441-2594 (INRA-CNRS), BP 52627, Chemin de Borde Rouge, Auzeville BP 52627, 31326 Castanet-Tolosan, France
| |
Collapse
|
25
|
Beaume M, Hernandez D, Farinelli L, Deluen C, Linder P, Gaspin C, Romby P, Schrenzel J, Francois P. Cartography of methicillin-resistant S. aureus transcripts: detection, orientation and temporal expression during growth phase and stress conditions. PLoS One 2010; 5:e10725. [PMID: 20505759 PMCID: PMC2873960 DOI: 10.1371/journal.pone.0010725] [Citation(s) in RCA: 110] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2010] [Accepted: 04/29/2010] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Staphylococcus aureus is a versatile bacterial opportunist responsible for a wide spectrum of infections. The severity of these infections is highly variable and depends on multiple parameters including the genome content of the bacterium as well as the condition of the infected host. Clinically and epidemiologically, S. aureus shows a particular capacity to survive and adapt to drastic environmental changes including the presence of numerous antimicrobial agents. Mechanisms triggering this adaptation remain largely unknown despite important research efforts. Most studies evaluating gene content have so far neglected to analyze the so-called intergenic regions as well as potential antisense RNA molecules. PRINCIPAL FINDINGS Using high-throughput sequencing technology, we performed an inventory of the whole transcriptome of S. aureus strain N315. In addition to the annotated transcription units, we identified more than 195 small transcribed regions, in the chromosome and the plasmid of S. aureus strain N315. The coding strand of each transcript was identified and structural analysis enabled classification of all discovered transcripts. RNA purified at four time-points during the growth phase of the bacterium allowed us to define the temporal expression of such transcripts. A selection of 26 transcripts of interest dispersed along the intergenic regions was assessed for expression changes in the presence of various stress conditions including pH, temperature, oxidative shocks and growth in a stringent medium. Most of these transcripts showed expression patterns specific for the defined stress conditions that we tested. CONCLUSIONS These RNA molecules potentially represent important effectors of S. aureus adaptation and more generally could support some of the epidemiological characteristics of the bacterium.
Collapse
MESH Headings
- Base Sequence
- Conserved Sequence
- DNA, Complementary/genetics
- Gene Expression Profiling
- Gene Expression Regulation, Bacterial
- Genome, Bacterial/genetics
- High-Throughput Screening Assays
- Methicillin-Resistant Staphylococcus aureus/genetics
- Methicillin-Resistant Staphylococcus aureus/growth & development
- Molecular Sequence Data
- Nucleic Acid Conformation
- RNA, Antisense/genetics
- RNA, Antisense/metabolism
- RNA, Bacterial/chemistry
- RNA, Bacterial/classification
- RNA, Bacterial/genetics
- RNA, Messenger/genetics
- RNA, Messenger/metabolism
- Reproducibility of Results
- Reverse Transcriptase Polymerase Chain Reaction
- Sequence Analysis, RNA
- Stress, Physiological/genetics
- Time Factors
Collapse
Affiliation(s)
- Marie Beaume
- Genomic Research Laboratory, Infectious Diseases Service, Geneva University Hospitals, Geneva, Switzerland.
| | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Geissmann T, Chevalier C, Cros MJ, Boisset S, Fechter P, Noirot C, Schrenzel J, François P, Vandenesch F, Gaspin C, Romby P. A search for small noncoding RNAs in Staphylococcus aureus reveals a conserved sequence motif for regulation. Nucleic Acids Res 2010; 37:7239-57. [PMID: 19786493 PMCID: PMC2790875 DOI: 10.1093/nar/gkp668] [Citation(s) in RCA: 169] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Bioinformatic analysis of the intergenic regions of Staphylococcus aureus predicted multiple regulatory regions. From this analysis, we characterized 11 novel noncoding RNAs (RsaA-K) that are expressed in several S. aureus strains under different experimental conditions. Many of them accumulate in the late-exponential phase of growth. All ncRNAs are stable and their expression is Hfq-independent. The transcription of several of them is regulated by the alternative sigma B factor (RsaA, D and F) while the expression of RsaE is agrA-dependent. Six of these ncRNAs are specific to S. aureus, four are conserved in other Staphylococci, and RsaE is also present in Bacillaceae. Transcriptomic and proteomic analysis indicated that RsaE regulates the synthesis of proteins involved in various metabolic pathways. Phylogenetic analysis combined with RNA structure probing, searches for RsaE-mRNA base pairing, and toeprinting assays indicate that a conserved and unpaired UCCC sequence motif of RsaE binds to target mRNAs and prevents the formation of the ribosomal initiation complex. This study unexpectedly shows that most of the novel ncRNAs carry the conserved C-rich motif, suggesting that they are members of a class of ncRNAs that target mRNAs by a shared mechanism.
Collapse
Affiliation(s)
- Thomas Geissmann
- Architecture et Réactivité de l'ARN, Université de Strasbourg, CNRS, IBMC, 15 rue René Descartes, F-67084 Strasbourg, France
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Abstract
The Weighted Constraint Satisfaction Problem (WCSP) framework allows representing and solving problems involving both hard constraints and cost functions. It has been applied to various problems, including resource allocation, bioinformatics, scheduling, etc. To solve such problems, solvers usually rely on branch-and-bound algorithms equipped with local consistency filtering, mostly soft arc consistency. However, these techniques are not well suited to solve problems with very large domains. Motivated by the resolution of an RNA gene localization problem inside large genomic sequences, and in the spirit of bounds consistency for large domains in crisp CSPs, we introduce soft bounds arc consistency, a new weighted local consistency specifically designed for WCSP with very large domains. Compared to soft arc consistency, BAC provides significantly improved time and space asymptotic complexity. In this paper, we show how the semantics of cost functions can be exploited to further improve the time complexity of BAC. We also compare both in theory and in practice the efficiency of BAC on a WCSP with bounds consistency enforced on a crisp CSP using cost variables. On two different real problems modeled as WCSP, including our RNA gene localization problem, we observe that maintaining bounds arc consistency outperforms arc consistency and also improves over bounds consistency enforced on a constraint model with cost variables.
Collapse
|
28
|
Grosjean H, Gaspin C, Marck C, Decatur WA, de Crécy-Lagard V. RNomics and Modomics in the halophilic archaea Haloferax volcanii: identification of RNA modification genes. BMC Genomics 2008; 9:470. [PMID: 18844986 PMCID: PMC2584109 DOI: 10.1186/1471-2164-9-470] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2008] [Accepted: 10/09/2008] [Indexed: 12/14/2022] Open
Abstract
Background Naturally occurring RNAs contain numerous enzymatically altered nucleosides. Differences in RNA populations (RNomics) and pattern of RNA modifications (Modomics) depends on the organism analyzed and are two of the criteria that distinguish the three kingdoms of life. If the genomic sequences of the RNA molecules can be derived from whole genome sequence information, the modification profile cannot and requires or direct sequencing of the RNAs or predictive methods base on the presence or absence of the modifications genes. Results By employing a comparative genomics approach, we predicted almost all of the genes coding for the t+rRNA modification enzymes in the mesophilic moderate halophile Haloferax volcanii. These encode both guide RNAs and enzymes. Some are orthologous to previously identified genes in Archaea, Bacteria or in Saccharomyces cerevisiae, but several are original predictions. Conclusion The number of modifications in t+rRNAs in the halophilic archaeon is surprisingly low when compared with other Archaea or Bacteria, particularly the hyperthermophilic organisms. This may result from the specific lifestyle of halophiles that require high intracellular salt concentration for survival. This salt content could allow RNA to maintain its functional structural integrity with fewer modifications. We predict that the few modifications present must be particularly important for decoding, accuracy of translation or are modifications that cannot be functionally replaced by the electrostatic interactions provided by the surrounding salt-ions. This analysis also guides future experimental validation work aiming to complete the understanding of the function of RNA modifications in Archaeal translation.
Collapse
Affiliation(s)
- Henri Grosjean
- Department of Microbiology, University of Florida, Gainsville, FL 32611, Florida, USA.
| | | | | | | | | |
Collapse
|
29
|
Noirot C, Gaspin C, Schiex T, Gouzy J. LeARN: a platform for detecting, clustering and annotating non-coding RNAs. BMC Bioinformatics 2008; 9:21. [PMID: 18194551 PMCID: PMC2241582 DOI: 10.1186/1471-2105-9-21] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2007] [Accepted: 01/14/2008] [Indexed: 11/16/2022] Open
Abstract
Background In the last decade, sequencing projects have led to the development of a number of annotation systems dedicated to the structural and functional annotation of protein-coding genes. These annotation systems manage the annotation of the non-protein coding genes (ncRNAs) in a very crude way, allowing neither the edition of the secondary structures nor the clustering of ncRNA genes into families which are crucial for appropriate annotation of these molecules. Results LeARN is a flexible software package which handles the complete process of ncRNA annotation by integrating the layers of automatic detection and human curation. Conclusion This software provides the infrastructure to deal properly with ncRNAs in the framework of any annotation project. It fills the gap between existing prediction software, that detect independent ncRNA occurrences, and public ncRNA repositories, that do not offer the flexibility and interactivity required for annotation projects. The software is freely available from the download section of the website
Collapse
Affiliation(s)
- Céline Noirot
- Laboratoire Interactions Plantes Micro-organismes UMR441/2594, INRA/CNRS, F-31320 Castanet Tolosan, France.
| | | | | | | |
Collapse
|
30
|
Boisset S, Geissmann T, Huntzinger E, Fechter P, Bendridi N, Possedko M, Chevalier C, Helfer AC, Benito Y, Jacquier A, Gaspin C, Vandenesch F, Romby P. Staphylococcus aureus RNAIII coordinately represses the synthesis of virulence factors and the transcription regulator Rot by an antisense mechanism. Genes Dev 2007; 21:1353-66. [PMID: 17545468 PMCID: PMC1877748 DOI: 10.1101/gad.423507] [Citation(s) in RCA: 349] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
RNAIII is the intracellular effector of the quorum-sensing system in Staphylococcus aureus. It is one of the largest regulatory RNAs (514 nucleotides long) that are known to control the expression of a large number of virulence genes. Here, we show that the 3' domain of RNAIII coordinately represses at the post-transcriptional level, the expression of mRNAs that encode a class of virulence factors that act early in the infection process. We demonstrate that the 3' domain acts primarily as an antisense RNA and rapidly anneals to these mRNAs, forming long RNA duplexes. The interaction between RNAIII and the mRNAs results in repression of translation initiation and triggers endoribonuclease III hydrolysis. These processes are followed by rapid depletion of the mRNA pool. In addition, we show that RNAIII and its 3' domain mediate translational repression of rot mRNA through a limited number of base pairings involving two loop-loop interactions. Since Rot is a transcriptional regulatory protein, we proposed that RNAIII indirectly acts on many downstream genes, resulting in the activation of the synthesis of several exoproteins. These data emphasize the multitude of regulatory steps affected by RNAIII and its 3' domain in establishing a network of S. aureus virulence factors.
Collapse
MESH Headings
- 3' Untranslated Regions/genetics
- 3' Untranslated Regions/metabolism
- Bacterial Proteins/genetics
- Bacterial Proteins/metabolism
- Base Sequence
- Gene Expression Regulation, Bacterial
- Hydrolysis
- Molecular Sequence Data
- Nucleic Acid Conformation
- Quorum Sensing
- RNA, Antisense/chemistry
- RNA, Antisense/genetics
- RNA, Antisense/metabolism
- RNA, Antisense/pharmacology
- RNA, Bacterial/chemistry
- RNA, Bacterial/genetics
- RNA, Bacterial/metabolism
- RNA, Messenger/genetics
- RNA, Messenger/metabolism
- Sequence Homology, Nucleic Acid
- Staphylococcus aureus/enzymology
- Staphylococcus aureus/genetics
- Transcription, Genetic
- Virulence Factors/genetics
- Virulence Factors/metabolism
Collapse
Affiliation(s)
- Sandrine Boisset
- Institut National pour la Recherche Médicale (INSERM) E0230, Université Lyon 1, Centre National de Référence des Staphylocoques, Faculté Laennec, Lyon, F-69008, France
| | - Thomas Geissmann
- Architecture et Réactivité de l’ARN, Université Louis Pasteur, Centre National de la Recherche Scientifique (CNRS), Institut de Biologie Moléculaire et Cellulaire (IBMC), F-67084 Strasbourg, France
| | - Eric Huntzinger
- Architecture et Réactivité de l’ARN, Université Louis Pasteur, Centre National de la Recherche Scientifique (CNRS), Institut de Biologie Moléculaire et Cellulaire (IBMC), F-67084 Strasbourg, France
| | - Pierre Fechter
- Architecture et Réactivité de l’ARN, Université Louis Pasteur, Centre National de la Recherche Scientifique (CNRS), Institut de Biologie Moléculaire et Cellulaire (IBMC), F-67084 Strasbourg, France
| | - Nadia Bendridi
- Institut National pour la Recherche Médicale (INSERM) E0230, Université Lyon 1, Centre National de Référence des Staphylocoques, Faculté Laennec, Lyon, F-69008, France
| | - Maria Possedko
- Architecture et Réactivité de l’ARN, Université Louis Pasteur, Centre National de la Recherche Scientifique (CNRS), Institut de Biologie Moléculaire et Cellulaire (IBMC), F-67084 Strasbourg, France
| | - Clément Chevalier
- Architecture et Réactivité de l’ARN, Université Louis Pasteur, Centre National de la Recherche Scientifique (CNRS), Institut de Biologie Moléculaire et Cellulaire (IBMC), F-67084 Strasbourg, France
| | - Anne Catherine Helfer
- Architecture et Réactivité de l’ARN, Université Louis Pasteur, Centre National de la Recherche Scientifique (CNRS), Institut de Biologie Moléculaire et Cellulaire (IBMC), F-67084 Strasbourg, France
| | - Yvonne Benito
- Institut National pour la Recherche Médicale (INSERM) E0230, Université Lyon 1, Centre National de Référence des Staphylocoques, Faculté Laennec, Lyon, F-69008, France
| | - Alain Jacquier
- Unité de Génétique des Interactions Macromoléculaires, URA 2171-Centre National de la Recherche Scientifique, Institut Pasteur, F-75724 Paris, France
| | - Christine Gaspin
- Unité de Biométrie et Intelligence Artificielle, Institut de National de la Recherche Agronomique (INRA)-UR875 Chemin de Borde-Rouge, F-31326 Castanet-Tolosan, France
| | - François Vandenesch
- Institut National pour la Recherche Médicale (INSERM) E0230, Université Lyon 1, Centre National de Référence des Staphylocoques, Faculté Laennec, Lyon, F-69008, France
| | - Pascale Romby
- Architecture et Réactivité de l’ARN, Université Louis Pasteur, Centre National de la Recherche Scientifique (CNRS), Institut de Biologie Moléculaire et Cellulaire (IBMC), F-67084 Strasbourg, France
- Corresponding author.E-MAIL . FAX: 33-388602218
| |
Collapse
|
31
|
Abstract
MOTIVATION Searching RNA gene occurrences in genomic sequences is a task whose importance has been renewed by the recent discovery of numerous functional RNA, often interacting with other ligands. Even if several programs exist for RNA motif search, none exists that can represent and solve the problem of searching for occurrences of RNA motifs in interaction with other molecules. RESULTS We present a constraint network formulation of this problem. RNA are represented as structured motifs that can occur on more than one sequence and which are related together by possible hybridization. The implemented tool MilPat is used to search for several sRNA families in genomic sequences. Results show that MilPat allows to efficiently search for interacting motifs in large genomic sequences and offers a simple and extensible framework to solve such problems. New and known sRNA are identified as H/ACA candidates in Methanocaldococcus jannaschii. AVAILABILITY http://carlit.toulouse.inra.fr/MilPaT/MilPat.pl.
Collapse
Affiliation(s)
- P Thébault
- Unité de Biométrie & Intelligence Artificielle, INRA, Chemin de Borde Rouge Auzeville, BP 52627, 31326 Castanet-Tolosan, France
| | | | | | | |
Collapse
|
32
|
Renalier MH, Joseph N, Gaspin C, Thebault P, Mougin A. The Cm56 tRNA modification in archaea is catalyzed either by a specific 2'-O-methylase, or a C/D sRNP. RNA 2005; 11:1051-63. [PMID: 15987815 PMCID: PMC1370790 DOI: 10.1261/rna.2110805] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
We identified the first archaeal tRNA ribose 2'-O-methylase, aTrm56, belonging to the Cluster of Orthologous Groups (COG) 1303 that contains archaeal genes only. The corresponding protein exhibits a SPOUT S-adenosylmethionine (AdoMet)-dependent methyltransferase domain found in bacterial and yeast G18 tRNA 2'-O-methylases (SpoU, Trm3). We cloned the Pyrococcus abyssi PAB1040 gene belonging to this COG, expressed and purified the corresponding protein, and showed that in vitro, it specifically catalyzes the AdoMet-dependent 2'-O-ribose methylation of C at position 56 in tRNA transcripts. This tRNA methylation is present only in archaea, and the gene for this enzyme is present in all the archaeal genomes sequenced up to now, except in the crenarchaeon Pyrobaculum aerophilum. In this archaea, the C56 2'-O-methylation is provided by a C/D sRNP. Our work is the first demonstration that, within the same kingdom, two different mechanisms are used to modify the same nucleoside in tRNAs.
Collapse
MESH Headings
- Amino Acid Sequence
- Catalysis
- Cloning, Molecular
- Consensus Sequence
- Cytosine/metabolism
- Escherichia coli/genetics
- Genome, Archaeal
- Glutathione Transferase/metabolism
- Kinetics
- Molecular Sequence Data
- Molecular Weight
- Open Reading Frames
- Phylogeny
- Protein Structure, Secondary
- Pyrobaculum/genetics
- Pyrobaculum/metabolism
- Pyrococcus abyssi/enzymology
- Pyrococcus abyssi/genetics
- RNA, Archaeal/chemistry
- RNA, Archaeal/genetics
- RNA, Archaeal/metabolism
- RNA, Small Nucleolar/genetics
- RNA, Small Nucleolar/metabolism
- RNA, Transfer/chemistry
- RNA, Transfer/metabolism
- Recombinant Proteins/chemistry
- Recombinant Proteins/isolation & purification
- Recombinant Proteins/metabolism
- Sequence Homology, Amino Acid
- Substrate Specificity
- Temperature
- tRNA Methyltransferases/chemistry
- tRNA Methyltransferases/classification
- tRNA Methyltransferases/genetics
- tRNA Methyltransferases/metabolism
Collapse
Affiliation(s)
- Marie-Hélène Renalier
- IEFG 109, Laboratoire de Biologie Moléculaire des Eucaryotes, (LBME) UMR CNRS/UHP 5099 118, route de Narbonne, 31062 Toulouse Cedex 02, France
| | | | | | | | | |
Collapse
|
33
|
Aubourg S, Brunaud V, Bruyère C, Cock M, Cooke R, Cottet A, Couloux A, Déhais P, Deléage G, Duclert A, Echeverria M, Eschbach A, Falconet D, Filippi G, Gaspin C, Geourjon C, Grienenberger JM, Houlné G, Jamet E, Lechauve F, Leleu O, Leroy P, Mache R, Meyer C, Nedjari H, Negrutiu I, Orsini V, Peyretaillade E, Pommier C, Raes J, Risler JL, Rivière S, Rombauts S, Rouzé P, Schneider M, Schwob P, Small I, Soumayet-Kampetenga G, Stankovski D, Toffano C, Tognolli M, Caboche M, Lecharny A. GeneFarm, structural and functional annotation of Arabidopsis gene and protein families by a network of experts. Nucleic Acids Res 2005; 33:D641-6. [PMID: 15608279 PMCID: PMC540069 DOI: 10.1093/nar/gki115] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Genomic projects heavily depend on genome annotations and are limited by the current deficiencies in the published predictions of gene structure and function. It follows that, improved annotation will allow better data mining of genomes, and more secure planning and design of experiments. The purpose of the GeneFarm project is to obtain homogeneous, reliable, documented and traceable annotations for Arabidopsis nuclear genes and gene products, and to enter them into an added-value database. This re-annotation project is being performed exhaustively on every member of each gene family. Performing a family-wide annotation makes the task easier and more efficient than a gene-by-gene approach since many features obtained for one gene can be extrapolated to some or all the other genes of a family. A complete annotation procedure based on the most efficient prediction tools available is being used by 16 partner laboratories, each contributing annotated families from its field of expertise. A database, named GeneFarm, and an associated user-friendly interface to query the annotations have been developed. More than 3000 genes distributed over 300 families have been annotated and are available at http://genoplante-info.infobiogen.fr/Genefarm/. Furthermore, collaboration with the Swiss Institute of Bioinformatics is underway to integrate the GeneFarm data into the protein knowledgebase Swiss-Prot.
Collapse
Affiliation(s)
- Sébastien Aubourg
- Unité de Recherche en Génomique Végétale (INRA/CNRS/UEVE) 2 Rue Gaston Crémieux, CP 5708, 91057 Evry Cedex, France.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Clouet-d'Orval B, Gaspin C, Mougin A. Two different mechanisms for tRNA ribose methylation in Archaea: a short survey. Biochimie 2005; 87:889-95. [PMID: 16164996 DOI: 10.1016/j.biochi.2005.02.004] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2004] [Accepted: 02/10/2005] [Indexed: 10/25/2022]
Abstract
The biogenesis of tRNA involves multiple reactions including post-transcriptional modifications and pre-tRNA splicing. Among the three domains of life, only Archaea have two different mechanisms for tRNA ribose methylation: site-specific 2'-O-methyltransferases and C/D guided-RNA machinery. Recently, the first archaeal tRNA 2'-O-methyltransferase, aTrm56, has been characterized. This enzyme is found in all archaeal genomes sequenced so far except one and belongs to the SPOUT family (class IV) of RNA methyltransferases. Its substrate is the conserved C56 in the T-loop of archaeal tRNAs. In the crenarchaeon Pyrobaculum aerophylum, in which no homologue of this methyltransferase is found, a box C/D guide sRNP insures the ribose methylation of C56. Moreover, a new twist on tRNA processing is the finding, in most euryarchaeal tRNAtrp genes, of a box C/D guide RNA within their intron specifying methylation at two sites. Modification of tRNA is an integral part of the complex maturation process of primary tRNA transcripts. In addition to their role in modification, both modification enzymes and C/D guide RNPs may have a chaperone function insuring the precise folding of the mature, functional tRNA.
Collapse
Affiliation(s)
- Béatrice Clouet-d'Orval
- Laboratoire de Microbiologie Génétique et Moléculaire, UMR5100 Université Paul-Sabatier, 118, route de Narbonne, 31062 Toulouse, France.
| | | | | |
Collapse
|
35
|
Salanoubat M, Genin S, Artiguenave F, Gouzy J, Mangenot S, Arlat M, Billault A, Brottier P, Camus JC, Cattolico L, Chandler M, Choisne N, Claudel-Renard C, Cunnac S, Demange N, Gaspin C, Lavie M, Moisan A, Robert C, Saurin W, Schiex T, Siguier P, Thébault P, Whalen M, Wincker P, Levy M, Weissenbach J, Boucher CA. Genome sequence of the plant pathogen Ralstonia solanacearum. Nature 2002; 415:497-502. [PMID: 11823852 DOI: 10.1038/415497a] [Citation(s) in RCA: 608] [Impact Index Per Article: 27.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Ralstonia solanacearum is a devastating, soil-borne plant pathogen with a global distribution and an unusually wide host range. It is a model system for the dissection of molecular determinants governing pathogenicity. We present here the complete genome sequence and its analysis of strain GMI1000. The 5.8-megabase (Mb) genome is organized into two replicons: a 3.7-Mb chromosome and a 2.1-Mb megaplasmid. Both replicons have a mosaic structure providing evidence for the acquisition of genes through horizontal gene transfer. Regions containing genetically mobile elements associated with the percentage of G+C bias may have an important function in genome evolution. The genome encodes many proteins potentially associated with a role in pathogenicity. In particular, many putative attachment factors were identified. The complete repertoire of type III secreted effector proteins can be studied. Over 40 candidates were identified. Comparison with other genomes suggests that bacterial plant pathogens and animal pathogens harbour distinct arrays of specialized type III-dependent effectors.
Collapse
Affiliation(s)
- M Salanoubat
- Genoscope and CNRS UMR-8030, 2 rue Gaston Crémieux, CP5706, 91057 Evry Cedex, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Clouet d'Orval B, Bortolin ML, Gaspin C, Bachellerie JP. Box C/D RNA guides for the ribose methylation of archaeal tRNAs. The tRNATrp intron guides the formation of two ribose-methylated nucleosides in the mature tRNATrp. Nucleic Acids Res 2001; 29:4518-29. [PMID: 11713301 PMCID: PMC92551 DOI: 10.1093/nar/29.22.4518] [Citation(s) in RCA: 122] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Following a search of the Pyrococcus genomes for homologs of eukaryotic methylation guide small nucleolar RNAs, we have experimentally identified in Pyrococcus abyssi four novel box C/D small RNAs predicted to direct 2'-O-ribose methylations onto the first position of the anticodon in tRNALeu(CAA), tRNALeu(UAA), elongator tRNAMet and tRNATrp, respectively. Remarkably, one of them corresponds to the intron of its presumptive target, pre-tRNATrp. This intron is predicted to direct in cis two distinct ribose methylations within the unspliced tRNA precursor, not only onto the first position of the anticodon in the 5' exon but also onto position 39 (universal tRNA numbering) in the 3' exon. The two intramolecular RNA duplexes expected to direct methylation, which both span an exon-intron junction in pre-tRNATrp, are phylogenetically conserved in euryarchaeotes. We have experimentally confirmed the predicted guide function of the box C/D intron in halophile Haloferax volcanii by mutagenesis analysis, using an in vitro splicing/RNA modification assay in which the two cognate ribose methylations of pre-tRNATrp are faithfully reproduced. Euryarchaeal pre-tRNATrp should provide a unique system to further investigate the molecular mechanisms of RNA-guided ribose methylation and gain new insights into the origin and evolution of the complex family of archaeal and eukaryotic box C/D small RNAs.
Collapse
MESH Headings
- Base Sequence
- DNA, Archaeal/chemistry
- DNA, Archaeal/genetics
- Genome, Archaeal
- Introns/genetics
- Methylation
- Molecular Sequence Data
- Mutation
- Nucleic Acid Conformation
- Nucleosides/genetics
- Nucleosides/metabolism
- Nucleotides/genetics
- Nucleotides/metabolism
- Phylogeny
- Plasmids/genetics
- Pyrococcus/genetics
- Pyrococcus/metabolism
- RNA, Archaeal/chemistry
- RNA, Archaeal/genetics
- RNA, Archaeal/metabolism
- RNA, Small Nucleolar/genetics
- RNA, Small Nucleolar/metabolism
- RNA, Transfer/chemistry
- RNA, Transfer/genetics
- RNA, Transfer/metabolism
- RNA, Transfer, Trp/genetics
- RNA, Transfer, Trp/metabolism
- Ribose/metabolism
- Sequence Alignment
- Sequence Analysis, DNA
- Sequence Homology, Nucleic Acid
Collapse
Affiliation(s)
- B Clouet d'Orval
- Laboratoire de Biologie Moléculaire Eucaryote, UMR5099 du CNRS, Université Paul Sabatier, 118 Route de Narbonne, 31062 Toulouse, France
| | | | | | | |
Collapse
|
37
|
Barneche F, Gaspin C, Guyot R, Echeverría M. Identification of 66 box C/D snoRNAs in Arabidopsis thaliana: extensive gene duplications generated multiple isoforms predicting new ribosomal RNA 2'-O-methylation sites. J Mol Biol 2001; 311:57-73. [PMID: 11469857 DOI: 10.1006/jmbi.2001.4851] [Citation(s) in RCA: 58] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Dozens of box C/D small nucleolar RNAs (snoRNAs) have recently been found in eukaryotes (vertebrates, yeast), ancient eukaryotes (trypanosomes) and archae, that specifically target ribosomal RNA sites for 2'-O-ribose methylation. Although early biochemical data revealed that plant rRNAs are among the most highly ribomethylated in eukaryotes, only a handful of methylation guide snoRNAs have been characterized in this kingdom. We report 66 novel box C/D snoRNAs identified by computational screening of Arabidopsis genomic sequences that are expressed in vivo from either single genes, 17 different clusters or three introns. At the structural level, many box C/D snoRNAs have dual antisense elements often matching rRNA regions close to each other on the rRNA secondary structure, which is reminiscent of their archaeal counterparts. Remarkable specimens are found that display two antisense elements having the potential to form an extended snoRNA-rRNA duplex of 23 to 30 nt, in line with the hypothetical function of box C/D snoRNAs in pre-rRNA folding or chaperoning. In contrast to other species, many Arabidopsis snoRNAs are found in multiple isoforms mainly resulting from two different mechanisms: large chromosomal duplications and small tandem duplications producing polycistronic genes. The discovery of numerous different snoRNAs, some of them arising from common ancestors, provide new insights to understand snoRNAs evolution and the birth of new rRNA methylation sites in plants and other organisms.
Collapse
MESH Headings
- Arabidopsis/genetics
- Base Sequence
- Chromosomes/genetics
- Computational Biology
- Evolution, Molecular
- Gene Duplication
- Genes, Duplicate/genetics
- Genes, Plant/genetics
- Genetic Variation/genetics
- Methylation
- Molecular Sequence Data
- Nucleic Acid Conformation
- RNA, Antisense/chemistry
- RNA, Antisense/genetics
- RNA, Antisense/metabolism
- RNA, Plant/chemistry
- RNA, Plant/genetics
- RNA, Plant/metabolism
- RNA, Ribosomal/chemistry
- RNA, Ribosomal/genetics
- RNA, Ribosomal/metabolism
- RNA, Small Nucleolar/chemistry
- RNA, Small Nucleolar/classification
- RNA, Small Nucleolar/genetics
- RNA, Small Nucleolar/metabolism
- Reverse Transcriptase Polymerase Chain Reaction
- Ribose/chemistry
- Ribose/metabolism
- Ribosomal Proteins/metabolism
- Tandem Repeat Sequences/genetics
Collapse
Affiliation(s)
- F Barneche
- Laboratoire Génome et Développement des Plantes, Université de Perpignan, UMR CNRS 5096, 52 Avenue de Villeneuve, Perpignan Cedex, 66860, France
| | | | | | | |
Collapse
|
38
|
Gaspin C, Cavaillé J, Erauso G, Bachellerie JP. Archaeal homologs of eukaryotic methylation guide small nucleolar RNAs: lessons from the Pyrococcus genomes. J Mol Biol 2000; 297:895-906. [PMID: 10736225 DOI: 10.1006/jmbi.2000.3593] [Citation(s) in RCA: 141] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Ribose methylation is a prevalent type of nucleotide modification in rRNA. Eukaryotic rRNAs display a complex pattern of ribose methylations, amounting to 55 in yeast Saccharomyces cerevisiae and about 100 in vertebrates. Ribose methylations of eukaryotic rRNAs are each guided by a cognate small RNA, belonging to the family of box C/D antisense snoRNAs, through transient formation of a specific base-pairing at the rRNA modification site. In prokaryotes, the pattern of rRNA ribose methylations has been fully characterized in a single species so far, Escherichia coli, which contains only four ribose methylated rRNA nucleotides. However, the hyperthermophile archaeon Sulfolobus solfataricus contains, like eukaryotes, a large number of (yet unmapped) rRNA ribose methylations and homologs of eukaryotic box C/D small nucleolar ribonuclear proteins have been identified in archaeal genomes. We have therefore searched archaeal genomes for potential homologs of eukaryotic methylation guide small nucleolar RNAs, by combining searches for structured motifs with homology searches. We have identified a family of 46 small RNAs, conserved in the genomes of three hyperthermophile Pyrococcus species, which we have experimentally characterized in Pyrococcus abyssi. The Pyrococcus small RNAs, the first reported homologs of methylation guide small nucleolar RNAs in organisms devoid of a nucleus, appear as a paradigm of minimalist box C/D antisense RNAs. They differ from their eukaryotic homologs by their outstanding structural homogeneity, extended consensus box motifs and the quasi-systematic presence of two (instead of one) rRNA antisense elements. Remarkably, for each small RNA the two antisense elements always match rRNA sequences close to each other in rRNA structure, suggesting an important role in rRNA folding. Only a few of the predicted P. abyssi rRNA ribose methylations have been detected so far. Further analysis of these archaeal small RNAs could provide new insights into the origin and functions of methylation guide small nucleolar RNAs and illuminate the still elusive role of rRNA ribose methylations.
Collapse
MESH Headings
- Base Sequence
- Consensus Sequence/genetics
- Databases, Factual
- Eukaryotic Cells/metabolism
- Genes, Archaeal/genetics
- Genome, Archaeal
- Methylation
- Molecular Sequence Data
- Nucleic Acid Conformation
- Open Reading Frames/genetics
- Physical Chromosome Mapping
- Pyrococcus/genetics
- RNA, Antisense/genetics
- RNA, Antisense/metabolism
- RNA, Archaeal/chemistry
- RNA, Archaeal/genetics
- RNA, Archaeal/metabolism
- RNA, Ribosomal/chemistry
- RNA, Ribosomal/genetics
- RNA, Ribosomal/metabolism
- RNA, Small Nucleolar/genetics
- RNA, Small Nucleolar/metabolism
- Ribose/metabolism
- Sequence Homology, Nucleic Acid
- Software
Collapse
Affiliation(s)
- C Gaspin
- Laboratoire de Biométrie et Intelligence Artificielle, INRA, Castanet-Tolosan, 31326, France
| | | | | | | |
Collapse
|
39
|
Laurent V, Wajnberg E, Mangin B, Schiex T, Gaspin C, Vanlerberghe-Masutti F. A composite genetic map of the parasitoid wasp Trichogramma brassicae based on RAPD markers. Genetics 1998; 150:275-82. [PMID: 9725846 PMCID: PMC1460326 DOI: 10.1093/genetics/150.1.275] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Three linkage maps of the genome of the microhymenopteran Trichogramma brassicae were constructed from the analysis of segregation of random amplified polymorphic DNA markers in three F2 populations. These populations were composed of the haploid male progeny of several virgin F1 females, which resulted from the breeding of four parental lines that were nearly fixed for different random amplified polymorphic DNA markers and that were polymorphic for longevity and fecundity characters. As the order of markers common to the three mapping populations was found to be well conserved, a composite linkage map was constructed. Eighty-four markers were organized into five linkage groups and two pairs. The mean interval between two markers was 17.7 cM, and the map spanned 1330 cM.
Collapse
Affiliation(s)
- V Laurent
- Laboratoire de Biologie des Invertébrés, Biologie des Populations, INRA, 06606 Antibes, France
| | | | | | | | | | | |
Collapse
|
40
|
Chetouani F, Monestié P, Thébault P, Gaspin C, Michot B. ESSA: an integrated and interactive computer tool for analysing RNA secondary structure. Nucleic Acids Res 1997; 25:3514-22. [PMID: 9254713 PMCID: PMC146922 DOI: 10.1093/nar/25.17.3514] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
With ESSA, we propose an approach of RNA secondary structure analysis based on extensive viewing within a friendly graphical interface. This computer program is organized around the display of folding models produced by two complementary methods suitable to draw long RNA molecules. Any feature of interest can be managed directly on the display and highlighted by a rich combination of colours and symbols with emphasis given to structural probe accessibilities. ESSA also includes a word searching procedure allowing easy visual identification of structural features even complex and degenerated. Analysis functions make it possible to calculate the thermodynamic stability of any part of a folding using several models and compare homologous aligned RNA both in primary and secondary structure. The predictive capacities of ESSA which brings together the experimental, thermodynamic and comparative methods, are increased by coupling it with a program dedicated to RNA folding prediction based on constraints management and propagation. The potentialities of ESSA are illustrated by the identification of a possible tertiary motif in the LSU rRNA and the visualization of a pseudoknot in S15 mRNA.
Collapse
Affiliation(s)
- F Chetouani
- Laboratoire de Biologie Moléculaire Eucaryote du C.N.R.S., Université Paul Sabatier, 118 route de Narbonne, 31062 Toulouse Cedex, France
| | | | | | | | | |
Collapse
|
41
|
Schiex T, Gaspin C. CARTHAGENE: constructing and joining maximum likelihood genetic maps. Proc Int Conf Intell Syst Mol Biol 1997; 5:258-267. [PMID: 9322047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Genetic mapping is an important step in the study of any organism. An accurate genetic map is extremely valuable for locating genes or more generally either qualitative or quantitative trait loci (QTL). This paper presents a new approach to two important problems in genetic mapping: automatically ordering markers to obtain a multipoint maximum likelihood map and building a multipoint maximum likelihood map using pooled data from several crosses. The approach is embodied in an hybrid algorithm that mixes the statistical optimization algorithm EM with local search techniques which have been developed in the artificial intelligence and operations research communities. An efficient implementation of the EM algorithm provides maximum likelihood recombination fractions, while the local search techniques look for orders that maximize this maximum likelihood. The specificity of the approach lies in the neighborhood structure used in the local search algorithms which has been inspired by an analogy between the marker ordering problem and the famous traveling salesman problem. The approach has been used to build joined maps for the wasp Trichogramma brassicae and on random pooled data sets. In both cases, it compares quite favorably with existing softwares as far as maximum likelihood is considered as a significant criteria.
Collapse
Affiliation(s)
- T Schiex
- Institut National de la Recherche Agronomique, Castanet-Tolosan, France.
| | | |
Collapse
|
42
|
Abstract
A novel approach aiding in the prediction of RNA secondary structures is presented. Although phylogenetic methods are the most successful at deriving RNA secondary structures, the are not applicable when the number of sequences or the sequence variability is too low. Methods based on energy minimization are therefore of great interest. However, some of the suboptimal RNA secondary structures computed with classic methods are unsaturated structures, i.e. some structures are included into others. Thus, the incorporation of constraints during the process of folding is not possible, while the incorporation of constraints before the process of folding often introduces a bias into the energy function. This paper describes a new procedure which allows for the incorporation of constraints before and during the process of RNA folding. SAPSSARN is an interactive program which offers a framework, both to specify a secondary structure through a set of folding constraints and to compute all the supoptimal saturated RNA secondary structures which satisfy all the folding constraints. At the start, it relies on the computation of the probabilities of pairing of each base with all others according to McCaskill's algorithm. The constraint satisfaction formulation of the problem deals dynamically with a chosen set of folding constraints and, finally, a search algorithm computes all the suboptimal saturated secondary structures which satisfy those folding constraints. Within such a framework, it is possible to test new ideas about RNA folding and secondary structures, including pseudoknots, can be computed. The program is illustrated with RNA sequences on which we obtained results in agreement with known structures by using a protocol which mimics the hierarchical folding of RNA molecules.
Collapse
Affiliation(s)
- C Gaspin
- SBIA/INRA Chemin de Borde Rouge, Castanet Tolosan, France
| | | |
Collapse
|
43
|
Abstract
A program for drawing automatically exact and schematic views of nucleic acids is described. The program is written in C ANSI and uses the Silicon Graphics GL and Xirisw libraries within the X11/Motif environment. Through menus, the user can choose, specify, and manipulate in real time the three-dimensional views to be displayed. Drawing options include partitioning of structures into differently colored or shaped fragments, representation of backbones as flat or with conic-section ribbons, display of paired or free bases as rods, and display of surfaces as filled or outlined and stereo or depth-cued views.
Collapse
Affiliation(s)
- C Massire
- Equipe de Modélisation et de Simulation des Acides Nucléiques, Institut de Biologie Moléculaire et Cellulaire du CNRS, Strasbourg, France
| | | | | |
Collapse
|
44
|
Abstract
A set of programs written in C language with the GL library and under UNIX has been developed for generating compact, pleasant and non-overlapping displays of secondary structures of ribonucleic acids. The first program, rnasearch, implements a new search procedure that dynamically rearranges overlapping portions of the two-dimensional drawing while preserving clear and readable displays of the two-dimensional structure. The algorithm is fast (the execution time for the command rnasearch is 38.6 s for the 16S rRNA of Escherichia coli with 1542 bases), accepts outputs from two-dimensional prediction programs and therefore allows for rapid comparison between the various two-dimensional folds generated. A second program, rnadisplay, allows the graphical display of the computed two-dimensional structures on a graphics workstation. Otherwise, it is possible to obtain a paper output of the two-dimensional structure by using the program print2D which builds a Postscript file. Moreover the two-dimensional drawing can be labelled for representing data coming from chemical modifications and/or enzymatic cleavages. Application to a few secondary structures such as RNaseP, 5S rRNA and 16S rRNA are given.
Collapse
MESH Headings
- Algorithms
- Bacillus subtilis/chemistry
- Bacillus subtilis/genetics
- Base Sequence
- Computer Graphics
- Endoribonucleases/genetics
- Escherichia coli/chemistry
- Escherichia coli/genetics
- Escherichia coli Proteins
- Molecular Sequence Data
- Nucleic Acid Conformation
- RNA/chemistry
- RNA/genetics
- RNA, Bacterial/genetics
- RNA, Catalytic/genetics
- RNA, Ribosomal, 16S/chemistry
- RNA, Ribosomal, 16S/genetics
- RNA, Ribosomal, 5S/chemistry
- RNA, Ribosomal, 5S/genetics
- Ribonuclease P
- Software
Collapse
Affiliation(s)
- G Muller
- UPR Structure des Macromolécules Biologues et Mécanismes de Reconnaissance, Institut de Biologie Moléulaire et Cellulaire du CNRS, Strasbourg, France
| | | | | | | |
Collapse
|