1
|
Yu D, Pei Y, Cui N, Zhao G, Hou M, Chen Y, Chen J, Li X. Comparative and phylogenetic analysis of complete chloroplast genome sequences of Salvia regarding its worldwide distribution. Sci Rep 2023; 13:14268. [PMID: 37652950 PMCID: PMC10471775 DOI: 10.1038/s41598-023-41198-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 08/23/2023] [Indexed: 09/02/2023] Open
Abstract
Salvia is widely used as medicine, food, and ornamental plants all over the world, with three main distribution centers, the Central and western Asia/Mediterranean (CAM), the East Aisa (EA), and the Central and South America (CASA). Along with its large number of species and world-wide distribution, Salvia is paraphyletic with multiple diversity. Chloroplast genomes (CPs) are useful tools for analyzing the phylogeny of plants at lower taxonomic levels. In this study, we reported chloroplast genomes of five species of Salvia and performed phylogenetic analysis with current available CPs of Salvia. Repeated sequence analysis and comparative analysis of Salvia CPs were also performed with representative species from different distribution centers. The results showed that the genetic characters of the CPs are related to the geographic distribution of plants. Species from CAM diverged first to form a separate group, followed by species from EA, and finally species from CASA. Larger variations of CPs were observed in species from CAM, whereas more deficient sequences and less repeated sequences in the CPs were observed in species from CASA. These results provide valuable information on the development and utilization of the worldwide genetic resources of Salvia.
Collapse
Affiliation(s)
- Dade Yu
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China
| | - Yifei Pei
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China
| | - Ning Cui
- Shandong Academy of Chinese Medicine, Jinan, 250014, China
| | - Guiping Zhao
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China
- College of Traditional Chinese Medicine, Yunnan University of Chinese Medicine, Kunming, 650500, China
| | - Mengmeng Hou
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China
- College of Pharmacy, Henan University of Chinese Medicine, Zhengzhou, 450046, China
| | - Yingying Chen
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China
- College of Traditional Chinese Medicine, Yunnan University of Chinese Medicine, Kunming, 650500, China
| | - Jialei Chen
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China
- College of Pharmacy, Henan University of Chinese Medicine, Zhengzhou, 450046, China
| | - Xiwen Li
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China.
- College of Traditional Chinese Medicine, Yunnan University of Chinese Medicine, Kunming, 650500, China.
- College of Pharmacy, Henan University of Chinese Medicine, Zhengzhou, 450046, China.
| |
Collapse
|
2
|
Ezoe A, Iuchi S, Sakurai T, Aso Y, Tokunaga H, Vu AT, Utsumi Y, Takahashi S, Tanaka M, Ishida J, Ishitani M, Seki M. Fully sequencing the cassava full-length cDNA library reveals unannotated transcript structures and alternative splicing events in regions with a high density of single nucleotide variations, insertions-deletions, and heterozygous sequences. PLANT MOLECULAR BIOLOGY 2023; 112:33-45. [PMID: 37014509 DOI: 10.1007/s11103-023-01346-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 02/27/2023] [Indexed: 05/09/2023]
Abstract
The primary transcript structure provides critical insights into protein diversity, transcriptional modification, and functions. Cassava transcript structures are highly diverse because of alternative splicing (AS) events and high heterozygosity. To precisely determine and characterize transcript structures, fully sequencing cloned transcripts is the most reliable method. However, cassava annotations were mainly determined according to fragmentation-based sequencing analyses (e.g., EST and short-read RNA-seq). In this study, we sequenced the cassava full-length cDNA library, which included rare transcripts. We obtained 8,628 non-redundant fully sequenced transcripts and detected 615 unannotated AS events and 421 unannotated loci. The different protein sequences resulting from the unannotated AS events tended to have diverse functional domains, implying that unannotated AS contributes to the truncation of functional domains. The unannotated loci tended to be derived from orphan genes, implying that the loci may be associated with cassava-specific traits. Unexpectedly, individual cassava transcripts were more likely to have multiple AS events than Arabidopsis transcripts, suggestive of the regulated interactions between cassava splicing-related complexes. We also observed that the unannotated loci and/or AS events were commonly in regions with abundant single nucleotide variations, insertions-deletions, and heterozygous sequences. These findings reflect the utility of completely sequenced FLcDNA clones for overcoming cassava-specific annotation-related problems to elucidate transcript structures. Our work provides researchers with transcript structural details that are useful for annotating highly diverse and unique transcripts and alternative splicing events.
Collapse
Affiliation(s)
- Akihiro Ezoe
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045, Japan
| | - Satoshi Iuchi
- Experimental Plant Division, RIKEN BioResource Research Center, Tsukuba, Ibaraki, 305-0074, Japan
| | - Tetsuya Sakurai
- Multidisciplinary Science Cluster, Interdisciplinary Science Unit, Kochi University, Nankoku, Kochi, 783-8502, Japan
| | - Yukie Aso
- Experimental Plant Division, RIKEN BioResource Research Center, Tsukuba, Ibaraki, 305-0074, Japan
| | - Hiroki Tokunaga
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045, Japan
- Tropical Agriculture Research Front, Japan International Research Center for Agricultural Sciences, Ishigaki, Okinawa, 907-0002, Japan
| | - Anh Thu Vu
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045, Japan
| | - Yoshinori Utsumi
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045, Japan
| | - Satoshi Takahashi
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045, Japan
- Plant Epigenome Regulation Laboratory, RIKEN Cluster for Pioneering Research, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan
| | - Maho Tanaka
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045, Japan
- Plant Epigenome Regulation Laboratory, RIKEN Cluster for Pioneering Research, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan
| | - Junko Ishida
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045, Japan
- Plant Epigenome Regulation Laboratory, RIKEN Cluster for Pioneering Research, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan
| | - Manabu Ishitani
- International Center for Tropical Agriculture (CIAT), Km 17, Recta Cali-Palmira Apartado Aéreo 6713, Cali, Colombia
| | - Motoaki Seki
- Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045, Japan.
- Plant Epigenome Regulation Laboratory, RIKEN Cluster for Pioneering Research, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan.
- Kihara Institute for Biological Research, Yokohama City University, 641-12 Maioka-cho, Totsuka-ku, Yokohama, Kanagawa, 244-0813, Japan.
| |
Collapse
|
3
|
Eleftheriou E, Aury JM, Vacherie B, Istace B, Belser C, Noel B, Moret Y, Rigaud T, Berro F, Gasparian S, Labadie-Bretheau K, Lefebvre T, Madoui MA. Chromosome-scale assembly of the yellow mealworm genome. OPEN RESEARCH EUROPE 2022; 1:94. [PMID: 37645128 PMCID: PMC10445852 DOI: 10.12688/openreseurope.13987.2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 08/30/2022] [Indexed: 03/12/2024]
Abstract
Background: The yellow mealworm beetle, Tenebrio molitor, is a promising alternative protein source for animal and human nutrition and its farming involves relatively low environmental costs. For these reasons, its industrial scale production started this century. However, to optimize and breed sustainable new T. molitor lines, the access to its genome remains essential. Methods: By combining Oxford Nanopore and Illumina Hi-C data, we constructed a high-quality chromosome-scale assembly of T. molitor. Then, we combined RNA-seq data and available coleoptera proteomes for gene prediction with GMOVE. Results: We produced a high-quality genome with a N50 = 21.9Mb with a completeness of 99.5% and predicted 21,435 genes with a median size of 1,780 bp. Gene orthology between T. molitor and Tribolium castaneum showed a highly conserved synteny between the two coleoptera and paralogs search revealed an expansion of histones in the T. molitor genome. Conclusions: The present genome will greatly help fundamental and applied research such as genetic breeding and will contribute to the sustainable production of the yellow mealworm.
Collapse
Affiliation(s)
- Evangelia Eleftheriou
- Génomique Métabolique, Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Univ Evry, Université Paris-Saclay, Université Paris-Saclay, Evry, 91057, France
| | - Jean-Marc Aury
- Génomique Métabolique, Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Univ Evry, Université Paris-Saclay, Université Paris-Saclay, Evry, 91057, France
| | - Benoît Vacherie
- Genoscope, Institut de biologie François Jacob, CEA, Université Paris‐Saclay, Evry, 91057, France
| | - Benjamin Istace
- Génomique Métabolique, Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Univ Evry, Université Paris-Saclay, Université Paris-Saclay, Evry, 91057, France
| | - Caroline Belser
- Génomique Métabolique, Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Univ Evry, Université Paris-Saclay, Université Paris-Saclay, Evry, 91057, France
| | - Benjamin Noel
- Génomique Métabolique, Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Univ Evry, Université Paris-Saclay, Université Paris-Saclay, Evry, 91057, France
| | - Yannick Moret
- Équipe Écologie Évolutive, UMR CNRS 6282 BioGéoSciences, Université de Bourgogne Franche-Comté, Dijon, 21000, France
| | - Thierry Rigaud
- Équipe Écologie Évolutive, UMR CNRS 6282 BioGéoSciences, Université de Bourgogne Franche-Comté, Dijon, 21000, France
| | | | | | - Karine Labadie-Bretheau
- Genoscope, Institut de biologie François Jacob, CEA, Université Paris‐Saclay, Evry, 91057, France
| | | | - Mohammed-Amin Madoui
- Génomique Métabolique, Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Univ Evry, Université Paris-Saclay, Université Paris-Saclay, Evry, 91057, France
- Équipe Écologie Évolutive, UMR CNRS 6282 BioGéoSciences, Université de Bourgogne Franche-Comté, Dijon, 21000, France
- Service d’Etude des Prions et des Infections Atypiques (SEPIA), Institut François Jacob, Commissariat à l’Energie Atomique et aux Energies Alternatives (CEA), Université Paris Saclay, Fontenay-aux-Roses, France
| |
Collapse
|
4
|
Eleftheriou E, Aury JM, Vacherie B, Istace B, Belser C, Noel B, Moret Y, Rigaud T, Berro F, Gasparian S, Labadie-Bretheau K, Lefebvre T, Madoui MA. Chromosome-scale assembly of the yellow mealworm genome. OPEN RESEARCH EUROPE 2022; 1:94. [PMID: 37645128 PMCID: PMC10445852 DOI: 10.12688/openreseurope.13987.3] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 08/30/2022] [Indexed: 08/31/2023]
Abstract
Background: The yellow mealworm beetle, Tenebrio molitor, is a promising alternative protein source for animal and human nutrition and its farming involves relatively low environmental costs. For these reasons, its industrial scale production started this century. However, to optimize and breed sustainable new T. molitor lines, the access to its genome remains essential. Methods: By combining Oxford Nanopore and Illumina Hi-C data, we constructed a high-quality chromosome-scale assembly of T. molitor. Then, we combined RNA-seq data and available coleoptera proteomes for gene prediction with GMOVE. Results: We produced a high-quality genome with a N50 = 21.9Mb with a completeness of 99.5% and predicted 21,435 genes with a median size of 1,780 bp. Gene orthology between T. molitor and Tribolium castaneum showed a highly conserved synteny between the two coleoptera and paralogs search revealed an expansion of histones in the T. molitor genome. Conclusions: The present genome will greatly help fundamental and applied research such as genetic breeding and will contribute to the sustainable production of the yellow mealworm.
Collapse
Affiliation(s)
- Evangelia Eleftheriou
- Génomique Métabolique, Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Univ Evry, Université Paris-Saclay, Université Paris-Saclay, Evry, 91057, France
| | - Jean-Marc Aury
- Génomique Métabolique, Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Univ Evry, Université Paris-Saclay, Université Paris-Saclay, Evry, 91057, France
| | - Benoît Vacherie
- Genoscope, Institut de biologie François Jacob, CEA, Université Paris‐Saclay, Evry, 91057, France
| | - Benjamin Istace
- Génomique Métabolique, Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Univ Evry, Université Paris-Saclay, Université Paris-Saclay, Evry, 91057, France
| | - Caroline Belser
- Génomique Métabolique, Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Univ Evry, Université Paris-Saclay, Université Paris-Saclay, Evry, 91057, France
| | - Benjamin Noel
- Génomique Métabolique, Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Univ Evry, Université Paris-Saclay, Université Paris-Saclay, Evry, 91057, France
| | - Yannick Moret
- Équipe Écologie Évolutive, UMR CNRS 6282 BioGéoSciences, Université de Bourgogne Franche-Comté, Dijon, 21000, France
| | - Thierry Rigaud
- Équipe Écologie Évolutive, UMR CNRS 6282 BioGéoSciences, Université de Bourgogne Franche-Comté, Dijon, 21000, France
| | | | | | - Karine Labadie-Bretheau
- Genoscope, Institut de biologie François Jacob, CEA, Université Paris‐Saclay, Evry, 91057, France
| | | | - Mohammed-Amin Madoui
- Génomique Métabolique, Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Univ Evry, Université Paris-Saclay, Université Paris-Saclay, Evry, 91057, France
- Équipe Écologie Évolutive, UMR CNRS 6282 BioGéoSciences, Université de Bourgogne Franche-Comté, Dijon, 21000, France
- Service d’Etude des Prions et des Infections Atypiques (SEPIA), Institut François Jacob, Commissariat à l’Energie Atomique et aux Energies Alternatives (CEA), Université Paris Saclay, Fontenay-aux-Roses, France
| |
Collapse
|
5
|
Mishra B, Ulaszewski B, Meger J, Aury JM, Bodénès C, Lesur-Kupin I, Pfenninger M, Da Silva C, Gupta DK, Guichoux E, Heer K, Lalanne C, Labadie K, Opgenoorth L, Ploch S, Le Provost G, Salse J, Scotti I, Wötzel S, Plomion C, Burczyk J, Thines M. A Chromosome-Level Genome Assembly of the European Beech ( Fagus sylvatica) Reveals Anomalies for Organelle DNA Integration, Repeat Content and Distribution of SNPs. Front Genet 2022; 12:691058. [PMID: 35211148 PMCID: PMC8862710 DOI: 10.3389/fgene.2021.691058] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Accepted: 12/14/2021] [Indexed: 01/14/2023] Open
Abstract
The European Beech is the dominant climax tree in most regions of Central Europe and valued for its ecological versatility and hardwood timber. Even though a draft genome has been published recently, higher resolution is required for studying aspects of genome architecture and recombination. Here, we present a chromosome-level assembly of the more than 300 year-old reference individual, Bhaga, from the Kellerwald-Edersee National Park (Germany). Its nuclear genome of 541 Mb was resolved into 12 chromosomes varying in length between 28 and 73 Mb. Multiple nuclear insertions of parts of the chloroplast genome were observed, with one region on chromosome 11 spanning more than 2 Mb which fragments up to 54,784 bp long and covering the whole chloroplast genome were inserted randomly. Unlike in Arabidopsis thaliana, ribosomal cistrons are present in Fagus sylvatica only in four major regions, in line with FISH studies. On most assembled chromosomes, telomeric repeats were found at both ends, while centromeric repeats were found to be scattered throughout the genome apart from their main occurrence per chromosome. The genome-wide distribution of SNPs was evaluated using a second individual from Jamy Nature Reserve (Poland). SNPs, repeat elements and duplicated genes were unevenly distributed in the genomes, with one major anomaly on chromosome 4. The genome presented here adds to the available highly resolved plant genomes and we hope it will serve as a valuable basis for future research on genome architecture and for understanding the past and future of European Beech populations in a changing climate.
Collapse
Affiliation(s)
- Bagdevi Mishra
- Senckenberg Biodiversity and Climate Research Centre (BiK-F), Senckenberg Gesellschaft für Naturforschung, Frankfurt am Main, Germany
- Department for Biological Sciences, Institute of Ecology, Evolution and Diversity, Goethe University, Frankfurt am Main, Germany
| | - Bartosz Ulaszewski
- Department of Genetics, ul. Chodkiewicza 30, Kazimierz Wielki University, Bydgoszcz, Poland
| | - Joanna Meger
- Department of Genetics, ul. Chodkiewicza 30, Kazimierz Wielki University, Bydgoszcz, Poland
| | - Jean-Marc Aury
- Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | | | - Isabelle Lesur-Kupin
- INRAE, Univ. Bordeaux, BIOGECO, Cestas, France
- HelixVenture, Mérignac, France
- Faculty of Biology, Plant Ecology and Geobotany, Philipps University Marburg, Marburg, Germany
| | - Markus Pfenninger
- Senckenberg Biodiversity and Climate Research Centre (BiK-F), Senckenberg Gesellschaft für Naturforschung, Frankfurt am Main, Germany
| | - Corinne Da Silva
- Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - Deepak K Gupta
- Senckenberg Biodiversity and Climate Research Centre (BiK-F), Senckenberg Gesellschaft für Naturforschung, Frankfurt am Main, Germany
- Department for Biological Sciences, Institute of Ecology, Evolution and Diversity, Goethe University, Frankfurt am Main, Germany
- LOEWE Centre for Translational Biodiversity Genomics (TBG), Frankfurt am Main, Germany
| | | | - Katrin Heer
- Faculty of Biology, Plant Ecology and Geobotany, Philipps University Marburg, Marburg, Germany
- Forest Genetics, Albert-Ludwigs-Universität Freiburg, Freiburg, Germany
| | | | - Karine Labadie
- Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - Lars Opgenoorth
- Faculty of Biology, Plant Ecology and Geobotany, Philipps University Marburg, Marburg, Germany
| | - Sebastian Ploch
- Senckenberg Biodiversity and Climate Research Centre (BiK-F), Senckenberg Gesellschaft für Naturforschung, Frankfurt am Main, Germany
| | | | | | | | - Stefan Wötzel
- Senckenberg Biodiversity and Climate Research Centre (BiK-F), Senckenberg Gesellschaft für Naturforschung, Frankfurt am Main, Germany
- Department for Biological Sciences, Institute of Ecology, Evolution and Diversity, Goethe University, Frankfurt am Main, Germany
| | | | - Jaroslaw Burczyk
- Department of Genetics, ul. Chodkiewicza 30, Kazimierz Wielki University, Bydgoszcz, Poland
| | - Marco Thines
- Senckenberg Biodiversity and Climate Research Centre (BiK-F), Senckenberg Gesellschaft für Naturforschung, Frankfurt am Main, Germany
- Department for Biological Sciences, Institute of Ecology, Evolution and Diversity, Goethe University, Frankfurt am Main, Germany
- LOEWE Centre for Translational Biodiversity Genomics (TBG), Frankfurt am Main, Germany
| |
Collapse
|
6
|
Lima L, Marchet C, Caboche S, Da Silva C, Istace B, Aury JM, Touzet H, Chikhi R. Comparative assessment of long-read error correction software applied to Nanopore RNA-sequencing data. Brief Bioinform 2021; 21:1164-1181. [PMID: 31232449 DOI: 10.1093/bib/bbz058] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2018] [Revised: 04/05/2019] [Accepted: 04/22/2019] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION Nanopore long-read sequencing technology offers promising alternatives to high-throughput short read sequencing, especially in the context of RNA-sequencing. However this technology is currently hindered by high error rates in the output data that affect analyses such as the identification of isoforms, exon boundaries, open reading frames and creation of gene catalogues. Due to the novelty of such data, computational methods are still actively being developed and options for the error correction of Nanopore RNA-sequencing long reads remain limited. RESULTS In this article, we evaluate the extent to which existing long-read DNA error correction methods are capable of correcting cDNA Nanopore reads. We provide an automatic and extensive benchmark tool that not only reports classical error correction metrics but also the effect of correction on gene families, isoform diversity, bias toward the major isoform and splice site detection. We find that long read error correction tools that were originally developed for DNA are also suitable for the correction of Nanopore RNA-sequencing data, especially in terms of increasing base pair accuracy. Yet investigators should be warned that the correction process perturbs gene family sizes and isoform diversity. This work provides guidelines on which (or whether) error correction tools should be used, depending on the application type. BENCHMARKING SOFTWARE https://gitlab.com/leoisl/LR_EC_analyser.
Collapse
Affiliation(s)
- Leandro Lima
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR Villeurbanne, France.,EPI ERABLE - Inria Grenoble, Rhône-Alpes, France.,Università di Roma 'Tor Vergata', Roma, Italy
| | | | - Ségolène Caboche
- Université de Lille, CNRS, Inserm, CHU Lille, Institut Pasteur de Lille, UMR, Center for Infection and Immunity of Lille, Lille, France
| | - Corinne Da Silva
- Genoscope, Institut de biologie Francois-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France
| | - Benjamin Istace
- Genoscope, Institut de biologie Francois-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France
| | - Jean-Marc Aury
- Genoscope, Institut de biologie Francois-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, Evry, France
| | - Hélène Touzet
- CNRS, Université de Lille, CRIStAL UMR, Lille, France
| | - Rayan Chikhi
- CNRS, Université de Lille, CRIStAL UMR, Lille, France.,Institut Pasteur, C3BI - USR 3756, 25-28 rue du Docteur Roux, Paris, France
| |
Collapse
|
7
|
Belser C, Baurens FC, Noel B, Martin G, Cruaud C, Istace B, Yahiaoui N, Labadie K, Hřibová E, Doležel J, Lemainque A, Wincker P, D'Hont A, Aury JM. Telomere-to-telomere gapless chromosomes of banana using nanopore sequencing. Commun Biol 2021; 4:1047. [PMID: 34493830 PMCID: PMC8423783 DOI: 10.1038/s42003-021-02559-3] [Citation(s) in RCA: 70] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Accepted: 08/13/2021] [Indexed: 02/07/2023] Open
Abstract
Long-read technologies hold the promise to obtain more complete genome assemblies and to make them easier. Coupled with long-range technologies, they can reveal the architecture of complex regions, like centromeres or rDNA clusters. These technologies also make it possible to know the complete organization of chromosomes, which remained complicated before even when using genetic maps. However, generating a gapless and telomere-to-telomere assembly is still not trivial, and requires a combination of several technologies and the choice of suitable software. Here, we report a chromosome-scale assembly of a banana genome (Musa acuminata) generated using Oxford Nanopore long-reads. We generated a genome coverage of 177X from a single PromethION flowcell with near 17X with reads longer than 75 kbp. From the 11 chromosomes, 5 were entirely reconstructed in a single contig from telomere to telomere, revealing for the first time the content of complex regions like centromeres or clusters of paralogous genes.
Collapse
Affiliation(s)
- Caroline Belser
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - Franc-Christophe Baurens
- CIRAD, UMR AGAP Institut, Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - Benjamin Noel
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - Guillaume Martin
- CIRAD, UMR AGAP Institut, Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - Corinne Cruaud
- Commissariat à l'Energie Atomique (CEA), Institut François Jacob, Genoscope, Evry, France
| | - Benjamin Istace
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - Nabila Yahiaoui
- CIRAD, UMR AGAP Institut, Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - Karine Labadie
- Commissariat à l'Energie Atomique (CEA), Institut François Jacob, Genoscope, Evry, France
| | - Eva Hřibová
- Institute of Experimental Botany of the Czech Academy of Sciences, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| | - Jaroslav Doležel
- Institute of Experimental Botany of the Czech Academy of Sciences, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| | - Arnaud Lemainque
- Commissariat à l'Energie Atomique (CEA), Institut François Jacob, Genoscope, Evry, France
| | - Patrick Wincker
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - Angélique D'Hont
- CIRAD, UMR AGAP Institut, Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - Jean-Marc Aury
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France.
| |
Collapse
|
8
|
Farhat S, Le P, Kayal E, Noel B, Bigeard E, Corre E, Maumus F, Florent I, Alberti A, Aury JM, Barbeyron T, Cai R, Da Silva C, Istace B, Labadie K, Marie D, Mercier J, Rukwavu T, Szymczak J, Tonon T, Alves-de-Souza C, Rouzé P, Van de Peer Y, Wincker P, Rombauts S, Porcel BM, Guillou L. Rapid protein evolution, organellar reductions, and invasive intronic elements in the marine aerobic parasite dinoflagellate Amoebophrya spp. BMC Biol 2021; 19:1. [PMID: 33407428 PMCID: PMC7789003 DOI: 10.1186/s12915-020-00927-9] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2020] [Accepted: 11/12/2020] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Dinoflagellates are aquatic protists particularly widespread in the oceans worldwide. Some are responsible for toxic blooms while others live in symbiotic relationships, either as mutualistic symbionts in corals or as parasites infecting other protists and animals. Dinoflagellates harbor atypically large genomes (~ 3 to 250 Gb), with gene organization and gene expression patterns very different from closely related apicomplexan parasites. Here we sequenced and analyzed the genomes of two early-diverging and co-occurring parasitic dinoflagellate Amoebophrya strains, to shed light on the emergence of such atypical genomic features, dinoflagellate evolution, and host specialization. RESULTS We sequenced, assembled, and annotated high-quality genomes for two Amoebophrya strains (A25 and A120), using a combination of Illumina paired-end short-read and Oxford Nanopore Technology (ONT) MinION long-read sequencing approaches. We found a small number of transposable elements, along with short introns and intergenic regions, and a limited number of gene families, together contribute to the compactness of the Amoebophrya genomes, a feature potentially linked with parasitism. While the majority of Amoebophrya proteins (63.7% of A25 and 59.3% of A120) had no functional assignment, we found many orthologs shared with Dinophyceae. Our analyses revealed a strong tendency for genes encoded by unidirectional clusters and high levels of synteny conservation between the two genomes despite low interspecific protein sequence similarity, suggesting rapid protein evolution. Most strikingly, we identified a large portion of non-canonical introns, including repeated introns, displaying a broad variability of associated splicing motifs never observed among eukaryotes. Those introner elements appear to have the capacity to spread over their respective genomes in a manner similar to transposable elements. Finally, we confirmed the reduction of organelles observed in Amoebophrya spp., i.e., loss of the plastid, potential loss of a mitochondrial genome and functions. CONCLUSION These results expand the range of atypical genome features found in basal dinoflagellates and raise questions regarding speciation and the evolutionary mechanisms at play while parastitism was selected for in this particular unicellular lineage.
Collapse
Affiliation(s)
- Sarah Farhat
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
- School of Marine and Atmospheric Sciences, Stony Brook University, Stony Brook, New York, 11794, USA
| | - Phuong Le
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Ehsan Kayal
- Sorbonne Université, CNRS, FR2424, Station Biologique de Roscoff, Place Georges Teissier, 29680, Roscoff, France
| | - Benjamin Noel
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Estelle Bigeard
- Sorbonne Université, CNRS, UMR7144 Adaptation et Diversité en Milieu Marin, Ecology of Marine Plankton (ECOMAP), Station Biologique de Roscoff SBR, 29680, Roscoff, France
| | - Erwan Corre
- Sorbonne Université, CNRS, FR2424, Station Biologique de Roscoff, Place Georges Teissier, 29680, Roscoff, France
| | - Florian Maumus
- URGI, INRA, Université Paris-Saclay, 78026, Versailles, France
| | - Isabelle Florent
- Unité Molécules de Communication et Adaptation des Microorganismes (MCAM, UMR7245), Muséum national d'Histoire naturelle, CNRS, CP 52, 57 rue Cuvier, 75005, Paris, France
| | - Adriana Alberti
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Jean-Marc Aury
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Tristan Barbeyron
- Sorbonne Université, CNRS, UMR 8227, Station Biologique de Roscoff, Place Georges Teissier, 29680, Roscoff, France
| | - Ruibo Cai
- Sorbonne Université, CNRS, UMR7144 Adaptation et Diversité en Milieu Marin, Ecology of Marine Plankton (ECOMAP), Station Biologique de Roscoff SBR, 29680, Roscoff, France
| | - Corinne Da Silva
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Benjamin Istace
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Karine Labadie
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Dominique Marie
- Sorbonne Université, CNRS, UMR7144 Adaptation et Diversité en Milieu Marin, Ecology of Marine Plankton (ECOMAP), Station Biologique de Roscoff SBR, 29680, Roscoff, France
| | - Jonathan Mercier
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Tsinda Rukwavu
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Jeremy Szymczak
- Sorbonne Université, CNRS, FR2424, Station Biologique de Roscoff, Place Georges Teissier, 29680, Roscoff, France
- Sorbonne Université, CNRS, UMR7144 Adaptation et Diversité en Milieu Marin, Ecology of Marine Plankton (ECOMAP), Station Biologique de Roscoff SBR, 29680, Roscoff, France
| | - Thierry Tonon
- Centre for Novel Agricultural Products, Department of Biology, University of York, Heslington, York, YO10 5DD, UK
| | - Catharina Alves-de-Souza
- Algal Resources Collection, MARBIONC, Center for Marine Sciences, University of North Carolina Wilmington, 5600 Marvin K. Moss Lane, Wilmington, NC, 28409, USA
| | - Pierre Rouzé
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
- Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
| | - Patrick Wincker
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Stephane Rombauts
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Betina M Porcel
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France.
| | - Laure Guillou
- Sorbonne Université, CNRS, UMR7144 Adaptation et Diversité en Milieu Marin, Ecology of Marine Plankton (ECOMAP), Station Biologique de Roscoff SBR, 29680, Roscoff, France.
| |
Collapse
|
9
|
Rousseau-Gueutin M, Belser C, Da Silva C, Richard G, Istace B, Cruaud C, Falentin C, Boideau F, Boutte J, Delourme R, Deniot G, Engelen S, de Carvalho JF, Lemainque A, Maillet L, Morice J, Wincker P, Denoeud F, Chèvre AM, Aury JM. Long-read assembly of the Brassica napus reference genome Darmor-bzh. Gigascience 2020; 9:giaa137. [PMID: 33319912 PMCID: PMC7736779 DOI: 10.1093/gigascience/giaa137] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Revised: 09/18/2020] [Accepted: 11/09/2020] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND The combination of long reads and long-range information to produce genome assemblies is now accepted as a common standard. This strategy not only allows access to the gene catalogue of a given species but also reveals the architecture and organization of chromosomes, including complex regions such as telomeres and centromeres. The Brassica genus is not exempt, and many assemblies based on long reads are now available. The reference genome for Brassica napus, Darmor-bzh, which was published in 2014, was produced using short reads and its contiguity was extremely low compared with current assemblies of the Brassica genus. FINDINGS Herein, we report the new long-read assembly of Darmor-bzh genome (Brassica napus) generated by combining long-read sequencing data and optical and genetic maps. Using the PromethION device and 6 flowcells, we generated ∼16 million long reads representing 93× coverage and, more importantly, 6× with reads longer than 100 kb. This ultralong-read dataset allows us to generate one of the most contiguous and complete assemblies of a Brassica genome to date (contig N50 > 10 Mb). In addition, we exploited all the advantages of the nanopore technology to detect modified bases and sequence transcriptomic data using direct RNA to annotate the genome and focus on resistance genes. CONCLUSION Using these cutting-edge technologies, and in particular by relying on all the advantages of the nanopore technology, we provide the most contiguous Brassica napus assembly, a resource that will be valuable to the Brassica community for crop improvement and will facilitate the rapid selection of agronomically important traits.
Collapse
Affiliation(s)
| | - Caroline Belser
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Corinne Da Silva
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Gautier Richard
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Benjamin Istace
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Corinne Cruaud
- Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Cyril Falentin
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Franz Boideau
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Julien Boutte
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Regine Delourme
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Gwenaëlle Deniot
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Stefan Engelen
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| | | | - Arnaud Lemainque
- Genoscope, Institut François Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Loeiz Maillet
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Jérôme Morice
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Patrick Wincker
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| | - France Denoeud
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Anne-Marie Chèvre
- IGEPP, INRAE, Institut Agro, Université de Rennes, Domaine de la Motte, 35653 Le Rheu, France
| | - Jean-Marc Aury
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 2 rue Gaston Crémieux, 91057 Evry, France
| |
Collapse
|
10
|
Re-annotation of 191 developmental and epileptic encephalopathy-associated genes unmasks de novo variants in SCN1A. NPJ Genom Med 2019; 4:31. [PMID: 31814998 PMCID: PMC6889285 DOI: 10.1038/s41525-019-0106-7] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Accepted: 11/01/2019] [Indexed: 12/21/2022] Open
Abstract
The developmental and epileptic encephalopathies (DEE) are a group of rare, severe neurodevelopmental disorders, where even the most thorough sequencing studies leave 60-65% of patients without a molecular diagnosis. Here, we explore the incompleteness of transcript models used for exome and genome analysis as one potential explanation for a lack of current diagnoses. Therefore, we have updated the GENCODE gene annotation for 191 epilepsy-associated genes, using human brain-derived transcriptomic libraries and other data to build 3,550 putative transcript models. Our annotations increase the transcriptional 'footprint' of these genes by over 674 kb. Using SCN1A as a case study, due to its close phenotype/genotype correlation with Dravet syndrome, we screened 122 people with Dravet syndrome or a similar phenotype with a panel of exon sequences representing eight established genes and identified two de novo SCN1A variants that now - through improved gene annotation - are ascribed to residing among our exons. These two (from 122 screened people, 1.6%) molecular diagnoses carry significant clinical implications. Furthermore, we identified a previously classified SCN1A intronic Dravet syndrome-associated variant that now lies within a deeply conserved exon. Our findings illustrate the potential gains of thorough gene annotation in improving diagnostic yields for genetic disorders.
Collapse
|
11
|
A New Chloroplast DNA Extraction Protocol Significantly Improves the Chloroplast Genome Sequence Quality of Foxtail Millet (Setaria italica (L.) P. Beauv.). Sci Rep 2019; 9:16227. [PMID: 31700055 PMCID: PMC6838068 DOI: 10.1038/s41598-019-52786-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Accepted: 10/23/2019] [Indexed: 12/03/2022] Open
Abstract
The complexity of the leaf constitution of foxtail millet (Setaria italica (L.) P. Beauv.) makes it difficult to obtain high-purity cpDNA. Here, we developed a protocol to isolate high-quality cpDNA from foxtail millet and other crops. The new protocol replaces previous tissue grinding and homogenization by enzyme digestion of tiny leaf strips to separate protoplasts from leaf tissue and protects chloroplasts from damage by undue grinding and homogenization and from contamination of cell debris and nuclear DNA. Using the new protocol, we successfully isolated high-quality cpDNAs for whole-genome sequencing from four foxtail millet cultivars, and comparative analysis revealed that they were approximately 27‰ longer than their reference genome. In addition, six cpDNAs of four other species with narrow and thin leaf blades, including wheat (Triticum aestivum L.), maize (Zea may L.), rice (Oryza sativa L.) and sorghum (Sorghum bicolor (L.) Moench), were also isolated by our new protocol, and they all exhibited high sequence identities to their corresponding reference genomes. A maximum-likelihood tree based on the chloroplast genomes we sequenced here was constructed, and the result was in agreement with previous reports, confirming that these cpDNA sequences were available for well-supported phylogenetic analysis and could provide valuable resources for future research.
Collapse
|
12
|
Talsania K, Mehta M, Raley C, Kriga Y, Gowda S, Grose C, Drew M, Roberts V, Cheng KT, Burkett S, Oeser S, Stephens R, Soppet D, Chen X, Kumar P, German O, Smirnova T, Hautman C, Shetty J, Tran B, Zhao Y, Esposito D. Genome Assembly and Annotation of the Trichoplusia ni Tni-FNL Insect Cell Line Enabled by Long-Read Technologies. Genes (Basel) 2019; 10:genes10020079. [PMID: 30678108 PMCID: PMC6409714 DOI: 10.3390/genes10020079] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2018] [Revised: 01/09/2019] [Accepted: 01/14/2019] [Indexed: 12/22/2022] Open
Abstract
Background: Trichoplusia ni derived cell lines are commonly used to enable recombinant protein expression via baculovirus infection to generate materials approved for clinical use and in clinical trials. In order to develop systems biology and genome engineering tools to improve protein expression in this host, we performed de novo genome assembly of the Trichoplusia ni-derived cell line Tni-FNL. Methods: By integration of PacBio single-molecule sequencing, Bionano optical mapping, and 10X Genomics linked-reads data, we have produced a draft genome assembly of Tni-FNL. Results: Our assembly contains 280 scaffolds, with a N50 scaffold size of 2.3 Mb and a total length of 359 Mb. Annotation of the Tni-FNL genome resulted in 14,101 predicted genes and 93.2% of the predicted proteome contained recognizable protein domains. Ortholog searches within the superorder Holometabola provided further evidence of high accuracy and completeness of the Tni-FNL genome assembly. Conclusions: This first draft Tni-FNL genome assembly was enabled by complementary long-read technologies and represents a high-quality, well-annotated genome that provides novel insight into the complexity of this insect cell line and can serve as a reference for future large-scale genome engineering work in this and other similar recombinant protein production hosts.
Collapse
Affiliation(s)
- Keyur Talsania
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Monika Mehta
- Cancer Research Technology Program, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Castle Raley
- Cancer Research Technology Program, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Yuliya Kriga
- Cancer Research Technology Program, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Sujatha Gowda
- Cancer Research Technology Program, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Carissa Grose
- NCI RAS Initiative, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Matthew Drew
- NCI RAS Initiative, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Veronica Roberts
- NCI RAS Initiative, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Kwong Tai Cheng
- NCI RAS Initiative, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Sandra Burkett
- Comparative Molecular Cytogenetics Core Facility, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | | | - Robert Stephens
- NCI RAS Initiative, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Daniel Soppet
- Cancer Research Technology Program, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Xiongfeng Chen
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Parimal Kumar
- Cancer Research Technology Program, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Oksana German
- Cancer Research Technology Program, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Tatyana Smirnova
- Cancer Research Technology Program, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Christopher Hautman
- Cancer Research Technology Program, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Jyoti Shetty
- Cancer Research Technology Program, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Bao Tran
- Cancer Research Technology Program, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Yongmei Zhao
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| | - Dominic Esposito
- NCI RAS Initiative, Frederick National Laboratory for Cancer Research Sponsored by the National Cancer Institute, Frederick, MD 21701, USA.
| |
Collapse
|
13
|
Marchet C, Lecompte L, Silva CD, Cruaud C, Aury JM, Nicolas J, Peterlongo P. De novo clustering of long reads by gene from transcriptomics data. Nucleic Acids Res 2019; 47:e2. [PMID: 30260405 PMCID: PMC6326815 DOI: 10.1093/nar/gky834] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Revised: 09/04/2018] [Accepted: 09/10/2018] [Indexed: 02/07/2023] Open
Abstract
Long-read sequencing currently provides sequences of several thousand base pairs. It is therefore possible to obtain complete transcripts, offering an unprecedented vision of the cellular transcriptome. However the literature lacks tools for de novo clustering of such data, in particular for Oxford Nanopore Technologies reads, because of the inherent high error rate compared to short reads. Our goal is to process reads from whole transcriptome sequencing data accurately and without a reference genome in order to reliably group reads coming from the same gene. This de novo approach is therefore particularly suitable for non-model species, but can also serve as a useful pre-processing step to improve read mapping. Our contribution both proposes a new algorithm adapted to clustering of reads by gene and a practical and free access tool that allows to scale the complete processing of eukaryotic transcriptomes. We sequenced a mouse RNA sample using the MinION device. This dataset is used to compare our solution to other algorithms used in the context of biological clustering. We demonstrate that it is the best approach for transcriptomics long reads. When a reference is available to enable mapping, we show that it stands as an alternative method that predicts complementary clusters.
Collapse
Affiliation(s)
| | | | - Corinne Da Silva
- Commissariat à l’Énergie Atomique (CEA), Institut de Biologie François Jacob, Genoscope, 91000 Evry, France
| | - Corinne Cruaud
- Commissariat à l’Énergie Atomique (CEA), Institut de Biologie François Jacob, Genoscope, 91000 Evry, France
| | - Jean-Marc Aury
- Commissariat à l’Énergie Atomique (CEA), Institut de Biologie François Jacob, Genoscope, 91000 Evry, France
| | | | | |
Collapse
|
14
|
Dutreux F, Da Silva C, d'Agata L, Couloux A, Gay EJ, Istace B, Lapalu N, Lemainque A, Linglin J, Noel B, Wincker P, Cruaud C, Rouxel T, Balesdent MH, Aury JM. De novo assembly and annotation of three Leptosphaeria genomes using Oxford Nanopore MinION sequencing. Sci Data 2018; 5:180235. [PMID: 30398473 PMCID: PMC6219404 DOI: 10.1038/sdata.2018.235] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2018] [Accepted: 07/13/2018] [Indexed: 01/11/2023] Open
Abstract
Leptosphaeria maculans and Leptosphaeria biglobosa are ascomycete phytopathogens of Brassica napus (oilseed rape, canola). Here we report the complete sequence of three Leptosphaeria genomes (L. maculans JN3, L. maculans Nz-T4 and L. biglobosa G12-14). Nz-T4 and G12-14 genome assemblies were generated de novo and the reference JN3 genome assembly was improved using Oxford Nanopore MinION reads. The new assembly of L. biglobosa showed the existence of AT rich regions and pointed to a genome compartmentalization previously unsuspected following Illumina sequencing. Moreover nanopore sequencing allowed us to generate a chromosome-level assembly for the L. maculans reference isolate, JN3. The genome annotation was supported by integrating conserved proteins and RNA sequencing from Leptosphaeria-infected samples. The newly produced high-quality assemblies and annotations of those three Leptosphaeria genomes will allow further studies, notably focused on the tripartite interaction between L. maculans, L. biglobosa and oilseed rape. The discovery of as yet unknown effectors will notably allow progress in B. napus breeding towards L. maculans resistance.
Collapse
Affiliation(s)
- Fabien Dutreux
- Genoscope, Institut de Biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, F-91057 Evry, France.,UMR BIOGER, INRA, AgroParisTech, Université Paris-Saclay, Avenue Lucien Brétignières, BP 01, F-78850 Thiverval-Grignon, France
| | - Corinne Da Silva
- Genoscope, Institut de Biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, F-91057 Evry, France
| | - Léo d'Agata
- Genoscope, Institut de Biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, F-91057 Evry, France
| | - Arnaud Couloux
- Genoscope, Institut de Biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, F-91057 Evry, France
| | - Elise J Gay
- UMR BIOGER, INRA, AgroParisTech, Université Paris-Saclay, Avenue Lucien Brétignières, BP 01, F-78850 Thiverval-Grignon, France
| | - Benjamin Istace
- Genoscope, Institut de Biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, F-91057 Evry, France
| | - Nicolas Lapalu
- UMR BIOGER, INRA, AgroParisTech, Université Paris-Saclay, Avenue Lucien Brétignières, BP 01, F-78850 Thiverval-Grignon, France
| | - Arnaud Lemainque
- Genoscope, Institut de Biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, F-91057 Evry, France
| | - Juliette Linglin
- UMR BIOGER, INRA, AgroParisTech, Université Paris-Saclay, Avenue Lucien Brétignières, BP 01, F-78850 Thiverval-Grignon, France
| | - Benjamin Noel
- Genoscope, Institut de Biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, F-91057 Evry, France
| | - Patrick Wincker
- Génomique Métabolique, Genoscope, Institut de Biologie François Jacob, Commissarait à l'Energie Atomique (CEA), CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
| | - Corinne Cruaud
- Genoscope, Institut de Biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, F-91057 Evry, France
| | - Thierry Rouxel
- UMR BIOGER, INRA, AgroParisTech, Université Paris-Saclay, Avenue Lucien Brétignières, BP 01, F-78850 Thiverval-Grignon, France
| | - Marie-Hélène Balesdent
- UMR BIOGER, INRA, AgroParisTech, Université Paris-Saclay, Avenue Lucien Brétignières, BP 01, F-78850 Thiverval-Grignon, France
| | - Jean-Marc Aury
- Genoscope, Institut de Biologie François-Jacob, Commissariat à l'Energie Atomique (CEA), Université Paris-Saclay, F-91057 Evry, France
| |
Collapse
|
15
|
Li Y, Zhang J, Li L, Gao L, Xu J, Yang M. Structural and Comparative Analysis of the Complete Chloroplast Genome of Pyrus hopeiensis-"Wild Plants with a Tiny Population"-and Three Other Pyrus Species. Int J Mol Sci 2018; 19:ijms19103262. [PMID: 30347837 PMCID: PMC6214102 DOI: 10.3390/ijms19103262] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Revised: 10/16/2018] [Accepted: 10/16/2018] [Indexed: 11/16/2022] Open
Abstract
Pyrus hopeiensis is a valuable wild resource of Pyrus in the Rosaceae. Due to its limited distribution and population decline, it has been listed as one of the “wild plants with a tiny population” in China. To date, few studies have been conducted on P. hopeiensis. This paper offers a systematic review of P. hopeiensis, providing a basis for the conservation and restoration of P. hopeiensis resources. In this study, the chloroplast genomes of two different genotypes of P. hopeiensis, P. ussuriensis Maxin. cv. Jingbaili, P. communis L. cv. Early Red Comice, and P. betulifolia were sequenced, compared and analyzed. The two P. hopeiensis genotypes showed a typical tetrad chloroplast genome, including a pair of inverted repeats encoding the same but opposite direction sequences, a large single copy (LSC) region, and a small single copy (SSC) region. The length of the chloroplast genome of P. hopeiensis HB-1 was 159,935 bp, 46 bp longer than that of the chloroplast genome of P. hopeiensis HB-2. The lengths of the SSC and IR regions of the two Pyrus genotypes were identical, with the only difference present in the LSC region. The GC content was only 0.02% higher in P. hopeiensis HB-1. The structure and size of the chloroplast genome, the gene species, gene number, and GC content of P. hopeiensis were similar to those of the other three Pyrus species. The IR boundary of the two genotypes of P. hopeiensis showed a similar degree of expansion. To determine the evolutionary history of P. hopeiensis within the genus Pyrus and the Rosaceae, 57 common protein-coding genes from 36 Rosaceae species were analyzed. The phylogenetic tree showed a close relationship between the genera Pyrus and Malus, and the relationship between P. hopeiensis HB-1 and P. hopeiensis HB-2 was the closest.
Collapse
Affiliation(s)
- Yongtan Li
- Institute of Forest Biotechnology, Forestry College, Agricultural University of Hebei, Baoding 071000, China.
- Hebei Key Laboratory for Tree Genetic Resources and Forest Protection, Baoding 071000, China.
| | - Jun Zhang
- Institute of Forest Biotechnology, Forestry College, Agricultural University of Hebei, Baoding 071000, China.
- Hebei Key Laboratory for Tree Genetic Resources and Forest Protection, Baoding 071000, China.
| | - Longfei Li
- Changli Institute for Pomology, Hebei Academy of Agricultural and Forestry Science, Changli 066600, China.
| | - Lijuan Gao
- Changli Institute for Pomology, Hebei Academy of Agricultural and Forestry Science, Changli 066600, China.
| | - Jintao Xu
- Changli Institute for Pomology, Hebei Academy of Agricultural and Forestry Science, Changli 066600, China.
| | - Minsheng Yang
- Institute of Forest Biotechnology, Forestry College, Agricultural University of Hebei, Baoding 071000, China.
- Hebei Key Laboratory for Tree Genetic Resources and Forest Protection, Baoding 071000, China.
| |
Collapse
|
16
|
Herraiz FJ, Blanca J, Ziarsolo P, Gramazio P, Plazas M, Anderson GJ, Prohens J, Vilanova S. The first de novo transcriptome of pepino (Solanum muricatum): assembly, comprehensive analysis and comparison with the closely related species S. caripense, potato and tomato. BMC Genomics 2016; 17:321. [PMID: 27142449 PMCID: PMC4855764 DOI: 10.1186/s12864-016-2656-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2015] [Accepted: 04/25/2016] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND Solanum sect. Basarthrum is phylogenetically very close to potatoes (Solanum sect. Petota) and tomatoes (Solanum sect. Lycopersicon), two groups with great economic importance, and for which Solanum sect. Basarthrum represents a tertiary gene pool for breeding. This section includes the important regional cultigen, the pepino (Solanum muricatum), and several wild species. Among the wild species, S. caripense is prominent due to its major involvement in the origin of pepino and its wide geographical distribution. Despite the value of the pepino as an emerging crop, and the potential for gene transfer from both the pepino and S. caripense to potatoes and tomatoes, there has been virtually no genomic study of these species. RESULTS Using Illumina HiSeq 2000, RNA-Seq was performed with a pool of three tissues (young leaf, flowers in pre-anthesis and mature fruits) from S. muricatum and S. caripense, generating almost 111,000,000 reads among the two species. A high quality de novo transcriptome was assembled from S. muricatum clean reads resulting in 75,832 unigenes with an average length of 704 bp. These unigenes were functionally annotated based on similarity of public databases. We used Blast2GO, to conduct an exhaustive study of the gene ontology, including GO terms, EC numbers and KEGG pathways. Pepino unigenes were compared to both potato and tomato genomes in order to determine their estimated relative position, and to infer gene prediction models. Candidate genes related to traits of interest in other Solanaceae were evaluated by presence or absence and compared with S. caripense transcripts. In addition, by studying five genes, the phylogeny of pepino and five other members of the family, Solanaceae, were studied. The comparison of S. caripense reads against S. muricatum assembled transcripts resulted in thousands of intra- and interspecific nucleotide-level variants. In addition, more than 1000 SSRs were identified in the pepino transcriptome. CONCLUSIONS This study represents the first genomic resource for the pepino. We suggest that the data will be useful not only for improvement of the pepino, but also for potato and tomato breeding and gene transfer. The high quality of the transcriptome presented here also facilitates comparative studies in the genus Solanum. The accurate transcript annotation will enable us to figure out the gene function of particular traits of interest. The high number of markers (SSR and nucleotide-level variants) obtained will be useful for breeding programs, as well as studies of synteny, diversity evolution, and phylogeny.
Collapse
Affiliation(s)
- Francisco J. Herraiz
- />Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Camino de Vera 14, 46022 Valencia Spain
| | - José Blanca
- />Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Camino de Vera 14, 46022 Valencia Spain
| | - Pello Ziarsolo
- />Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Camino de Vera 14, 46022 Valencia Spain
| | - Pietro Gramazio
- />Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Camino de Vera 14, 46022 Valencia Spain
| | - Mariola Plazas
- />Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Camino de Vera 14, 46022 Valencia Spain
| | - Gregory J. Anderson
- />Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06268-3043 USA
| | - Jaime Prohens
- />Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Camino de Vera 14, 46022 Valencia Spain
| | - Santiago Vilanova
- />Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Camino de Vera 14, 46022 Valencia Spain
| |
Collapse
|
17
|
Gramazio P, Blanca J, Ziarsolo P, Herraiz FJ, Plazas M, Prohens J, Vilanova S. Transcriptome analysis and molecular marker discovery in Solanum incanum and S. aethiopicum, two close relatives of the common eggplant (Solanum melongena) with interest for breeding. BMC Genomics 2016; 17:300. [PMID: 27108408 PMCID: PMC4841963 DOI: 10.1186/s12864-016-2631-4] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2015] [Accepted: 04/19/2016] [Indexed: 11/28/2022] Open
Abstract
Background Solanum incanum is a close wild relative of S. melongena with high contents of bioactive phenolics and drought tolerance. S. aethiopicum is a cultivated African eggplant cross-compatible with S. melongena. Despite their great interest in S. melongena breeding programs, the genomic resources for these species are scarce. Results RNA-Seq was performed with NGS from pooled RNA of young leaf, floral bud and young fruit tissues, generating more than one hundred millions raw reads per species. The transcriptomes were assembled in 83,905 unigenes for S. incanum and in 87,084 unigenes for S. aethiopicum with an average length of 696 and 722 bp, respectively. The unigenes were structurally and functionally annotated based on comparison with public databases by using bioinformatic tools. The single nucleotide variant calling analysis (SNPs and INDELs) was performed by mapping our S. incanum and S. aethiopicum reads, as well as reads from S. melongena and S. torvum available on NCBI database (National Center for Biotechnology Information), against the eggplant genome. Both intraspecific and interspecific polymorphisms were identified and subsets of molecular markers were created for all species combinations. 36 SNVs were selected for validation in the S. incanum and S. aethiopicum accessions and 96 % were correctly amplified confirming the polymorphisms. In addition, 976 and 1,278 SSRs were identified in S. incanum and S. aethiopicum transcriptomes respectively, and a set of them were validated. Conclusions This work provides a broad insight into gene sequences and allelic variation in S. incanum and S. aethiopicum. This work is a first step toward better understanding of target genes involved in metabolic pathways relevant for eggplant breeding. The molecular markers detected in this study could be used across all the eggplant genepool, which is of interest for breeding programs as well as to perform marker-trait association and QTL analysis studies. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2631-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- P Gramazio
- Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Camino de Vera 14, 46022, Valencia, Spain.
| | - J Blanca
- Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Camino de Vera 14, 46022, Valencia, Spain
| | - P Ziarsolo
- Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Camino de Vera 14, 46022, Valencia, Spain
| | - F J Herraiz
- Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Camino de Vera 14, 46022, Valencia, Spain
| | - M Plazas
- Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Camino de Vera 14, 46022, Valencia, Spain
| | - J Prohens
- Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Camino de Vera 14, 46022, Valencia, Spain
| | - S Vilanova
- Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Camino de Vera 14, 46022, Valencia, Spain
| |
Collapse
|
18
|
Identification of an Alternative Splicing Product of the Otx2 Gene Expressed in the Neural Retina and Retinal Pigmented Epithelial Cells. PLoS One 2016; 11:e0150758. [PMID: 26985665 PMCID: PMC4795653 DOI: 10.1371/journal.pone.0150758] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2016] [Accepted: 01/20/2016] [Indexed: 12/16/2022] Open
Abstract
To investigate the complexity of alternative splicing in the retina, we sequenced and analyzed a total of 115,706 clones from normalized cDNA libraries from mouse neural retina (66,217) and rat retinal pigmented epithelium (49,489). Based upon clustering the cDNAs and mapping them with their respective genomes, the estimated numbers of genes were 9,134 for the mouse neural retina and 12,050 for the rat retinal pigmented epithelium libraries. This unique collection of retinal of messenger RNAs is maintained and accessible through a web-base server to the whole community of retinal biologists for further functional characterization. The analysis revealed 3,248 and 3,202 alternative splice events for mouse neural retina and rat retinal pigmented epithelium, respectively. We focused on transcription factors involved in vision. Among the six candidates suitable for functional analysis, we selected Otx2S, a novel variant of the Otx2 gene with a deletion within the homeodomain sequence. Otx2S is expressed in both the neural retina and retinal pigmented epithelium, and encodes a protein that is targeted to the nucleus. OTX2S exerts transdominant activity on the tyrosinase promoter when tested in the physiological environment of primary RPE cells. By overexpressing OTX2S in primary RPE cells using an adeno associated viral vector, we identified 10 genes whose expression is positively regulated by OTX2S. We find that OTX2S is able to bind to the chromatin at the promoter of the retinal dehydrogenase 10 (RDH10) gene.
Collapse
|
19
|
Bermudez-Santana CI. APLICACIONES DE LA BIOINFORMÁTICA EN LA MEDICINA: EL GENOMA HUMANO. ¿CÓMO PODEMOS VER TANTO DETALLE? ACTA BIOLÓGICA COLOMBIANA 2016. [DOI: 10.15446/abc.v21n1supl.51233] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
<p lang="es-ES" align="JUSTIFY">La bioinformática es un campo novedoso que soporta parte de la investigación biológica dirigida a la identificación de variantes génicas que pueden ser descubiertas desde los estudios de genomas completos. Basados en esta motivación se presenta el panorama general de los aportes principales de la bioinformática en el desarrollo del secuenciamiento del primer genoma humano. Adicionalmente se resumen los principales avances en cómputo desarrollados para responder a las demandas requeridas por los métodos de secuenciamiento de última generación para lograr re-secuenciar un genoma humano. Finalmente se introducen algunos de los nuevos retos que deben asumirse para aplicar la genómica personalizada en el desarrollo de la medicina. </p><p lang="es-ES" align="JUSTIFY"> </p><p lang="es-ES" align="JUSTIFY">Abstract</p><p lang="es-ES" align="JUSTIFY">Bioinformatics is a new field that supports part of the biological research aimed at identifying gene variants that can be discovered from studies of whole genomes. Based on this motivation the overview of the main contributions of bioinformatics in the development of sequencing the first human genome is presented. Additionally it is summarized the main advances in computing developed to meet the demands to re-sequence a human genome by using the next generation sequencing technologies. Finally some new challenges that must be faced to apply the personalized genomics into the medicine development are introduced.</p>
Collapse
|
20
|
Sato K, Tanaka T, Shigenobu S, Motoi Y, Wu J, Itoh T. Improvement of barley genome annotations by deciphering the Haruna Nijo genome. DNA Res 2015; 23:21-8. [PMID: 26622062 PMCID: PMC4755524 DOI: 10.1093/dnares/dsv033] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2015] [Accepted: 10/26/2015] [Indexed: 11/21/2022] Open
Abstract
Full-length (FL) cDNA sequences provide the most reliable evidence for the presence of genes in genomes. In this report, detailed gene structures of barley, whole genome shotgun (WGS) and additional transcript data of the cultivar Haruna Nijo were quality controlled and compared with the published Morex genome information. Haruna Nijo scaffolds have longer total sequence length with much higher N50 and fewer sequences than those in Morex WGS contigs. The longer Haruna Nijo scaffolds provided efficient FLcDNA mapping, resulting in high coverage and detection of the transcription start sites. In combination with FLcDNAs and RNA-Seq data from four different tissue samples of Haruna Nijo, we identified 51,249 gene models on 30,606 loci. Overall sequence similarity between Haruna Nijo and Morex genome was 95.99%, while that of exon regions was higher (99.71%). These sequence and annotation data of Haruna Nijo are combined with Morex genome data and released from a genome browser. The genome sequence of Haruna Nijo may provide detailed gene structures in addition to the current Morex barley genome information.
Collapse
Affiliation(s)
- Kazuhiro Sato
- Institute of Plant Science and Resources, Okayama University, Kurashiki 710-0046, Japan
| | - Tsuyoshi Tanaka
- National Institute of Agrobiological Sciences, Tsukuba 305-8602, Japan
| | - Shuji Shigenobu
- National Institute for Basic Biology, Okazaki 444-8585, Japan
| | - Yuka Motoi
- Institute of Plant Science and Resources, Okayama University, Kurashiki 710-0046, Japan
| | - Jianzhong Wu
- National Institute of Agrobiological Sciences, Tsukuba 305-8602, Japan
| | - Takeshi Itoh
- National Institute of Agrobiological Sciences, Tsukuba 305-8602, Japan
| |
Collapse
|
21
|
Fraser HI, Howlett S, Clark J, Rainbow DB, Stanford SM, Wu DJ, Hsieh YW, Maine CJ, Christensen M, Kuchroo V, Sherman LA, Podolin PL, Todd JA, Steward CA, Peterson LB, Bottini N, Wicker LS. Ptpn22 and Cd2 Variations Are Associated with Altered Protein Expression and Susceptibility to Type 1 Diabetes in Nonobese Diabetic Mice. THE JOURNAL OF IMMUNOLOGY 2015; 195:4841-52. [PMID: 26438525 PMCID: PMC4635565 DOI: 10.4049/jimmunol.1402654] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Accepted: 09/04/2015] [Indexed: 01/08/2023]
Abstract
By congenic strain mapping using autoimmune NOD.C57BL/6J congenic mice, we demonstrated previously that the type 1 diabetes (T1D) protection associated with the insulin-dependent diabetes (Idd)10 locus on chromosome 3, originally identified by linkage analysis, was in fact due to three closely linked Idd loci: Idd10, Idd18.1, and Idd18.3. In this study, we define two additional Idd loci—Idd18.2 and Idd18.4—within the boundaries of this cluster of disease-associated genes. Idd18.2 is 1.31 Mb and contains 18 genes, including Ptpn22, which encodes a phosphatase that negatively regulates T and B cell signaling. The human ortholog of Ptpn22, PTPN22, is associated with numerous autoimmune diseases, including T1D. We, therefore, assessed Ptpn22 as a candidate for Idd18.2; resequencing of the NOD Ptpn22 allele revealed 183 single nucleotide polymorphisms with the C57BL/6J (B6) allele—6 exonic and 177 intronic. Functional studies showed higher expression of full-length Ptpn22 RNA and protein, and decreased TCR signaling in congenic strains with B6-derived Idd18.2 susceptibility alleles. The 953-kb Idd18.4 locus contains eight genes, including the candidate Cd2. The CD2 pathway is associated with the human autoimmune disease, multiple sclerosis, and mice with NOD-derived susceptibility alleles at Idd18.4 have lower CD2 expression on B cells. Furthermore, we observed that susceptibility alleles at Idd18.2 can mask the protection provided by Idd10/Cd101 or Idd18.1/Vav3 and Idd18.3. In summary, we describe two new T1D loci, Idd18.2 and Idd18.4, candidate genes within each region, and demonstrate the complex nature of genetic interactions underlying the development of T1D in the NOD mouse model.
Collapse
Affiliation(s)
- Heather I Fraser
- Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 0XY, United Kingdom
| | - Sarah Howlett
- Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 0XY, United Kingdom
| | - Jan Clark
- Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 0XY, United Kingdom
| | - Daniel B Rainbow
- Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 0XY, United Kingdom
| | - Stephanie M Stanford
- Division of Cell Biology, La Jolla Institute for Allergy and Immunology, La Jolla, CA 92037; La Jolla Institute for Allergy and Immunology, Type 1 Diabetes Research Center, La Jolla, CA 92037
| | - Dennis J Wu
- Division of Cell Biology, La Jolla Institute for Allergy and Immunology, La Jolla, CA 92037; La Jolla Institute for Allergy and Immunology, Type 1 Diabetes Research Center, La Jolla, CA 92037
| | - Yi-Wen Hsieh
- Division of Cell Biology, La Jolla Institute for Allergy and Immunology, La Jolla, CA 92037
| | - Christian J Maine
- Department of Immunology and Microbial Sciences, The Scripps Research Institute, La Jolla, CA 92037
| | - Mikkel Christensen
- Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 0XY, United Kingdom
| | - Vijay Kuchroo
- Center for Neurologic Diseases, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115
| | - Linda A Sherman
- Department of Immunology and Microbial Sciences, The Scripps Research Institute, La Jolla, CA 92037
| | - Patricia L Podolin
- Department of Pharmacology, Merck Research Laboratories, Rahway, NJ 07065; and
| | - John A Todd
- Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 0XY, United Kingdom
| | - Charles A Steward
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1HH, United Kingdom
| | - Laurence B Peterson
- Department of Pharmacology, Merck Research Laboratories, Rahway, NJ 07065; and
| | - Nunzio Bottini
- Division of Cell Biology, La Jolla Institute for Allergy and Immunology, La Jolla, CA 92037; La Jolla Institute for Allergy and Immunology, Type 1 Diabetes Research Center, La Jolla, CA 92037
| | - Linda S Wicker
- Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 0XY, United Kingdom;
| |
Collapse
|
22
|
Epithelial Cadherin Determines Resistance to Infectious Pancreatic Necrosis Virus in Atlantic Salmon. Genetics 2015; 200:1313-26. [PMID: 26041276 DOI: 10.1534/genetics.115.175406] [Citation(s) in RCA: 123] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2015] [Accepted: 05/15/2015] [Indexed: 01/12/2023] Open
Abstract
Infectious pancreatic necrosis virus (IPNV) is the cause of one of the most prevalent diseases in farmed Atlantic salmon (Salmo salar). A quantitative trait locus (QTL) has been found to be responsible for most of the genetic variation in resistance to the virus. Here we describe how a linkage disequilibrium-based test for deducing the QTL allele was developed, and how it was used to produce IPN-resistant salmon, leading to a 75% decrease in the number of IPN outbreaks in the salmon farming industry. Furthermore, we describe how whole-genome sequencing of individuals with deduced QTL genotypes was used to map the QTL down to a region containing an epithelial cadherin (cdh1) gene. In a coimmunoprecipitation assay, the Cdh1 protein was found to bind to IPNV virions, strongly indicating that the protein is part of the machinery used by the virus for internalization. Immunofluorescence revealed that the virus colocalizes with IPNV in the endosomes of homozygous susceptible individuals but not in the endosomes of homozygous resistant individuals. A putative causal single nucleotide polymorphism was found within the full-length cdh1 gene, in phase with the QTL in all observed haplotypes except one; the absence of a single, all-explaining DNA polymorphism indicates that an additional causative polymorphism may contribute to the observed QTL genotype patterns. Cdh1 has earlier been shown to be necessary for the internalization of certain bacteria and fungi, but this is the first time the protein is implicated in internalization of a virus.
Collapse
|
23
|
Lesur I, Le Provost G, Bento P, Da Silva C, Leplé JC, Murat F, Ueno S, Bartholomé J, Lalanne C, Ehrenmann F, Noirot C, Burban C, Léger V, Amselem J, Belser C, Quesneville H, Stierschneider M, Fluch S, Feldhahn L, Tarkka M, Herrmann S, Buscot F, Klopp C, Kremer A, Salse J, Aury JM, Plomion C. The oak gene expression atlas: insights into Fagaceae genome evolution and the discovery of genes regulated during bud dormancy release. BMC Genomics 2015; 16:112. [PMID: 25765701 PMCID: PMC4350297 DOI: 10.1186/s12864-015-1331-9] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2014] [Accepted: 02/09/2015] [Indexed: 11/03/2022] Open
Abstract
BACKGROUND Many northern-hemisphere forests are dominated by oaks. These species extend over diverse environmental conditions and are thus interesting models for studies of plant adaptation and speciation. The genomic toolbox is an important asset for exploring the functional variation associated with natural selection. RESULTS The assembly of previously available and newly developed long and short sequence reads for two sympatric oak species, Quercus robur and Quercus petraea, generated a comprehensive catalog of transcripts for oak. The functional annotation of 91 k contigs demonstrated the presence of a large proportion of plant genes in this unigene set. Comparisons with SwissProt accessions and five plant gene models revealed orthologous relationships, making it possible to decipher the evolution of the oak genome. In particular, it was possible to align 9.5 thousand oak coding sequences with the equivalent sequences on peach chromosomes. Finally, RNA-seq data shed new light on the gene networks underlying vegetative bud dormancy release, a key stage in development allowing plants to adapt their phenology to the environment. CONCLUSION In addition to providing a vast array of expressed genes, this study generated essential information about oak genome evolution and the regulation of genes associated with vegetative bud phenology, an important adaptive traits in trees. This resource contributes to the annotation of the oak genome sequence and will provide support for forward genetics approaches aiming to link genotypes with adaptive phenotypes.
Collapse
Affiliation(s)
- Isabelle Lesur
- INRA, UMR1202, BIOGECO, F-33610, Cestas, France.
- HelixVenture, F-33700, Mérignac, France.
| | - Grégoire Le Provost
- INRA, UMR1202, BIOGECO, F-33610, Cestas, France.
- University Bordeaux, BIOGECO, UMR1202, F-33170, Talence, France.
| | - Pascal Bento
- CEA-Institut de Génomique, GENOSCOPE, Centre National de Séquençage, 2 rue Gaston Crémieux, CP5706, F-91057, Evry Cedex, France.
| | - Corinne Da Silva
- CEA-Institut de Génomique, GENOSCOPE, Centre National de Séquençage, 2 rue Gaston Crémieux, CP5706, F-91057, Evry Cedex, France.
| | - Jean-Charles Leplé
- INRA, UR0588 Amélioration Génétique et Physiologie Forestières, F-45075, Orléans, France.
| | - Florent Murat
- INRA/UBP UMR 1095, Laboratoire Génétique, Diversité et Ecophysiologie des Céréales, F-63039, Clermont-Ferrand, France.
| | - Saneyoshi Ueno
- Forestry and Forest Products Research Institute, Department of Forest Genetics, Tree Genetics Laboratory, 1 Matsunosato, Tsukuba, Ibaraki, 305-8687, Japan.
| | - Jerôme Bartholomé
- INRA, UMR1202, BIOGECO, F-33610, Cestas, France.
- CIRAD, UMR AGAP, F-34398, Montpellier, France.
| | - Céline Lalanne
- INRA, UMR1202, BIOGECO, F-33610, Cestas, France.
- University Bordeaux, BIOGECO, UMR1202, F-33170, Talence, France.
| | - François Ehrenmann
- INRA, UMR1202, BIOGECO, F-33610, Cestas, France.
- University Bordeaux, BIOGECO, UMR1202, F-33170, Talence, France.
| | - Céline Noirot
- Plateforme bioinformatique Toulouse Midi-Pyrénées, UBIA, INRA, F-31326, Auzeville Castanet-Tolosan, France.
| | - Christian Burban
- INRA, UMR1202, BIOGECO, F-33610, Cestas, France.
- University Bordeaux, BIOGECO, UMR1202, F-33170, Talence, France.
| | - Valérie Léger
- INRA, UMR1202, BIOGECO, F-33610, Cestas, France.
- University Bordeaux, BIOGECO, UMR1202, F-33170, Talence, France.
| | - Joelle Amselem
- INRA, Unité de Recherche Génomique Info (URGI), F78026, Versailles, France.
| | - Caroline Belser
- CEA-Institut de Génomique, GENOSCOPE, Centre National de Séquençage, 2 rue Gaston Crémieux, CP5706, F-91057, Evry Cedex, France.
| | - Hadi Quesneville
- INRA, Unité de Recherche Génomique Info (URGI), F78026, Versailles, France.
| | | | - Silvia Fluch
- AIT Austrian Institute of Technology GmbH, Konrad-Lorenz Str 24, 3430, Tulln, Austria.
| | - Lasse Feldhahn
- Department of Soil Ecology, UFZ - Helmholtz Centre for Environmental Research, DE-06120, Halle/Saale, Germany.
| | - Mika Tarkka
- Department of Soil Ecology, UFZ - Helmholtz Centre for Environmental Research, DE-06120, Halle/Saale, Germany.
- iDiv - German Centre for Integrative Biodiversity Research, Halle Jena Leipzig, DE-04103, Leipzig, Germany.
| | - Sylvie Herrmann
- iDiv - German Centre for Integrative Biodiversity Research, Halle Jena Leipzig, DE-04103, Leipzig, Germany.
- Department of Community Ecology, UFZ - Helmholtz Centre for Environmental Research, 06120, Halle/Saale, Germany.
| | - François Buscot
- Department of Soil Ecology, UFZ - Helmholtz Centre for Environmental Research, DE-06120, Halle/Saale, Germany.
- iDiv - German Centre for Integrative Biodiversity Research, Halle Jena Leipzig, DE-04103, Leipzig, Germany.
| | - Christophe Klopp
- Plateforme bioinformatique Toulouse Midi-Pyrénées, UBIA, INRA, F-31326, Auzeville Castanet-Tolosan, France.
| | - Antoine Kremer
- INRA, UMR1202, BIOGECO, F-33610, Cestas, France.
- University Bordeaux, BIOGECO, UMR1202, F-33170, Talence, France.
| | - Jérôme Salse
- INRA/UBP UMR 1095, Laboratoire Génétique, Diversité et Ecophysiologie des Céréales, F-63039, Clermont-Ferrand, France.
| | - Jean-Marc Aury
- CEA-Institut de Génomique, GENOSCOPE, Centre National de Séquençage, 2 rue Gaston Crémieux, CP5706, F-91057, Evry Cedex, France.
| | - Christophe Plomion
- INRA, UMR1202, BIOGECO, F-33610, Cestas, France.
- University Bordeaux, BIOGECO, UMR1202, F-33170, Talence, France.
| |
Collapse
|
24
|
Denoeud F, Carretero-Paulet L, Dereeper A, Droc G, Guyot R, Pietrella M, Zheng C, Alberti A, Anthony F, Aprea G, Aury JM, Bento P, Bernard M, Bocs S, Campa C, Cenci A, Combes MC, Crouzillat D, Da Silva C, Daddiego L, De Bellis F, Dussert S, Garsmeur O, Gayraud T, Guignon V, Jahn K, Jamilloux V, Joët T, Labadie K, Lan T, Leclercq J, Lepelley M, Leroy T, Li LT, Librado P, Lopez L, Muñoz A, Noel B, Pallavicini A, Perrotta G, Poncet V, Pot D, Priyono, Rigoreau M, Rouard M, Rozas J, Tranchant-Dubreuil C, VanBuren R, Zhang Q, Andrade AC, Argout X, Bertrand B, de Kochko A, Graziosi G, Henry RJ, Jayarama, Ming R, Nagai C, Rounsley S, Sankoff D, Giuliano G, Albert VA, Wincker P, Lashermes P. The coffee genome provides insight into the convergent evolution of caffeine biosynthesis. Science 2014; 345:1181-4. [PMID: 25190796 DOI: 10.1126/science.1255274] [Citation(s) in RCA: 345] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Coffee is a valuable beverage crop due to its characteristic flavor, aroma, and the stimulating effects of caffeine. We generated a high-quality draft genome of the species Coffea canephora, which displays a conserved chromosomal gene order among asterid angiosperms. Although it shows no sign of the whole-genome triplication identified in Solanaceae species such as tomato, the genome includes several species-specific gene family expansions, among them N-methyltransferases (NMTs) involved in caffeine production, defense-related genes, and alkaloid and flavonoid enzymes involved in secondary compound synthesis. Comparative analyses of caffeine NMTs demonstrate that these genes expanded through sequential tandem duplications independently of genes from cacao and tea, suggesting that caffeine in eudicots is of polyphyletic origin.
Collapse
Affiliation(s)
- France Denoeud
- Commissariat à l'Energie Atomique, Genoscope, Institut de Génomique, BP5706, 91057 Evry, France. CNRS, UMR 8030, CP5706, Evry, France. Université d'Evry, UMR 8030, CP5706, Evry, France
| | - Lorenzo Carretero-Paulet
- Department of Biological Sciences, 109 Cooke Hall, University at Buffalo (State University of New York), Buffalo, NY 14260, USA
| | - Alexis Dereeper
- Institut de Recherche pour le Développement (IRD), UMR Résistance des Plantes aux Bioagresseurs (RPB) [Centre de Coopération Internationale en Recherche Agronomique pour le Développement (CIRAD), IRD, UM2)], BP 64501, 34394 Montpellier Cedex 5, France
| | - Gaëtan Droc
- CIRAD, UMR Amélioration Génétique et Adaptation des Plantes Méditerranéennes et Tropicales (AGAP), F-34398 Montpellier, France
| | - Romain Guyot
- IRD, UMR Diversité Adaptation et Développement des Plantes (CIRAD, IRD, UM2), BP 64501, 34394 Montpellier Cedex 5, France
| | - Marco Pietrella
- Italian National Agency for New Technologies, Energy and Sustainable Development (ENEA) Casaccia Research Center, Via Anguillarese 301, 00123 Roma, Italy
| | - Chunfang Zheng
- Department of Mathematics and Statistics, University of Ottawa, 585 King Edward Avenue, Ottawa, Ontario K1N 6N5, Canada
| | - Adriana Alberti
- Commissariat à l'Energie Atomique, Genoscope, Institut de Génomique, BP5706, 91057 Evry, France
| | - François Anthony
- Institut de Recherche pour le Développement (IRD), UMR Résistance des Plantes aux Bioagresseurs (RPB) [Centre de Coopération Internationale en Recherche Agronomique pour le Développement (CIRAD), IRD, UM2)], BP 64501, 34394 Montpellier Cedex 5, France
| | - Giuseppe Aprea
- Italian National Agency for New Technologies, Energy and Sustainable Development (ENEA) Casaccia Research Center, Via Anguillarese 301, 00123 Roma, Italy
| | - Jean-Marc Aury
- Commissariat à l'Energie Atomique, Genoscope, Institut de Génomique, BP5706, 91057 Evry, France
| | - Pascal Bento
- Commissariat à l'Energie Atomique, Genoscope, Institut de Génomique, BP5706, 91057 Evry, France
| | - Maria Bernard
- Commissariat à l'Energie Atomique, Genoscope, Institut de Génomique, BP5706, 91057 Evry, France
| | - Stéphanie Bocs
- CIRAD, UMR Amélioration Génétique et Adaptation des Plantes Méditerranéennes et Tropicales (AGAP), F-34398 Montpellier, France
| | - Claudine Campa
- IRD, UMR Diversité Adaptation et Développement des Plantes (CIRAD, IRD, UM2), BP 64501, 34394 Montpellier Cedex 5, France
| | - Alberto Cenci
- Institut de Recherche pour le Développement (IRD), UMR Résistance des Plantes aux Bioagresseurs (RPB) [Centre de Coopération Internationale en Recherche Agronomique pour le Développement (CIRAD), IRD, UM2)], BP 64501, 34394 Montpellier Cedex 5, France. Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier Cedex 5, France
| | - Marie-Christine Combes
- Institut de Recherche pour le Développement (IRD), UMR Résistance des Plantes aux Bioagresseurs (RPB) [Centre de Coopération Internationale en Recherche Agronomique pour le Développement (CIRAD), IRD, UM2)], BP 64501, 34394 Montpellier Cedex 5, France
| | - Dominique Crouzillat
- Nestlé Research and Development Centre, 101 Avenue Gustave Eiffel, Notre-Dame-d'Oé, BP 49716, 37097 Tours Cedex 2, France
| | - Corinne Da Silva
- Commissariat à l'Energie Atomique, Genoscope, Institut de Génomique, BP5706, 91057 Evry, France
| | | | - Fabien De Bellis
- CIRAD, UMR Amélioration Génétique et Adaptation des Plantes Méditerranéennes et Tropicales (AGAP), F-34398 Montpellier, France
| | - Stéphane Dussert
- IRD, UMR Diversité Adaptation et Développement des Plantes (CIRAD, IRD, UM2), BP 64501, 34394 Montpellier Cedex 5, France
| | - Olivier Garsmeur
- CIRAD, UMR Amélioration Génétique et Adaptation des Plantes Méditerranéennes et Tropicales (AGAP), F-34398 Montpellier, France
| | - Thomas Gayraud
- IRD, UMR Diversité Adaptation et Développement des Plantes (CIRAD, IRD, UM2), BP 64501, 34394 Montpellier Cedex 5, France
| | - Valentin Guignon
- Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier Cedex 5, France
| | - Katharina Jahn
- Department of Mathematics and Statistics, University of Ottawa, 585 King Edward Avenue, Ottawa, Ontario K1N 6N5, Canada. Center for Biotechnology, Universität Bielefeld, Universitätsstraße 27, D-33615 Bielefeld, Germany. AG Genominformatik, Technische Fakultät, Universität Bielefeld, 33594 Bielefeld, Germany
| | - Véronique Jamilloux
- Institut National de la Recherche Agronomique (INRA), Unité de Recherches en Génomique-Info (UR INRA 1164), Centre de Recherche de Versailles, 78026 Versailles Cedex, France
| | - Thierry Joët
- IRD, UMR Diversité Adaptation et Développement des Plantes (CIRAD, IRD, UM2), BP 64501, 34394 Montpellier Cedex 5, France
| | - Karine Labadie
- Commissariat à l'Energie Atomique, Genoscope, Institut de Génomique, BP5706, 91057 Evry, France
| | - Tianying Lan
- Department of Biological Sciences, 109 Cooke Hall, University at Buffalo (State University of New York), Buffalo, NY 14260, USA. Department of Biology, Chongqing University of Science and Technology, 4000042 Chongqing, China
| | - Julie Leclercq
- CIRAD, UMR Amélioration Génétique et Adaptation des Plantes Méditerranéennes et Tropicales (AGAP), F-34398 Montpellier, France
| | - Maud Lepelley
- Nestlé Research and Development Centre, 101 Avenue Gustave Eiffel, Notre-Dame-d'Oé, BP 49716, 37097 Tours Cedex 2, France
| | - Thierry Leroy
- CIRAD, UMR Amélioration Génétique et Adaptation des Plantes Méditerranéennes et Tropicales (AGAP), F-34398 Montpellier, France
| | - Lei-Ting Li
- Department of Plant Biology, 148 Edward R. Madigan Laboratory, MC-051, 1201 West Gregory Drive, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Pablo Librado
- Departament de Genètica and Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Diagonal 643, Barcelona 08028, Spain
| | | | - Adriana Muñoz
- Department of Mathematics, University of Maryland, Mathematics Building 084, University of Maryland, College Park, MD 20742, USA. School of Electrical Engineering and Computer Science, University of Ottawa, 800 King Edward Avenue, Ottawa, Ontario K1N 6N5, Canada
| | - Benjamin Noel
- Commissariat à l'Energie Atomique, Genoscope, Institut de Génomique, BP5706, 91057 Evry, France
| | - Alberto Pallavicini
- Department of Life Sciences, University of Trieste, Via Licio Giorgieri 5, 34127 Trieste, Italy
| | | | - Valérie Poncet
- IRD, UMR Diversité Adaptation et Développement des Plantes (CIRAD, IRD, UM2), BP 64501, 34394 Montpellier Cedex 5, France
| | - David Pot
- CIRAD, UMR Amélioration Génétique et Adaptation des Plantes Méditerranéennes et Tropicales (AGAP), F-34398 Montpellier, France
| | - Priyono
- Indonesian Coffee and Cocoa Institute, Jember, East Java, Indonesia
| | - Michel Rigoreau
- Nestlé Research and Development Centre, 101 Avenue Gustave Eiffel, Notre-Dame-d'Oé, BP 49716, 37097 Tours Cedex 2, France
| | - Mathieu Rouard
- Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier Cedex 5, France
| | - Julio Rozas
- Departament de Genètica and Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Diagonal 643, Barcelona 08028, Spain
| | - Christine Tranchant-Dubreuil
- IRD, UMR Diversité Adaptation et Développement des Plantes (CIRAD, IRD, UM2), BP 64501, 34394 Montpellier Cedex 5, France
| | - Robert VanBuren
- Department of Plant Biology, 148 Edward R. Madigan Laboratory, MC-051, 1201 West Gregory Drive, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Qiong Zhang
- Department of Plant Biology, 148 Edward R. Madigan Laboratory, MC-051, 1201 West Gregory Drive, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Alan C Andrade
- Laboratório de Genética Molecular, Núcleo de Biotecnologia (NTBio), Embrapa Recursos Genéticos e Biotecnologia, Final Av. W/5 Norte, Parque Estação Biológia, Brasília-DF 70770-917, Brazil
| | - Xavier Argout
- CIRAD, UMR Amélioration Génétique et Adaptation des Plantes Méditerranéennes et Tropicales (AGAP), F-34398 Montpellier, France
| | - Benoît Bertrand
- CIRAD, UMR RPB (CIRAD, IRD, UM2), BP 64501, 34394 Montpellier Cedex 5, France
| | - Alexandre de Kochko
- IRD, UMR Diversité Adaptation et Développement des Plantes (CIRAD, IRD, UM2), BP 64501, 34394 Montpellier Cedex 5, France
| | - Giorgio Graziosi
- Department of Life Sciences, University of Trieste, Via Licio Giorgieri 5, 34127 Trieste, Italy. DNA Analytica Srl, Via Licio Giorgieri 5, 34127 Trieste, Italy
| | - Robert J Henry
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia 4072, Australia
| | - Jayarama
- Central Coffee Research Institute, Coffee Board, Coffee Research Station (Post) - 577 117 Chikmagalur District, Karnataka State, India
| | - Ray Ming
- Department of Plant Biology, 148 Edward R. Madigan Laboratory, MC-051, 1201 West Gregory Drive, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Chifumi Nagai
- Hawaii Agriculture Research Center, Post Office Box 100, Kunia, HI 96759-0100, USA
| | - Steve Rounsley
- BIO5 Institute, University of Arizona, 1657 Helen Street, Tucson, AZ 85721, USA
| | - David Sankoff
- Department of Mathematics and Statistics, University of Ottawa, 585 King Edward Avenue, Ottawa, Ontario K1N 6N5, Canada
| | - Giovanni Giuliano
- Italian National Agency for New Technologies, Energy and Sustainable Development (ENEA) Casaccia Research Center, Via Anguillarese 301, 00123 Roma, Italy
| | - Victor A Albert
- Department of Biological Sciences, 109 Cooke Hall, University at Buffalo (State University of New York), Buffalo, NY 14260, USA.
| | - Patrick Wincker
- Commissariat à l'Energie Atomique, Genoscope, Institut de Génomique, BP5706, 91057 Evry, France. CNRS, UMR 8030, CP5706, Evry, France. Université d'Evry, UMR 8030, CP5706, Evry, France.
| | - Philippe Lashermes
- Institut de Recherche pour le Développement (IRD), UMR Résistance des Plantes aux Bioagresseurs (RPB) [Centre de Coopération Internationale en Recherche Agronomique pour le Développement (CIRAD), IRD, UM2)], BP 64501, 34394 Montpellier Cedex 5, France.
| |
Collapse
|
25
|
Chalhoub B, Denoeud F, Liu S, Parkin IAP, Tang H, Wang X, Chiquet J, Belcram H, Tong C, Samans B, Corréa M, Da Silva C, Just J, Falentin C, Koh CS, Le Clainche I, Bernard M, Bento P, Noel B, Labadie K, Alberti A, Charles M, Arnaud D, Guo H, Daviaud C, Alamery S, Jabbari K, Zhao M, Edger PP, Chelaifa H, Tack D, Lassalle G, Mestiri I, Schnel N, Le Paslier MC, Fan G, Renault V, Bayer PE, Golicz AA, Manoli S, Lee TH, Thi VHD, Chalabi S, Hu Q, Fan C, Tollenaere R, Lu Y, Battail C, Shen J, Sidebottom CHD, Wang X, Canaguier A, Chauveau A, Bérard A, Deniot G, Guan M, Liu Z, Sun F, Lim YP, Lyons E, Town CD, Bancroft I, Wang X, Meng J, Ma J, Pires JC, King GJ, Brunel D, Delourme R, Renard M, Aury JM, Adams KL, Batley J, Snowdon RJ, Tost J, Edwards D, Zhou Y, Hua W, Sharpe AG, Paterson AH, Guan C, Wincker P. Plant genetics. Early allopolyploid evolution in the post-Neolithic Brassica napus oilseed genome. Science 2014; 345:950-3. [PMID: 25146293 DOI: 10.1126/science.1253435] [Citation(s) in RCA: 1408] [Impact Index Per Article: 140.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Oilseed rape (Brassica napus L.) was formed ~7500 years ago by hybridization between B. rapa and B. oleracea, followed by chromosome doubling, a process known as allopolyploidy. Together with more ancient polyploidizations, this conferred an aggregate 72× genome multiplication since the origin of angiosperms and high gene content. We examined the B. napus genome and the consequences of its recent duplication. The constituent An and Cn subgenomes are engaged in subtle structural, functional, and epigenetic cross-talk, with abundant homeologous exchanges. Incipient gene loss and expression divergence have begun. Selection in B. napus oilseed types has accelerated the loss of glucosinolate genes, while preserving expansion of oil biosynthesis genes. These processes provide insights into allopolyploid evolution and its relationship with crop domestication and improvement.
Collapse
Affiliation(s)
- Boulos Chalhoub
- Institut National de Recherche Agronomique (INRA)/Université d'Evry Val d'Essone, Unité de Recherche en Génomique Végétale, UMR1165, Organization and Evolution of Plant Genomes, 2 rue Gaston Crémieux, 91057 Evry, France.
| | - France Denoeud
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France. Université d'Evry Val d'Essone, UMR 8030, CP5706, Evry, France. Centre National de Recherche Scientifique (CNRS), UMR 8030, CP5706, Evry, France
| | - Shengyi Liu
- Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture of People's Republic of China, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan 430062, China
| | - Isobel A P Parkin
- Agriculture and Agri-Food Canada, 107 Science Place, Saskatoon, SK S7N 0X2, Canada.
| | - Haibao Tang
- J. Craig Venter Institute, Rockville, MD 20850, USA. Center for Genomics and Biotechnology, Fujian Agriculture and Forestry, University, Fuzhou 350002, Fujian Province, China
| | - Xiyin Wang
- Plant Genome Mapping Laboratory, University of Georgia, Athens, GA 30602, USA. Center of Genomics and Computational Biology, School of Life Sciences, Hebei United University, Tangshan, Hebei 063000, China
| | - Julien Chiquet
- Laboratoire de Mathématiques et Modélisation d'Evry-UMR 8071 CNRS/Université d'Evry val d'Essonne-USC INRA, Evry, France
| | - Harry Belcram
- Institut National de Recherche Agronomique (INRA)/Université d'Evry Val d'Essone, Unité de Recherche en Génomique Végétale, UMR1165, Organization and Evolution of Plant Genomes, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Chaobo Tong
- Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture of People's Republic of China, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan 430062, China
| | - Birgit Samans
- Department of Plant Breeding, Research Center for Biosystems, Land Use and Nutrition, Justus Liebig University, Heinrich-Buff-Ring 26-32, 35392 Giessen, Germany
| | - Margot Corréa
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Corinne Da Silva
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Jérémy Just
- Institut National de Recherche Agronomique (INRA)/Université d'Evry Val d'Essone, Unité de Recherche en Génomique Végétale, UMR1165, Organization and Evolution of Plant Genomes, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Cyril Falentin
- INRA, Institut de Génétique, Environnement et Protection des Plantes (IGEPP) UMR1349, BP35327, 35653 Le Rheu Cedex, France
| | - Chu Shin Koh
- National Research Council Canada, 110 Gymnasium Place, Saskatoon, SK S7N 0W9, Canada
| | - Isabelle Le Clainche
- Institut National de Recherche Agronomique (INRA)/Université d'Evry Val d'Essone, Unité de Recherche en Génomique Végétale, UMR1165, Organization and Evolution of Plant Genomes, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Maria Bernard
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Pascal Bento
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Benjamin Noel
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Karine Labadie
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Adriana Alberti
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Mathieu Charles
- INRA, Etude du Polymorphisme des Génomes Végétaux, US1279, Centre National de Génotypage, CEA-IG, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Dominique Arnaud
- Institut National de Recherche Agronomique (INRA)/Université d'Evry Val d'Essone, Unité de Recherche en Génomique Végétale, UMR1165, Organization and Evolution of Plant Genomes, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Hui Guo
- Plant Genome Mapping Laboratory, University of Georgia, Athens, GA 30602, USA
| | - Christian Daviaud
- Laboratory for Epigenetics and Environment, Centre National de Génotypage, CEA-IG, 2 rue Gaston Crémieux, 91000 Evry, France
| | - Salman Alamery
- Australian Centre for Plant Functional Genomics, School of Agriculture and Food Sciences, University of Queensland, St. Lucia, QLD 4072, Australia
| | - Kamel Jabbari
- Institut National de Recherche Agronomique (INRA)/Université d'Evry Val d'Essone, Unité de Recherche en Génomique Végétale, UMR1165, Organization and Evolution of Plant Genomes, 2 rue Gaston Crémieux, 91057 Evry, France. Cologne Center for Genomics, University of Cologne, Weyertal 115b, 50931 Köln, Germany
| | - Meixia Zhao
- Department of Agronomy, Purdue University, WSLR Building B018, West Lafayette, IN 47907, USA
| | - Patrick P Edger
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA
| | - Houda Chelaifa
- Institut National de Recherche Agronomique (INRA)/Université d'Evry Val d'Essone, Unité de Recherche en Génomique Végétale, UMR1165, Organization and Evolution of Plant Genomes, 2 rue Gaston Crémieux, 91057 Evry, France
| | - David Tack
- Department of Botany, University of British Columbia, Vancouver, BC, Canada
| | - Gilles Lassalle
- INRA, Institut de Génétique, Environnement et Protection des Plantes (IGEPP) UMR1349, BP35327, 35653 Le Rheu Cedex, France
| | - Imen Mestiri
- Institut National de Recherche Agronomique (INRA)/Université d'Evry Val d'Essone, Unité de Recherche en Génomique Végétale, UMR1165, Organization and Evolution of Plant Genomes, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Nicolas Schnel
- INRA, Institut de Génétique, Environnement et Protection des Plantes (IGEPP) UMR1349, BP35327, 35653 Le Rheu Cedex, France
| | - Marie-Christine Le Paslier
- INRA, Etude du Polymorphisme des Génomes Végétaux, US1279, Centre National de Génotypage, CEA-IG, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Guangyi Fan
- Beijing Genome Institute-Shenzhen, Shenzhen 518083, China
| | - Victor Renault
- Fondation Jean Dausset-Centre d'Étude du Polymorphisme Humain, 27 rue Juliette Dodu, 75010 Paris, France
| | - Philippe E Bayer
- Australian Centre for Plant Functional Genomics, School of Agriculture and Food Sciences, University of Queensland, St. Lucia, QLD 4072, Australia
| | - Agnieszka A Golicz
- Australian Centre for Plant Functional Genomics, School of Agriculture and Food Sciences, University of Queensland, St. Lucia, QLD 4072, Australia
| | - Sahana Manoli
- Australian Centre for Plant Functional Genomics, School of Agriculture and Food Sciences, University of Queensland, St. Lucia, QLD 4072, Australia
| | - Tae-Ho Lee
- Plant Genome Mapping Laboratory, University of Georgia, Athens, GA 30602, USA
| | - Vinh Ha Dinh Thi
- Institut National de Recherche Agronomique (INRA)/Université d'Evry Val d'Essone, Unité de Recherche en Génomique Végétale, UMR1165, Organization and Evolution of Plant Genomes, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Smahane Chalabi
- Institut National de Recherche Agronomique (INRA)/Université d'Evry Val d'Essone, Unité de Recherche en Génomique Végétale, UMR1165, Organization and Evolution of Plant Genomes, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Qiong Hu
- Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture of People's Republic of China, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan 430062, China
| | - Chuchuan Fan
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | - Reece Tollenaere
- Australian Centre for Plant Functional Genomics, School of Agriculture and Food Sciences, University of Queensland, St. Lucia, QLD 4072, Australia
| | - Yunhai Lu
- Institut National de Recherche Agronomique (INRA)/Université d'Evry Val d'Essone, Unité de Recherche en Génomique Végétale, UMR1165, Organization and Evolution of Plant Genomes, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Christophe Battail
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Jinxiong Shen
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | | | - Xinfa Wang
- Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture of People's Republic of China, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan 430062, China
| | - Aurélie Canaguier
- Institut National de Recherche Agronomique (INRA)/Université d'Evry Val d'Essone, Unité de Recherche en Génomique Végétale, UMR1165, Organization and Evolution of Plant Genomes, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Aurélie Chauveau
- INRA, Etude du Polymorphisme des Génomes Végétaux, US1279, Centre National de Génotypage, CEA-IG, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Aurélie Bérard
- INRA, Etude du Polymorphisme des Génomes Végétaux, US1279, Centre National de Génotypage, CEA-IG, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Gwenaëlle Deniot
- INRA, Institut de Génétique, Environnement et Protection des Plantes (IGEPP) UMR1349, BP35327, 35653 Le Rheu Cedex, France
| | - Mei Guan
- College of Agronomy, Hunan Agricultural University, Changsha 410128, China
| | - Zhongsong Liu
- College of Agronomy, Hunan Agricultural University, Changsha 410128, China
| | - Fengming Sun
- Beijing Genome Institute-Shenzhen, Shenzhen 518083, China
| | - Yong Pyo Lim
- Molecular Genetics and Genomics Laboratory, Department of Horticulture, Chungnam National University, Daejeon-305764, South Korea
| | - Eric Lyons
- School of Plant Sciences, iPlant Collaborative, University of Arizona, Tucson, AZ, USA
| | | | - Ian Bancroft
- Department of Biology, University of York, Wentworth Way, Heslington, York YO10 5DD, UK
| | - Xiaowu Wang
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Jinling Meng
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | - Jianxin Ma
- Department of Agronomy, Purdue University, WSLR Building B018, West Lafayette, IN 47907, USA
| | - J Chris Pires
- Division of Biological Sciences, University of Missouri, Columbia, MO 65211, USA
| | - Graham J King
- Southern Cross Plant Science, Southern Cross University, Lismore, NSW 2480, Australia
| | - Dominique Brunel
- INRA, Etude du Polymorphisme des Génomes Végétaux, US1279, Centre National de Génotypage, CEA-IG, 2 rue Gaston Crémieux, 91057 Evry, France
| | - Régine Delourme
- INRA, Institut de Génétique, Environnement et Protection des Plantes (IGEPP) UMR1349, BP35327, 35653 Le Rheu Cedex, France
| | - Michel Renard
- INRA, Institut de Génétique, Environnement et Protection des Plantes (IGEPP) UMR1349, BP35327, 35653 Le Rheu Cedex, France
| | - Jean-Marc Aury
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France
| | - Keith L Adams
- Department of Botany, University of British Columbia, Vancouver, BC, Canada
| | - Jacqueline Batley
- Australian Centre for Plant Functional Genomics, School of Agriculture and Food Sciences, University of Queensland, St. Lucia, QLD 4072, Australia. School of Plant Biology, University of Western Australia, WA 6009, Australia
| | - Rod J Snowdon
- Department of Plant Breeding, Research Center for Biosystems, Land Use and Nutrition, Justus Liebig University, Heinrich-Buff-Ring 26-32, 35392 Giessen, Germany
| | - Jorg Tost
- Laboratory for Epigenetics and Environment, Centre National de Génotypage, CEA-IG, 2 rue Gaston Crémieux, 91000 Evry, France
| | - David Edwards
- Australian Centre for Plant Functional Genomics, School of Agriculture and Food Sciences, University of Queensland, St. Lucia, QLD 4072, Australia. School of Plant Biology, University of Western Australia, WA 6009, Australia.
| | - Yongming Zhou
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China.
| | - Wei Hua
- Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture of People's Republic of China, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan 430062, China.
| | - Andrew G Sharpe
- National Research Council Canada, 110 Gymnasium Place, Saskatoon, SK S7N 0W9, Canada.
| | - Andrew H Paterson
- Plant Genome Mapping Laboratory, University of Georgia, Athens, GA 30602, USA.
| | - Chunyun Guan
- College of Agronomy, Hunan Agricultural University, Changsha 410128, China.
| | - Patrick Wincker
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, BP5706, 91057 Evry, France. Université d'Evry Val d'Essone, UMR 8030, CP5706, Evry, France. Centre National de Recherche Scientifique (CNRS), UMR 8030, CP5706, Evry, France.
| |
Collapse
|
26
|
Wright IA, Travers SA. RAMICS: trainable, high-speed and biologically relevant alignment of high-throughput sequencing reads to coding DNA. Nucleic Acids Res 2014; 42:e106. [PMID: 24861618 PMCID: PMC4117746 DOI: 10.1093/nar/gku473] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
The challenge presented by high-throughput sequencing necessitates the development of novel tools for accurate alignment of reads to reference sequences. Current approaches focus on using heuristics to map reads quickly to large genomes, rather than generating highly accurate alignments in coding regions. Such approaches are, thus, unsuited for applications such as amplicon-based analysis and the realignment phase of exome sequencing and RNA-seq, where accurate and biologically relevant alignment of coding regions is critical. To facilitate such analyses, we have developed a novel tool, RAMICS, that is tailored to mapping large numbers of sequence reads to short lengths (<10 000 bp) of coding DNA. RAMICS utilizes profile hidden Markov models to discover the open reading frame of each sequence and aligns to the reference sequence in a biologically relevant manner, distinguishing between genuine codon-sized indels and frameshift mutations. This approach facilitates the generation of highly accurate alignments, accounting for the error biases of the sequencing machine used to generate reads, particularly at homopolymer regions. Performance improvements are gained through the use of graphics processing units, which increase the speed of mapping through parallelization. RAMICS substantially outperforms all other mapping approaches tested in terms of alignment quality while maintaining highly competitive speed performance.
Collapse
Affiliation(s)
- Imogen A Wright
- South African National Bioinformatics Institute, South African Medical Research Council Bioinformatics Unit, University of the Western Cape, Bellville 7535, South Africa
| | - Simon A Travers
- South African National Bioinformatics Institute, South African Medical Research Council Bioinformatics Unit, University of the Western Cape, Bellville 7535, South Africa
| |
Collapse
|
27
|
The rainbow trout genome provides novel insights into evolution after whole-genome duplication in vertebrates. Nat Commun 2014; 5:3657. [PMID: 24755649 PMCID: PMC4071752 DOI: 10.1038/ncomms4657] [Citation(s) in RCA: 598] [Impact Index Per Article: 59.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2014] [Accepted: 03/14/2014] [Indexed: 02/07/2023] Open
Abstract
Vertebrate evolution has been shaped by several rounds of whole-genome duplications (WGDs) that are often suggested to be associated with adaptive radiations and evolutionary innovations. Due to an additional round of WGD, the rainbow trout genome offers a unique opportunity to investigate the early evolutionary fate of a duplicated vertebrate genome. Here we show that after 100 million years of evolution the two ancestral subgenomes have remained extremely collinear, despite the loss of half of the duplicated protein-coding genes, mostly through pseudogenization. In striking contrast is the fate of miRNA genes that have almost all been retained as duplicated copies. The slow and stepwise rediploidization process characterized here challenges the current hypothesis that WGD is followed by massive and rapid genomic reorganizations and gene deletions. Although whole-genome duplications (WGDs) are rare events, they have an important role in shaping vertebrate evolution. Here, the authors sequence the rainbow trout genome and show that rediploidization after WGD occurs in a slow and stepwise manner.
Collapse
|
28
|
Sakai H, Kanamori H, Arai-Kichise Y, Shibata-Hatta M, Ebana K, Oono Y, Kurita K, Fujisawa H, Katagiri S, Mukai Y, Hamada M, Itoh T, Matsumoto T, Katayose Y, Wakasa K, Yano M, Wu J. Construction of pseudomolecule sequences of the aus rice cultivar Kasalath for comparative genomics of Asian cultivated rice. DNA Res 2014; 21:397-405. [PMID: 24578372 PMCID: PMC4131834 DOI: 10.1093/dnares/dsu006] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Having a deep genetic structure evolved during its domestication and adaptation, the Asian cultivated rice (Oryza sativa) displays considerable physiological and morphological variations. Here, we describe deep whole-genome sequencing of the aus rice cultivar Kasalath by using the advanced next-generation sequencing (NGS) technologies to gain a better understanding of the sequence and structural changes among highly differentiated cultivars. The de novo assembled Kasalath sequences represented 91.1% (330.55 Mb) of the genome and contained 35 139 expressed loci annotated by RNA-Seq analysis. We detected 2 787 250 single-nucleotide polymorphisms (SNPs) and 7393 large insertion/deletion (indel) sites (>100 bp) between Kasalath and Nipponbare, and 2 216 251 SNPs and 3780 large indels between Kasalath and 93-11. Extensive comparison of the gene contents among these cultivars revealed similar rates of gene gain and loss. We detected at least 7.39 Mb of inserted sequences and 40.75 Mb of unmapped sequences in the Kasalath genome in comparison with the Nipponbare reference genome. Mapping of the publicly available NGS short reads from 50 rice accessions proved the necessity and the value of using the Kasalath whole-genome sequence as an additional reference to capture the sequence polymorphisms that cannot be discovered by using the Nipponbare sequence alone.
Collapse
Affiliation(s)
- Hiroaki Sakai
- Agrogenomics Research Center, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | - Hiroyuki Kanamori
- Agrogenomics Research Center, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | - Yuko Arai-Kichise
- Genome Research Center, NODAI Research Institute, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya, Tokyo 156-8502, Japan
| | - Mari Shibata-Hatta
- Genome Research Center, NODAI Research Institute, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya, Tokyo 156-8502, Japan
| | - Kaworu Ebana
- Agrogenomics Research Center, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | - Youko Oono
- Agrogenomics Research Center, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | - Kanako Kurita
- Agrogenomics Research Center, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | - Hiroko Fujisawa
- Agrogenomics Research Center, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | - Satoshi Katagiri
- Agrogenomics Research Center, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | - Yoshiyuki Mukai
- Agrogenomics Research Center, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | - Masao Hamada
- Agrogenomics Research Center, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | - Takeshi Itoh
- Agrogenomics Research Center, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | - Takashi Matsumoto
- Agrogenomics Research Center, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | - Yuichi Katayose
- Agrogenomics Research Center, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | - Kyo Wakasa
- Genome Research Center, NODAI Research Institute, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya, Tokyo 156-8502, Japan Department of Bioscience, Faculty of Applied Bioscience, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya, Tokyo 156-8502, Japan
| | - Masahiro Yano
- Agrogenomics Research Center, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | - Jianzhong Wu
- Agrogenomics Research Center, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| |
Collapse
|
29
|
Siciliano P, Scolari F, Gomulski LM, Falchetto M, Manni M, Gabrieli P, Field LM, Zhou JJ, Gasperi G, Malacrida AR. Sniffing out chemosensory genes from the Mediterranean fruit fly, Ceratitis capitata. PLoS One 2014; 9:e85523. [PMID: 24416419 PMCID: PMC3885724 DOI: 10.1371/journal.pone.0085523] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2013] [Accepted: 11/27/2013] [Indexed: 11/18/2022] Open
Abstract
The Mediterranean fruit fly, Ceratitis capitata (medfly), is an extremely invasive agricultural pest due to its extremely wide host range and its ability to adapt to a broad range of climatic conditions and habitats. Chemosensory behaviour plays an important role in many crucial stages in the life of this insect, such as the detection of pheromone cues during mate pursuit and odorants during host plant localisation. Thus, the analysis of the chemosensory gene repertoire is an important step for the interpretation of the biology of this species and consequently its invasive potential. Moreover, these genes may represent ideal targets for the development of novel, effective control methods and pest population monitoring systems. Expressed sequence tag libraries from C. capitata adult heads, embryos, male accessory glands and testes were screened for sequences encoding putative odorant binding proteins (OBPs). A total of seventeen putative OBP transcripts were identified, corresponding to 13 Classic, three Minus-C and one Plus-C subfamily OBPs. The tissue distributions of the OBP transcripts were assessed by RT-PCR and a subset of five genes with predicted proteins sharing high sequence similarities and close phylogenetic affinities to Drosophila melanogaster pheromone binding protein related proteins (PBPRPs) were characterised in greater detail. Real Time quantitative PCR was used to assess the effects of maturation, mating and time of day on the transcript abundances of the putative PBPRP genes in the principal olfactory organs, the antennae, in males and females. The results of the present study have facilitated the annotation of OBP genes in the recently released medfly genome sequence and represent a significant contribution to the characterisation of the medfly chemosensory repertoire. The identification of these medfly OBPs/PBPRPs permitted evolutionary and functional comparisons with homologous sequences from other tephritids of the genera Bactrocera and Rhagoletis.
Collapse
Affiliation(s)
- Paolo Siciliano
- Department of Biology and Biotechnology, University of Pavia, Pavia, Italy
| | - Francesca Scolari
- Department of Biology and Biotechnology, University of Pavia, Pavia, Italy
| | - Ludvik M. Gomulski
- Department of Biology and Biotechnology, University of Pavia, Pavia, Italy
| | - Marco Falchetto
- Department of Biology and Biotechnology, University of Pavia, Pavia, Italy
| | - Mosè Manni
- Department of Biology and Biotechnology, University of Pavia, Pavia, Italy
| | - Paolo Gabrieli
- Department of Biology and Biotechnology, University of Pavia, Pavia, Italy
| | - Linda M. Field
- Department of Biological Chemistry and Crop Protection, Rothamsted Research, Harpenden, United Kingdom
| | - Jing-Jiang Zhou
- Department of Biological Chemistry and Crop Protection, Rothamsted Research, Harpenden, United Kingdom
| | - Giuliano Gasperi
- Department of Biology and Biotechnology, University of Pavia, Pavia, Italy
| | - Anna R. Malacrida
- Department of Biology and Biotechnology, University of Pavia, Pavia, Italy
- * E-mail:
| |
Collapse
|
30
|
Tanaka T, Sakai H, Fujii N, Kobayashi F, Nakamura S, Itoh T, Matsumoto T, Wu J. bex-db: Bioinformatics workbench for comprehensive analysis of barley-expressed genes. BREEDING SCIENCE 2013; 63:430-434. [PMID: 24399916 PMCID: PMC3859355 DOI: 10.1270/jsbbs.63.430] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/03/2013] [Accepted: 10/14/2013] [Indexed: 06/03/2023]
Abstract
Barley (Hordeum vulgare) is one of the world's most important cereal crops. Although its large and complex genome has held back barley genomics for quite a while, the whole genome sequence was released in 2012 by the International Barley Genome Sequencing Consortium (IBSC). Moreover, more than 30,000 barley full-length cDNAs (FLcDNAs) are now available in the public domain. Here we present the Barley Gene Expression Database (bex-db: http://barleyflc.dna.affrc.go.jp/bexdb/index.html) as a repository of transcriptome data including the sequences and the expression profiles of barley genes resulting from microarray analysis. In addition to FLcDNA sequences, bex-db also contains partial sequences of more than 309,000 novel expressed sequence tags (ESTs). Users can browse the data via keyword, sequence homology and expression profile search options. A genome browser was also developed to display the chromosomal locations of barley FLcDNAs and wheat (Triticum aestivum) transcripts as well as Aegilops tauschii gene models on the IBSC genome sequence for future comparative analysis of orthologs among Triticeae species. The bex-db should provide a useful resource for further genomics studies and development of genome-based tools to enhance the progress of the genetic improvement of cereal crops.
Collapse
Affiliation(s)
- Tsuyoshi Tanaka
- Agrogenomics Research Center, National Institute of Agrobiological Sciences,
2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602,
Japan
| | - Hiroaki Sakai
- Agrogenomics Research Center, National Institute of Agrobiological Sciences,
2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602,
Japan
| | - Nobuyuki Fujii
- Bioinformatics Solution Group, Hitachi Government & Public Corporation System Engineering, Ltd.,
2-4-18 Toyo, Koto, Tokyo 135-8633,
Japan
| | - Fuminori Kobayashi
- Agrogenomics Research Center, National Institute of Agrobiological Sciences,
2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602,
Japan
| | - Shingo Nakamura
- Wheat and Barley Research Division, National Institute of Crop Science,
2-1-18 Kannondai, Tsukuba, Ibaraki 305-8518,
Japan
| | - Takeshi Itoh
- Agrogenomics Research Center, National Institute of Agrobiological Sciences,
2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602,
Japan
| | - Takashi Matsumoto
- Agrogenomics Research Center, National Institute of Agrobiological Sciences,
2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602,
Japan
| | - Jianzhong Wu
- Agrogenomics Research Center, National Institute of Agrobiological Sciences,
2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602,
Japan
| |
Collapse
|
31
|
Large scale full-length cDNA sequencing reveals a unique genomic landscape in a lepidopteran model insect, Bombyx mori. G3-GENES GENOMES GENETICS 2013; 3:1481-92. [PMID: 23821615 PMCID: PMC3755909 DOI: 10.1534/g3.113.006239] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
The establishment of a complete genomic sequence of silkworm, the model species of Lepidoptera, laid a foundation for its functional genomics. A more complete annotation of the genome will benefit functional and comparative studies and accelerate extensive industrial applications for this insect. To realize these goals, we embarked upon a large-scale full-length cDNA collection from 21 full-length cDNA libraries derived from 14 tissues of the domesticated silkworm and performed full sequencing by primer walking for 11,104 full-length cDNAs. The large average intron size was 1904 bp, resulting from a high accumulation of transposons. Using gene models predicted by GLEAN and published mRNAs, we identified 16,823 gene loci on the silkworm genome assembly. Orthology analysis of 153 species, including 11 insects, revealed that among three Lepidoptera including Monarch and Heliconius butterflies, the 403 largest silkworm-specific genes were composed mainly of protective immunity, hormone-related, and characteristic structural proteins. Analysis of testis-/ovary-specific genes revealed distinctive features of sexual dimorphism, including depletion of ovary-specific genes on the Z chromosome in contrast to an enrichment of testis-specific genes. More than 40% of genes expressed in specific tissues mapped in tissue-specific chromosomal clusters. The newly obtained FL-cDNA sequences enabled us to annotate the genome of this lepidopteran model insect more accurately, enhancing genomic and functional studies of Lepidoptera and comparative analyses with other insect orders, and yielding new insights into the evolution and organization of lepidopteran-specific genes.
Collapse
|
32
|
Shangguan L, Han J, Kayesh E, Sun X, Zhang C, Pervaiz T, Wen X, Fang J. Evaluation of genome sequencing quality in selected plant species using expressed sequence tags. PLoS One 2013; 8:e69890. [PMID: 23922843 PMCID: PMC3726750 DOI: 10.1371/journal.pone.0069890] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2013] [Accepted: 06/14/2013] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND With the completion of genome sequencing projects for more than 30 plant species, large volumes of genome sequences have been produced and stored in online databases. Advancements in sequencing technologies have reduced the cost and time of whole genome sequencing enabling more and more plants to be subjected to genome sequencing. Despite this, genome sequence qualities of multiple plants have not been evaluated. METHODOLOGY/PRINCIPAL FINDING Integrity and accuracy were calculated to evaluate the genome sequence quality of 32 plants. The integrity of a genome sequence is presented by the ratio of chromosome size and genome size (or between scaffold size and genome size), which ranged from 55.31% to nearly 100%. The accuracy of genome sequence was presented by the ratio between matched EST and selected ESTs where 52.93% ∼ 98.28% and 89.02% ∼ 98.85% of the randomly selected clean ESTs could be mapped to chromosome and scaffold sequences, respectively. According to the integrity, accuracy and other analysis of each plant species, thirteen plant species were divided into four levels. Arabidopsis thaliana, Oryza sativa and Zea mays had the highest quality, followed by Brachypodium distachyon, Populus trichocarpa, Vitis vinifera and Glycine max, Sorghum bicolor, Solanum lycopersicum and Fragaria vesca, and Lotus japonicus, Medicago truncatula and Malus × domestica in that order. Assembling the scaffold sequences into chromosome sequences should be the primary task for the remaining nineteen species. Low GC content and repeat DNA influences genome sequence assembly. CONCLUSION The quality of plant genome sequences was found to be lower than envisaged and thus the rapid development of genome sequencing projects as well as research on bioinformatics tools and the algorithms of genome sequence assembly should provide increased processing and correction of genome sequences that have already been published.
Collapse
Affiliation(s)
- Lingfei Shangguan
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Jian Han
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Emrul Kayesh
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Xin Sun
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Changqing Zhang
- College of Horticulture, Jinling Institute of Technology, Nanjing City, Jiangsu Province, China
| | - Tariq Pervaiz
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Xicheng Wen
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| | - Jinggui Fang
- College of Horticulture, Nanjing Agricultural University, Nanjing City, Jiangsu Province, China
| |
Collapse
|
33
|
Droc G, Larivière D, Guignon V, Yahiaoui N, This D, Garsmeur O, Dereeper A, Hamelin C, Argout X, Dufayard JF, Lengelle J, Baurens FC, Cenci A, Pitollat B, D'Hont A, Ruiz M, Rouard M, Bocs S. The banana genome hub. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2013; 2013:bat035. [PMID: 23707967 PMCID: PMC3662865 DOI: 10.1093/database/bat035] [Citation(s) in RCA: 100] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Banana is one of the world’s favorite fruits and one of the most important crops for developing countries. The banana reference genome sequence (Musa acuminata) was recently released. Given the taxonomic position of Musa, the completed genomic sequence has particular comparative value to provide fresh insights about the evolution of the monocotyledons. The study of the banana genome has been enhanced by a number of tools and resources that allows harnessing its sequence. First, we set up essential tools such as a Community Annotation System, phylogenomics resources and metabolic pathways. Then, to support post-genomic efforts, we improved banana existing systems (e.g. web front end, query builder), we integrated available Musa data into generic systems (e.g. markers and genetic maps, synteny blocks), we have made interoperable with the banana hub, other existing systems containing Musa data (e.g. transcriptomics, rice reference genome, workflow manager) and finally, we generated new results from sequence analyses (e.g. SNP and polymorphism analysis). Several uses cases illustrate how the Banana Genome Hub can be used to study gene families. Overall, with this collaborative effort, we discuss the importance of the interoperability toward data integration between existing information systems. Database URL: http://banana-genome.cirad.fr/
Collapse
Affiliation(s)
- Gaëtan Droc
- CIRAD, UMR AGAP, Montpellier F-34398, France.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Wilming LG, Hart EA, Coggill PC, Horton R, Gilbert JGR, Clee C, Jones M, Lloyd C, Palmer S, Sims S, Whitehead S, Wiley D, Beck S, Harrow JL. Sequencing and comparative analysis of the gorilla MHC genomic sequence. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2013; 2013:bat011. [PMID: 23589541 PMCID: PMC3626023 DOI: 10.1093/database/bat011] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC.
Collapse
Affiliation(s)
- Laurens G Wilming
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1HH, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Yorukoglu D, Hach F, Swanson L, Collins CC, Birol I, Sahinalp SC. Dissect: detection and characterization of novel structural alterations in transcribed sequences. Bioinformatics 2013; 28:i179-87. [PMID: 22689759 PMCID: PMC3371846 DOI: 10.1093/bioinformatics/bts214] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Motivation: Computational identification of genomic structural variants via high-throughput sequencing is an important problem for which a number of highly sophisticated solutions have been recently developed. With the advent of high-throughput transcriptome sequencing (RNA-Seq), the problem of identifying structural alterations in the transcriptome is now attracting significant attention. In this article, we introduce two novel algorithmic formulations for identifying transcriptomic structural variants through aligning transcripts to the reference genome under the consideration of such variation. The first formulation is based on a nucleotide-level alignment model; a second, potentially faster formulation is based on chaining fragments shared between each transcript and the reference genome. Based on these formulations, we introduce a novel transcriptome-to-genome alignment tool, Dissect (DIScovery of Structural Alteration Event Containing Transcripts), which can identify and characterize transcriptomic events such as duplications, inversions, rearrangements and fusions. Dissect is suitable for whole transcriptome structural variation discovery problems involving sufficiently long reads or accurately assembled contigs. Results: We tested Dissect on simulated transcripts altered via structural events, as well as assembled RNA-Seq contigs from human prostate cancer cell line C4-2. Our results indicate that Dissect has high sensitivity and specificity in identifying structural alteration events in simulated transcripts as well as uncovering novel structural alterations in cancer transcriptomes. Availability: Dissect is available for public use at: http://dissect-trans.sourceforge.net Contact:denizy@mit.edu; fhach@cs.sfu.ca; cenk@cs.sfu.ca
Collapse
Affiliation(s)
- Deniz Yorukoglu
- School of Computing Science, Simon Fraser University, Burnaby, V5A 1S6 BC, Canada.
| | | | | | | | | | | |
Collapse
|
36
|
Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M, Kokocinski F, Aken BL, Barrell D, Zadissa A, Searle S, Barnes I, Bignell A, Boychenko V, Hunt T, Kay M, Mukherjee G, Rajan J, Despacio-Reyes G, Saunders G, Steward C, Harte R, Lin M, Howald C, Tanzer A, Derrien T, Chrast J, Walters N, Balasubramanian S, Pei B, Tress M, Rodriguez JM, Ezkurdia I, van Baren J, Brent M, Haussler D, Kellis M, Valencia A, Reymond A, Gerstein M, Guigó R, Hubbard TJ. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res 2013; 22:1760-74. [PMID: 22955987 PMCID: PMC3431492 DOI: 10.1101/gr.135350.111] [Citation(s) in RCA: 3273] [Impact Index Per Article: 297.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The GENCODE Consortium aims to identify all gene features in the human genome using a combination of computational analysis, manual annotation, and experimental validation. Since the first public release of this annotation data set, few new protein-coding loci have been added, yet the number of alternative splicing transcripts annotated has steadily increased. The GENCODE 7 release contains 20,687 protein-coding and 9640 long noncoding RNA loci and has 33,977 coding transcripts not represented in UCSC genes and RefSeq. It also has the most comprehensive annotation of long noncoding RNA (lncRNA) loci publicly available with the predominant transcript form consisting of two exons. We have examined the completeness of the transcript annotation and found that 35% of transcriptional start sites are supported by CAGE clusters and 62% of protein-coding genes have annotated polyA sites. Over one-third of GENCODE protein-coding genes are supported by peptide hits derived from mass spectrometry spectra submitted to Peptide Atlas. New models derived from the Illumina Body Map 2.0 RNA-seq data identify 3689 new loci not currently in GENCODE, of which 3127 consist of two exon models indicating that they are possibly unannotated long noncoding loci. GENCODE 7 is publicly available from gencodegenes.org and via the Ensembl and UCSC Genome Browsers.
Collapse
Affiliation(s)
- Jennifer Harrow
- Wellcome Trust Sanger Institute, Wellcome Trust Campus, Hinxton, Cambridge CB10 1SA, United Kingdom.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences. BMC Genomics 2012; 13:715. [PMID: 23256920 PMCID: PMC3543216 DOI: 10.1186/1471-2164-13-715] [Citation(s) in RCA: 467] [Impact Index Per Article: 38.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2012] [Accepted: 12/15/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. With the advance of next-generation sequencing technology, the number of completely sequenced chloroplast genomes is expected to increase exponentially, powerful computational tools annotating the genome sequences are in urgent need. RESULTS We have developed a web server CPGAVAS. The server accepts a complete chloroplast genome sequence as input. First, it predicts protein-coding and rRNA genes based on the identification and mapping of the most similar, full-length protein, cDNA and rRNA sequences by integrating results from Blastx, Blastn, protein2genome and est2genome programs. Second, tRNA genes and inverted repeats (IR) are identified using tRNAscan, ARAGORN and vmatch respectively. Third, it calculates the summary statistics for the annotated genome. Fourth, it generates a circular map ready for publication. Fifth, it can create a Sequin file for GenBank submission. Last, it allows the extractions of protein and mRNA sequences for given list of genes and species. The annotation results in GFF3 format can be edited using any compatible annotation editing tools. The edited annotations can then be uploaded to CPGAVAS for update and re-analyses repeatedly. Using known chloroplast genome sequences as test set, we show that CPGAVAS performs comparably to another application DOGMA, while having several superior functionalities. CONCLUSIONS CPGAVAS allows the semi-automatic and complete annotation of a chloroplast genome sequence, and the visualization, editing and analysis of the annotation results. It will become an indispensible tool for researchers studying chloroplast genomes. The software is freely accessible from http://www.herbalgenomics.org/cpgavas.
Collapse
|
38
|
Danks G, Campsteijn C, Parida M, Butcher S, Doddapaneni H, Fu B, Petrin R, Metpally R, Lenhard B, Wincker P, Chourrout D, Thompson EM, Manak JR. OikoBase: a genomics and developmental transcriptomics resource for the urochordate Oikopleura dioica. Nucleic Acids Res 2012. [PMID: 23185044 PMCID: PMC3531137 DOI: 10.1093/nar/gks1159] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
We report the development of OikoBase (http://oikoarrays.biology.uiowa.edu/Oiko/), a tiling array-based genome browser resource for Oikopleura dioica, a metazoan belonging to the urochordates, the closest extant group to vertebrates. OikoBase facilitates retrieval and mining of a variety of useful genomics information. First, it includes a genome browser which interrogates 1260 genomic sequence scaffolds and features gene, transcript and CDS annotation tracks. Second, we annotated gene models with gene ontology (GO) terms and InterPro domains which are directly accessible in the browser with links to their entries in the GO (http://www.geneontology.org/) and InterPro (http://www.ebi.ac.uk/interpro/) databases, and we provide transcript and peptide links for sequence downloads. Third, we introduce the transcriptomics of a comprehensive set of developmental stages of O. dioica at high resolution and provide downloadable gene expression data for all developmental stages. Fourth, we incorporate a BLAST tool to identify homologs of genes and proteins. Finally, we include a tutorial that describes how to use OikoBase as well as a link to detailed methods, explaining the data generation and analysis pipeline. OikoBase will provide a valuable resource for research in chordate development, genome evolution and plasticity and the molecular ecology of this important marine planktonic organism.
Collapse
Affiliation(s)
- Gemma Danks
- Computational Biology Unit, University of Bergen, Bergen, N-5008, Norway
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Cornillot E, Hadj-Kaddour K, Dassouli A, Noel B, Ranwez V, Vacherie B, Augagneur Y, Brès V, Duclos A, Randazzo S, Carcy B, Debierre-Grockiego F, Delbecq S, Moubri-Ménage K, Shams-Eldin H, Usmani-Brown S, Bringaud F, Wincker P, Vivarès CP, Schwarz RT, Schetters TP, Krause PJ, Gorenflot A, Berry V, Barbe V, Ben Mamoun C. Sequencing of the smallest Apicomplexan genome from the human pathogen Babesia microti. Nucleic Acids Res 2012; 40:9102-14. [PMID: 22833609 PMCID: PMC3467087 DOI: 10.1093/nar/gks700] [Citation(s) in RCA: 156] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2012] [Revised: 06/22/2012] [Accepted: 06/25/2012] [Indexed: 11/22/2022] Open
Abstract
We have sequenced the genome of the emerging human pathogen Babesia microti and compared it with that of other protozoa. B. microti has the smallest nuclear genome among all Apicomplexan parasites sequenced to date with three chromosomes encoding ∼3500 polypeptides, several of which are species specific. Genome-wide phylogenetic analyses indicate that B. microti is significantly distant from all species of Babesidae and Theileridae and defines a new clade in the phylum Apicomplexa. Furthermore, unlike all other Apicomplexa, its mitochondrial genome is circular. Genome-scale reconstruction of functional networks revealed that B. microti has the minimal metabolic requirement for intraerythrocytic protozoan parasitism. B. microti multigene families differ from those of other protozoa in both the copy number and organization. Two lateral transfer events with significant metabolic implications occurred during the evolution of this parasite. The genomic sequencing of B. microti identified several targets suitable for the development of diagnostic assays and novel therapies for human babesiosis.
Collapse
Affiliation(s)
- Emmanuel Cornillot
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Kamel Hadj-Kaddour
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Amina Dassouli
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Benjamin Noel
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Vincent Ranwez
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Benoît Vacherie
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Yoann Augagneur
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Virginie Brès
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Aurelie Duclos
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Sylvie Randazzo
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Bernard Carcy
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Françoise Debierre-Grockiego
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Stéphane Delbecq
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Karina Moubri-Ménage
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Hosam Shams-Eldin
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Sahar Usmani-Brown
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Frédéric Bringaud
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Patrick Wincker
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Christian P. Vivarès
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Ralph T. Schwarz
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Theo P. Schetters
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Peter J. Krause
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - André Gorenflot
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Vincent Berry
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Valérie Barbe
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| | - Choukri Ben Mamoun
- Laboratoire de Biologie Cellulaire et Moléculaire (LBCM-EA4558), UFR Pharmacie, Université Montpellier 1, 15, av. Charles Flahault, 34093 Montpellier cedex 5, Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Université Montpellier II, Place E. Bataillon—34095 Montpellier cedex 5, and Montpellier SupAgro, UMR AGAP, av. Agropolis—TA A96/03 - 34398 Montpellier cedex 5, France, Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York St., New Haven, CT 06520, USA, UMR1282 Infectiologie et Santé Publique, Université de Tours, F-37000 Tours, France and INRA, F-37380 Nouzilly, France, Institut für Virologie, Zentrum für Hygiene und Infektionsbiologie, Philipps-Universität Marburg, Hans-Meerwein-Strasse, 35043 Marburg, Germany, Centre de Résonance Magnétique des Systèmes Biologiques (RMSB, UMR 5536), Université Bordeaux Segalen, CNRS, 146 rue Léo Saignat, 33076 Bordeaux, Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes: Génome et Environnement, BP10448, F-63000 Clermont-Ferrand, France, Microbiology R&D Department, Intervet/Schering-Plough Animal Health, 5830 AA Boxmeer, The Netherlands, Yale School of Public Health and Yale School of Medicine, 60 College St., New Haven, CT 06520, USA and Equipe Méthodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Université Montpellier II, Place E Bataillon—34095 Montpellier, France
| |
Collapse
|
40
|
D'Hont A, Denoeud F, Aury JM, Baurens FC, Carreel F, Garsmeur O, Noel B, Bocs S, Droc G, Rouard M, Da Silva C, Jabbari K, Cardi C, Poulain J, Souquet M, Labadie K, Jourda C, Lengellé J, Rodier-Goud M, Alberti A, Bernard M, Correa M, Ayyampalayam S, Mckain MR, Leebens-Mack J, Burgess D, Freeling M, Mbéguié-A-Mbéguié D, Chabannes M, Wicker T, Panaud O, Barbosa J, Hribova E, Heslop-Harrison P, Habas R, Rivallan R, Francois P, Poiron C, Kilian A, Burthia D, Jenny C, Bakry F, Brown S, Guignon V, Kema G, Dita M, Waalwijk C, Joseph S, Dievart A, Jaillon O, Leclercq J, Argout X, Lyons E, Almeida A, Jeridi M, Dolezel J, Roux N, Risterucci AM, Weissenbach J, Ruiz M, Glaszmann JC, Quétier F, Yahiaoui N, Wincker P. The banana (Musa acuminata) genome and the evolution of monocotyledonous plants. Nature 2012; 488:213-7. [PMID: 22801500 DOI: 10.1038/nature11241] [Citation(s) in RCA: 615] [Impact Index Per Article: 51.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2012] [Accepted: 05/18/2012] [Indexed: 01/17/2023]
Abstract
Bananas (Musa spp.), including dessert and cooking types, are giant perennial monocotyledonous herbs of the order Zingiberales, a sister group to the well-studied Poales, which include cereals. Bananas are vital for food security in many tropical and subtropical countries and the most popular fruit in industrialized countries. The Musa domestication process started some 7,000 years ago in Southeast Asia. It involved hybridizations between diverse species and subspecies, fostered by human migrations, and selection of diploid and triploid seedless, parthenocarpic hybrids thereafter widely dispersed by vegetative propagation. Half of the current production relies on somaclones derived from a single triploid genotype (Cavendish). Pests and diseases have gradually become adapted, representing an imminent danger for global banana production. Here we describe the draft sequence of the 523-megabase genome of a Musa acuminata doubled-haploid genotype, providing a crucial stepping-stone for genetic improvement of banana. We detected three rounds of whole-genome duplications in the Musa lineage, independently of those previously described in the Poales lineage and the one we detected in the Arecales lineage. This first monocotyledon high-continuity whole-genome sequence reported outside Poales represents an essential bridge for comparative genome analysis in plants. As such, it clarifies commelinid-monocotyledon phylogenetic relationships, reveals Poaceae-specific features and has led to the discovery of conserved non-coding sequences predating monocotyledon-eudicotyledon divergence.
Collapse
Affiliation(s)
- Angélique D'Hont
- Centre de coopération Internationale en Recherche Agronomique pour le Développement, UMR AGAP, F-34398 Montpellier, France. angelique.d’
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Comparative genome analysis of three eukaryotic parasites with differing abilities to transform leukocytes reveals key mediators of Theileria-induced leukocyte transformation. mBio 2012; 3:e00204-12. [PMID: 22951932 PMCID: PMC3445966 DOI: 10.1128/mbio.00204-12] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
We sequenced the genome of Theileria orientalis, a tick-borne apicomplexan protozoan parasite of cattle. The focus of this study was a comparative genome analysis of T. orientalis relative to other highly pathogenic Theileria species, T. parva and T. annulata. T. parva and T. annulata induce transformation of infected cells of lymphocyte or macrophage/monocyte lineages; in contrast, T. orientalis does not induce uncontrolled proliferation of infected leukocytes and multiplies predominantly within infected erythrocytes. While synteny across homologous chromosomes of the three Theileria species was found to be well conserved overall, subtelomeric structures were found to differ substantially, as T. orientalis lacks the large tandemly arrayed subtelomere-encoded variable secreted protein-encoding gene family. Moreover, expansion of particular gene families by gene duplication was found in the genomes of the two transforming Theileria species, most notably, the TashAT/TpHN and Tar/Tpr gene families. Gene families that are present only in T. parva and T. annulata and not in T. orientalis, Babesia bovis, or Plasmodium were also identified. Identification of differences between the genome sequences of Theileria species with different abilities to transform and immortalize bovine leukocytes will provide insight into proteins and mechanisms that have evolved to induce and regulate this process. The T. orientalis genome database is available at http://totdb.czc.hokudai.ac.jp/.
Collapse
|
42
|
Iwata H, Gotoh O. Benchmarking spliced alignment programs including Spaln2, an extended version of Spaln that incorporates additional species-specific features. Nucleic Acids Res 2012; 40:e161. [PMID: 22848105 PMCID: PMC3488211 DOI: 10.1093/nar/gks708] [Citation(s) in RCA: 123] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Spliced alignment plays a central role in the precise identification of eukaryotic gene structures. Even though many spliced alignment programs have been developed, recent rapid progress in DNA sequencing technologies demands further improvements in software tools. Benchmarking algorithms under various conditions is an indispensable task for the development of better software; however, there is a dire lack of appropriate datasets usable for benchmarking spliced alignment programs. In this study, we have constructed two types of datasets: simulated sequence datasets and actual cross-species datasets. The datasets are designed to correspond to various real situations, i.e. divergent eukaryotic species, different types of reference sequences, and the wide divergence between query and target sequences. In addition, we have developed an extended version of our program Spaln, which incorporates two additional features to the scoring scheme of the original version, and examined this extended version, Spaln2, together with the original Spaln and other representative aligners based on our benchmark datasets. Although the effects of the modifications are not individually striking, Spaln2 is consistently most accurate and reasonably fast in most practical cases, especially for plants and fungi and for increasingly divergent pairs of target and query sequences.
Collapse
Affiliation(s)
- Hiroaki Iwata
- Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University, Yoshida Honmachi, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan.
| | | |
Collapse
|
43
|
Frankish A, Mudge JM, Thomas M, Harrow J. The importance of identifying alternative splicing in vertebrate genome annotation. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2012; 2012:bas014. [PMID: 22434846 PMCID: PMC3308168 DOI: 10.1093/database/bas014] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
While alternative splicing (AS) can potentially expand the functional repertoire of vertebrate genomes, relatively few AS transcripts have been experimentally characterized. We describe our detailed manual annotation of vertebrate genomes, which is generating a publicly available geneset rich in AS. In order to achieve this we have adopted a highly sensitive approach to annotating gene models supported by correctly mapped, canonically spliced transcriptional evidence combined with a highly cautious approach to adding unsupported extensions to models and making decisions on their functional potential. We use information about the predicted functional potential and structural properties of every AS transcript annotated at a protein-coding or non-coding locus to place them into one of eleven subclasses. We describe the incorporation of new sequencing and proteomics technologies into our annotation pipelines, which are used to identify and validate AS. Combining all data sources has led to the production of a rich geneset containing an average of 6.3 AS transcripts for every human multi-exon protein-coding gene. The datasets produced have proved very useful in providing context to studies investigating the functional potential of genes and the effect of variation may have on gene structure and function. Database URL:http://www.ensembl.org/index.html, http://vega.sanger.ac.uk/index.html
Collapse
Affiliation(s)
- Adam Frankish
- Human and Vertebrate Analysis and Annotation Team, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
| | | | | | | |
Collapse
|
44
|
Haas BJ, Zeng Q, Pearson MD, Cuomo CA, Wortman JR. Approaches to Fungal Genome Annotation. Mycology 2011; 2:118-141. [PMID: 22059117 PMCID: PMC3207268 DOI: 10.1080/21501203.2011.606851] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Fungal genome annotation is the starting point for analysis of genome content. This generally involves the application of diverse methods to identify features on a genome assembly such as protein-coding and non-coding genes, repeats and transposable elements, and pseudogenes. Here we describe tools and methods leveraged for eukaryotic genome annotation with a focus on the annotation of fungal nuclear and mitochondrial genomes. We highlight the application of the latest technologies and tools to improve the quality of predicted gene sets. The Broad Institute eukaryotic genome annotation pipeline is described as one example of how such methods and tools are integrated into a sequencing center's production genome annotation environment.
Collapse
Affiliation(s)
- Brian J Haas
- Genome Sequencing and Analysis Program, Broad Institute, 7 Cambridge Center, Cambridge, MA 02142, U.S.A
| | | | | | | | | |
Collapse
|
45
|
Dessimoz C, Zoller S, Manousaki T, Qiu H, Meyer A, Kuraku S. Comparative genomics approach to detecting split-coding regions in a low-coverage genome: lessons from the chimaera Callorhinchus milii (Holocephali, Chondrichthyes). Brief Bioinform 2011; 12:474-84. [PMID: 21712341 PMCID: PMC3178057 DOI: 10.1093/bib/bbr038] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
Recent development of deep sequencing technologies has facilitated de novo genome sequencing projects, now conducted even by individual laboratories. However, this will yield more and more genome sequences that are not well assembled, and will hinder thorough annotation when no closely related reference genome is available. One of the challenging issues is the identification of protein-coding sequences split into multiple unassembled genomic segments, which can confound orthology assignment and various laboratory experiments requiring the identification of individual genes. In this study, using the genome of a cartilaginous fish, Callorhinchus milii, as test case, we performed gene prediction using a model specifically trained for this genome. We implemented an algorithm, designated ESPRIT, to identify possible linkages between multiple protein-coding portions derived from a single genomic locus split into multiple unassembled genomic segments. We developed a validation framework based on an artificially fragmented human genome, improvements between early and recent mouse genome assemblies, comparison with experimentally validated sequences from GenBank, and phylogenetic analyses. Our strategy provided insights into practical solutions for efficient annotation of only partially sequenced (low-coverage) genomes. To our knowledge, our study is the first formulation of a method to link unassembled genomic segments based on proteomes of relatively distantly related species as references.
Collapse
Affiliation(s)
- Christophe Dessimoz
- ETH Zurich, Computer Science, Universitätstrasse 6, 8092 Zurich, Switzerland.
| | | | | | | | | | | |
Collapse
|
46
|
Denoeud F, Roussel M, Noel B, Wawrzyniak I, Da Silva C, Diogon M, Viscogliosi E, Brochier-Armanet C, Couloux A, Poulain J, Segurens B, Anthouard V, Texier C, Blot N, Poirier P, Ng GC, Tan KSW, Artiguenave F, Jaillon O, Aury JM, Delbac F, Wincker P, Vivarès CP, El Alaoui H. Genome sequence of the stramenopile Blastocystis, a human anaerobic parasite. Genome Biol 2011; 12:R29. [PMID: 21439036 PMCID: PMC3129679 DOI: 10.1186/gb-2011-12-3-r29] [Citation(s) in RCA: 126] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2010] [Revised: 01/04/2011] [Accepted: 03/25/2011] [Indexed: 01/28/2023] Open
Abstract
Background Blastocystis is a highly prevalent anaerobic eukaryotic parasite of humans and animals that is associated with various gastrointestinal and extraintestinal disorders. Epidemiological studies have identified different subtypes but no one subtype has been definitively correlated with disease. Results Here we report the 18.8 Mb genome sequence of a Blastocystis subtype 7 isolate, which is the smallest stramenopile genome sequenced to date. The genome is highly compact and contains intriguing rearrangements. Comparisons with other available stramenopile genomes (plant pathogenic oomycete and diatom genomes) revealed effector proteins potentially involved in the adaptation to the intestinal environment, which were likely acquired via horizontal gene transfer. Moreover, Blastocystis living in anaerobic conditions harbors mitochondria-like organelles. An incomplete oxidative phosphorylation chain, a partial Krebs cycle, amino acid and fatty acid metabolisms and an iron-sulfur cluster assembly are all predicted to occur in these organelles. Predicted secretory proteins possess putative activities that may alter host physiology, such as proteases, protease-inhibitors, immunophilins and glycosyltransferases. This parasite also possesses the enzymatic machinery to tolerate oxidative bursts resulting from its own metabolism or induced by the host immune system. Conclusions This study provides insights into the genome architecture of this unusual stramenopile. It also proposes candidate genes with which to study the physiopathology of this parasite and thus may lead to further investigations into Blastocystis-host interactions.
Collapse
Affiliation(s)
- France Denoeud
- Genoscope (CEA) and CNRS UMR 8030, Université d'Evry, 2 rue Gaston Crémieux, 91057 Evry, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
47
|
Choi KM, Kim JY, Moon SU, Lee HW, Sattabongkot J, Na BK, Kim DW, Suh EJ, Kim YJ, Cho SH, Lee HS, Rhie HG, Kim TS. Molecular cloning of Plasmodium vivax calcium-dependent protein kinase 4. THE KOREAN JOURNAL OF PARASITOLOGY 2010; 48:319-24. [PMID: 21234235 PMCID: PMC3018582 DOI: 10.3347/kjp.2010.48.4.319] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/04/2010] [Revised: 10/30/2010] [Accepted: 11/01/2010] [Indexed: 11/23/2022]
Abstract
A family of calcium-dependent protein kinases (CDPKs) is a unique enzyme which plays crucial roles in intracellular calcium signaling in plants, algae, and protozoa. CDPKs of malaria parasites are known to be key regulators for stage-specific cellular responses to calcium, a widespread secondary messenger that controls the progression of the parasite. In our study, we identified a gene encoding Plasmodium vivax CDPK4 (PvCDPK4) and characterized its molecular property and cellular localization. PvCDPK4 was a typical CDPK which had well-conserved N-terminal kinase domain and C-terminal calmodulin-like structure with 4 EF hand motifs for calcium-binding. The recombinant protein of EF hand domain of PvCDPK4 was expressed in E. coli and a 34 kDa product was obtained. Immunofluorescence assay by confocal laser microscopy revealed that the protein was expressed at the mature schizont of P. vivax. The expression of PvCDPK4-EF in schizont suggests that it may participate in the proliferation or egress process in the life cycle of this parasite.
Collapse
Affiliation(s)
- Kyung-Mi Choi
- National Institute of Health, Korea Center for Disease Control and Prevention, Seoul 122-701, Korea
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
48
|
Amano N, Tanaka T, Numa H, Sakai H, Itoh T. Efficient plant gene identification based on interspecies mapping of full-length cDNAs. DNA Res 2010; 17:271-9. [PMID: 20668003 PMCID: PMC2955710 DOI: 10.1093/dnares/dsq017] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
We present an annotation pipeline that accurately predicts exon–intron structures and protein-coding sequences (CDSs) on the basis of full-length cDNAs (FLcDNAs). This annotation pipeline was used to identify genes in 10 plant genomes. In particular, we show that interspecies mapping of FLcDNAs to genomes is of great value in fully utilizing FLcDNA resources whose availability is limited to several species. Because low sequence conservation at 5′- and 3′-ends of FLcDNAs between different species tends to result in truncated CDSs, we developed an improved algorithm to identify complete CDSs by the extension of both ends of truncated CDSs. Interspecies mapping of 71 801 monocot FLcDNAs to the Oryza sativa genome led to the detection of 22 142 protein-coding regions. Moreover, in comparing two mapping programs and three ab initio prediction programs, we found that our pipeline was more capable of identifying complete CDSs. As demonstrated by monocot interspecies mapping, in which nucleotide identity between FLcDNAs and the genome was ∼80%, the resultant inferred CDSs were sufficiently accurate. Finally, we applied both inter- and intraspecies mapping to 10 monocot and dicot genomes and identified genes in 210 551 loci. Interspecies mapping of FLcDNAs is expected to effectively predict genes and CDSs in newly sequenced genomes.
Collapse
Affiliation(s)
- Naoki Amano
- Bioinformatics Research Unit, Division of Genome and Biodiversity Research, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | | | | | | | | |
Collapse
|
49
|
Fraser HI, Dendrou CA, Healy B, Rainbow DB, Howlett S, Smink LJ, Gregory S, Steward CA, Todd JA, Peterson LB, Wicker LS. Nonobese diabetic congenic strain analysis of autoimmune diabetes reveals genetic complexity of the Idd18 locus and identifies Vav3 as a candidate gene. THE JOURNAL OF IMMUNOLOGY 2010; 184:5075-84. [PMID: 20363978 DOI: 10.4049/jimmunol.0903734] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
We have used the public sequencing and annotation of the mouse genome to delimit the previously resolved type 1 diabetes (T1D) insulin-dependent diabetes (Idd)18 interval to a region on chromosome 3 that includes the immunologically relevant candidate gene, Vav3. To test the candidacy of Vav3, we developed a novel congenic strain that enabled the resolution of Idd18 to a 604-kb interval, designated Idd18.1, which contains only two annotated genes: the complete sequence of Vav3 and the last exon of the gene encoding NETRIN G1, Ntng1. Targeted sequencing of Idd18.1 in the NOD mouse strain revealed that allelic variation between NOD and C57BL/6J (B6) occurs in noncoding regions with 138 single nucleotide polymorphisms concentrated in the introns between exons 20 and 27 and immediately after the 3' untranslated region. We observed differential expression of VAV3 RNA transcripts in thymocytes when comparing congenic mouse strains with B6 or NOD alleles at Idd18.1. The T1D protection associated with B6 alleles of Idd18.1/Vav3 requires the presence of B6 protective alleles at Idd3, which are correlated with increased IL-2 production and regulatory T cell function. In the absence of B6 protective alleles at Idd3, we detected a second T1D protective B6 locus, Idd18.3, which is closely linked to, but distinct from, Idd18.1. Therefore, genetic mapping, sequencing, and gene expression evidence indicate that alteration of VAV3 expression is an etiological factor in the development of autoimmune beta-cell destruction in NOD mice. This study also demonstrates that a congenic strain mapping approach can isolate closely linked susceptibility genes.
Collapse
Affiliation(s)
- Heather I Fraser
- Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, Cambridge Institute for Medical Research, University of Cambridge, Cambridge
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Aoki K, Yano K, Suzuki A, Kawamura S, Sakurai N, Suda K, Kurabayashi A, Suzuki T, Tsugane T, Watanabe M, Ooga K, Torii M, Narita T, Shin-I T, Kohara Y, Yamamoto N, Takahashi H, Watanabe Y, Egusa M, Kodama M, Ichinose Y, Kikuchi M, Fukushima S, Okabe A, Arie T, Sato Y, Yazawa K, Satoh S, Omura T, Ezura H, Shibata D. Large-scale analysis of full-length cDNAs from the tomato (Solanum lycopersicum) cultivar Micro-Tom, a reference system for the Solanaceae genomics. BMC Genomics 2010. [PMID: 20350329 DOI: 10.1186/1471‐2164‐11‐210] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The Solanaceae family includes several economically important vegetable crops. The tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. Recently, a number of tomato resources have been developed in parallel with the ongoing tomato genome sequencing project. In particular, a miniature cultivar, Micro-Tom, is regarded as a model system in tomato genomics, and a number of genomics resources in the Micro-Tom-background, such as ESTs and mutagenized lines, have been established by an international alliance. RESULTS To accelerate the progress in tomato genomics, we developed a collection of fully-sequenced 13,227 Micro-Tom full-length cDNAs. By checking redundant sequences, coding sequences, and chimeric sequences, a set of 11,502 non-redundant full-length cDNAs (nrFLcDNAs) was generated. Analysis of untranslated regions demonstrated that tomato has longer 5'- and 3'-untranslated regions than most other plants but rice. Classification of functions of proteins predicted from the coding sequences demonstrated that nrFLcDNAs covered a broad range of functions. A comparison of nrFLcDNAs with genes of sixteen plants facilitated the identification of tomato genes that are not found in other plants, most of which did not have known protein domains. Mapping of the nrFLcDNAs onto currently available tomato genome sequences facilitated prediction of exon-intron structure. Introns of tomato genes were longer than those of Arabidopsis and rice. According to a comparison of exon sequences between the nrFLcDNAs and the tomato genome sequences, the frequency of nucleotide mismatch in exons between Micro-Tom and the genome-sequencing cultivar (Heinz 1706) was estimated to be 0.061%. CONCLUSION The collection of Micro-Tom nrFLcDNAs generated in this study will serve as a valuable genomic tool for plant biologists to bridge the gap between basic and applied studies. The nrFLcDNA sequences will help annotation of the tomato whole-genome sequence and aid in tomato functional genomics and molecular breeding. Full-length cDNA sequences and their annotations are provided in the database KaFTom http://www.pgb.kazusa.or.jp/kaftom/ via the website of the National Bioresource Project Tomato http://tomato.nbrp.jp.
Collapse
Affiliation(s)
- Koh Aoki
- Kazusa DNA Research Institute, 2-6-7 Kazusa-Kamatari, Kisarazu, Japan.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|