1
|
Hale I, Ma X, Melo ATO, Padi FK, Hendre PS, Kingan SB, Sullivan ST, Chen S, Boffa JM, Muchugi A, Danquah A, Barnor MT, Jamnadass R, Van de Peer Y, Van Deynze A. Genomic Resources to Guide Improvement of the Shea Tree. Front Plant Sci 2021; 12:720670. [PMID: 34567033 PMCID: PMC8459026 DOI: 10.3389/fpls.2021.720670] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Accepted: 08/04/2021] [Indexed: 05/25/2023]
Abstract
A defining component of agroforestry parklands across Sahelo-Sudanian Africa (SSA), the shea tree (Vitellaria paradoxa) is central to sustaining local livelihoods and the farming environments of rural communities. Despite its economic and cultural value, however, not to mention the ecological roles it plays as a dominant parkland species, shea remains semi-domesticated with virtually no history of systematic genetic improvement. In truth, shea's extended juvenile period makes traditional breeding approaches untenable; but the opportunity for genome-assisted breeding is immense, provided the foundational resources are available. Here we report the development and public release of such resources. Using the FALCON-Phase workflow, 162.6 Gb of long-read PacBio sequence data were assembled into a 658.7 Mbp, chromosome-scale reference genome annotated with 38,505 coding genes. Whole genome duplication (WGD) analysis based on this gene space revealed clear signatures of two ancient WGD events in shea's evolutionary past, one prior to the Astrid-Rosid divergence (116-126 Mya) and the other at the root of the order Ericales (65-90 Mya). In a first genome-wide look at the suite of fatty acid (FA) biosynthesis genes that likely govern stearin content, the primary determinant of shea butter quality, relatively high copy numbers of six key enzymes were found (KASI, KASIII, FATB, FAD2, FAD3, and FAX2), some likely originating in shea's more recent WGD event. To help translate these findings into practical tools for characterization, selection, and genome-wide association studies (GWAS), resequencing data from a shea diversity panel was used to develop a database of more than 3.5 million functionally annotated, physically anchored SNPs. Two smaller, more curated sets of suggested SNPs, one for GWAS (104,211 SNPs) and the other targeting FA biosynthesis genes (90 SNPs), are also presented. With these resources, the hope is to support national programs across the shea belt in the strategic, genome-enabled conservation and long-term improvement of the shea tree for SSA.
Collapse
Affiliation(s)
- Iago Hale
- Department of Agriculture, Nutrition, and Food Systems, University of New Hampshire, Durham, NH, United States
| | - Xiao Ma
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- Center for Plant Systems Biology, VIB, Ghent, Belgium
| | - Arthur T. O. Melo
- Department of Agriculture, Nutrition, and Food Systems, University of New Hampshire, Durham, NH, United States
| | - Francis Kwame Padi
- Plant Breeding Division, Cocoa Research Institute of Ghana, Ghana Cocoa Board, New Tafo, Ghana
| | - Prasad S. Hendre
- AOCC Genomics Laboratory and Tree Genebank Research Unit, World Agroforestry (CIFOR-ICRAF), Nairobi, Kenya
| | | | | | - Shiyu Chen
- Seed Biotechnology Center, University of California, Davis, Davis, CA, United States
| | - Jean-Marc Boffa
- AOCC Genomics Laboratory and Tree Genebank Research Unit, World Agroforestry (CIFOR-ICRAF), Nairobi, Kenya
| | - Alice Muchugi
- AOCC Genomics Laboratory and Tree Genebank Research Unit, World Agroforestry (CIFOR-ICRAF), Nairobi, Kenya
- The Forage Genebank, Feed and Forage Development Program, International Livestock Research Institute, Addis Ababa, Ethiopia
| | - Agyemang Danquah
- West Africa Centre for Crop Improvement, College of Basic and Applied Sciences, University of Ghana, Accra, Ghana
| | - Michael Teye Barnor
- Plant Breeding Division, Cocoa Research Institute of Ghana, Ghana Cocoa Board, New Tafo, Ghana
| | - Ramni Jamnadass
- AOCC Genomics Laboratory and Tree Genebank Research Unit, World Agroforestry (CIFOR-ICRAF), Nairobi, Kenya
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- Center for Plant Systems Biology, VIB, Ghent, Belgium
- College of Horticulture, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, China
- Centre for Microbial Ecology and Genomics, Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
| | - Allen Van Deynze
- AOCC Genomics Laboratory and Tree Genebank Research Unit, World Agroforestry (CIFOR-ICRAF), Nairobi, Kenya
- Seed Biotechnology Center, University of California, Davis, Davis, CA, United States
| |
Collapse
|
2
|
Yu H, Wang X, Lu Z, Xu Y, Deng X, Xu Q. Endogenous pararetrovirus sequences are widely present in Citrinae genomes. Virus Res 2018; 262:48-53. [PMID: 29792903 DOI: 10.1016/j.virusres.2018.05.018] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Revised: 05/20/2018] [Accepted: 05/20/2018] [Indexed: 01/04/2023]
Abstract
Endogenous pararetroviruses (EPRVs) are characterized in several plant genomes and their biological effects have been reported. In this study, hundreds of EPRV segments were identified in six Citrinae genomes. A total of 1034 EPRV segments were identified in the genomes of sweet orange, 2036 in pummelo, 598 in clementine mandarin, 752 in Ichang papeda, 2060 in citron and 245 in atalantia. Genomic analysis indicated that EPRV segments tend to cluster as hot spots in the genomes, particularly on chromosome 2 and 5. Large numbers of simple repeats and transposable elements were identified in the 2-kb flanking regions of the EPRV segments. Comparative genomic analysis and PCR experiments showed that there are highly conserved EPRV segments and species-specific EPRV segments between the Citrinae genomes. Phylogenetic analysis suggested that the integration events of EPRVs could initiate in a common progenitor of Citrinae species and repeatedly occur during the Citrinae divergence.
Collapse
Affiliation(s)
- Huiwen Yu
- Key Laboratory of Horticultural Plant Biology (Ministry of Education), Huazhong Agricultural University, Wuhan 430070, China
| | - Xia Wang
- Key Laboratory of Horticultural Plant Biology (Ministry of Education), Huazhong Agricultural University, Wuhan 430070, China
| | - Zhihao Lu
- Key Laboratory of Horticultural Plant Biology (Ministry of Education), Huazhong Agricultural University, Wuhan 430070, China
| | - Yuantao Xu
- Key Laboratory of Horticultural Plant Biology (Ministry of Education), Huazhong Agricultural University, Wuhan 430070, China
| | - Xiuxin Deng
- Key Laboratory of Horticultural Plant Biology (Ministry of Education), Huazhong Agricultural University, Wuhan 430070, China
| | - Qiang Xu
- Key Laboratory of Horticultural Plant Biology (Ministry of Education), Huazhong Agricultural University, Wuhan 430070, China.
| |
Collapse
|
3
|
Lang D, Ullrich KK, Murat F, Fuchs J, Jenkins J, Haas FB, Piednoel M, Gundlach H, Van Bel M, Meyberg R, Vives C, Morata J, Symeonidi A, Hiss M, Muchero W, Kamisugi Y, Saleh O, Blanc G, Decker EL, van Gessel N, Grimwood J, Hayes RD, Graham SW, Gunter LE, McDaniel SF, Hoernstein SNW, Larsson A, Li FW, Perroud PF, Phillips J, Ranjan P, Rokshar DS, Rothfels CJ, Schneider L, Shu S, Stevenson DW, Thümmler F, Tillich M, Villarreal Aguilar JC, Widiez T, Wong GKS, Wymore A, Zhang Y, Zimmer AD, Quatrano RS, Mayer KFX, Goodstein D, Casacuberta JM, Vandepoele K, Reski R, Cuming AC, Tuskan GA, Maumus F, Salse J, Schmutz J, Rensing SA. The Physcomitrella patens chromosome-scale assembly reveals moss genome structure and evolution. Plant J 2018; 93:515-533. [PMID: 29237241 DOI: 10.1111/tpj.13801] [Citation(s) in RCA: 247] [Impact Index Per Article: 41.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Revised: 11/20/2017] [Accepted: 11/24/2017] [Indexed: 05/18/2023]
Abstract
The draft genome of the moss model, Physcomitrella patens, comprised approximately 2000 unordered scaffolds. In order to enable analyses of genome structure and evolution we generated a chromosome-scale genome assembly using genetic linkage as well as (end) sequencing of long DNA fragments. We find that 57% of the genome comprises transposable elements (TEs), some of which may be actively transposing during the life cycle. Unlike in flowering plant genomes, gene- and TE-rich regions show an overall even distribution along the chromosomes. However, the chromosomes are mono-centric with peaks of a class of Copia elements potentially coinciding with centromeres. Gene body methylation is evident in 5.7% of the protein-coding genes, typically coinciding with low GC and low expression. Some giant virus insertions are transcriptionally active and might protect gametes from viral infection via siRNA mediated silencing. Structure-based detection methods show that the genome evolved via two rounds of whole genome duplications (WGDs), apparently common in mosses but not in liverworts and hornworts. Several hundred genes are present in colinear regions conserved since the last common ancestor of plants. These syntenic regions are enriched for functions related to plant-specific cell growth and tissue organization. The P. patens genome lacks the TE-rich pericentromeric and gene-rich distal regions typical for most flowering plant genomes. More non-seed plant genomes are needed to unravel how plant genomes evolve, and to understand whether the P. patens genome structure is typical for mosses or bryophytes.
Collapse
Affiliation(s)
- Daniel Lang
- Plant Biotechnology, Faculty of Biology, University of Freiburg, Schaenzlestr. 1, 79104, Freiburg, Germany
- Plant Genome and Systems Biology, Helmholtz Center Munich, 85764, Neuherberg, Germany
| | - Kristian K Ullrich
- Plant Cell Biology, Faculty of Biology, University of Marburg, Marburg, Germany
| | - Florent Murat
- INRA, UMR 1095 Genetics, Diversity and Ecophysiology of Cereals (GDEC), 5 Chemin de Beaulieu, 63100, Clermont-Ferrand, France
| | - Jörg Fuchs
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstrasse 3, OT Gatersleben, D-06466, Stadt Seeland, Germany
| | - Jerry Jenkins
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | - Fabian B Haas
- Plant Cell Biology, Faculty of Biology, University of Marburg, Marburg, Germany
| | - Mathieu Piednoel
- Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, Carl-von-Linné Weg 10, D-50829, Cologne, Germany
| | - Heidrun Gundlach
- Plant Genome and Systems Biology, Helmholtz Center Munich, 85764, Neuherberg, Germany
| | - Michiel Van Bel
- VIB Center for Plant Systems Biology, Technologiepark 927, 9052, Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, B-9052, Gent, Belgium
| | - Rabea Meyberg
- Plant Cell Biology, Faculty of Biology, University of Marburg, Marburg, Germany
| | - Cristina Vives
- Center for Research in Agricultural Genomics, CRAG (CSIC-IRTA-UAB-UB), Campus UAB, Bellaterra, Cerdanyola del Vallès, 08193, Barcelona, Spain
| | - Jordi Morata
- Center for Research in Agricultural Genomics, CRAG (CSIC-IRTA-UAB-UB), Campus UAB, Bellaterra, Cerdanyola del Vallès, 08193, Barcelona, Spain
| | | | - Manuel Hiss
- Plant Cell Biology, Faculty of Biology, University of Marburg, Marburg, Germany
| | - Wellington Muchero
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
| | - Yasuko Kamisugi
- Centre for Plant Sciences, Faculty of Biological Sciences, University of Leeds, Leeds, LS2 9JT, UK
| | - Omar Saleh
- Plant Biotechnology, Faculty of Biology, University of Freiburg, Schaenzlestr. 1, 79104, Freiburg, Germany
| | - Guillaume Blanc
- Structural and Genomic Information Laboratory (IGS), Aix-Marseille Université, CNRS, UMR 7256 (IMM FR 3479), Marseille, France
| | - Eva L Decker
- Plant Biotechnology, Faculty of Biology, University of Freiburg, Schaenzlestr. 1, 79104, Freiburg, Germany
| | - Nico van Gessel
- Plant Biotechnology, Faculty of Biology, University of Freiburg, Schaenzlestr. 1, 79104, Freiburg, Germany
| | - Jane Grimwood
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
- DOE Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | | | - Sean W Graham
- Department of Botany, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Lee E Gunter
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
| | - Stuart F McDaniel
- Department of Biology, University of Florida, Gainesville, FL, 32611, USA
| | - Sebastian N W Hoernstein
- Plant Biotechnology, Faculty of Biology, University of Freiburg, Schaenzlestr. 1, 79104, Freiburg, Germany
| | - Anders Larsson
- Department of Organismal Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | - Fay-Wei Li
- Boyce Thompson Institute, Ithaca, NY, 14853, USA
| | | | | | - Priya Ranjan
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
| | - Daniel S Rokshar
- DOE Joint Genome Institute, Walnut Creek, CA, 94598, USA
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
| | - Carl J Rothfels
- University Herbarium and Department of Integrative Biology, University of California, Berkeley, CA, 94720-2465, USA
| | - Lucas Schneider
- Plant Cell Biology, Faculty of Biology, University of Marburg, Marburg, Germany
| | - Shengqiang Shu
- DOE Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | | | - Fritz Thümmler
- Vertis Biotechnologie AG, Lise-Meitner-Str. 30, 85354, Freising, Germany
| | - Michael Tillich
- Max Planck Institute of Molecular Plant Physiology, Am Muehlenberg 1, 14476, Potsdam-Golm, Germany
| | | | - Thomas Widiez
- Department of Plant Biology, University of Geneva, Sciences III, Geneva 4, CH-1211, Switzerland
- Department of Plant Biology & Pathology Rutgers, The State University of New Jersey, New Brunswick, NJ, 08901, USA
| | - Gane Ka-Shu Wong
- Department of Biological Sciences, University of Alberta, Edmonton, AB, T6G 2E9, Canada
- Department of Medicine, University of Alberta, Edmonton, AB, T6G 2E1, Canada
- BGI-Shenzhen, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Ann Wymore
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
| | - Yong Zhang
- Shenzhen Huahan Gene Life Technology Co. Ltd, Shenzhen, China
| | - Andreas D Zimmer
- Plant Biotechnology, Faculty of Biology, University of Freiburg, Schaenzlestr. 1, 79104, Freiburg, Germany
| | - Ralph S Quatrano
- Department of Biology, Washington University, St. Louis, MO, USA
| | - Klaus F X Mayer
- Plant Genome and Systems Biology, Helmholtz Center Munich, 85764, Neuherberg, Germany
- WZW, Technical University Munich, Munich, Germany
| | | | - Josep M Casacuberta
- Center for Research in Agricultural Genomics, CRAG (CSIC-IRTA-UAB-UB), Campus UAB, Bellaterra, Cerdanyola del Vallès, 08193, Barcelona, Spain
| | - Klaas Vandepoele
- VIB Center for Plant Systems Biology, Technologiepark 927, 9052, Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, B-9052, Gent, Belgium
| | - Ralf Reski
- Plant Biotechnology, Faculty of Biology, University of Freiburg, Schaenzlestr. 1, 79104, Freiburg, Germany
- BIOSS Centre for Biological Signalling Studies, University of Freiburg, Schaenzlestr. 18, 79104, Freiburg, Germany
| | - Andrew C Cuming
- Centre for Plant Sciences, Faculty of Biological Sciences, University of Leeds, Leeds, LS2 9JT, UK
| | - Gerald A Tuskan
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
| | - Florian Maumus
- URGI, INRA, Université Paris-Saclay, 78026, Versailles, France
| | - Jérome Salse
- INRA, UMR 1095 Genetics, Diversity and Ecophysiology of Cereals (GDEC), 5 Chemin de Beaulieu, 63100, Clermont-Ferrand, France
| | - Jeremy Schmutz
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
- DOE Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | - Stefan A Rensing
- Plant Cell Biology, Faculty of Biology, University of Marburg, Marburg, Germany
- BIOSS Centre for Biological Signalling Studies, University of Freiburg, Schaenzlestr. 18, 79104, Freiburg, Germany
| |
Collapse
|
4
|
Wang X, Xu Y, Zhang S, Cao L, Huang Y, Cheng J, Wu G, Tian S, Chen C, Liu Y, Yu H, Yang X, Lan H, Wang N, Wang L, Xu J, Jiang X, Xie Z, Tan M, Larkin RM, Chen L, Ma B, Ruan Y, Deng X, Xu Q. Genomic analyses of primitive, wild and cultivated citrus provide insights into asexual reproduction. Nat Genet 2017; 49:765-72. [DOI: 10.1038/ng.3839] [Citation(s) in RCA: 216] [Impact Index Per Article: 30.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 03/17/2017] [Indexed: 12/17/2022]
|
5
|
|
6
|
Baek JH, Kim J, Kim CK, Sohn SH, Choi D, Ratnaparkhe MB, Kim DW, Lee TH. MultiSyn: A Webtool for Multiple Synteny Detection and Visualization of User's Sequence of Interest Compared to Public Plant Species. Evol Bioinform Online 2016; 12:193-9. [PMID: 27594782 PMCID: PMC5003123 DOI: 10.4137/ebo.s40009] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2016] [Revised: 07/12/2016] [Accepted: 07/14/2016] [Indexed: 12/19/2022] Open
Abstract
Information on multiple synteny between plants and/or within a plant is key information to understand genome evolution. In addition, visualization of multiple synteny is helpful in interpreting evolution. So far, some web applications have been developed to determine and visualize multiple homology regions at once. However, the applications are not fully convenient for biologists because some of them do not include the function of synteny determination but visualize the multiple synteny plots by allowing users to upload their synteny data by determining the synteny based only on BLAST similarity information, with some algorithms not designed for synteny determination. Here, we introduce a web application that determines and visualizes multiple synteny from two types of files, simplified browser extensible data and protein sequence file by MCScanX algorithm, which have been used in many synteny studies.
Collapse
Affiliation(s)
- Jeong-Ho Baek
- Genomics Division, National Institute of Agricultural Sciences, Jeonju, Korea
| | - Junah Kim
- Genomics Division, National Institute of Agricultural Sciences, Jeonju, Korea
| | - Chang-Kug Kim
- Genomics Division, National Institute of Agricultural Sciences, Jeonju, Korea
| | - Seong-Han Sohn
- Genomics Division, National Institute of Agricultural Sciences, Jeonju, Korea
| | - Dongsu Choi
- Department of Biology, Kunsan National University, Gunsan-si, Jeollabuk-do, Korea
| | - Milind B Ratnaparkhe
- Directorate of Soybean Research, Indian Council of Agriculture Research (ICAR), Indore, Madhya Pradesh, India
| | - Do-Wan Kim
- Genomics Division, National Institute of Agricultural Sciences, Jeonju, Korea
| | - Tae-Ho Lee
- Genomics Division, National Institute of Agricultural Sciences, Jeonju, Korea
| |
Collapse
|
7
|
Singh PP, Arora J, Isambert H. Identification of Ohnolog Genes Originating from Whole Genome Duplication in Early Vertebrates, Based on Synteny Comparison across Multiple Genomes. PLoS Comput Biol 2015; 11:e1004394. [PMID: 26181593 PMCID: PMC4504502 DOI: 10.1371/journal.pcbi.1004394] [Citation(s) in RCA: 90] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Accepted: 06/09/2015] [Indexed: 11/18/2022] Open
Abstract
Whole genome duplications (WGD) have now been firmly established in all major eukaryotic kingdoms. In particular, all vertebrates descend from two rounds of WGDs, that occurred in their jawless ancestor some 500 MY ago. Paralogs retained from WGD, also coined 'ohnologs' after Susumu Ohno, have been shown to be typically associated with development, signaling and gene regulation. Ohnologs, which amount to about 20 to 35% of genes in the human genome, have also been shown to be prone to dominant deleterious mutations and frequently implicated in cancer and genetic diseases. Hence, identifying ohnologs is central to better understand the evolution of vertebrates and their susceptibility to genetic diseases. Early computational analyses to identify vertebrate ohnologs relied on content-based synteny comparisons between the human genome and a single invertebrate outgroup genome or within the human genome itself. These approaches are thus limited by lineage specific rearrangements in individual genomes. We report, in this study, the identification of vertebrate ohnologs based on the quantitative assessment and integration of synteny conservation between six amniote vertebrates and six invertebrate outgroups. Such a synteny comparison across multiple genomes is shown to enhance the statistical power of ohnolog identification in vertebrates compared to earlier approaches, by overcoming lineage specific genome rearrangements. Ohnolog gene families can be browsed and downloaded for three statistical confidence levels or recompiled for specific, user-defined, significance criteria at http://ohnologs.curie.fr/. In the light of the importance of WGD on the genetic makeup of vertebrates, our analysis provides a useful resource for researchers interested in gaining further insights on vertebrate evolution and genetic diseases.
Collapse
Affiliation(s)
- Param Priya Singh
- CNRS UMR168, UPMC, Institut Curie, Research Center, Paris, France
- * E-mail: (PPS); (HI)
| | - Jatin Arora
- CNRS UMR168, UPMC, Institut Curie, Research Center, Paris, France
| | - Hervé Isambert
- CNRS UMR168, UPMC, Institut Curie, Research Center, Paris, France
- * E-mail: (PPS); (HI)
| |
Collapse
|
8
|
Gehrmann T, Reinders MJT. Proteny: discovering and visualizing statistically significant syntenic clusters at the proteome level. Bioinformatics 2015; 31:3437-44. [PMID: 26116928 PMCID: PMC4612220 DOI: 10.1093/bioinformatics/btv389] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2014] [Accepted: 06/23/2015] [Indexed: 01/12/2023] Open
Abstract
Background: With more and more genomes being sequenced, detecting synteny between genomes becomes more and more important. However, for microorganisms the genomic divergence quickly becomes large, resulting in different codon usage and shuffling of gene order and gene elements such as exons. Results: We present Proteny, a methodology to detect synteny between diverged genomes. It operates on the amino acid sequence level to be insensitive to codon usage adaptations and clusters groups of exons disregarding order to handle diversity in genomic ordering between genomes. Furthermore, Proteny assigns significance levels to the syntenic clusters such that they can be selected on statistical grounds. Finally, Proteny provides novel ways to visualize results at different scales, facilitating the exploration and interpretation of syntenic regions. We test the performance of Proteny on a standard ground truth dataset, and we illustrate the use of Proteny on two closely related genomes (two different strains of Aspergillus niger) and on two distant genomes (two species of Basidiomycota). In comparison to other tools, we find that Proteny finds clusters with more true homologies in fewer clusters that contain more genes, i.e. Proteny is able to identify a more consistent synteny. Further, we show how genome rearrangements, assembly errors, gene duplications and the conservation of specific genes can be easily studied with Proteny. Availability and implementation: Proteny is freely available at the Delft Bioinformatics Lab website http://bioinformatics.tudelft.nl/dbl/software. Contact:t.gehrmann@tudelft.nl Supplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Thies Gehrmann
- Delft Bioinformatics Lab, Delft University of Technology, Mekelweg 4, 2628 CD Delft, The Netherlands
| | - Marcel J T Reinders
- Delft Bioinformatics Lab, Delft University of Technology, Mekelweg 4, 2628 CD Delft, The Netherlands
| |
Collapse
|
9
|
Abstract
BACKGROUND The rapid accumulation of whole-genome data has renewed interest in the study of using gene-order data for phylogenetic analyses and ancestral reconstruction. Current software and web servers typically do not support duplication and loss events along with rearrangements. RESULTS MLGO (Maximum Likelihood for Gene-Order Analysis) is a web tool for the reconstruction of phylogeny and/or ancestral genomes from gene-order data. MLGO is based on likelihood computation and shows advantages over existing methods in terms of accuracy, scalability and flexibility. CONCLUSIONS To the best of our knowledge, it is the first web tool for analysis of large-scale genomic changes including not only rearrangements but also gene insertions, deletions and duplications. The web tool is available from http://www.geneorder.org/server.php .
Collapse
Affiliation(s)
- Fei Hu
- Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin University, Tianjin, 300072, China. .,Department of Computer Science and Engineering, University of South Carolina, Columbia, 29208, SC, USA.
| | - Yu Lin
- Department of Computer Science and Engineering, University of California, San Diego, 92093 La Jolla, CA, USA.
| | - Jijun Tang
- Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin University, Tianjin, 300072, China. .,Department of Computer Science and Engineering, University of South Carolina, Columbia, 29208, SC, USA.
| |
Collapse
|
10
|
Jung S, Main D. Genomics and bioinformatics resources for translational science in Rosaceae. Plant Biotechnol Rep 2014; 8:49-64. [PMID: 24634697 PMCID: PMC3951882 DOI: 10.1007/s11816-013-0282-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/11/2013] [Accepted: 04/22/2013] [Indexed: 05/22/2023]
Abstract
Recent technological advances in biology promise unprecedented opportunities for rapid and sustainable advancement of crop quality. Following this trend, the Rosaceae research community continues to generate large amounts of genomic, genetic and breeding data. These include annotated whole genome sequences, transcriptome and expression data, proteomic and metabolomic data, genotypic and phenotypic data, and genetic and physical maps. Analysis, storage, integration and dissemination of these data using bioinformatics tools and databases are essential to provide utility of the data for basic, translational and applied research. This review discusses the currently available genomics and bioinformatics resources for the Rosaceae family.
Collapse
Affiliation(s)
- Sook Jung
- Department of Horticulture, Washington State University, Pullman, WA 99164 USA
| | - Dorrie Main
- Department of Horticulture, Washington State University, Pullman, WA 99164 USA
| |
Collapse
|
11
|
Yin X, Wang J, Cheng H, Wang X, Yu D. Detection and evolutionary analysis of soybean miRNAs responsive to soybean mosaic virus. Planta 2013; 237:1213-25. [PMID: 23328897 DOI: 10.1007/s00425-012-1835-3] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/18/2012] [Accepted: 12/26/2012] [Indexed: 05/22/2023]
Abstract
MicroRNAs (miRNA) are a class of non-coding RNAs that have important gene regulatory roles in various organisms. However, the miRNAs involved in soybean's response to soybean mosaic virus (SMV) are unknown. To identify novel miRNAs and biotic-stress regulated small RNAs that are involved in soybean's response to SMV, two small RNA libraries were constructed from mock-inoculated and SMV-infected soybean leaves and sequenced. This led to the discovery of 179 miRNAs, representing 52 families, among which five miRNAs belonging to three families were novel miRNAs in soybean. A large proportion (71.5 %) of miRNAs arose from segmental duplication, similar to the process that drives the evolution of protein-coding genes. In addition, we predicted 346 potential targets of these identified miRNAs, and verified 12 targets by modified 5'-RACE analysis. Finally, three miRNAs (miR160, miR393 and miR1510) that are involved in plant resistance were observed to respond to SMV infection. The interaction between miRNAs and resistance-related genes provides a novel mechanism for pathogens to evade host recognition.
Collapse
Affiliation(s)
- Xianchao Yin
- National Center for Soybean Improvement, National Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, 210095, China
| | | | | | | | | |
Collapse
|
12
|
Roulin A, Auer PL, Libault M, Schlueter J, Farmer A, May G, Stacey G, Doerge RW, Jackson SA. The fate of duplicated genes in a polyploid plant genome. Plant J 2013; 73:143-53. [PMID: 22974547 DOI: 10.1111/tpj.12026] [Citation(s) in RCA: 190] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2012] [Revised: 08/09/2012] [Accepted: 09/10/2012] [Indexed: 05/18/2023]
Abstract
Polyploidy is generally not tolerated in animals, but is widespread in plant genomes and may result in extensive genetic redundancy. The fate of duplicated genes is poorly understood, both functionally and evolutionarily. Soybean (Glycine max L.) has undergone two separate polyploidy events (13 and 59 million years ago) that have resulted in 75% of its genes being present in multiple copies. It therefore constitutes a good model to study the impact of whole-genome duplication on gene expression. Using RNA-seq, we tested the functional fate of a set of approximately 18 000 duplicated genes. Across seven tissues tested, approximately 50% of paralogs were differentially expressed and thus had undergone expression sub-functionalization. Based on gene ontology and expression data, our analysis also revealed that only a small proportion of the duplicated genes have been neo-functionalized or non-functionalized. In addition, duplicated genes were often found in collinear blocks, and several blocks of duplicated genes were co-regulated, suggesting some type of epigenetic or positional regulation. We also found that transcription factors and ribosomal protein genes were differentially expressed in many tissues, suggesting that the main consequence of polyploidy in soybean may be at the regulatory level.
Collapse
Affiliation(s)
- Anne Roulin
- Institute for Plant Breeding, Genetics and Genomics, University of Georgia, 111 Riverbend Road, Athens, GA, 30602, USA
- Zoologisches Institut, Universität Basel, Vesalgasse 1, CH-4051, Basel, Switzerland
| | - Paul L Auer
- Department of Statistics, Purdue University, West Lafayette, IN, 47907, USA
- Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Marc Libault
- Divisions of Plant Science and Biochemistry, University of Missouri, Columbia, MO, 65211, USA
- Department of Botany and Microbiology, University of Oklahoma, Norman, OK, 73019, USA
| | - Jessica Schlueter
- Institute for Plant Breeding, Genetics and Genomics, University of Georgia, 111 Riverbend Road, Athens, GA, 30602, USA
- College of Computing and Informatics, University of North Carolina Charlotte, Charlotte, NC, 28223, USA
| | - Andrew Farmer
- National Center for Genome Resources, Santa Fe, NM, USA
| | - Greg May
- National Center for Genome Resources, Santa Fe, NM, USA
| | - Gary Stacey
- Divisions of Plant Science and Biochemistry, University of Missouri, Columbia, MO, 65211, USA
| | - Rebecca W Doerge
- Department of Statistics, Purdue University, West Lafayette, IN, 47907, USA
| | - Scott A Jackson
- Institute for Plant Breeding, Genetics and Genomics, University of Georgia, 111 Riverbend Road, Athens, GA, 30602, USA
| |
Collapse
|
13
|
Yang R, Jarvis DE, Chen H, Beilstein MA, Grimwood J, Jenkins J, Shu S, Prochnik S, Xin M, Ma C, Schmutz J, Wing RA, Mitchell-Olds T, Schumaker KS, Wang X. The Reference Genome of the Halophytic Plant Eutrema salsugineum. Front Plant Sci 2013; 4:46. [PMID: 23518688 PMCID: PMC3604812 DOI: 10.3389/fpls.2013.00046] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2013] [Accepted: 02/24/2013] [Indexed: 05/02/2023]
Abstract
Halophytes are plants that can naturally tolerate high concentrations of salt in the soil, and their tolerance to salt stress may occur through various evolutionary and molecular mechanisms. Eutrema salsugineum is a halophytic species in the Brassicaceae that can naturally tolerate multiple types of abiotic stresses that typically limit crop productivity, including extreme salinity and cold. It has been widely used as a laboratorial model for stress biology research in plants. Here, we present the reference genome sequence (241 Mb) of E. salsugineum at 8× coverage sequenced using the traditional Sanger sequencing-based approach with comparison to its close relative Arabidopsis thaliana. The E. salsugineum genome contains 26,531 protein-coding genes and 51.4% of its genome is composed of repetitive sequences that mostly reside in pericentromeric regions. Comparative analyses of the genome structures, protein-coding genes, microRNAs, stress-related pathways, and estimated translation efficiency of proteins between E. salsugineum and A. thaliana suggest that halophyte adaptation to environmental stresses may occur via a global network adjustment of multiple regulatory mechanisms. The E. salsugineum genome provides a resource to identify naturally occurring genetic alterations contributing to the adaptation of halophytic plants to salinity and that might be bioengineered in related crop species.
Collapse
Affiliation(s)
- Ruolin Yang
- School of Plant Sciences, University of ArizonaTucson, AZ, USA
| | - David E. Jarvis
- School of Plant Sciences, University of ArizonaTucson, AZ, USA
| | - Hao Chen
- School of Plant Sciences, University of ArizonaTucson, AZ, USA
| | | | - Jane Grimwood
- Department of Energy Joint Genome InstituteWalnut Creek, CA, USA
- HudsonAlpha Institute of BiotechnologyHuntsville, AL, USA
| | - Jerry Jenkins
- Department of Energy Joint Genome InstituteWalnut Creek, CA, USA
- HudsonAlpha Institute of BiotechnologyHuntsville, AL, USA
| | - ShengQiang Shu
- Department of Energy Joint Genome InstituteWalnut Creek, CA, USA
| | - Simon Prochnik
- Department of Energy Joint Genome InstituteWalnut Creek, CA, USA
| | - Mingming Xin
- School of Plant Sciences, University of ArizonaTucson, AZ, USA
| | - Chuang Ma
- School of Plant Sciences, University of ArizonaTucson, AZ, USA
| | - Jeremy Schmutz
- Department of Energy Joint Genome InstituteWalnut Creek, CA, USA
- HudsonAlpha Institute of BiotechnologyHuntsville, AL, USA
| | - Rod A. Wing
- School of Plant Sciences, University of ArizonaTucson, AZ, USA
| | | | - Karen S. Schumaker
- School of Plant Sciences, University of ArizonaTucson, AZ, USA
- *Correspondence: Karen S. Schumaker and Xiangfeng Wang, School of Plant Sciences, University of Arizona, 303 Forbes Hall, 1140 E. South Campus Drive, Tucson, AZ 85721-0036, USA. e-mail: ;
| | - Xiangfeng Wang
- School of Plant Sciences, University of ArizonaTucson, AZ, USA
- *Correspondence: Karen S. Schumaker and Xiangfeng Wang, School of Plant Sciences, University of Arizona, 303 Forbes Hall, 1140 E. South Campus Drive, Tucson, AZ 85721-0036, USA. e-mail: ;
| |
Collapse
|
14
|
Malacarne G, Perazzolli M, Cestaro A, Sterck L, Fontana P, Van de Peer Y, Viola R, Velasco R, Salamini F. Deconstruction of the (paleo)polyploid grapevine genome based on the analysis of transposition events involving NBS resistance genes. PLoS One 2012; 7:e29762. [PMID: 22253773 DOI: 10.1371/journal.pone.0029762] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2011] [Accepted: 12/05/2011] [Indexed: 01/09/2023] Open
Abstract
Plants have followed a reticulate type of evolution and taxa have frequently merged via allopolyploidization. A polyploid structure of sequenced genomes has often been proposed, but the chromosomes belonging to putative component genomes are difficult to identify. The 19 grapevine chromosomes are evolutionary stable structures: their homologous triplets have strongly conserved gene order, interrupted by rare translocations. The aim of this study is to examine how the grapevine nucleotide-binding site (NBS)-encoding resistance (NBS-R) genes have evolved in the genomic context and to understand mechanisms for the genome evolution. We show that, in grapevine, i) helitrons have significantly contributed to transposition of NBS-R genes, and ii) NBS-R gene cluster similarity indicates the existence of two groups of chromosomes (named as Va and Vc) that may have evolved independently. Chromosome triplets consist of two Va and one Vc chromosomes, as expected from the tetraploid and diploid conditions of the two component genomes. The hexaploid state could have been derived from either allopolyploidy or the separation of the Va and Vc component genomes in the same nucleus before fusion, as known for Rosaceae species. Time estimation indicates that grapevine component genomes may have fused about 60 mya, having had at least 40–60 mya to evolve independently. Chromosome number variation in the Vitaceae and related families, and the gap between the time of eudicot radiation and the age of Vitaceae fossils, are accounted for by our hypothesis.
Collapse
|
15
|
Abstract
Identification of intragenomic conservation of gene compositions in multiple chromosomal segments led to evidence of whole genome (WGDs) duplications. The process by which WGDs have been maintained and decayed provides us with clues for understanding how the genome evolves. In this chapter, we summarize current understanding of phylogenetic distribution and evolutionary impact of WGDs, introduce basic procedures to detect conserved synteny, and discuss typical pitfalls, as well as biological insights.
Collapse
Affiliation(s)
- Shigehiro Kuraku
- Genome Resource and Analysis Unit, RIKEN Center for Developmental Biology, Chuo-ku, Kobe, Japan.
| | | |
Collapse
|
16
|
Proost S, Fostier J, De Witte D, Dhoedt B, Demeester P, Van de Peer Y, Vandepoele K. i-ADHoRe 3.0--fast and sensitive detection of genomic homology in extremely large data sets. Nucleic Acids Res 2011; 40:e11. [PMID: 22102584 PMCID: PMC3258164 DOI: 10.1093/nar/gkr955] [Citation(s) in RCA: 138] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Comparative genomics is a powerful means to gain insight into the evolutionary processes that shape the genomes of related species. As the number of sequenced genomes increases, the development of software to perform accurate cross-species analyses becomes indispensable. However, many implementations that have the ability to compare multiple genomes exhibit unfavorable computational and memory requirements, limiting the number of genomes that can be analyzed in one run. Here, we present a software package to unveil genomic homology based on the identification of conservation of gene content and gene order (collinearity), i-ADHoRe 3.0, and its application to eukaryotic genomes. The use of efficient algorithms and support for parallel computing enable the analysis of large-scale data sets. Unlike other tools, i-ADHoRe can process the Ensembl data set, containing 49 species, in 1 h. Furthermore, the profile search is more sensitive to detect degenerate genomic homology than chaining pairwise collinearity information based on transitive homology. From ultra-conserved collinear regions between mammals and birds, by integrating coexpression information and protein–protein interactions, we identified more than 400 regions in the human genome showing significant functional coherence. The different algorithmical improvements ensure that i-ADHoRe 3.0 will remain a powerful tool to study genome evolution.
Collapse
|
17
|
Varshney RK, Chen W, Li Y, Bharti AK, Saxena RK, Schlueter JA, Donoghue MTA, Azam S, Fan G, Whaley AM, Farmer AD, Sheridan J, Iwata A, Tuteja R, Penmetsa RV, Wu W, Upadhyaya HD, Yang SP, Shah T, Saxena KB, Michael T, McCombie WR, Yang B, Zhang G, Yang H, Wang J, Spillane C, Cook DR, May GD, Xu X, Jackson SA. Draft genome sequence of pigeonpea (Cajanus cajan), an orphan legume crop of resource-poor farmers. Nat Biotechnol 2011; 30:83-9. [PMID: 22057054 DOI: 10.1038/nbt.2022] [Citation(s) in RCA: 421] [Impact Index Per Article: 32.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2011] [Accepted: 10/03/2011] [Indexed: 11/08/2022]
Abstract
Pigeonpea is an important legume food crop grown primarily by smallholder farmers in many semi-arid tropical regions of the world. We used the Illumina next-generation sequencing platform to generate 237.2 Gb of sequence, which along with Sanger-based bacterial artificial chromosome end sequences and a genetic map, we assembled into scaffolds representing 72.7% (605.78 Mb) of the 833.07 Mb pigeonpea genome. Genome analysis predicted 48,680 genes for pigeonpea and also showed the potential role that certain gene families, for example, drought tolerance-related genes, have played throughout the domestication of pigeonpea and the evolution of its ancestors. Although we found a few segmental duplication events, we did not observe the recent genome-wide duplication events observed in soybean. This reference genome sequence will facilitate the identification of the genetic basis of agronomically important traits, and accelerate the development of improved pigeonpea varieties that could improve food security in many developing countries.
Collapse
Affiliation(s)
- Rajeev K Varshney
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, India.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Fawcett JA, Rouzé P, Van de Peer Y. Higher intron loss rate in Arabidopsis thaliana than A. lyrata is consistent with stronger selection for a smaller genome. Mol Biol Evol 2011; 29:849-59. [PMID: 21998273 DOI: 10.1093/molbev/msr254] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
The number of introns varies considerably among different organisms. This can be explained by the differences in the rates of intron gain and loss. Two factors that are likely to influence these rates are selection for or against introns and the mutation rate that generates the novel intron or the intronless copy. Although it has been speculated that stronger selection for a compact genome might result in a higher rate of intron loss and a lower rate of intron gain, clear evidence is lacking, and the role of selection in determining these rates has not been established. Here, we studied the gain and loss of introns in the two closely related species Arabidopsis thaliana and A. lyrata as it was recently shown that A. thaliana has been undergoing a faster genome reduction driven by selection. We found that A. thaliana has lost six times more introns than A. lyrata since the divergence of the two species but gained very few introns. We suggest that stronger selection for genome reduction probably resulted in the much higher intron loss rate in A. thaliana, although further analysis is required as we could not find evidence that the loss rate increased in A. thaliana as opposed to having decreased in A. lyrata compared with the rate in the common ancestor. We also examined the pattern of the intron gains and losses to better understand the mechanisms by which they occur. Microsimilarity was detected between the splice sites of several gained and lost introns, suggesting that nonhomologous end joining repair of double-strand breaks might be a common pathway not only for intron gain but also for intron loss.
Collapse
|
19
|
Xu X, Pan S, Cheng S, Zhang B, Mu D, Ni P, Zhang G, Yang S, Li R, Wang J, Orjeda G, Guzman F, Torres M, Lozano R, Ponce O, Martinez D, De la Cruz G, Chakrabarti SK, Patil VU, Skryabin KG, Kuznetsov BB, Ravin NV, Kolganova TV, Beletsky AV, Mardanov AV, Di Genova A, Bolser DM, Martin DM, Li G, Yang Y, Kuang H, Hu Q, Xiong X, Bishop GJ, Sagredo B, Mejía N, Zagorski W, Gromadka R, Gawor J, Szczesny P, Huang S, Zhang Z, Liang C, He J, Li Y, He Y, Xu J, Zhang Y, Xie B, Du Y, Qu D, Bonierbale M, Ghislain M, Herrera Mdel R, Giuliano G, Pietrella M, Perrotta G, Facella P, O'Brien K, Feingold SE, Barreiro LE, Massa GA, Diambra L, Whitty BR, Vaillancourt B, Lin H, Massa AN, Geoffroy M, Lundback S, DellaPenna D, Buell CR, Sharma SK, Marshall DF, Waugh R, Bryan GJ, Destefanis M, Nagy I, Milbourne D, Thomson SJ, Fiers M, Jacobs JM, Nielsen KL, Sønderkær M, Iovene M, Torres GA, Jiang J, Veilleux RE, Bachem CW, de Boer J, Borm T, Kloosterman B, van Eck H, Datema E, Hekkert Bt, Goverse A, van Ham RC, Visser RG; Potato Genome Sequencing Consortium. Genome sequence and analysis of the tuber crop potato. Nature 2011; 475:189-95. [PMID: 21743474 DOI: 10.1038/nature10158] [Citation(s) in RCA: 1168] [Impact Index Per Article: 89.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2011] [Accepted: 05/03/2011] [Indexed: 02/03/2023]
Abstract
Potato (Solanum tuberosum L.) is the world's most important non-grain food crop and is central to global food security. It is clonally propagated, highly heterozygous, autotetraploid, and suffers acute inbreeding depression. Here we use a homozygous doubled-monoploid potato clone to sequence and assemble 86% of the 844-megabase genome. We predict 39,031 protein-coding genes and present evidence for at least two genome duplication events indicative of a palaeopolyploid origin. As the first genome sequence of an asterid, the potato genome reveals 2,642 genes specific to this large angiosperm clade. We also sequenced a heterozygous diploid clone and show that gene presence/absence variants and other potentially deleterious mutations occur frequently and are a likely cause of inbreeding depression. Gene family expansion, tissue-specific expression and recruitment of genes to new pathways contributed to the evolution of tuber development. The potato genome sequence provides a platform for genetic improvement of this vital crop.
Collapse
|
20
|
Wang J, Tan S, Zhang L, Li P, Tian D. Co-variation among major classes of LRR-encoding genes in two pairs of plant species. J Mol Evol 2011; 72:498-509. [PMID: 21626302 DOI: 10.1007/s00239-011-9448-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2011] [Accepted: 05/10/2011] [Indexed: 10/18/2022]
Abstract
NBS-LRR (nucleotide-binding site-leucine-rich repeat), LRR-RLK (LRR-receptor-like kinase), and LRR-only are the three major LRR-encoding genes. Owing to the crucial role played by them in plant resistance, development, and growth, extensive studies have been performed on the NBS-LRR and LRR-RLK genes. However, few studies have focused on these genes collectively; they may co-vary as all of them contain LRR motifs. To investigate their common evolutionary patterns, all major classes of LRR-encoding genes were identified in 12 plant species, and particularly compared in two pairs of close relatives, Arabidopsis thaliana-A. lyrata (At-Al) and Zea mays-Sorghum bicolor. Our results showed that these genes co-vary significantly in terms of their numbers between species and that the genes with certain evolutionary parameters are most likely to have similar functions. The development-related genes have clear orthologous relationships between closely related species, as well as lower nucleotide divergence, and Ka/Ks ratio. In contrast, resistance-related genes have exactly opposite characteristics and favor 11-15 LRRs per gene. This association could be very useful in predicting the function of LRR-encoding genes. The presence of co-variation suggests that LRRs, combined with other domains, can work better in some common functions. In order to cooperate efficiently, there should be balanced gene numbers among the different gene classes.
Collapse
Affiliation(s)
- Jiao Wang
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, China
| | | | | | | | | |
Collapse
|
21
|
Abstract
BACKGROUND The automatic identification of syntenies across multiple species is a key step in comparative genomics that helps biologists shed light both on evolutionary and functional problems. RESULTS In this paper, we present a versatile tool to extract all syntenies from multiple bacterial species based on a clear-cut and very flexible definition of the synteny blocks that allows for gene quorum, partial gene correspondence, gaps, and a partial or total conservation of the gene order. CONCLUSIONS We apply this tool to two different kinds of studies. The first one is a search for functional gene associations. In this context, we compare our tool to a widely used heuristic--I-ADHORE--and show that at least up to ten genomes, the problem remains tractable with our exact definition and algorithm. The second application is linked to evolutionary studies: we verify in a multiple alignment setting that pairs of orthologs in synteny are more conserved than pairs outside, thus extending a previous pairwise study. We then show that this observation is in fact a function of the size of the synteny: the larger the block of synteny is, the more conserved the genes are.
Collapse
Affiliation(s)
- Yves-Pol Deniélou
- INRIA Grenoble-Rhône-Alpes, Team BAMBOO, 655 Avenue de l'Europe, 38334 Montbonnot Cedex, France.
| | | | | | | |
Collapse
|
22
|
Tang H, Lyons E, Pedersen B, Schnable JC, Paterson AH, Freeling M. Screening synteny blocks in pairwise genome comparisons through integer programming. BMC Bioinformatics 2011; 12:102. [PMID: 21501495 PMCID: PMC3088904 DOI: 10.1186/1471-2105-12-102] [Citation(s) in RCA: 93] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2010] [Accepted: 04/18/2011] [Indexed: 12/01/2022] Open
Abstract
Background It is difficult to accurately interpret chromosomal correspondences such as true orthology and paralogy due to significant divergence of genomes from a common ancestor. Analyses are particularly problematic among lineages that have repeatedly experienced whole genome duplication (WGD) events. To compare multiple "subgenomes" derived from genome duplications, we need to relax the traditional requirements of "one-to-one" syntenic matchings of genomic regions in order to reflect "one-to-many" or more generally "many-to-many" matchings. However this relaxation may result in the identification of synteny blocks that are derived from ancient shared WGDs that are not of interest. For many downstream analyses, we need to eliminate weak, low scoring alignments from pairwise genome comparisons. Our goal is to objectively select subset of synteny blocks whose total scores are maximized while respecting the duplication history of the genomes in comparison. We call this "quota-based" screening of synteny blocks in order to appropriately fill a quota of syntenic relationships within one genome or between two genomes having WGD events. Results We have formulated the synteny block screening as an optimization problem known as "Binary Integer Programming" (BIP), which is solved using existing linear programming solvers. The computer program QUOTA-ALIGN performs this task by creating a clear objective function that maximizes the compatible set of synteny blocks under given constraints on overlaps and depths (corresponding to the duplication history in respective genomes). Such a procedure is useful for any pairwise synteny alignments, but is most useful in lineages affected by multiple WGDs, like plants or fish lineages. For example, there should be a 1:2 ploidy relationship between genome A and B if genome B had an independent WGD subsequent to the divergence of the two genomes. We show through simulations and real examples using plant genomes in the rosid superorder that the quota-based screening can eliminate ambiguous synteny blocks and focus on specific genomic evolutionary events, like the divergence of lineages (in cross-species comparisons) and the most recent WGD (in self comparisons). Conclusions The QUOTA-ALIGN algorithm screens a set of synteny blocks to retain only those compatible with a user specified ploidy relationship between two genomes. These blocks, in turn, may be used for additional downstream analyses such as identifying true orthologous regions in interspecific comparisons. There are two major contributions of QUOTA-ALIGN: 1) reducing the block screening task to a BIP problem, which is novel; 2) providing an efficient software pipeline starting from all-against-all BLAST to the screened synteny blocks with dot plot visualizations. Python codes and full documentations are publicly available http://github.com/tanghaibao/quota-alignment. QUOTA-ALIGN program is also integrated as a major component in SynMap http://genomevolution.com/CoGe/SynMap.pl, offering easier access to thousands of genomes for non-programmers.
Collapse
Affiliation(s)
- Haibao Tang
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA.
| | | | | | | | | | | |
Collapse
|
23
|
Abstract
SyMAP (Synteny Mapping and Analysis Program) was originally developed to compute synteny blocks between a sequenced genome and a FPC map, and has been extended to support pairs of sequenced genomes. SyMAP uses MUMmer to compute the raw hits between the two genomes, which are then clustered and filtered using the optional gene annotation. The filtered hits are input to the synteny algorithm, which was designed to discover duplicated regions and form larger-scale synteny blocks, where intervening micro-rearrangements are allowed. SyMAP provides extensive interactive Java displays at all levels of resolution along with simultaneous displays of multiple aligned pairs. The synteny blocks from multiple chromosomes may be displayed in a high-level dot plot or three-dimensional view, and the user may then drill down to see the details of a region, including the alignments of the hits to the gene annotation. These capabilities are illustrated by showing their application to the study of genome duplication, differential gene loss and transitive homology between sorghum, maize and rice. The software may be used from a website or standalone for the best performance. A project manager is provided to organize and automate the analysis of multi-genome groups. The software is freely distributed at http://www.agcol.arizona.edu/software/symap.
Collapse
Affiliation(s)
- Carol Soderlund
- BIO5 Institute, 1657 Helen Street, University of Arizona, Tucson, AZ 85721, USA.
| | | | | |
Collapse
|
24
|
Fostier J, Proost S, Dhoedt B, Saeys Y, Demeester P, Van de Peer Y, Vandepoele K. A greedy, graph-based algorithm for the alignment of multiple homologous gene lists. ACTA ACUST UNITED AC 2011; 27:749-56. [PMID: 21216775 DOI: 10.1093/bioinformatics/btr008] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
MOTIVATION Many comparative genomics studies rely on the correct identification of homologous genomic regions using accurate alignment tools. In such case, the alphabet of the input sequences consists of complete genes, rather than nucleotides or amino acids. As optimal multiple sequence alignment is computationally impractical, a progressive alignment strategy is often employed. However, such an approach is susceptible to the propagation of alignment errors in early pairwise alignment steps, especially when dealing with strongly diverged genomic regions. In this article, we present a novel accurate and efficient greedy, graph-based algorithm for the alignment of multiple homologous genomic segments, represented as ordered gene lists. RESULTS Based on provable properties of the graph structure, several heuristics are developed to resolve local alignment conflicts that occur due to gene duplication and/or rearrangement events on the different genomic segments. The performance of the algorithm is assessed by comparing the alignment results of homologous genomic segments in Arabidopsis thaliana to those obtained by using both a progressive alignment method and an earlier graph-based implementation. Especially for datasets that contain strongly diverged segments, the proposed method achieves a substantially higher alignment accuracy, and proves to be sufficiently fast for large datasets including a few dozens of eukaryotic genomes. AVAILABILITY http://bioinformatics.psb.ugent.be/software. The algorithm is implemented as a part of the i-ADHoRe 3.0 package.
Collapse
Affiliation(s)
- Jan Fostier
- Department of Information Technology, Ghent University, Ghent, Belgium
| | | | | | | | | | | | | |
Collapse
|
25
|
Abstract
Background Whole genome duplication (WGD) is a special case of gene duplication, observed rarely in animals, whereby all genes duplicate simultaneously through polyploidisation. Two rounds of WGD (2R-WGD) occurred at the base of vertebrates, giving rise to an enormous wave of genetic novelty, but a systematic analysis of functional consequences of this event has not yet been performed. Results We show that 2R-WGD affected an overwhelming majority (74%) of signalling genes, in particular developmental pathways involving receptor tyrosine kinases, Wnt and transforming growth factor-β ligands, G protein-coupled receptors and the apoptosis pathway. 2R-retained genes, in contrast to tandem duplicates, were enriched in protein interaction domains and multifunctional signalling modules of Ras and mitogen-activated protein kinase cascades. 2R-WGD had a fundamental impact on the cell-cycle machinery, redefined molecular building blocks of the neuronal synapse, and was formative for vertebrate brains. We investigated 2R-associated nodes in the context of the human signalling network, as well as in an inferred ancestral pre-2R (AP2R) network, and found that hubs (particularly involving negative regulation) were preferentially retained, with high connectivity driving retention. Finally, microarrays and proteomics demonstrated a trend for gradual paralog expression divergence independent of the duplication mechanism, but inferred ancestral expression states suggested preferential subfunctionalisation among 2R-ohnologs (2ROs). Conclusions The 2R event left an indelible imprint on vertebrate signalling and the cell cycle. We show that 2R-WGD preferentially retained genes are associated with higher organismal complexity (for example, locomotion, nervous system, morphogenesis), while genes associated with basic cellular functions (for example, translation, replication, splicing, recombination; with the notable exception of cell cycle) tended to be excluded. 2R-WGD set the stage for the emergence of key vertebrate functional novelties (such as complex brains, circulatory system, heart, bone, cartilage, musculature and adipose tissue). A full explanation of the impact of 2R on evolution, function and the flow of information in vertebrate signalling networks is likely to have practical consequences for regenerative medicine, stem cell therapies and cancer treatment.
Collapse
Affiliation(s)
- Lukasz Huminiecki
- Ludwig Institute for Cancer Research, Uppsala University, Box 595, SE-751 24 Uppsala, Sweden.
| | | |
Collapse
|
26
|
Chauve C, Gavranovic H, Ouangraoua A, Tannier E. Yeast Ancestral Genome Reconstructions: The Possibilities of Computational Methods II. J Comput Biol 2010; 17:1097-112. [DOI: 10.1089/cmb.2010.0092] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Affiliation(s)
- Cedric Chauve
- Department of Mathematics, Simon Fraser University, Burnaby, BC, Canada
| | - Haris Gavranovic
- Faculty of Natural Sciences, University of Sarajevo, Sarajevo, Bosnia and Herzegovina
| | - Aida Ouangraoua
- INRIA Lille-Nord-Europe, Université Lille 1, LIFL, UMR CNRS 8022, Villeneuve d'Ascq, France
| | - Eric Tannier
- INRIA Rhône-Alpes, Université de Lyon, Lyon, and Université Lyon 1, CNRS, UMR 5558, Laboratoire de Biométrie et Biologie Evolutive, Villeurbanne, France
| |
Collapse
|
27
|
Velasco R, Zharkikh A, Affourtit J, Dhingra A, Cestaro A, Kalyanaraman A, Fontana P, Bhatnagar SK, Troggio M, Pruss D, Salvi S, Pindo M, Baldi P, Castelletti S, Cavaiuolo M, Coppola G, Costa F, Cova V, Dal Ri A, Goremykin V, Komjanc M, Longhi S, Magnago P, Malacarne G, Malnoy M, Micheletti D, Moretto M, Perazzolli M, Si-Ammour A, Vezzulli S, Zini E, Eldredge G, Fitzgerald LM, Gutin N, Lanchbury J, Macalma T, Mitchell JT, Reid J, Wardell B, Kodira C, Chen Z, Desany B, Niazi F, Palmer M, Koepke T, Jiwan D, Schaeffer S, Krishnan V, Wu C, Chu VT, King ST, Vick J, Tao Q, Mraz A, Stormo A, Stormo K, Bogden R, Ederle D, Stella A, Vecchietti A, Kater MM, Masiero S, Lasserre P, Lespinasse Y, Allan AC, Bus V, Chagné D, Crowhurst RN, Gleave AP, Lavezzo E, Fawcett JA, Proost S, Rouzé P, Sterck L, Toppo S, Lazzari B, Hellens RP, Durel CE, Gutin A, Bumgarner RE, Gardiner SE, Skolnick M, Egholm M, Van de Peer Y, Salamini F, Viola R. The genome of the domesticated apple (Malus × domestica Borkh.). Nat Genet 2010; 42:833-9. [DOI: 10.1038/ng.654] [Citation(s) in RCA: 1538] [Impact Index Per Article: 109.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2009] [Accepted: 08/03/2010] [Indexed: 11/09/2022]
|
28
|
Abstract
MOTIVATION The rapidly increasing set of sequenced genomes highlights the importance of identifying the synteny blocks in multiple and/or highly duplicated genomes. Most synteny block reconstruction algorithms use genes shared over all genomes to construct the synteny blocks for multiple genomes. However, the number of genes shared among all genomes quickly decreases with the increase in the number of genomes. RESULTS We propose the Duplications and Rearrangements In Multiple Mammals (DRIMM)-Synteny algorithm to address this bottleneck and apply it to analyzing genomic architectures of yeast, plant and mammalian genomes. We further combine synteny block generation with rearrangement analysis to reconstruct the ancestral preduplicated yeast genome. CONTACT kspham@cs.ucsd.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Son K Pham
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, California, USA.
| | | |
Collapse
|
29
|
Martens C, Van de Peer Y. The hidden duplication past of the plant pathogen Phytophthora and its consequences for infection. BMC Genomics 2010; 11:353. [PMID: 20525264 PMCID: PMC2996974 DOI: 10.1186/1471-2164-11-353] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2010] [Accepted: 06/03/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Oomycetes of the genus Phytophthora are pathogens that infect a wide range of plant species. For dicot hosts such as tomato, potato and soybean, Phytophthora is even the most important pathogen. Previous analyses of Phytophthora genomes uncovered many genes, large gene families and large genome sizes that can partially be explained by significant repeat expansion patterns. RESULTS Analysis of the complete genomes of three different Phytophthora species, using a newly developed approach, unveiled a large number of small duplicated blocks, mainly consisting of two or three consecutive genes. Further analysis of these duplicated genes and comparison with the known gene and genome duplication history of ten other eukaryotes including parasites, algae, plants, fungi, vertebrates and invertebrates, suggests that the ancestor of P. infestans, P. sojae and P. ramorum most likely underwent a whole genome duplication (WGD). Genes that have survived in duplicate are mainly genes that are known to be preferentially retained following WGDs, but also genes important for pathogenicity and infection of the different hosts seem to have been retained in excess. As a result, the WGD might have contributed to the evolutionary and pathogenic success of Phytophthora. CONCLUSIONS The fact that we find many small blocks of duplicated genes indicates that the genomes of Phytophthora species have been heavily rearranged following the WGD. Most likely, the high repeat content in these genomes have played an important role in this rearrangement process. As a consequence, the paucity of retained larger duplicated blocks has greatly complicated previous attempts to detect remnants of a large-scale duplication event in Phytophthora. However, as we show here, our newly developed strategy to identify very small duplicated blocks might be a useful approach to uncover ancient polyploidy events, in particular for heavily rearranged genomes.
Collapse
Affiliation(s)
- Cindy Martens
- Department of Plant Systems Biology, VIB, Technologiepark 927, B-9052 Ghent, Belgium
- Bioinformatics and Evolutionary Genomics, Department of Molecular Genetics, Technologiepark 927, Ghent University, B-9052 Ghent, Belgium
| | - Yves Van de Peer
- Department of Plant Systems Biology, VIB, Technologiepark 927, B-9052 Ghent, Belgium
- Bioinformatics and Evolutionary Genomics, Department of Molecular Genetics, Technologiepark 927, Ghent University, B-9052 Ghent, Belgium
| |
Collapse
|
30
|
Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang XC, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA. Genome sequence of the palaeopolyploid soybean. Nature 2010; 463:178-83. [PMID: 20075913 DOI: 10.1038/nature08670] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70% more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78% of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75% of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.
Collapse
|
31
|
Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang XC, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA. Genome sequence of the palaeopolyploid soybean. Nature 2010; 463:178-83. [PMID: 20075913 DOI: 10.1038/nature08670] [Citation(s) in RCA: 2569] [Impact Index Per Article: 183.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2009] [Accepted: 11/12/2009] [Indexed: 12/27/2022]
Abstract
Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70% more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78% of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75% of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.
Collapse
Affiliation(s)
- Jeremy Schmutz
- HudsonAlpha Genome Sequencing Center, 601 Genome Way, Huntsville, Alabama 35806, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Proost S, Van Bel M, Sterck L, Billiau K, Van Parys T, Van de Peer Y, Vandepoele K. PLAZA: a comparative genomics resource to study gene and genome evolution in plants. Plant Cell 2009; 21:3718-31. [PMID: 20040540 PMCID: PMC2814516 DOI: 10.1105/tpc.109.071506] [Citation(s) in RCA: 193] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2009] [Revised: 12/04/2009] [Accepted: 12/10/2009] [Indexed: 05/17/2023]
Abstract
The number of sequenced genomes of representatives within the green lineage is rapidly increasing. Consequently, comparative sequence analysis has significantly altered our view on the complexity of genome organization, gene function, and regulatory pathways. To explore all this genome information, a centralized infrastructure is required where all data generated by different sequencing initiatives is integrated and combined with advanced methods for data mining. Here, we describe PLAZA, an online platform for plant comparative genomics (http://bioinformatics.psb.ugent.be/plaza/). This resource integrates structural and functional annotation of published plant genomes together with a large set of interactive tools to study gene function and gene and genome evolution. Precomputed data sets cover homologous gene families, multiple sequence alignments, phylogenetic trees, intraspecies whole-genome dot plots, and genomic colinearity between species. Through the integration of high confidence Gene Ontology annotations and tree-based orthology between related species, thousands of genes lacking any functional description are functionally annotated. Advanced query systems, as well as multiple interactive visualization tools, are available through a user-friendly and intuitive Web interface. In addition, detailed documentation and tutorials introduce the different tools, while the workbench provides an efficient means to analyze user-defined gene sets through PLAZA's interface. In conclusion, PLAZA provides a comprehensible and up-to-date research environment to aid researchers in the exploration of genome information within the green plant lineage.
Collapse
Affiliation(s)
- Sebastian Proost
- Department of Plant Systems Biology, Flanders Institute for Biotechnology, B-9052 Ghent, Belgium.
| | | | | | | | | | | | | |
Collapse
|
33
|
Van de Peer Y, Fawcett JA, Proost S, Sterck L, Vandepoele K. The flowering world: a tale of duplications. Trends Plant Sci 2009; 14:680-8. [PMID: 19818673 DOI: 10.1016/j.tplants.2009.09.001] [Citation(s) in RCA: 114] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2008] [Revised: 08/31/2009] [Accepted: 09/07/2009] [Indexed: 05/02/2023]
Abstract
Flowering plants contain many genes, most of which were created during the past 200 or so million years through small- and large-scale duplications. Paleo-polyploidy events, in particular, have been the subject of much recent research. There is a growing consensus that one or more genome doubling or merging events occurred early during the evolution of the flowering plants, and that many lineages have since undergone additional, independent and more recent duplication events. Here, we review the difficulties in determining the number of genome duplications and discuss how the completion of some additional genome sequences of species occupying key phylogenetic positions has led to a better understanding of the timing of certain duplication events. This is important if we want to demonstrate the significance of genome duplications for the evolution and radiation of (different groups of) flowering plants.
Collapse
Affiliation(s)
- Yves Van de Peer
- Department of Plant Systems Biology, Flanders Institute for Biotechnology (VIB), 9052 Gent, Belgium.
| | | | | | | | | |
Collapse
|
34
|
Peng Q, Alekseyev MA, Tesler G, Pevzner PA. Decoding Synteny Blocks and Large-Scale Duplications in Mammalian and Plant Genomes. Lecture Notes in Computer Science 2009. [DOI: 10.1007/978-3-642-04241-6_19] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|