1
|
Navabi ZK, Huebert T, Sharpe AG, O’Neill CM, Bancroft I, Parkin IAP. Conserved microstructure of the Brassica B Genome of Brassica nigra in relation to homologous regions of Arabidopsis thaliana, B. rapa and B. oleracea. BMC Genomics 2013; 14:250. [PMID: 23586706 PMCID: PMC3765694 DOI: 10.1186/1471-2164-14-250] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2013] [Accepted: 04/04/2013] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND The Brassica B genome is known to carry several important traits, yet there has been limited analyses of its underlying genome structure, especially in comparison to the closely related A and C genomes. A bacterial artificial chromosome (BAC) library of Brassica nigra was developed and screened with 17 genes from a 222 kb region of A. thaliana that had been well characterised in both the Brassica A and C genomes. RESULTS Fingerprinting of 483 apparently non-redundant clones defined physical contigs for the corresponding regions in B. nigra. The target region is duplicated in A. thaliana and six homologous contigs were found in B. nigra resulting from the whole genome triplication event shared by the Brassiceae tribe. BACs representative of each region were sequenced to elucidate the level of microscale rearrangements across the Brassica species divide. CONCLUSIONS Although the B genome species separated from the A/C lineage some 6 Mya, comparisons between the three paleopolyploid Brassica genomes revealed extensive conservation of gene content and sequence identity. The level of fractionation or gene loss varied across genomes and genomic regions; however, the greatest loss of genes was observed to be common to all three genomes. One large-scale chromosomal rearrangement differentiated the B genome suggesting such events could contribute to the lack of recombination observed between B genome species and those of the closely related A/C lineage.
Collapse
Affiliation(s)
- Zahra-Katy Navabi
- Agriculture and Agri-Food Canada, 107 Science Place, Saskatoon, SK S7N 0X2, Canada
| | - Terry Huebert
- Agriculture and Agri-Food Canada, 107 Science Place, Saskatoon, SK S7N 0X2, Canada
| | - Andrew G Sharpe
- DNA Technologies Laboratory, 110 Gymnasium Place, Saskatoon, SK S7N 0W9, Canada
| | - Carmel M O’Neill
- John Innes Centre, Norwich Research Park, Colney, Norwich NR4 7UH, UK
| | - Ian Bancroft
- John Innes Centre, Norwich Research Park, Colney, Norwich NR4 7UH, UK
| | - Isobel AP Parkin
- Agriculture and Agri-Food Canada, 107 Science Place, Saskatoon, SK S7N 0X2, Canada
| |
Collapse
|
2
|
Medium- and short-chain dehydrogenase/reductase gene and protein families : the SDR superfamily: functional and structural diversity within a family of metabolic and regulatory enzymes. Cell Mol Life Sci 2009; 65:3895-906. [PMID: 19011750 PMCID: PMC2792337 DOI: 10.1007/s00018-008-8588-y] [Citation(s) in RCA: 622] [Impact Index Per Article: 41.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Short-chain dehydrogenases/reductases (SDRs) constitute a large family of NAD(P)(H)-dependent oxidoreductases, sharing sequence motifs and displaying similar mechanisms. SDR enzymes have critical roles in lipid, amino acid, carbohydrate, cofactor, hormone and xenobiotic metabolism as well as in redox sensor mechanisms. Sequence identities are low, and the most conserved feature is an α/β folding pattern with a central beta sheet flanked by 2–3 α-helices from each side, thus a classical Rossmannfold motif for nucleotide binding. The conservation of this element and an active site, often with an Asn-Ser-Tyr-Lys tetrad, provides a platform for enzymatic activities encompassing several EC classes, including oxidoreductases, epimerases and lyases. The common mechanism is an underlying hydride and proton transfer involving the nicotinamide and typically an active site tyrosine residue, whereas substrate specificity is determined by a variable C-terminal segment. Relationships exist with bacterial haloalcohol dehalogenases, which lack cofactor binding but have the active site architecture, emphasizing the versatility of the basic fold in also generating hydride transfer-independent lyases. The conserved fold and nucleotide binding emphasize the role of SDRs as scaffolds for an NAD(P)(H) redox sensor system, of importance to control metabolic routes, transcription and signalling.
Collapse
|
3
|
Town CD, Cheung F, Maiti R, Crabtree J, Haas BJ, Wortman JR, Hine EE, Althoff R, Arbogast TS, Tallon LJ, Vigouroux M, Trick M, Bancroft I. Comparative genomics of Brassica oleracea and Arabidopsis thaliana reveal gene loss, fragmentation, and dispersal after polyploidy. THE PLANT CELL 2006; 18:1348-59. [PMID: 16632643 PMCID: PMC1475499 DOI: 10.1105/tpc.106.041665] [Citation(s) in RCA: 280] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2006] [Revised: 03/21/2006] [Accepted: 03/28/2006] [Indexed: 05/08/2023]
Abstract
We sequenced 2.2 Mb representing triplicated genome segments of Brassica oleracea, which are each paralogous with one another and homologous with a segmentally duplicated region of the Arabidopsis thaliana genome. Sequence annotation identified 177 conserved collinear genes in the B. oleracea genome segments. Analysis of synonymous base substitution rates indicated that the triplicated Brassica genome segments diverged from a common ancestor soon after divergence of the Arabidopsis and Brassica lineages. This conclusion was corroborated by phylogenetic analysis of protein families. Using A. thaliana as an outgroup, 35% of the genes inferred to be present when genome triplication occurred in the Brassica lineage have been lost, most likely via a deletion mechanism, in an interspersed pattern. Genes encoding proteins involved in signal transduction or transcription were not found to be significantly more extensively retained than those encoding proteins classified with other functions, but putative proteins predicted in the A. thaliana genome were underrepresented in B. oleracea. We identified one example of gene loss from the Arabidopsis lineage. We found evidence for the frequent insertion of gene fragments of nuclear genomic origin and identified four apparently intact genes in noncollinear positions in the B. oleracea and A. thaliana genomes.
Collapse
|
4
|
Rautengarten C, Steinhauser D, Büssis D, Stintzi A, Schaller A, Kopka J, Altmann T. Inferring hypotheses on functional relationships of genes: Analysis of the Arabidopsis thaliana subtilase gene family. PLoS Comput Biol 2005; 1:e40. [PMID: 16193095 PMCID: PMC1236819 DOI: 10.1371/journal.pcbi.0010040] [Citation(s) in RCA: 118] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2005] [Accepted: 08/16/2005] [Indexed: 11/18/2022] Open
Abstract
The gene family of subtilisin-like serine proteases (subtilases) in Arabidopsis thaliana comprises 56 members, divided into six distinct subfamilies. Whereas the members of five subfamilies are similar to pyrolysins, two genes share stronger similarity to animal kexins. Mutant screens confirmed 144 T-DNA insertion lines with knockouts for 55 out of the 56 subtilases. Apart from SDD1, none of the confirmed homozygous mutants revealed any obvious visible phenotypic alteration during growth under standard conditions. Apart from this specific case, forward genetics gave us no hints about the function of the individual 54 non-characterized subtilase genes. Therefore, the main objective of our work was to overcome the shortcomings of the forward genetic approach and to infer alternative experimental approaches by using an integrative bioinformatics and biological approach. Computational analyses based on transcriptional co-expression and co-response pattern revealed at least two expression networks, suggesting that functional redundancy may exist among subtilases with limited similarity. Furthermore, two hubs were identified, which may be involved in signalling or may represent higher-order regulatory factors involved in responses to environmental cues. A particular enrichment of co-regulated genes with metabolic functions was observed for four subtilases possibly representing late responsive elements of environmental stress. The kexin homologs show stronger associations with genes of transcriptional regulation context. Based on the analyses presented here and in accordance with previously characterized subtilases, we propose three main functions of subtilases: involvement in (i) control of development, (ii) protein turnover, and (iii) action as downstream components of signalling cascades. Supplemental material is available in the Plant Subtilase Database (PSDB)
(http://csbdb.mpimp-golm.mpg.de/psdb.html)
, as well as from the CSB.DB (http://csbdb.mpimp-golm.mpg.de). The first complete plant genome sequence was available for Arabidopsis thaliana, a common weed. The number of genes in the Arabidopsis genome is estimated to be around 25,000. The functions of most of these gene are, however, still unknown. Many genes are grouped into gene families due to conserved sequences and predicted protein structures. In this article, the large subtilisin-like serine protease (subtilase) family of Arabidopsis is analysed. Although 56 subtilase genes have been identified in Arabidopsis, the function of only two subtilases is known. Analysis of mutants has revealed no further hints about the function of the other 54 subtilases. Here the authors present a novel approach to infer hypotheses about functions of the subtilase genes using computational analysis. Based on the analyses presented here and in accordance with previously characterized subtilases, they propose three main functions of subtilases: involvement in (i) control of development, (ii) protein degradation, and (iii) signalling. The results presented can be used to direct further analysis to elucidate functions of subtilases in plants.
Collapse
Affiliation(s)
- Carsten Rautengarten
- Institut für Biochemie und Biologie, Genetik, Universität Potsdam, Golm, Germany.
| | | | | | | | | | | | | |
Collapse
|
5
|
Muller C, Denis M, Gentzbittel L, Faraut T. The Iccare web server: an attempt to merge sequence and mapping information for plant and animal species. Nucleic Acids Res 2004; 32:W429-34. [PMID: 15215424 PMCID: PMC441598 DOI: 10.1093/nar/gkh460] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The Iccare web server, http://genopole.toulouse.inra.fr/bioinfo/Iccare, provides a simple yet efficient tool for crude EST (expressed sequence tag) annotation specifically dedicated to comparative mapping approaches. Iccare uses all the EST and mRNA sequences from public databases for an organism of interest (query species) and compares them to all the transcripts of one reference organism (Homo sapiens or Arabidopsis thaliana). The results are displayed according to the location of the genes on the chromosomes of the reference organism. Gene structure information and sequence similarities are combined in a graphical representation in order to pinpoint the nature of the transcript query sequence. The user can subsequently design primers or probes for the purpose of physical or genetic mapping. In addition to the query organisms already available in Iccare, users can perform a tailor-made search with their own sequences against the animal or plant reference organism genes.
Collapse
Affiliation(s)
- Cédric Muller
- INP-ENSAT, Laboratoire de biotechnologies et d'amélioration des plantes, Castanet Tolosan 31326, France
| | | | | | | |
Collapse
|
6
|
Busov VB, Johannes E, Whetten RW, Sederoff RR, Spiker SL, Lanz-Garcia C, Goldfarb B. An auxin-inducible gene from loblolly pine (Pinus taeda L.) is differentially expressed in mature and juvenile-phase shoots and encodes a putative transmembrane protein. PLANTA 2004; 218:916-27. [PMID: 14722770 DOI: 10.1007/s00425-003-1175-4] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2003] [Accepted: 11/20/2003] [Indexed: 05/18/2023]
Abstract
We have isolated a gene from loblolly pine, 5NG4, that is highly and specifically induced by auxin in juvenile loblolly pine shoots prior to adventitious root formation, but substantially down-regulated in physiologically mature shoots that are adventitious rooting incompetent. 5NG4 was highly auxin-induced in roots, stems and hypocotyls, organs that can form either lateral or adventitious roots following an auxin treatment, but was not induced to the same level in needles and cotyledons, organs that do not form roots. The deduced amino acid sequence shows homology to the MtN21 nodulin gene from Medicago truncatula. The expression pattern of 5NG4 and its homology to a protein from Medicago involved in a root-related process suggest a possible role for this gene in adventitious root formation. Homology searches also identified similar proteins in Arabidopsis thaliana and Oryza sativa. High conservation across these evolutionarily distant species suggests essential functions in plant growth and development. A 38-member family of genes homologous to 5NG4 was identified in the A. thaliana genome. The physiological significance of this redundancy is most likely associated with functional divergence and/or expression specificity of the different family members. The exact biochemical function of the gene is still unknown, but sequence and structure predictions and 5NG4::GFP fusion protein localizations indicate it is a transmembrane protein with a possible transport function.
Collapse
Affiliation(s)
- Victor B Busov
- Department of Forestry, North Carolina State University, Raleigh, NC 27695, USA
| | | | | | | | | | | | | |
Collapse
|
7
|
Champion A, Kreis M, Mockaitis K, Picaud A, Henry Y. Arabidopsis kinome: after the casting. Funct Integr Genomics 2004; 4:163-87. [PMID: 14740254 DOI: 10.1007/s10142-003-0096-4] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2003] [Revised: 09/22/2003] [Accepted: 11/04/2003] [Indexed: 11/25/2022]
Abstract
Arabidopsis thaliana is used as a favourite experimental organism for many aspects of plant biology. We capitalized on the recently available Arabidopsis genome sequence and predicted proteome, to draw up a genome-scale protein serine/threonine kinase (PSTK) inventory. The PSTKs represent about 4% of the A. thaliana proteome. In this study, we provide a description of the content and diversity of the non-receptor PSTKs. These kinases have crucial functions in sensing, mediating and coordinating cellular responses to an extensive range of stimuli. A total of 369 predicted non receptor PSTKs were detailed: the Raf superfamily, the CMGC, CaMK, AGC and STE families, as well as a few small clades and orphan sequences. An extensive relationship analysis of these kinases allows us to classify the proteins in superfamilies, families, sub-families and groups. The classification provides a better knowledge of the characteristics shared by the different clades. We focused on the MAP kinase module elements, with particular attention to their docking sites for protein-protein interaction and their biological function. The large number of A. thaliana genes encoding kinases might have been achieved through successive rounds of gene and genome duplications. The evolution towards an increasing gene number suggests that functional redundancy plays an important role in plant genetic robustness.
Collapse
Affiliation(s)
- A Champion
- Institut de Biotechnologie des Plantes, Laboratoire de Biologie du Développement des Plantes, Bâtiment 630, UMR CNRS/UPS 8618, Université de Paris-Sud, 91405, Orsay Cedex, France
| | | | | | | | | |
Collapse
|
8
|
Abstract
Plant MADS-box genes encode transcriptional regulators that are critical for a number of developmental processes. In the angiosperms (the flowering plants), these include the specification of floral organ identities, flowering time and fruit development. It appears that the MADS box gene family has undergone considerable gene duplication and sequence divergence within the angiosperms. Here I discuss the possibility that these events have allowed the recruitment of these genes to new developmental pathways in particular angiosperm lineages. Recent analyses of sequence changes, expression patterns and, in a few cases, gene function are beginning to provide tantalizing evidence for deciphering when and how such genetic diversification has led to particular morphological innovations. In the future, comparative studies of large numbers of species will be required to assess the extent of such variation as well as to fully understand the mechanisms by which evolution of these developmental regulators has played a role in shaping new morphologies.
Collapse
Affiliation(s)
- Vivian F Irish
- Departments of Molecular, Cellular and Developmental Biology and of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA.
| |
Collapse
|
9
|
Jia L, Clegg MT, Jiang T. Excess non-synonymous substitutions suggest that positive selection episodes occurred during the evolution of DNA-binding domains in the Arabidopsis R2R3-MYB gene family. PLANT MOLECULAR BIOLOGY 2003; 52:627-42. [PMID: 12956532 DOI: 10.1023/a:1024875232511] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
It has been suggested that evolutionary changes in regulatory genes may be the predominant molecular mechanism governing both physiological and morphological evolution. R2R3-AtMYB is one of the largest transcription factor gene families in Arabidopsis. Using inferred ancestral sequences we show that several lineages in the R2R3-AtMYB phylogeny experienced excess non-synonymous nucleotide substitution upon gene duplication, indicating episodes of positive selection driving adaptive shifts early in the evolution of this gene family. A noise reduction technique was then used to determine individual sites in DNA-binding domains (R2 domain and R3 domain) of R2R3-AtMYB protein sequence that were favored by frequent non-synonymous substitutions. The analyses reveal that the first helix (helix1) and the second helix (helix2) in both R2 and R3 domains are characterized by more frequent non-synonymous substitutions, and thus experienced significantly higher positive selection pressure than the third helix (helix3) in both domains. Previous MYB protein structure studies have suggested that helix1 and helix2 in both R2 and R3 domains are involved in the characteristic packing of R2R3-AtMYB DNA-binding domains. This suggests that excess non-synonymous substitutions in these helices could have resulted in MYB recognition of novel gene target sites.
Collapse
Affiliation(s)
- Li Jia
- Department of Computer Science, University of California, Riverside, CA 92521, USA.
| | | | | |
Collapse
|
10
|
Abstract
Comparison of partially sequenced cereal genomes suggests a mosaic structure consisting of recombinationally active gene-rich islands that are separated by blocks of high-copy DNA. Annotation of the whole rice genome suggests that most, but not all, cereal genes are present within the rice genome and that the high number of reported genes in this genome is probably due to duplications. Within the cereals, macrocolinearity is conserved but, at the level of individual genes, microcolinearity is frequently disrupted. Preliminary evidence from limited comparative analysis of sequenced orthologous genomic segments suggests that local gene amplification and translocation within a plant genome may be linked in some cases.
Collapse
Affiliation(s)
- Doreen Ware
- Cold Spring Harbor Laboratory, Bungtown Road, Cold Spring Harbor, New York 11724, USA.
| | | |
Collapse
|
11
|
Persson B, Kallberg Y, Oppermann U, Jörnvall H. Coenzyme-based functional assignments of short-chain dehydrogenases/reductases (SDRs). Chem Biol Interact 2003; 143-144:271-8. [PMID: 12604213 DOI: 10.1016/s0009-2797(02)00223-5] [Citation(s) in RCA: 163] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Short-chain dehydrogenases/reductases (SDRs) are enzymes of great functional diversity. In spite of a residue identity of only 15-30%, the folds are conserved to a large extent, with specific sequence motifs detectable. We have developed an assignment scheme based on these motifs and detect five families. Only two of these were known before, called 'Classical' and 'Extended', but are now distinguished at a further level based on patterns of charged residues in the coenzyme-binding region, giving seven subfamilies of classical SDRs and three subfamilies of extended SDRs. Three further families are novel entities, denoted 'Intermediate', 'Divergent' and 'Complex', encompassing short-chain alcohol dehydrogenases, enoyl reductases and multifunctional enzymes, respectively. The assignment scheme was applied to the genomes of human, mouse, D. melanogaster, C. elegans, A. thaliana and S. cerevisiae. In the animal genomes, genes corresponding to the extended SDRs amount to around one quarter or less of the total number of SDR genes, while in those of A. thaliana and S. cerevisiae, the extended members constitute about 40% of the SDR forms. The NAD(H)-dependent SDRs are about equally many as the NADP(H)-dependent ones in human, mouse and plant, while the proportions of NAD(H)-dependent enzymes are much lower in fruit fly, worm and yeast. We also find that NADP(H) is the preferred coenzyme among most classical SDRs, while NAD(H) is that preferred among most extended SDRs.
Collapse
Affiliation(s)
- Bengt Persson
- IFM Bioinformatics, Linköping University, S-581 83, Linköping, Sweden.
| | | | | | | |
Collapse
|
12
|
Haake V, Cook D, Riechmann JL, Pineda O, Thomashow MF, Zhang JZ. Transcription factor CBF4 is a regulator of drought adaptation in Arabidopsis. PLANT PHYSIOLOGY 2002; 130:639-48. [PMID: 12376631 PMCID: PMC166593 DOI: 10.1104/pp.006478] [Citation(s) in RCA: 415] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2002] [Revised: 05/03/2002] [Accepted: 06/03/2002] [Indexed: 05/17/2023]
Abstract
In plants, low temperature and dehydration activate a set of genes containing C-repeat/dehydration-responsive elements in their promoter. It has been shown previously that the Arabidopsis CBF/DREB1 transcription activators are critical regulators of gene expression in the signal transduction of cold acclimation. Here, we report the isolation of an apparent homolog of the CBF/DREB1 proteins (CBF4) that plays the equivalent role during drought adaptation. In contrast to the three already identified CBF/DREB1 homologs, which are induced under cold stress, CBF4 gene expression is up-regulated by drought stress, but not by low temperature. Overexpression of CBF4 in transgenic Arabidopsis plants results in the activation of C-repeat/dehydration-responsive element containing downstream genes that are involved in cold acclimation and drought adaptation. As a result, the transgenic plants are more tolerant to freezing and drought stress. Because of the physiological similarity between freezing and drought stress, and the sequence and structural similarity of the CBF/DREB1 and the CBF4 proteins, we propose that the plant's response to cold and drought evolved from a common CBF-like transcription factor, first through gene duplication and then through promoter evolution.
Collapse
Affiliation(s)
- Volker Haake
- Mendel Biotechnology, 21375 Cabot Boulevard, Hayward, CA 94545, USA
| | | | | | | | | | | |
Collapse
|
13
|
Abstract
Although comparative genetic mapping studies show extensive genome conservation among grasses, recent data provide many exceptions to gene collinearity at the DNA sequence level. Rice, sorghum, and maize are closely related grass species, once sharing a common ancestor. Because they diverged at different times during evolution, they provide an excellent model to investigate sequence divergence. We isolated, sequenced, and compared orthologous regions from two rice subspecies, sorghum, and maize to investigate the nature of their sequence differences. This study represents the most extensive sequence comparison among grasses, including the largest contiguous genomic sequences from sorghum (425 kb) and maize (435 kb) to date. Our results reveal a mosaic organization of the orthologous regions, with conserved sequences interspersed with nonconserved sequences. Gene amplification, gene movement, and retrotransposition account for the majority of the nonconserved sequences. Our analysis also shows that gene amplification is frequently linked with gene movement. Analyzing an additional 2.9 Mb of genomic sequence from rice not only corroborates our observations, but also suggests that a significant portion of grass genomes may consist of paralogous sequences derived from gene amplification. We propose that sequence divergence started from hotspots along chromosomes and expanded by accumulating small-scale genomic changes during evolution.
Collapse
Affiliation(s)
- Rentao Song
- Waksman Institute, Rutgers, The State University of New Jersey, Piscataway, New Jersey 08854-8020, USA
| | | | | |
Collapse
|
14
|
Kallberg Y, Oppermann U, Jörnvall H, Persson B. Short-chain dehydrogenases/reductases (SDRs). EUROPEAN JOURNAL OF BIOCHEMISTRY 2002; 269:4409-17. [PMID: 12230552 DOI: 10.1046/j.1432-1033.2002.03130.x] [Citation(s) in RCA: 312] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Short-chain dehydrogenases/reductases (SDRs) are enzymes of great functional diversity. Even at sequence identities of typically only 15-30%, specific sequence motifs are detectable, reflecting common folding patterns. We have developed a functional assignment scheme based on these motifs and we find five families. Two of these families were known previously and are called 'classical' and 'extended' families, but they are now distinguished at a further level based on coenzyme specificities. This analysis gives seven subfamilies of classical SDRs and three subfamilies of extended SDRs. We find that NADP(H) is the preferred coenzyme among most classical SDRs, while NAD(H) is that preferred among most extended SDRs. Three families are novel entities, denoted 'intermediate', 'divergent' and 'complex', encompassing short-chain alcohol dehydrogenases, enoyl reductases and multifunctional enzymes, respectively. The assignment scheme was applied to the genomes of human, mouse, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana and Saccharomyces cerevisiae. In the animal genomes, the extended SDRs amount to around one quarter or less of the total number of SDRs, while in the A. thaliana and S. cerevisiae genomes, the extended members constitute about 40% of the SDR forms. The numbers of NAD(H)-dependent and NADP(H)-dependent SDRs are similar in human, mouse and plant, while the proportions of NAD(H)-dependent enzymes are much lower in fruit fly, worm and yeast. We show that, in spite of the great diversity of the SDR superfamily, the primary structure alone can be used for functional assignments and for predictions of coenzyme preference.
Collapse
Affiliation(s)
- Yvonne Kallberg
- Department of Medical Biochemistry and Biophysics and Stockholm Bioinformatics Centre, Karolinska Institutet, Sweden
| | | | | | | |
Collapse
|
15
|
Hall AE, Fiebig A, Preuss D. Beyond the Arabidopsis genome: opportunities for comparative genomics. PLANT PHYSIOLOGY 2002; 129:1439-47. [PMID: 12177458 PMCID: PMC1540248 DOI: 10.1104/pp.004051] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Affiliation(s)
- Anne E Hall
- Howard Hughes Medical Institute, The University of Chicago, 1103 East 57th Street, Chicago, Illinois 60637, USA
| | | | | |
Collapse
|
16
|
Kallberg Y, Oppermann U, Jörnvall H, Persson B. Short-chain dehydrogenase/reductase (SDR) relationships: a large family with eight clusters common to human, animal, and plant genomes. Protein Sci 2002; 11:636-41. [PMID: 11847285 PMCID: PMC2373483 DOI: 10.1110/ps.26902] [Citation(s) in RCA: 177] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
The progress in genome characterizations has opened new routes for studying enzyme families. The availability of the human genome enabled us to delineate the large family of short-chain dehydrogenase/reductase (SDR) members. Although the human genome releases are not yet final, we have already found 63 members. We have also compared these SDR forms with those of three model organisms: Caenorhabditis elegans, Drosophila melanogaster, and Arabidopsis thaliana. We detect eight SDR ortholog clusters in a cross-genome comparison. Four of these clusters represent extended SDR forms, a subgroup found in all life forms. The other four are classical SDRs with activities involved in cellular differentiation and signalling. We also find 18 SDR genes that are present only in the human genome of the four genomes studied, reflecting enzyme forms specific to mammals. Close to half of these gene products represent steroid dehydrogenases, emphasizing the regulatory importance of these enzymes.
Collapse
Affiliation(s)
- Yvonne Kallberg
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, S-171 77 Stockholm, Sweden
| | | | | | | |
Collapse
|
17
|
Schmidt R. Plant genome evolution: lessons from comparative genomics at the DNA level. PLANT MOLECULAR BIOLOGY 2002; 48:21-37. [PMID: 11860210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Angiosperm genomes show tremendous variability in genome size and chromosome number. Nevertheless, comparative genetic mapping has revealed genome collinearity of closely related species. Sequence-based comparisons were used to assess the conservation of gene arrangements. Numerous small rearrangements, insertions/deletions, duplications, inversions and translocations have been detected. Importantly, comparative sequence analyses have unambiguously shown micro-collinearity of distantly related plant species. Duplications and subsequent gene loss have been identified as a particular important factor in the evolution of plant genomes.
Collapse
Affiliation(s)
- Renate Schmidt
- Max-Delbrück-Laboratorium in der Max-Planck-Gesellschaft, Cologne, Germany.
| |
Collapse
|
18
|
Ryder CD, Smith LB, Teakle GR, King GJ. Contrasting genome organisation: two regions of the Brassica oleracea genome compared with collinear regions of the Arabidopsis thaliana genome. Genome 2001. [DOI: 10.1139/g01-075] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Brassica crop species are of worldwide importance and are closely related to the model plant Arabidopsis thaliana for which the complete genome sequence has recently been established. We investigated collinearity of marker order by comparing two contrasting regions of the Brassica oleracea genome with homologous regions of A. thaliana. Although there is widespread replication of marker loci in both A. thaliana and B. oleracea, we found that a combination of genetic markers mapped in B. oleracea, including RFLPs, CAPS, and SSRs allowed comparison and interpretation of medium-scale chromosomal organisation and rearrangements. The interpretation of data was facilitated by hybridising probes onto the whole A. thaliana genome, as represented by BAC contigs. Twenty marker loci were sampled from the whole length of the shortest B. oleracea linkage group, O6, and 21 from a 30.4-cM section of the longest linkage group, O3. There is evidence of locus duplication on linkage group O6. Locus order is well conserved between a putative duplicated region of 10.5 cM and a discrete region comprising 25 cM of A. thaliana chromosome I. This was supported by evidence from seven paralogous loci, three of which were duplicated in a 30.6-cM region of linkage group O6. The pattern of locus order for the remainder of linkage group O6 and the sampled section of linkage group O3 was more complex when compared with the A. thaliana genome. Although there was some conservation of locus order between markers on linkage group O3 and approximately 9 cM of A. thaliana chromosome I, this was superimposed upon a complex pattern of additional loci that were replicated in both A. thaliana and B. oleracea. The results are discussed in the context of the ability to use collinear information to assist map-based cloning.Key words: comparative mapping, BAC, physical contig, MADS box.
Collapse
|
19
|
Mayer K, Murphy G, Tarchini R, Wambutt R, Volckaert G, Pohl T, Düsterhöft A, Stiekema W, Entian KD, Terryn N, Lemcke K, Haase D, Hall CR, van Dodeweerd AM, Tingey SV, Mewes HW, Bevan MW, Bancroft I. Conservation of Microstructure between a Sequenced Region of the Genome of Rice and Multiple Segments of the Genome of Arabidopsis thaliana. Genome Res 2001. [DOI: 10.1101/gr.161701] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The nucleotide sequence was determined for a 340-kb segment of rice chromosome 2, revealing 56 putative protein-coding genes. This represents a density of one gene per 6.1 kb, which is higher than was reported for a previously sequenced segment of the rice genome. Sixteen of the putative genes were supported by matches to ESTs. The predicted products of 29 of the putative genes showed similarity to known proteins, and a further 17 genes showed similarity only to predicted or hypothetical proteins identified in genome sequence data. The region contains a few transposable elements: one retrotransposon, and one transposon. The segment of the rice genome studied had previously been identified as representing a part of rice chromosome 2 that may be homologous to a segment of Arabidopsis chromosome 4. We confirmed the conservation of gene content and order between the two genome segments. In addition, we identified a further four segments of the Arabidopsis genome that contain conserved gene content and order. In total, 22 of the 56 genes identified in the rice genome segment were represented in this set of Arabidopsis genome segments, with at least five genes present, in conserved order, in each segment. These data are consistent with the hypothesis that theArabidopsis genome has undergone multiple duplication events. Our results demonstrate that conservation of the genome microstructure can be identified even between monocot and dicot species. However, the frequent occurrence of duplication, and subsequent microstructure divergence, within plant genomes may necessitate the integration of subsets of genes present in multiple redundant segments to deduce evolutionary relationships and identify orthologous genes.
Collapse
|
20
|
Mayer K, Murphy G, Tarchini R, Wambutt R, Volckaert G, Pohl T, Düsterhöft A, Stiekema W, Entian KD, Terryn N, Lemcke K, Haase D, Hall CR, van Dodeweerd AM, Tingey SV, Mewes HW, Bevan MW, Bancroft I. Conservation of microstructure between a sequenced region of the genome of rice and multiple segments of the genome of Arabidopsis thaliana. Genome Res 2001; 11:1167-74. [PMID: 11435398 PMCID: PMC311122 DOI: 10.1101/gr.gr-1617r] [Citation(s) in RCA: 61] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The nucleotide sequence was determined for a 340-kb segment of rice chromosome 2, revealing 56 putative protein-coding genes. This represents a density of one gene per 6.1 kb, which is higher than was reported for a previously sequenced segment of the rice genome. Sixteen of the putative genes were supported by matches to ESTs. The predicted products of 29 of the putative genes showed similarity to known proteins, and a further 17 genes showed similarity only to predicted or hypothetical proteins identified in genome sequence data. The region contains a few transposable elements: one retrotransposon, and one transposon. The segment of the rice genome studied had previously been identified as representing a part of rice chromosome 2 that may be homologous to a segment of Arabidopsis chromosome 4. We confirmed the conservation of gene content and order between the two genome segments. In addition, we identified a further four segments of the Arabidopsis genome that contain conserved gene content and order. In total, 22 of the 56 genes identified in the rice genome segment were represented in this set of Arabidopsis genome segments, with at least five genes present, in conserved order, in each segment. These data are consistent with the hypothesis that the Arabidopsis genome has undergone multiple duplication events. Our results demonstrate that conservation of the genome microstructure can be identified even between monocot and dicot species. However, the frequent occurrence of duplication, and subsequent microstructure divergence, within plant genomes may necessitate the integration of subsets of genes present in multiple redundant segments to deduce evolutionary relationships and identify orthologous genes.
Collapse
Affiliation(s)
- K Mayer
- National Research Center for Environment and Health, Institute for Bioinformatics, Munich Information Centre for Protein Sequences, 85764 Neuherberg, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Grelon M, Vezon D, Gendrot G, Pelletier G. AtSPO11-1 is necessary for efficient meiotic recombination in plants. EMBO J 2001; 20:589-600. [PMID: 11157765 PMCID: PMC133473 DOI: 10.1093/emboj/20.3.589] [Citation(s) in RCA: 361] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The Saccharomyces cerevisiae Spo11 protein catalyses DNA double-strand breaks (DSBs) that initiate meiotic recombination. The model plant Arabidopsis thaliana possesses at least three SPO11 homologues. T-DNA and ethyl-methane sulfonate mutagenesis allowed us to show that meiotic progression is altered in plants in which the AtSPO11-1 gene is disrupted. Both male and female meiocytes formed very few bivalents. Furthermore, no fully synapsed chromosomes were observed during prophase I. Later, in meiosis I, we observed that chromosomes segregated randomly, leading to the production of a large proportion of non-functional gametes. These meiotic aberrations were associated with a drastic reduction in meiotic recombination. Thus, our data show that initiation of meiotic recombination by SPO11- induced DSBs is a mechanism conserved in plants. Furthermore, unlike Drosophila and Caenorhabditis elegans, but like fungi, SPO11 is necessary for normal synapsis in plants.
Collapse
Affiliation(s)
- M Grelon
- Station de Génétique et d'Amélioration des Plantes, INRA de Versailles, Route de St-Cyr, 78026 Versailles Cedex, France.
| | | | | | | |
Collapse
|
22
|
Abstract
The use of positional approaches for the isolation of genes from most crop species is difficult due to the large size of their genomes. If the order of genes in segments of the genomes is similar in different plants, it might be feasible to use smaller genomes as templates upon which to base strategies for the positional cloning of genes from other species. Comparative genetic mapping, using markers such as restriction-fragment length polymorphisms, has revealed extensive conservation of long-range genome organization (macrostructure) between related species. But is the organization of the tens or hundreds of genes between the genetic markers also conserved? Recent results suggest that the fine-scale structure (microstructure) of plant genomes is more dynamic than previously assumed from investigations of the macrostructure.
Collapse
Affiliation(s)
- I Bancroft
- Dept. of Brassica and Oilseeds Research, John Innes Centre, Norwich Research Park, Colney, NR4 7UH, Norwich, UK.
| |
Collapse
|
23
|
Abstract
Large segmental duplications cover much of the Arabidopsis thaliana genome. Little is known about their origins. We show that they are primarily due to at least four different large-scale duplication events that occurred 100 to 200 million years ago, a formative period in the diversification of the angiosperms. A better understanding of the complex structural history of angiosperm genomes is necessary to make full use of Arabidopsis as a genetic model for other plant species.
Collapse
Affiliation(s)
- T J Vision
- USDA-ARS Center for Agricultural Bioinformatics, 604 Rhodes Hall, Cornell University, Ithaca, NY 14853, USA.
| | | | | |
Collapse
|