1
|
Frederick J, Hennessy F, Horn U, de la Torre Cortés P, van den Broek M, Strych U, Willson R, Hefer CA, Daran JMG, Sewell T, Otten LG, Brady D. The complete genome sequence of the nitrile biocatalyst Rhodocccus rhodochrous ATCC BAA-870. BMC Genomics 2020; 21:3. [PMID: 31898479 PMCID: PMC6941271 DOI: 10.1186/s12864-019-6405-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Accepted: 12/16/2019] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Rhodococci are industrially important soil-dwelling Gram-positive bacteria that are well known for both nitrile hydrolysis and oxidative metabolism of aromatics. Rhodococcus rhodochrous ATCC BAA-870 is capable of metabolising a wide range of aliphatic and aromatic nitriles and amides. The genome of the organism was sequenced and analysed in order to better understand this whole cell biocatalyst. RESULTS The genome of R. rhodochrous ATCC BAA-870 is the first Rhodococcus genome fully sequenced using Nanopore sequencing. The circular genome contains 5.9 megabase pairs (Mbp) and includes a 0.53 Mbp linear plasmid, that together encode 7548 predicted protein sequences according to BASys annotation, and 5535 predicted protein sequences according to RAST annotation. The genome contains numerous oxidoreductases, 15 identified antibiotic and secondary metabolite gene clusters, several terpene and nonribosomal peptide synthetase clusters, as well as 6 putative clusters of unknown type. The 0.53 Mbp plasmid encodes 677 predicted genes and contains the nitrile converting gene cluster, including a nitrilase, a low molecular weight nitrile hydratase, and an enantioselective amidase. Although there are fewer biotechnologically relevant enzymes compared to those found in rhodococci with larger genomes, such as the well-known Rhodococcus jostii RHA1, the abundance of transporters in combination with the myriad of enzymes found in strain BAA-870 might make it more suitable for use in industrially relevant processes than other rhodococci. CONCLUSIONS The sequence and comprehensive description of the R. rhodochrous ATCC BAA-870 genome will facilitate the additional exploitation of rhodococci for biotechnological applications, as well as enable further characterisation of this model organism. The genome encodes a wide range of enzymes, many with unknown substrate specificities supporting potential applications in biotechnology, including nitrilases, nitrile hydratase, monooxygenases, cytochrome P450s, reductases, proteases, lipases, and transaminases.
Collapse
Affiliation(s)
- Joni Frederick
- Protein Technologies, CSIR Biosciences, Meiring Naude Road, Brummeria, Pretoria, South Africa
- Electron Microscope Unit, University of Cape Town, Rondebosch, 7701 South Africa
- Present Address: LadHyx, UMR CNRS 7646, École Polytechnique, 91128 Palaiseau, France
| | - Fritha Hennessy
- Protein Technologies, CSIR Biosciences, Meiring Naude Road, Brummeria, Pretoria, South Africa
| | - Uli Horn
- Meraka, CSIR, Meiring Naude Road, Brummeria, 0091 South Africa
| | - Pilar de la Torre Cortés
- Industrial Microbiology, Department of Biotechnology, Delft University of Technology, Van der Maasweg 9, 2629 HZ Delft, The Netherlands
| | - Marcel van den Broek
- Industrial Microbiology, Department of Biotechnology, Delft University of Technology, Van der Maasweg 9, 2629 HZ Delft, The Netherlands
| | - Ulrich Strych
- Biology and Biochemistry, University of Houston, 4800 Calhoun Road, Houston, TX 77204 USA
- Present Address: Department of Pediatrics, Section of Tropical Medicine, Baylor College of Medicine, 1102 Bates Avenue, Houston, TX 77030 USA
| | - Richard Willson
- Biology and Biochemistry, University of Houston, 4800 Calhoun Road, Houston, TX 77204 USA
- Chemical and Biomolecular Engineering, University of Houston, 4800 Calhoun Road, Houston, TX 77204 USA
| | - Charles A. Hefer
- Bioinformatics and Computational Biology Unit, Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, 0002 South Africa
- Present Address: AgResearch Limited, Lincoln Research Centre, Private Bag 4749, Christchurch, 8140 New Zealand
| | - Jean-Marc G. Daran
- Industrial Microbiology, Department of Biotechnology, Delft University of Technology, Van der Maasweg 9, 2629 HZ Delft, The Netherlands
| | - Trevor Sewell
- Electron Microscope Unit, University of Cape Town, Rondebosch, 7701 South Africa
| | - Linda G. Otten
- Biocatalysis, Department of Biotechnology, Delft University of Technology, Van der Maasweg 9, 2629 HZ Delft, The Netherlands
| | - Dean Brady
- Protein Technologies, CSIR Biosciences, Meiring Naude Road, Brummeria, Pretoria, South Africa
- Molecular Sciences Institute, School of Chemistry, University of the Witwatersrand, PO, Wits, 2050 South Africa
| |
Collapse
|
2
|
Uddin A, Mazumder TH, Chakraborty S. Understanding molecular biology of codon usage in mitochondrial complex IV genes of electron transport system: Relevance to mitochondrial diseases. J Cell Physiol 2018; 234:6397-6413. [DOI: 10.1002/jcp.27375] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Accepted: 08/17/2018] [Indexed: 12/17/2022]
Affiliation(s)
- Arif Uddin
- Department of Zoology Moinul Hoque Choudhury Memorial Science College Hailakandi Assam India
| | | | | |
Collapse
|
3
|
Pellissier L, Niculita-Hirzel H, Dubuis A, Pagni M, Guex N, Ndiribe C, Salamin N, Xenarios I, Goudet J, Sanders IR, Guisan A. Soil fungal communities of grasslands are environmentally structured at a regional scale in the Alps. Mol Ecol 2014; 23:4274-90. [PMID: 25041483 DOI: 10.1111/mec.12854] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2013] [Revised: 06/18/2014] [Accepted: 07/05/2014] [Indexed: 01/20/2023]
Abstract
Studying patterns of species distributions along elevation gradients is frequently used to identify the primary factors that determine the distribution, diversity and assembly of species. However, despite their crucial role in ecosystem functioning, our understanding of the distribution of below-ground fungi is still limited, calling for more comprehensive studies of fungal biogeography along environmental gradients at various scales (from regional to global). Here, we investigated the richness of taxa of soil fungi and their phylogenetic diversity across a wide range of grassland types along a 2800 m elevation gradient at a large number of sites (213), stratified across a region of the Western Swiss Alps (700 km(2)). We used 454 pyrosequencing to obtain fungal sequences that were clustered into operational taxonomic units (OTUs). The OTU diversity-area relationship revealed uneven distribution of fungal taxa across the study area (i.e. not all taxa are everywhere) and fine-scale spatial clustering. Fungal richness and phylogenetic diversity were found to be higher in lower temperatures and higher moisture conditions. Climatic and soil characteristics as well as plant community composition were related to OTU alpha, beta and phylogenetic diversity, with distinct fungal lineages suggesting distinct ecological tolerances. Soil fungi, thus, show lineage-specific biogeographic patterns, even at a regional scale, and follow environmental determinism, mediated by interactions with plants.
Collapse
Affiliation(s)
- L Pellissier
- Department of Ecology and Evolution, University of Lausanne, Biophore Building, 1015, Lausanne, Switzerland
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
4
|
Signal correlations in ecological niches can shape the organization and evolution of bacterial gene regulatory networks. Adv Microb Physiol 2013; 61:1-36. [PMID: 23046950 DOI: 10.1016/b978-0-12-394423-8.00001-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Transcriptional regulation plays a significant role in the biological response of bacteria to changing environmental conditions. Therefore, mapping transcriptional regulatory networks is an important step not only in understanding how bacteria sense and interpret their environment but also to identify the functions involved in biological responses to specific conditions. Recent experimental and computational developments have facilitated the characterization of regulatory networks on a genome-wide scale in model organisms. In addition, the multiplication of complete genome sequences has encouraged comparative analyses to detect conserved regulatory elements and infer regulatory networks in other less well-studied organisms. However, transcription regulation appears to evolve rapidly, thus, creating challenges for the transfer of knowledge to nonmodel organisms. Nevertheless, the mechanisms and constraints driving the evolution of regulatory networks have been the subjects of numerous analyses, and several models have been proposed. Overall, the contributions of mutations, recombination, and horizontal gene transfer are complex. Finally, the rapid evolution of regulatory networks plays a significant role in the remarkable capacity of bacteria to adapt to new or changing environments. Conversely, the characteristics of environmental niches determine the selective pressures and can shape the structure of regulatory network accordingly.
Collapse
|
5
|
Davies N, Meyer C, Gilbert JA, Amaral-Zettler L, Deck J, Bicak M, Rocca-Serra P, Assunta-Sansone S, Willis K, Field D. A call for an international network of genomic observatories (GOs). Gigascience 2012; 1:5. [PMID: 23587188 PMCID: PMC3617453 DOI: 10.1186/2047-217x-1-5] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2012] [Accepted: 07/12/2012] [Indexed: 11/18/2022] Open
Abstract
We are entering a new era in genomics–that of large-scale, place-based, highly contextualized genomic research. Here we review this emerging paradigm shift and suggest that sites of utmost scientific importance be expanded into ‘Genomic Observatories’ (GOs). Investment in GOs should focus on the digital characterization of whole ecosystems, from all-taxa biotic inventories to time-series ’omics studies. The foundational layer of biodiversity–genetic variation–would thus be mainstreamed into Earth Observation systems enabling predictive modelling of biodiversity dynamics and resultant impacts on ecosystem services.
Collapse
Affiliation(s)
- Neil Davies
- Biodiversity Institute, Department of Zoology, University of Oxford, The Tinbergen Building, South Parks Road, Oxford, OX1 3PS, UK.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Ikuma K, Gunsch CK. Functionality of the TOL plasmid under varying environmental conditions following conjugal transfer. Appl Microbiol Biotechnol 2012; 97:395-408. [PMID: 22367613 DOI: 10.1007/s00253-012-3949-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2011] [Revised: 01/19/2012] [Accepted: 02/06/2012] [Indexed: 10/28/2022]
Abstract
Conjugation of catabolic plasmids in contaminated environments is a naturally occurring horizontal gene transfer phenomenon, which could be utilized in genetic bioaugmentation. The potentially important parameters for genetic bioaugmentation include gene regulation of transferred catabolic plasmids that may be controlled by the genetic characteristics of transconjugants as well as environmental conditions that may alter the expression of the contaminant-degrading phenotype. This study showed that both genomic guanine-cytosine contents and phylogenetic characteristics of transconjugants were important in controlling the phenotype functionality of the TOL plasmid. These genetic characteristics had no apparent impact on the stability of the TOL plasmid, which was observed to be highly variable among strains. Within the environmental conditions tested, the addition of glucose resulted in the largest enhancement of the activities of enzymes encoded by the TOL plasmid in all transconjugant strains. Glucose (1 g/L) enhanced the phenotype functionality by up to 16.4 (±2.22), 30.8 (±7.03), and 90.8 (±4.56)-fold in toluene degradation rates, catechol 2,3-dioxygenase enzymatic activities, and xylE gene expression, respectively. These results suggest that genetic limitations of the expression of horizontally acquired genes may be overcome by the presence of alternate carbon substrates. Such observations may be utilized in improving the effectiveness of genetic bioaugmentation.
Collapse
Affiliation(s)
- Kaoru Ikuma
- Department of Civil and Environmental Engineering, Duke University, 121 Hudson Hall, Box 90287, Durham, NC 27708-0287, USA
| | | |
Collapse
|
7
|
Tarrío R, Ayala FJ, Rodríguez-Trelles F. The Vein Patterning 1 (VEP1) gene family laterally spread through an ecological network. PLoS One 2011; 6:e22279. [PMID: 21818306 PMCID: PMC3144213 DOI: 10.1371/journal.pone.0022279] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2011] [Accepted: 06/18/2011] [Indexed: 11/23/2022] Open
Abstract
Lateral gene transfer (LGT) is a major evolutionary mechanism in prokaryotes. Knowledge about LGT— particularly, multicellular— eukaryotes has only recently started to accumulate. A widespread assumption sees the gene as the unit of LGT, largely because little is yet known about how LGT chances are affected by structural/functional features at the subgenic level. Here we trace the evolutionary trajectory of VEin Patterning 1, a novel gene family known to be essential for plant development and defense. At the subgenic level VEP1 encodes a dinucleotide-binding Rossmann-fold domain, in common with members of the short-chain dehydrogenase/reductase (SDR) protein family. We found: i) VEP1 likely originated in an aerobic, mesophilic and chemoorganotrophic α-proteobacterium, and was laterally propagated through nets of ecological interactions, including multiple LGTs between phylogenetically distant green plant/fungi-associated bacteria, and five independent LGTs to eukaryotes. Of these latest five transfers, three are ancient LGTs, implicating an ancestral fungus, the last common ancestor of land plants and an ancestral trebouxiophyte green alga, and two are recent LGTs to modern embryophytes. ii) VEP1's rampant LGT behavior was enabled by the robustness and broad utility of the dinucleotide-binding Rossmann-fold, which provided a platform for the evolution of two unprecedented departures from the canonical SDR catalytic triad. iii) The fate of VEP1 in eukaryotes has been different in different lineages, being ubiquitous and highly conserved in land plants, whereas fungi underwent multiple losses. And iv) VEP1-harboring bacteria include non-phytopathogenic and phytopathogenic symbionts which are non-randomly distributed with respect to the type of harbored VEP1 gene. Our findings suggest that VEP1 may have been instrumental for the evolutionary transition of green plants to land, and point to a LGT-mediated ‘Trojan Horse’ mechanism for the evolution of bacterial pathogenesis against plants. VEP1 may serve as tool for revealing microbial interactions in plant/fungi-associated environments.
Collapse
Affiliation(s)
- Rosa Tarrío
- Universidad de Santiago de Compostela, CIBERER, Genome Medicine Group, Santiago de Compostela, Spain
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California, United States of America
| | - Francisco J. Ayala
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California, United States of America
| | - Francisco Rodríguez-Trelles
- Grup de Biologia Evolutiva, Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Barcelona, Spain
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California, United States of America
- * E-mail:
| |
Collapse
|
8
|
Duhaime MB, Kottmann R, Field D, Glöckner FO. Enriching public descriptions of marine phages using the Genomic Standards Consortium MIGS standard. Stand Genomic Sci 2011; 4:271-85. [PMID: 21677864 PMCID: PMC3111985 DOI: 10.4056/sigs.621069] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
In any sequencing project, the possible depth of comparative analysis is determined largely by the amount and quality of the accompanying contextual data. The structure, content, and storage of this contextual data should be standardized to ensure consistent coverage of all sequenced entities and facilitate comparisons. The Genomic Standards Consortium (GSC) has developed the “Minimum Information about Genome/Metagenome Sequences (MIGS/MIMS)” checklist for the description of genomes and here we annotate all 30 publicly available marine bacteriophage sequences to the MIGS standard. These annotations build on existing International Nucleotide Sequence Database Collaboration (INSDC) records, and confirm, as expected that current submissions lack most MIGS fields. MIGS fields were manually curated from the literature and placed in XML format as specified by the Genomic Contextual Data Markup Language (GCDML). These “machine-readable” reports were then analyzed to highlight patterns describing this collection of genomes. Completed reports are provided in GCDML. This work represents one step towards the annotation of our complete collection of genome sequences and shows the utility of capturing richer metadata along with raw sequences.
Collapse
|
9
|
Hirschman L, Clark C, Cohen KB, Mardis S, Luciano J, Kottmann R, Cole J, Markowitz V, Kyrpides N, Morrison N, Schriml LM, Field D. Habitat-Lite: A GSC Case Study Based on Free Text Terms for Environmental Metadata. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2008; 12:129-36. [PMID: 18416669 DOI: 10.1089/omi.2008.0016] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Affiliation(s)
- Lynette Hirschman
- Information Technology Center, The MITRE Corporation, Bedford, Massachusetts
| | - Cheryl Clark
- Information Technology Center, The MITRE Corporation, Bedford, Massachusetts
| | - K. Bretonnel Cohen
- Information Technology Center, The MITRE Corporation, Bedford, Massachusetts
| | - Scott Mardis
- Information Technology Center, The MITRE Corporation, Bedford, Massachusetts
| | - Joanne Luciano
- Information Technology Center, The MITRE Corporation, Bedford, Massachusetts
| | - Renzo Kottmann
- Microbial Genomics Group, Max Planck Institute for Marine Microbiology and Jacobs University Bremen, 28359 Bremen, Germany
| | - James Cole
- Center For Microbial Ecology, Michigan State University, East Lansing, Michigan
| | - Victor Markowitz
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, California
| | - Nikos Kyrpides
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, California
| | - Norman Morrison
- School of Computer Science, University of Manchester, Oxford Road, Manchester, United Kingdom
| | - Lynn M. Schriml
- Institute for Genome Sciences and Department of Epidemiology and Preventive Medicine, University of Maryland School of Medicine, HSFI, 685 West Baltimore Street, Baltimore, Maryland
| | - Dawn Field
- NERC Centre for Ecology and Hydrology, Mansfield Road, Oxford, Oxfordshire, United Kingdom
| | | |
Collapse
|
10
|
Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, Tatusova T, Thomson N, Allen MJ, Angiuoli SV, Ashburner M, Axelrod N, Baldauf S, Ballard S, Boore J, Cochrane G, Cole J, Dawyndt P, De Vos P, DePamphilis C, Edwards R, Faruque N, Feldman R, Gilbert J, Gilna P, Glöckner FO, Goldstein P, Guralnick R, Haft D, Hancock D, Hermjakob H, Hertz-Fowler C, Hugenholtz P, Joint I, Kagan L, Kane M, Kennedy J, Kowalchuk G, Kottmann R, Kolker E, Kravitz S, Kyrpides N, Leebens-Mack J, Lewis SE, Li K, Lister AL, Lord P, Maltsev N, Markowitz V, Martiny J, Methe B, Mizrachi I, Moxon R, Nelson K, Parkhill J, Proctor L, White O, Sansone SA, Spiers A, Stevens R, Swift P, Taylor C, Tateno Y, Tett A, Turner S, Ussery D, Vaughan B, Ward N, Whetzel T, San Gil I, Wilson G, Wipat A. The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol 2008; 26:541-7. [PMID: 18464787 PMCID: PMC2409278 DOI: 10.1038/nbt1360] [Citation(s) in RCA: 982] [Impact Index Per Article: 57.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the 'transparency' of the information contained in existing genomic databases.
Collapse
Affiliation(s)
- Dawn Field
- Natural Environmental Research Council Centre for Ecology and Hydrology, Oxford OX1 3SR, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Morrison N, Cochrane G, Faruque N, Tatusova T, Tateno Y, Hancock D, Field D. Concept of sample in OMICS technology. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2007; 10:127-37. [PMID: 16901217 DOI: 10.1089/omi.2006.10.127] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Fundamental biological processes can now be studied by applying the full range of OMICS technologies (genomics, transcriptomics, proteomics, metabolomics, and beyond) to the same biological sample. Clearly, it would be desirable if the concept of sample were shared among these technologies, especially as up until the time a biological sample is prepared for use in a specific OMICS assay, its description is inherently technology independent. Sharing a common informatic representation would encourage data sharing (rather than data replication), thereby reducing redundant data capture and the potential for error. This would result in a significant degree of harmonization across different OMICS data standardization activities, a task that is critical if we are to integrate data from these different data sources. Here, we review the current concept of sample in OMICS technologies as it is being dealt with by different OMICS standardization initiatives and discuss the special role that the newly formed Genomic Standards Consortium (GSC) might have to play in this domain.
Collapse
Affiliation(s)
- Norman Morrison
- School of Computer Science, University of Manchester, United Kingdom
| | | | | | | | | | | | | |
Collapse
|
12
|
Field D, Morrison N, Selengut J, Sterk P. Meeting report: eGenomics: Cataloguing our Complete Genome Collection II. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2007; 10:100-4. [PMID: 16901213 DOI: 10.1089/omi.2006.10.100] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
This article summarizes the proceedings of the "eGenomics: Cataloguing our Complete Genome Collection II" workshop held November 10-11, 2005, at the European Bioinformatics Institute. This exploratory workshop, organized by members of the Genomic Standards Consortium (GSC), brought together researchers from the genomic, functional OMICS, and computational biology communities to discuss standardization activities across a range of projects. The workshop proceedings and outcomes are set to help guide the development of the GSC's Minimal Information about a Genome Sequence (MIGS) specification.
Collapse
Affiliation(s)
- Dawn Field
- Molecular Evolution & Bioinformatics Section, Oxford Centre for Ecology and Hydrology, Oxford, United Kingdom.
| | | | | | | |
Collapse
|
13
|
Suen G, Goldman BS, Welch RD. Predicting prokaryotic ecological niches using genome sequence analysis. PLoS One 2007; 2:e743. [PMID: 17710143 PMCID: PMC1937020 DOI: 10.1371/journal.pone.0000743] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2007] [Accepted: 07/13/2007] [Indexed: 11/18/2022] Open
Abstract
Automated DNA sequencing technology is so rapid that analysis has become the rate-limiting step. Hundreds of prokaryotic genome sequences are publicly available, with new genomes uploaded at the rate of approximately 20 per month. As a result, this growing body of genome sequences will include microorganisms not previously identified, isolated, or observed. We hypothesize that evolutionary pressure exerted by an ecological niche selects for a similar genetic repertoire in those prokaryotes that occupy the same niche, and that this is due to both vertical and horizontal transmission. To test this, we have developed a novel method to classify prokaryotes, by calculating their Pfam protein domain distributions and clustering them with all other sequenced prokaryotic species. Clusters of organisms are visualized in two dimensions as 'mountains' on a topological map. When compared to a phylogenetic map constructed using 16S rRNA, this map more accurately clusters prokaryotes according to functional and environmental attributes. We demonstrate the ability of this map, which we term a "niche map", to cluster according to ecological niche both quantitatively and qualitatively, and propose that this method be used to associate uncharacterized prokaryotes with their ecological niche as a means of predicting their functional role directly from their genome sequence.
Collapse
Affiliation(s)
- Garret Suen
- Department of Biology, Syracuse University, Syracuse, New York, United States of America
| | | | - Roy D. Welch
- Department of Biology, Syracuse University, Syracuse, New York, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
14
|
Tett A, Spiers AJ, Crossman LC, Ager D, Ciric L, Dow JM, Fry JC, Harris D, Lilley A, Oliver A, Parkhill J, Quail MA, Rainey PB, Saunders NJ, Seeger K, Snyder LAS, Squares R, Thomas CM, Turner SL, Zhang XX, Field D, Bailey MJ. Sequence-based analysis of pQBR103; a representative of a unique, transfer-proficient mega plasmid resident in the microbial community of sugar beet. THE ISME JOURNAL 2007; 1:331-40. [PMID: 18043644 PMCID: PMC2656933 DOI: 10.1038/ismej.2007.47] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The plasmid pQBR103 was found within Pseudomonas populations colonizing the leaf and root surfaces of sugar beet plants growing at Wytham, Oxfordshire, UK. At 425 kb it is the largest self-transmissible plasmid yet sequenced from the phytosphere. It is known to enhance the competitive fitness of its host, and parts of the plasmid are known to be actively transcribed in the plant environment. Analysis of the complete sequence of this plasmid predicts a coding sequence (CDS)-rich genome containing 478 CDSs and an exceptional degree of genetic novelty; 80% of predicted coding sequences cannot be ascribed a function and 60% are orphans. Of those to which function could be assigned, 40% bore greatest similarity to sequences from Pseudomonas spp, and the majority of the remainder showed similarity to other gamma-proteobacterial genera and plasmids. pQBR103 has identifiable regions presumed responsible for replication and partitioning, but despite being tra+ lacks the full complement of any previously described conjugal transfer functions. The DNA sequence provided few insights into the functional significance of plant-induced transcriptional regions, but suggests that 14% of CDSs may be expressed (11 CDSs with functional annotation and 54 without), further highlighting the ecological importance of these novel CDSs. Comparative analysis indicates that pQBR103 shares significant regions of sequence with other plasmids isolated from sugar beet plants grown at the same geographic location. These plasmid sequences indicate there is more novelty in the mobile DNA pool accessible to phytosphere pseudomonas than is currently appreciated or understood.
Collapse
Affiliation(s)
- Adrian Tett
- Centre for Ecology and Hydrology-Oxford, Oxford, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Field D, Kyrpides N. The positive role of the ecological community in the genomic revolution. MICROBIAL ECOLOGY 2007; 53:507-11. [PMID: 17436031 DOI: 10.1007/s00248-007-9206-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/29/2006] [Revised: 12/29/2006] [Accepted: 01/02/2007] [Indexed: 05/14/2023]
Abstract
The exponential increase of genomic and metagenomic data, fueled in part by recent advancements in sequencing technology, are greatly expanding our understanding of the phylogenetic diversity and metabolic capacity present in the environment. Two of the central challenges that bioinformaticians and ecologists alike must face are the design of bioinformatic resources that facilitate the analysis of genomic and metagenomic data in a comparative context and the efficient capture and organization of the plethora of descriptive information required to usefully describe these data sets. In this commentary, we review three initiatives presented in the "new frontiers" session of the second SCOPE meeting on Microbial Environmental Genomics (MicroEnGen-II, Shanghai, June 12-15, 2006). These are (1) the Integrated Microbial Genomes Resources (IMG), (2) the Genomic Standards Consortium (GSC), and (3) the Natural Environment Research Council (NERC) Environmental Bioinformatics Centre (NEBC). These integrative bioinformatics and data management initiatives underscore the increasingly important role ecologists have to play in the genomic (metagenomic) revolution.
Collapse
Affiliation(s)
- Dawn Field
- Molecular Evolution and Bioinformatics Section, Oxford Centre for Ecology and Hydrology, Oxford, UK.
| | | |
Collapse
|
16
|
Wilson GA, Feil EJ, Lilley AK, Field D. Large-scale comparative genomic ranking of taxonomically restricted genes (TRGs) in bacterial and archaeal genomes. PLoS One 2007; 2:e324. [PMID: 17389915 PMCID: PMC1824705 DOI: 10.1371/journal.pone.0000324] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2007] [Accepted: 02/18/2007] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Lineage-specific, or taxonomically restricted genes (TRGs), especially those that are species and strain-specific, are of special interest because they are expected to play a role in defining exclusive ecological adaptations to particular niches. Despite this, they are relatively poorly studied and little understood, in large part because many are still orphans or only have homologues in very closely related isolates. This lack of homology confounds attempts to establish the likelihood that a hypothetical gene is expressed and, if so, to determine the putative function of the protein. METHODOLOGY/PRINCIPAL FINDINGS We have developed "QIPP" ("Quality Index for Predicted Proteins"), an index that scores the "quality" of a protein based on non-homology-based criteria. QIPP can be used to assign a value between zero and one to any protein based on comparing its features to other proteins in a given genome. We have used QIPP to rank the predicted proteins in the proteomes of Bacteria and Archaea. This ranking reveals that there is a large amount of variation in QIPP scores, and identifies many high-scoring orphans as potentially "authentic" (expressed) orphans. There are significant differences in the distributions of QIPP scores between orphan and non-orphan genes for many genomes and a trend for less well-conserved genes to have lower QIPP scores. CONCLUSIONS The implication of this work is that QIPP scores can be used to further annotate predicted proteins with information that is independent of homology. Such information can be used to prioritize candidates for further analysis. Data generated for this study can be found in the OrphanMine at http://www.genomics.ceh.ac.uk/orphan_mine.
Collapse
Affiliation(s)
- Gareth A Wilson
- Centre for Ecology and Hydrology (CEH) Oxford, Oxford, United Kindgom.
| | | | | | | |
Collapse
|
17
|
Abstract
This meeting report summarizes the proceedings of the “eGenomics: Cataloguing our Complete Genome Collection III” workshop held September 11–13, 2006, at the National Institute for Environmental eScience (NIEeS), Cambridge, United Kingdom. This 3rd workshop of the Genomic Standards Consortium was divided into two parts. The first half of the three-day workshop was dedicated to reviewing the genomic diversity of our current and future genome and metagenome collection, and exploring linkages to a series of existing projects through formal presentations. The second half was dedicated to strategic discussions. Outcomes of the workshop include a revised “Minimum Information about a Genome Sequence” (MIGS) specification (v1.1), consensus on a variety of features to be added to the Genome Catalogue (GCat), agreement by several researchers to adopt MIGS for imminent genome publications, and an agreement by the EBI and NCBI to input their genome collections into GCat for the purpose of quantifying the amount of optional data already available (e.g., for geographic location coordinates) and working towards a single, global list of all public genomes and metagenomes.
Collapse
|
18
|
Field D, Wilson G, van der Gast C. How do we compare hundreds of bacterial genomes? Curr Opin Microbiol 2006; 9:499-504. [PMID: 16942900 DOI: 10.1016/j.mib.2006.08.008] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2006] [Accepted: 08/16/2006] [Indexed: 11/26/2022]
Abstract
The genomic revolution is fully upon us in 2006 and the pace of discovery is set to accelerate with the emergence of ultra-high-throughput sequencing technologies. Our complete genome collection of bacteria and archaea continues to grow in number and diversity, as genome sequencing is applied to an array of new problems, from the characterization of the pan-genome to the detection of mutation after experimentation and the exploration of microbial communities in unprecedented detail. The benefits of large-scale comparative genomic analyses are driving the community to think about how to manage our public collections of genomes in novel ways.
Collapse
Affiliation(s)
- Dawn Field
- Oxford Centre for Ecology and Hydrology, Oxford OX1 3SR, UK.
| | | | | |
Collapse
|
19
|
Field D, Sansone SA. A Special Issue on Data Standards. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2006. [DOI: 10.1089/omi.2006.10.84] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Affiliation(s)
- Dawn Field
- Molecular Evolution & Bioinformatics Section, Oxford Centre for Ecology and Hydrology, Oxford, United Kingdom
| | - Susanna-Assunta Sansone
- EMBL, EBI (European Bioinformatics Institute), Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| |
Collapse
|
20
|
Wang HC, Susko E, Roger AJ. On the correlation between genomic G+C content and optimal growth temperature in prokaryotes: data quality and confounding factors. Biochem Biophys Res Commun 2006; 342:681-4. [PMID: 16499870 DOI: 10.1016/j.bbrc.2006.02.037] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2006] [Accepted: 02/08/2006] [Indexed: 11/30/2022]
Abstract
The correlation between genomic G+C content and optimal growth temperature in prokaryotes has gained renewed interest after Musto et al. [H. Musto, H. Naya, A. Zavala, H. Romero, F. Alvarex-Valin, G. Bernardi, Correlations between genomic GC levels and optimal growth temperatures in prokaryotes, FEBS Lett. 573 (2004) 73-77], reported that positive correlations exist in 15 families studied. We have reanalyzed their data and found that when genome size and data quality were adjusted for, there was no significant evidence of relationship between optimal temperature and GC content for two of the families that had previously shown strongly significant correlations. Using updated temperature optima for Halobacteriaceae species we found the correlation is insignificant in this family. For the family Enterobacteriaceae when genome size and optimal temperature are included in a multiple linear regression, only genome size is significant as a predictor of GC content. We showed that more profound statistical methods than simple two factor correlation analysis should be used for analyzing complex intrinsic and extrinsic factors that affect genomic GC content. We further found that a positive correlation between temperature and genomic GC is only evident in free-living species of low optimal growth temperatures.
Collapse
Affiliation(s)
- Huai-Chun Wang
- Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada B3H 3J5.
| | | | | |
Collapse
|