1
|
Nguyen TM, Pombubpa N, Huntemann M, Clum A, Foster B, Foster B, Roux S, Palaniappan K, Varghese N, Mukherjee S, Reddy TBK, Daum C, Copeland A, Chen IMA, Ivanova NN, Kyrpides NC, Harmon-Smith M, Eloe-Fadrosh EA, Pietrasiak N, Stajich JE, Hom EFY. Whole community shotgun metagenomes of two biological soil crust types from the Mojave Desert. Microbiol Resour Announc 2024; 13:e0098023. [PMID: 38329355 DOI: 10.1128/mra.00980-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Accepted: 01/23/2024] [Indexed: 02/09/2024] Open
Abstract
We present six whole community shotgun metagenomic sequencing data sets of two types of biological soil crusts sampled at the ecotone of the Mojave Desert and Colorado Desert in California. These data will help us understand the diversity and function of biocrust microbial communities, which are essential for desert ecosystems.
Collapse
Affiliation(s)
- Thuy M Nguyen
- Department of Biology and Center for Biodiversity and Conservation Research, University of Mississippi, University, Mississippi, USA
| | - Nuttapon Pombubpa
- Department of Microbiology and Plant Pathology, University of California, Riverside, California, USA
| | - Marcel Huntemann
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Alicia Clum
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Brian Foster
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Bryce Foster
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Simon Roux
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Krishnaveni Palaniappan
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Neha Varghese
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Supratim Mukherjee
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - T B K Reddy
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Chris Daum
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Alex Copeland
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - I-Min A Chen
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Natalia N Ivanova
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Nikos C Kyrpides
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Miranda Harmon-Smith
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Emiley A Eloe-Fadrosh
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Nicole Pietrasiak
- School of Life Sciences, University of Nevada-Las Vegas, Las Vegas, Nevada, USA
| | - Jason E Stajich
- Department of Microbiology and Plant Pathology, University of California, Riverside, California, USA
| | - Erik F Y Hom
- Department of Biology and Center for Biodiversity and Conservation Research, University of Mississippi, University, Mississippi, USA
| |
Collapse
|
2
|
Nguyen TM, Pombubpa N, Huntemann M, Clum A, Foster B, Foster B, Roux S, Palaniappan K, Varghese N, Mukherjee S, Reddy TBK, Daum C, Copeland A, Chen IMA, Ivanova NN, Kyrpides NC, Harmon-Smith M, Eloe-Fadrosh EA, Pietrasiak N, Stajich JE, Hom EFY. Metatranscriptomes of two biological soil crust types from the Mojave desert in response to wetting. Microbiol Resour Announc 2024; 13:e0108023. [PMID: 38189307 PMCID: PMC10868201 DOI: 10.1128/mra.01080-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 12/13/2023] [Indexed: 01/09/2024] Open
Abstract
We present eight metatranscriptomic datasets of light algal and cyanolichen biological soil crusts from the Mojave Desert in response to wetting. These data will help us understand gene expression patterns in desert biocrust microbial communities after they have been reactivated by the addition of water.
Collapse
Affiliation(s)
- Thuy M. Nguyen
- Department of Biology and Center for Biodiversity and Conservation Research, University, University of Mississippi, Mississippi, USA
| | - Nuttapon Pombubpa
- Department of Microbiology and Plant Pathology, University of California, Riverside, California, USA
| | - Marcel Huntemann
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Alicia Clum
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Brian Foster
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Bryce Foster
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Simon Roux
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Krishnaveni Palaniappan
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Neha Varghese
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Supratim Mukherjee
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - T. B. K. Reddy
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Chris Daum
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Alex Copeland
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - I-Min A. Chen
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Natalia N. Ivanova
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Nikos C. Kyrpides
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Miranda Harmon-Smith
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Emiley A. Eloe-Fadrosh
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Nicole Pietrasiak
- School of Life Sciences, University of Nevada-Las Vegas, Las Vegas, Nevada, USA
| | - Jason E. Stajich
- Department of Microbiology and Plant Pathology, University of California, Riverside, California, USA
| | - Erik F. Y. Hom
- Department of Biology and Center for Biodiversity and Conservation Research, University, University of Mississippi, Mississippi, USA
| |
Collapse
|
3
|
Mukherjee S, Ovchinnikova G, Stamatis D, Li CT, Chen IMA, Kyrpides NC, Reddy TBK. Standardized naming of microbiome samples in Genomes OnLine Database. Database (Oxford) 2023; 2023:7042581. [PMID: 36794865 PMCID: PMC9933444 DOI: 10.1093/database/baad001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Accepted: 01/24/2023] [Indexed: 02/17/2023]
Abstract
The power of next-generation sequencing has resulted in an explosive growth in the number of projects aiming to understand the metagenomic diversity of complex microbial environments. The interdisciplinary nature of this microbiome research community, along with the absence of reporting standards for microbiome data and samples, poses a significant challenge for follow-up studies. Commonly used names of metagenomes and metatranscriptomes in public databases currently lack the essential information necessary to accurately describe and classify the underlying samples, which makes a comparative analysis difficult to conduct and often results in misclassified sequences in data repositories. The Genomes OnLine Database (GOLD) (https:// gold.jgi.doe.gov/) at the Department of Energy Joint Genome Institute has been at the forefront of addressing this challenge by developing a standardized nomenclature system for naming microbiome samples. GOLD, currently in its twenty-fifth anniversary, continues to enrich the research community with hundreds of thousands of metagenomes and metatranscriptomes with well-curated and easy-to-understand names. Through this manuscript, we describe the overall naming process that can be easily adopted by researchers worldwide. Additionally, we propose the use of this naming system as a best practice for the scientific community to facilitate better interoperability and reusability of microbiome data.
Collapse
Affiliation(s)
- Supratim Mukherjee
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Galina Ovchinnikova
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Dimitri Stamatis
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Cindy Tianqing Li
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - I-Min A Chen
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Nikos C Kyrpides
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - T B K Reddy
- *Corresponding author: Tel: +1 408 505 8273;
| |
Collapse
|
4
|
Camargo AP, Nayfach S, Chen IMA, Palaniappan K, Ratner A, Chu K, Ritter S, Reddy TBK, Mukherjee S, Schulz F, Call L, Neches R, Woyke T, Ivanova N, Eloe-Fadrosh E, Kyrpides N, Roux S. IMG/VR v4: an expanded database of uncultivated virus genomes within a framework of extensive functional, taxonomic, and ecological metadata. Nucleic Acids Res 2023; 51:D733-D743. [PMID: 36399502 PMCID: PMC9825611 DOI: 10.1093/nar/gkac1037] [Citation(s) in RCA: 55] [Impact Index Per Article: 55.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 10/15/2022] [Accepted: 10/25/2022] [Indexed: 11/19/2022] Open
Abstract
Viruses are widely recognized as critical members of all microbiomes. Metagenomics enables large-scale exploration of the global virosphere, progressively revealing the extensive genomic diversity of viruses on Earth and highlighting the myriad of ways by which viruses impact biological processes. IMG/VR provides access to the largest collection of viral sequences obtained from (meta)genomes, along with functional annotation and rich metadata. A web interface enables users to efficiently browse and search viruses based on genome features and/or sequence similarity. Here, we present the fourth version of IMG/VR, composed of >15 million virus genomes and genome fragments, a ≈6-fold increase in size compared to the previous version. These clustered into 8.7 million viral operational taxonomic units, including 231 408 with at least one high-quality representative. Viral sequences in IMG/VR are now systematically identified from genomes, metagenomes, and metatranscriptomes using a new detection approach (geNomad), and IMG standard annotation are complemented with genome quality estimation using CheckV, taxonomic classification reflecting the latest taxonomic standards, and microbial host taxonomy prediction. IMG/VR v4 is available at https://img.jgi.doe.gov/vr, and the underlying data are available to download at https://genome.jgi.doe.gov/portal/IMG_VR.
Collapse
Affiliation(s)
- Antonio Pedro Camargo
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Stephen Nayfach
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - I-Min A Chen
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | | | - Anna Ratner
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Ken Chu
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Stephan J Ritter
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - T B K Reddy
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Supratim Mukherjee
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Frederik Schulz
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Lee Call
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Russell Y Neches
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Tanja Woyke
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Natalia N Ivanova
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Emiley A Eloe-Fadrosh
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Nikos C Kyrpides
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Simon Roux
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| |
Collapse
|
5
|
Chen IMA, Chu K, Palaniappan K, Ratner A, Huang J, Huntemann M, Hajek P, Ritter S, Webb C, Wu D, Varghese N, Reddy TBK, Mukherjee S, Ovchinnikova G, Nolan M, Seshadri R, Roux S, Visel A, Woyke T, Eloe-Fadrosh E, Kyrpides N, Ivanova N. The IMG/M data management and analysis system v.7: content updates and new features. Nucleic Acids Res 2022; 51:D723-D732. [PMID: 36382399 PMCID: PMC9825475 DOI: 10.1093/nar/gkac976] [Citation(s) in RCA: 53] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 10/05/2022] [Accepted: 10/17/2022] [Indexed: 11/17/2022] Open
Abstract
The Integrated Microbial Genomes & Microbiomes system (IMG/M: https://img.jgi.doe.gov/m/) at the Department of Energy (DOE) Joint Genome Institute (JGI) continues to provide support for users to perform comparative analysis of isolate and single cell genomes, metagenomes, and metatranscriptomes. In addition to datasets produced by the JGI, IMG v.7 also includes datasets imported from public sources such as NCBI Genbank, SRA, and the DOE National Microbiome Data Collaborative (NMDC), or submitted by external users. In the past couple years, we have continued our effort to help the user community by improving the annotation pipeline, upgrading the contents with new reference database versions, and adding new analysis functionalities such as advanced scaffold search, Average Nucleotide Identity (ANI) for high-quality metagenome bins, new cassette search, improved gene neighborhood display, and improvements to metatranscriptome data display and analysis. We also extended the collaboration and integration efforts with other DOE-funded projects such as NMDC and DOE Biology Knowledgebase (KBase).
Collapse
Affiliation(s)
- I-Min A Chen
- To whom correspondence should be addressed. Tel: +1 510 495 8437;
| | - Ken Chu
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Krishnaveni Palaniappan
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Anna Ratner
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Jinghua Huang
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Marcel Huntemann
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Patrick Hajek
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Stephan J Ritter
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Cody Webb
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Dongying Wu
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Neha J Varghese
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - T B K Reddy
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Supratim Mukherjee
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Galina Ovchinnikova
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Matt Nolan
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Rekha Seshadri
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Simon Roux
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Axel Visel
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Tanja Woyke
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Emiley A Eloe-Fadrosh
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Nikos C Kyrpides
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Natalia N Ivanova
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| |
Collapse
|
6
|
Seshadri R, Roux S, Huber KJ, Wu D, Yu S, Udwary D, Call L, Nayfach S, Hahnke RL, Pukall R, White JR, Varghese NJ, Webb C, Palaniappan K, Reimer LC, Sardà J, Bertsch J, Mukherjee S, Reddy T, Hajek PP, Huntemann M, Chen IMA, Spunde A, Clum A, Shapiro N, Wu ZY, Zhao Z, Zhou Y, Evtushenko L, Thijs S, Stevens V, Eloe-Fadrosh EA, Mouncey NJ, Yoshikuni Y, Whitman WB, Klenk HP, Woyke T, Göker M, Kyrpides NC, Ivanova NN. Expanding the genomic encyclopedia of Actinobacteria with 824 isolate reference genomes. Cell Genom 2022; 2:100213. [PMID: 36778052 PMCID: PMC9903846 DOI: 10.1016/j.xgen.2022.100213] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 07/19/2022] [Accepted: 10/16/2022] [Indexed: 11/13/2022]
Abstract
The phylum Actinobacteria includes important human pathogens like Mycobacterium tuberculosis and Corynebacterium diphtheriae and renowned producers of secondary metabolites of commercial interest, yet only a small part of its diversity is represented by sequenced genomes. Here, we present 824 actinobacterial isolate genomes in the context of a phylum-wide analysis of 6,700 genomes including public isolates and metagenome-assembled genomes (MAGs). We estimate that only 30%-50% of projected actinobacterial phylogenetic diversity possesses genomic representation via isolates and MAGs. A comparison of gene functions reveals novel determinants of host-microbe interaction as well as environment-specific adaptations such as potential antimicrobial peptides. We identify plasmids and prophages across isolates and uncover extensive prophage diversity structured mainly by host taxonomy. Analysis of >80,000 biosynthetic gene clusters reveals that horizontal gene transfer and gene loss shape secondary metabolite repertoire across taxa. Our observations illustrate the essential role of and need for high-quality isolate genome sequences.
Collapse
Affiliation(s)
- Rekha Seshadri
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA,Corresponding author
| | - Simon Roux
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Katharina J. Huber
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Dongying Wu
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Sora Yu
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Dan Udwary
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA,Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Lee Call
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Stephen Nayfach
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Richard L. Hahnke
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Rüdiger Pukall
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | | | - Neha J. Varghese
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Cody Webb
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | | | - Lorenz C. Reimer
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Joaquim Sardà
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Jonathon Bertsch
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | | | - T.B.K. Reddy
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Patrick P. Hajek
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Marcel Huntemann
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - I-Min A. Chen
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Alex Spunde
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Alicia Clum
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Nicole Shapiro
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Zong-Yen Wu
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Zhiying Zhao
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Yuguang Zhou
- China General Microbiological Culture Collection Center, Beijing, China
| | - Lyudmila Evtushenko
- Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences, All-Russian Collection of Microorganisms (VKM), Pushchino, Russia
| | - Sofie Thijs
- Center for Environmental Sciences, Environmental Biology, Hasselt University, Diepenbeek, Belgium
| | - Vincent Stevens
- Center for Environmental Sciences, Environmental Biology, Hasselt University, Diepenbeek, Belgium
| | - Emiley A. Eloe-Fadrosh
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA,Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Nigel J. Mouncey
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA,Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Yasuo Yoshikuni
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA,Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA,Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA,Center for Advanced Bioenergy and Bioproducts Innovation, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA,Global Institution for Collaborative Research and Education, Hokkaido University, Hokkaido 060-8589, Japan
| | | | - Hans-Peter Klenk
- School of Biology, Newcastle University, Newcastle upon Tyne, UK
| | - Tanja Woyke
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA,Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Markus Göker
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany,Corresponding author
| | - Nikos C. Kyrpides
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA,Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Natalia N. Ivanova
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA,Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA,Corresponding author
| |
Collapse
|
7
|
Roux S, Páez-Espino D, Chen IMA, Palaniappan K, Ratner A, Chu K, Reddy TBK, Nayfach S, Schulz F, Call L, Neches RY, Woyke T, Ivanova NN, Eloe-Fadrosh EA, Kyrpides NC. IMG/VR v3: an integrated ecological and evolutionary framework for interrogating genomes of uncultivated viruses. Nucleic Acids Res 2021; 49:D764-D775. [PMID: 33137183 PMCID: PMC7778971 DOI: 10.1093/nar/gkaa946] [Citation(s) in RCA: 179] [Impact Index Per Article: 59.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 10/02/2020] [Accepted: 10/09/2020] [Indexed: 12/28/2022] Open
Abstract
Viruses are integral components of all ecosystems and microbiomes on Earth. Through pervasive infections of their cellular hosts, viruses can reshape microbial community structure and drive global nutrient cycling. Over the past decade, viral sequences identified from genomes and metagenomes have provided an unprecedented view of viral genome diversity in nature. Since 2016, the IMG/VR database has provided access to the largest collection of viral sequences obtained from (meta)genomes. Here, we present the third version of IMG/VR, composed of 18 373 cultivated and 2 314 329 uncultivated viral genomes (UViGs), nearly tripling the total number of sequences compared to the previous version. These clustered into 935 362 viral Operational Taxonomic Units (vOTUs), including 188 930 with two or more members. UViGs in IMG/VR are now reported as single viral contigs, integrated proviruses or genome bins, and are annotated with a new standardized pipeline including genome quality estimation using CheckV, taxonomic classification reflecting the latest ICTV update, and expanded host taxonomy prediction. The new IMG/VR interface enables users to efficiently browse, search, and select UViGs based on genome features and/or sequence similarity. IMG/VR v3 is available at https://img.jgi.doe.gov/vr, and the underlying data are available to download at https://genome.jgi.doe.gov/portal/IMG_VR.
Collapse
Affiliation(s)
- Simon Roux
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - David Páez-Espino
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - I-Min A Chen
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Krishna Palaniappan
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Anna Ratner
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Ken Chu
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - T B K Reddy
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Stephen Nayfach
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Frederik Schulz
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Lee Call
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Russell Y Neches
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Tanja Woyke
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Natalia N Ivanova
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Emiley A Eloe-Fadrosh
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Nikos C Kyrpides
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| |
Collapse
|
8
|
Chen IMA, Chu K, Palaniappan K, Ratner A, Huang J, Huntemann M, Hajek P, Ritter S, Varghese N, Seshadri R, Roux S, Woyke T, Eloe-Fadrosh EA, Ivanova NN, Kyrpides N. The IMG/M data management and analysis system v.6.0: new tools and advanced capabilities. Nucleic Acids Res 2021; 49:D751-D763. [PMID: 33119741 PMCID: PMC7778900 DOI: 10.1093/nar/gkaa939] [Citation(s) in RCA: 250] [Impact Index Per Article: 83.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Revised: 10/04/2020] [Accepted: 10/07/2020] [Indexed: 12/22/2022] Open
Abstract
The Integrated Microbial Genomes & Microbiomes system (IMG/M: https://img.jgi.doe.gov/m/) contains annotated isolate genome and metagenome datasets sequenced at the DOE's Joint Genome Institute (JGI), submitted by external users, or imported from public sources such as NCBI. IMG v 6.0 includes advanced search functions and a new tool for statistical analysis of mixed sets of genomes and metagenome bins. The new IMG web user interface also has a new Help page with additional documentation and webinar tutorials to help users better understand how to use various IMG functions and tools for their research. New datasets have been processed with the prokaryotic annotation pipeline v.5, which includes extended protein family assignments.
Collapse
Affiliation(s)
- I-Min A Chen
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Ken Chu
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Krishnaveni Palaniappan
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Anna Ratner
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Jinghua Huang
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Marcel Huntemann
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Patrick Hajek
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Stephan Ritter
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Neha Varghese
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Rekha Seshadri
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Simon Roux
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Tanja Woyke
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Emiley A Eloe-Fadrosh
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Natalia N Ivanova
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Nikos C Kyrpides
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| |
Collapse
|
9
|
Mukherjee S, Stamatis D, Bertsch J, Ovchinnikova G, Sundaramurthi J, Lee J, Kandimalla M, Chen IMA, Kyrpides NC, Reddy TBK. Genomes OnLine Database (GOLD) v.8: overview and updates. Nucleic Acids Res 2021; 49:D723-D733. [PMID: 33152092 PMCID: PMC7778979 DOI: 10.1093/nar/gkaa983] [Citation(s) in RCA: 104] [Impact Index Per Article: 34.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 10/08/2020] [Accepted: 10/19/2020] [Indexed: 12/28/2022] Open
Abstract
The Genomes OnLine Database (GOLD) (https://gold.jgi.doe.gov/) is a manually curated, daily updated collection of genome projects and their metadata accumulated from around the world. The current version of the database includes over 1.17 million entries organized broadly into Studies (45 770), Organisms (387 382) or Biosamples (101 207), Sequencing Projects (355 364) and Analysis Projects (283 481). These four levels contain over 600 metadata fields, which includes 76 controlled vocabulary (CV) tables containing 3873 terms. GOLD provides an interactive web user interface for browsing and searching by a wide range of project and metadata fields. Users can enter details about their own projects in GOLD, which acts as a gatekeeper to ensure that metadata is accurately documented before submitting sequence information to the Integrated Microbial Genomes (IMG) system for analysis. In order to maintain a reference dataset for use by members of the scientific community, GOLD also imports projects from public repositories such as GenBank and SRA. The current status of the database, along with recent updates and improvements are described in this manuscript.
Collapse
Affiliation(s)
- Supratim Mukherjee
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Dimitri Stamatis
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Jon Bertsch
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Galina Ovchinnikova
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | | | - Janey Lee
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Mahathi Kandimalla
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - I-Min A Chen
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Nikos C Kyrpides
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - T B K Reddy
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| |
Collapse
|
10
|
Panwar P, Allen MA, Williams TJ, Hancock AM, Brazendale S, Bevington J, Roux S, Páez-Espino D, Nayfach S, Berg M, Schulz F, Chen IMA, Huntemann M, Shapiro N, Kyrpides NC, Woyke T, Eloe-Fadrosh EA, Cavicchioli R. Influence of the polar light cycle on seasonal dynamics of an Antarctic lake microbial community. Microbiome 2020; 8:116. [PMID: 32772914 PMCID: PMC7416419 DOI: 10.1186/s40168-020-00889-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Accepted: 06/30/2020] [Indexed: 05/10/2023]
Abstract
BACKGROUND Cold environments dominate the Earth's biosphere and microbial activity drives ecosystem processes thereby contributing greatly to global biogeochemical cycles. Polar environments differ to all other cold environments by experiencing 24-h sunlight in summer and no sunlight in winter. The Vestfold Hills in East Antarctica contains hundreds of lakes that have evolved from a marine origin only 3000-7000 years ago. Ace Lake is a meromictic (stratified) lake from this region that has been intensively studied since the 1970s. Here, a total of 120 metagenomes representing a seasonal cycle and four summers spanning a 10-year period were analyzed to determine the effects of the polar light cycle on microbial-driven nutrient cycles. RESULTS The lake system is characterized by complex sulfur and hydrogen cycling, especially in the anoxic layers, with multiple mechanisms for the breakdown of biopolymers present throughout the water column. The two most abundant taxa are phototrophs (green sulfur bacteria and cyanobacteria) that are highly influenced by the seasonal availability of sunlight. The extent of the Chlorobium biomass thriving at the interface in summer was captured in underwater video footage. The Chlorobium abundance dropped from up to 83% in summer to 6% in winter and 1% in spring, before rebounding to high levels. Predicted Chlorobium viruses and cyanophage were also abundant, but their levels did not negatively correlate with their hosts. CONCLUSION Over-wintering expeditions in Antarctica are logistically challenging, meaning insight into winter processes has been inferred from limited data. Here, we found that in contrast to chemolithoautotrophic carbon fixation potential of Southern Ocean Thaumarchaeota, this marine-derived lake evolved a reliance on photosynthesis. While viruses associated with phototrophs also have high seasonal abundance, the negative impact of viral infection on host growth appeared to be limited. The microbial community as a whole appears to have developed a capacity to generate biomass and remineralize nutrients, sufficient to sustain itself between two rounds of sunlight-driven summer-activity. In addition, this unique metagenome dataset provides considerable opportunity for future interrogation of eukaryotes and their viruses, abundant uncharacterized taxa (i.e. dark matter), and for testing hypotheses about endemic species in polar aquatic ecosystems. Video Abstract.
Collapse
Affiliation(s)
- Pratibha Panwar
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, New South Wales, 2052, Australia
| | - Michelle A Allen
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, New South Wales, 2052, Australia
| | - Timothy J Williams
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, New South Wales, 2052, Australia
| | - Alyce M Hancock
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, New South Wales, 2052, Australia
- Institute for Marine and Antarctic Studies, University of Tasmania, 20 Castray Esplanade, Battery Point, Tasmania, Australia
| | - Sarah Brazendale
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, New South Wales, 2052, Australia
- , 476 Lancaster Rd, Pegarah, Australia
| | - James Bevington
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, New South Wales, 2052, Australia
| | - Simon Roux
- Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - David Páez-Espino
- Department of Energy Joint Genome Institute, Berkeley, CA, USA
- Mammoth BioSciences, 279 East Grand Ave, South San Francisco, CA, USA
| | - Stephen Nayfach
- Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Maureen Berg
- Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - Frederik Schulz
- Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | - I-Min A Chen
- Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | | | - Nicole Shapiro
- Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | | | - Tanja Woyke
- Department of Energy Joint Genome Institute, Berkeley, CA, USA
| | | | - Ricardo Cavicchioli
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, New South Wales, 2052, Australia.
| |
Collapse
|
11
|
Palaniappan K, Chen IMA, Chu K, Ratner A, Seshadri R, Kyrpides NC, Ivanova NN, Mouncey NJ. IMG-ABC v.5.0: an update to the IMG/Atlas of Biosynthetic Gene Clusters Knowledgebase. Nucleic Acids Res 2020; 48:D422-D430. [PMID: 31665416 DOI: 10.1093/nar/gkz932] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Revised: 10/02/2019] [Accepted: 10/09/2019] [Indexed: 01/14/2023] Open
Abstract
Microbial secondary metabolism is a reservoir of bioactive compounds of immense biotechnological and biomedical potential. The biosynthetic machinery responsible for the production of these secondary metabolites (SMs) (also called natural products) is often encoded by collocated groups of genes called biosynthetic gene clusters (BGCs). High-throughput genome sequencing of both isolates and metagenomic samples combined with the development of specialized computational workflows is enabling systematic identification of BGCs and the discovery of novel SMs. In order to advance exploration of microbial secondary metabolism and its diversity, we developed the largest publicly available database of predicted BGCs combined with experimentally verified BGCs, the Integrated Microbial Genomes Atlas of Biosynthetic gene Clusters (IMG-ABC) (https://img.jgi.doe.gov/abc-public). Here we describe the first major content update of the IMG-ABC knowledgebase, since its initial release in 2015, refreshing the BGC prediction pipeline with the latest version of antiSMASH (v5) as well as presenting the data in the context of underlying environmental metadata sourced from GOLD (https://gold.jgi.doe.gov/). This update has greatly improved the quality and expanded the types of predicted BGCs compared to the previous version.
Collapse
Affiliation(s)
- Krishnaveni Palaniappan
- Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA.,Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - I-Min A Chen
- Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA.,Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Ken Chu
- Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA.,Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Anna Ratner
- Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA.,Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Rekha Seshadri
- Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA.,Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Nikos C Kyrpides
- Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA.,Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Natalia N Ivanova
- Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA.,Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Nigel J Mouncey
- Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA.,Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| |
Collapse
|
12
|
Mukherjee S, Stamatis D, Bertsch J, Ovchinnikova G, Katta HY, Mojica A, Chen IMA, Kyrpides NC, Reddy T. Genomes OnLine database (GOLD) v.7: updates and new features. Nucleic Acids Res 2020; 47:D649-D659. [PMID: 30357420 PMCID: PMC6323969 DOI: 10.1093/nar/gky977] [Citation(s) in RCA: 125] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 10/08/2018] [Indexed: 12/11/2022] Open
Abstract
The Genomes Online Database (GOLD) (https://gold.jgi.doe.gov) is an open online resource, which maintains an up-to-date catalog of genome and metagenome projects in the context of a comprehensive list of associated metadata. Information in GOLD is organized into four levels: Study, Biosample/Organism, Sequencing Project and Analysis Project. Currently GOLD hosts information on 33 415 Studies, 49 826 Biosamples, 313 324 Organisms, 215 881 Sequencing Projects and 174 454 Analysis Projects with a total of 541 metadata fields, of which 80 are based on controlled vocabulary (CV) terms. GOLD provides a user-friendly web interface to browse sequencing projects and launch advanced search tools across four classification levels. Users submit metadata on a wide range of Sequencing and Analysis Projects in GOLD before depositing sequence data to the Integrated Microbial Genomes (IMG) system for analysis. GOLD conforms with and supports the rules set by the Genomic Standards Consortium (GSC) Minimum Information standards. The current version of GOLD (v.7) has seen the number of projects and associated metadata increase exponentially over the years. This paper provides an update on the current status of GOLD and highlights the new features added over the last two years.
Collapse
Affiliation(s)
- Supratim Mukherjee
- Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - Dimitri Stamatis
- Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - Jon Bertsch
- Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - Galina Ovchinnikova
- Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - Hema Y Katta
- Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - Alejandro Mojica
- Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - I-Min A Chen
- Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - Nikos C Kyrpides
- Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - Tbk Reddy
- Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, CA 94598, USA
| |
Collapse
|
13
|
Paez-Espino D, Roux S, Chen IMA, Palaniappan K, Ratner A, Chu K, Huntemann M, Reddy TBK, Pons JC, Llabrés M, Eloe-Fadrosh EA, Ivanova NN, Kyrpides NC. IMG/VR v.2.0: an integrated data management and analysis system for cultivated and environmental viral genomes. Nucleic Acids Res 2020; 47:D678-D686. [PMID: 30407573 PMCID: PMC6323928 DOI: 10.1093/nar/gky1127] [Citation(s) in RCA: 111] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 10/31/2018] [Indexed: 01/06/2023] Open
Abstract
The Integrated Microbial Genome/Virus (IMG/VR) system v.2.0 (https://img.jgi.doe.gov/vr/) is the largest publicly available data management and analysis platform dedicated to viral genomics. Since the last report published in the 2016, NAR Database Issue, the data has tripled in size and currently contains genomes of 8389 cultivated reference viruses, 12 498 previously published curated prophages derived from cultivated microbial isolates, and 735 112 viral genomic fragments computationally predicted from assembled shotgun metagenomes. Nearly 60% of the viral genomes and genome fragments are clustered into 110 384 viral Operational Taxonomic Units (vOTUs) with two or more members. To improve data quality and predictions of host specificity, IMG/VR v.2.0 now separates prokaryotic and eukaryotic viruses, utilizes known prophage sequences to improve taxonomic assignments, and provides viral genome quality scores based on the estimated genome completeness. New features also include enhanced BLAST search capabilities for external queries. Finally, geographic map visualization to locate user-selected viral genomes or genome fragments has been implemented and download options have been extended. All of these features make IMG/VR v.2.0 a key resource for the study of viruses.
Collapse
Affiliation(s)
| | - Simon Roux
- Department of Energy, Joint Genome Institute, Walnut Creek, CA, USA
| | - I-Min A Chen
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, USA
| | - Krishna Palaniappan
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, USA
| | - Anna Ratner
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, USA
| | - Ken Chu
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, USA
| | - Marcel Huntemann
- Department of Energy, Joint Genome Institute, Walnut Creek, CA, USA
| | - T B K Reddy
- Department of Energy, Joint Genome Institute, Walnut Creek, CA, USA
| | - Joan Carles Pons
- Department of Mathematics and Computer Science, University of the Balearic Islands, Spain
| | - Mercè Llabrés
- Department of Mathematics and Computer Science, University of the Balearic Islands, Spain
| | | | | | - Nikos C Kyrpides
- Department of Energy, Joint Genome Institute, Walnut Creek, CA, USA
| |
Collapse
|
14
|
Chen IMA, Chu K, Palaniappan K, Pillay M, Ratner A, Huang J, Huntemann M, Varghese N, White JR, Seshadri R, Smirnova T, Kirton E, Jungbluth SP, Woyke T, Eloe-Fadrosh EA, Ivanova NN, Kyrpides NC. IMG/M v.5.0: an integrated data management and comparative analysis system for microbial genomes and microbiomes. Nucleic Acids Res 2020; 47:D666-D677. [PMID: 30289528 PMCID: PMC6323987 DOI: 10.1093/nar/gky901] [Citation(s) in RCA: 536] [Impact Index Per Article: 134.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Accepted: 09/24/2018] [Indexed: 11/12/2022] Open
Abstract
The Integrated Microbial Genomes & Microbiomes system v.5.0 (IMG/M: https://img.jgi.doe.gov/m/) contains annotated datasets categorized into: archaea, bacteria, eukarya, plasmids, viruses, genome fragments, metagenomes, cell enrichments, single particle sorts, and metatranscriptomes. Source datasets include those generated by the DOE's Joint Genome Institute (JGI), submitted by external scientists, or collected from public sequence data archives such as NCBI. All submissions are typically processed through the IMG annotation pipeline and then loaded into the IMG data warehouse. IMG's web user interface provides a variety of analytical and visualization tools for comparative analysis of isolate genomes and metagenomes in IMG. IMG/M allows open access to all public genomes in the IMG data warehouse, while its expert review (ER) system (IMG/MER: https://img.jgi.doe.gov/mer/) allows registered users to access their private genomes and to store their private datasets in workspace for sharing and for further analysis. IMG/M data content has grown by 60% since the last report published in the 2017 NAR Database Issue. IMG/M v.5.0 has a new and more powerful genome search feature, new statistical tools, and supports metagenome binning.
Collapse
Affiliation(s)
- I-Min A Chen
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Ken Chu
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Krishna Palaniappan
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Manoj Pillay
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Anna Ratner
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Jinghua Huang
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Marcel Huntemann
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Neha Varghese
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | | | - Rekha Seshadri
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Tatyana Smirnova
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Edward Kirton
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Sean P Jungbluth
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Tanja Woyke
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Emiley A Eloe-Fadrosh
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Natalia N Ivanova
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Nikos C Kyrpides
- Department of Energy, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| |
Collapse
|
15
|
Högfors-Rönnholm E, Lopez-Fernandez M, Christel S, Brambilla D, Huntemann M, Clum A, Foster B, Foster B, Roux S, Palaniappan K, Varghese N, Mukherjee S, Reddy TBK, Daum C, Copeland A, Chen IMA, Ivanova NN, Kyrpides NC, Harmon-Smith M, Eloe-Fadrosh EA, Lundin D, Engblom S, Dopson M. Metagenomes and metatranscriptomes from boreal potential and actual acid sulfate soil materials. Sci Data 2019; 6:207. [PMID: 31619684 PMCID: PMC6795848 DOI: 10.1038/s41597-019-0222-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Accepted: 08/27/2019] [Indexed: 11/09/2022] Open
Abstract
Natural sulfide rich deposits are common in coastal areas worldwide, including along the Baltic Sea coast. When artificial drainage exposes these deposits to atmospheric oxygen, iron sulfide minerals in the soils are rapidly oxidized. This process turns the potential acid sulfate soils into actual acid sulfate soils and mobilizes large quantities of acidity and leachable toxic metals that cause severe environmental problems. It is known that acidophilic microorganisms living in acid sulfate soils catalyze iron sulfide mineral oxidation. However, only a few studies regarding these communities have been published. In this study, we sampled the oxidized actual acid sulfate soil, the transition zone where oxidation is actively taking place, and the deepest un-oxidized potential acid sulfate soil. Nucleic acids were extracted and 16S rRNA gene amplicons, metagenomes, and metatranscriptomes generated to gain a detailed insight into the communities and their activities. The project will be of great use to microbiologists, environmental biologists, geochemists, and geologists as there is hydrological and geochemical monitoring from the site stretching back for many years.
Collapse
Affiliation(s)
- Eva Högfors-Rönnholm
- Research and Development, Novia University of Applied Sciences, Vaasa, 65200, Finland.
| | - Margarita Lopez-Fernandez
- Centre for Ecology and Evolution in Microbial Model Systems (EEMiS), Linnaeus University, Kalmar, 59231, Sweden
| | - Stephan Christel
- Centre for Ecology and Evolution in Microbial Model Systems (EEMiS), Linnaeus University, Kalmar, 59231, Sweden
| | - Diego Brambilla
- Centre for Ecology and Evolution in Microbial Model Systems (EEMiS), Linnaeus University, Kalmar, 59231, Sweden
| | - Marcel Huntemann
- Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | - Alicia Clum
- Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | - Brian Foster
- Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | - Bryce Foster
- Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | - Simon Roux
- Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | | | - Neha Varghese
- Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | - Supratim Mukherjee
- Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | - T B K Reddy
- Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | - Chris Daum
- Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | - Alex Copeland
- Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | - I-Min A Chen
- Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | - Natalia N Ivanova
- Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | - Nikos C Kyrpides
- Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA
| | | | | | - Daniel Lundin
- Centre for Ecology and Evolution in Microbial Model Systems (EEMiS), Linnaeus University, Kalmar, 59231, Sweden
| | - Sten Engblom
- Research and Development, Novia University of Applied Sciences, Vaasa, 65200, Finland
| | - Mark Dopson
- Centre for Ecology and Evolution in Microbial Model Systems (EEMiS), Linnaeus University, Kalmar, 59231, Sweden
| |
Collapse
|
16
|
Camargo AP, de Souza RSC, de Britto Costa P, Gerhardt IR, Dante RA, Teodoro GS, Abrahão A, Lambers H, Carazzolle MF, Huntemann M, Clum A, Foster B, Foster B, Roux S, Palaniappan K, Varghese N, Mukherjee S, Reddy TBK, Daum C, Copeland A, Chen IMA, Ivanova NN, Kyrpides NC, Pennacchio C, Eloe-Fadrosh EA, Arruda P, Oliveira RS. Microbiomes of Velloziaceae from phosphorus-impoverished soils of the campos rupestres, a biodiversity hotspot. Sci Data 2019; 6:140. [PMID: 31366912 PMCID: PMC6668480 DOI: 10.1038/s41597-019-0141-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Accepted: 06/25/2019] [Indexed: 12/22/2022] Open
Abstract
The rocky, seasonally-dry and nutrient-impoverished soils of the Brazilian campos rupestres impose severe growth-limiting conditions on plants. Species of a dominant plant family, Velloziaceae, are highly specialized to low-nutrient conditions and seasonal water availability of this environment, where phosphorus (P) is the key limiting nutrient. Despite plant-microbe associations playing critical roles in stressful ecosystems, the contribution of these interactions in the campos rupestres remains poorly studied. Here we present the first microbiome data of Velloziaceae spp. thriving in contrasting substrates of campos rupestres. We assessed the microbiomes of Vellozia epidendroides, which occupies shallow patches of soil, and Barbacenia macrantha, growing on exposed rocks. The prokaryotic and fungal profiles were assessed by rRNA barcode sequencing of epiphytic and endophytic compartments of roots, stems, leaves and surrounding soil/rocks. We also generated root and substrate (rock/soil)-associated metagenomes of each plant species. We foresee that these data will contribute to decipher how the microbiome contributes to plant functioning in the campos rupestres, and to unravel new strategies for improved crop productivity in stressful environments.
Collapse
Affiliation(s)
- Antonio Pedro Camargo
- Centro de Biologia Molecular e Engenharia Genética, Universidade Estadual de Campinas (UNICAMP), 13083-875, Campinas, SP, Brazil
- Departamento de Genética e Evolução, Instituto de Biologia, Universidade Estadual de Campinas (UNICAMP), 13083-875, Campinas, SP, Brazil
- Genomics for Climate Change Research Center, Universidade Estadual de Campinas (UNICAMP), 13083-875, Campinas, SP, Brazil
| | - Rafael Soares Correa de Souza
- Centro de Biologia Molecular e Engenharia Genética, Universidade Estadual de Campinas (UNICAMP), 13083-875, Campinas, SP, Brazil.
- Departamento de Genética e Evolução, Instituto de Biologia, Universidade Estadual de Campinas (UNICAMP), 13083-875, Campinas, SP, Brazil.
- Genomics for Climate Change Research Center, Universidade Estadual de Campinas (UNICAMP), 13083-875, Campinas, SP, Brazil.
| | - Patrícia de Britto Costa
- Departamento de Biologia Vegetal, Instituto de Biologia, Universidade Estadual de Campinas (UNICAMP), 13083-862, Campinas, SP, Brazil
- School of Biological Sciences, University of Western Australia (UWA), Perth, WA, 6009, Australia
| | - Isabel Rodrigues Gerhardt
- Centro de Biologia Molecular e Engenharia Genética, Universidade Estadual de Campinas (UNICAMP), 13083-875, Campinas, SP, Brazil
- Genomics for Climate Change Research Center, Universidade Estadual de Campinas (UNICAMP), 13083-875, Campinas, SP, Brazil
- Embrapa Informática Agropecuária, 13083-886, Campinas, SP, Brazil
| | - Ricardo Augusto Dante
- Centro de Biologia Molecular e Engenharia Genética, Universidade Estadual de Campinas (UNICAMP), 13083-875, Campinas, SP, Brazil
- Genomics for Climate Change Research Center, Universidade Estadual de Campinas (UNICAMP), 13083-875, Campinas, SP, Brazil
- Embrapa Informática Agropecuária, 13083-886, Campinas, SP, Brazil
| | - Grazielle Sales Teodoro
- Instituto de Ciências Biológicas, Universidade Federal do Para (UFPA), 66075-750, Belem, PA, Brazil
| | - Anna Abrahão
- Departamento de Biologia Vegetal, Instituto de Biologia, Universidade Estadual de Campinas (UNICAMP), 13083-862, Campinas, SP, Brazil
- School of Biological Sciences, University of Western Australia (UWA), Perth, WA, 6009, Australia
| | - Hans Lambers
- School of Biological Sciences, University of Western Australia (UWA), Perth, WA, 6009, Australia
| | - Marcelo Falsarella Carazzolle
- Departamento de Genética e Evolução, Instituto de Biologia, Universidade Estadual de Campinas (UNICAMP), 13083-875, Campinas, SP, Brazil
| | - Marcel Huntemann
- Department of Energy Joint Genome Institute, Walnut Creek, California, 94598, USA
| | - Alicia Clum
- Department of Energy Joint Genome Institute, Walnut Creek, California, 94598, USA
| | - Brian Foster
- Department of Energy Joint Genome Institute, Walnut Creek, California, 94598, USA
| | - Bryce Foster
- Department of Energy Joint Genome Institute, Walnut Creek, California, 94598, USA
| | - Simon Roux
- Department of Energy Joint Genome Institute, Walnut Creek, California, 94598, USA
| | | | - Neha Varghese
- Department of Energy Joint Genome Institute, Walnut Creek, California, 94598, USA
| | - Supratim Mukherjee
- Department of Energy Joint Genome Institute, Walnut Creek, California, 94598, USA
| | - T B K Reddy
- Department of Energy Joint Genome Institute, Walnut Creek, California, 94598, USA
| | - Chris Daum
- Department of Energy Joint Genome Institute, Walnut Creek, California, 94598, USA
| | - Alex Copeland
- Department of Energy Joint Genome Institute, Walnut Creek, California, 94598, USA
| | - I-Min A Chen
- Department of Energy Joint Genome Institute, Walnut Creek, California, 94598, USA
| | - Natalia N Ivanova
- Department of Energy Joint Genome Institute, Walnut Creek, California, 94598, USA
| | - Nikos C Kyrpides
- Department of Energy Joint Genome Institute, Walnut Creek, California, 94598, USA
| | - Christa Pennacchio
- Department of Energy Joint Genome Institute, Walnut Creek, California, 94598, USA
| | | | - Paulo Arruda
- Centro de Biologia Molecular e Engenharia Genética, Universidade Estadual de Campinas (UNICAMP), 13083-875, Campinas, SP, Brazil
- Departamento de Genética e Evolução, Instituto de Biologia, Universidade Estadual de Campinas (UNICAMP), 13083-875, Campinas, SP, Brazil
- Genomics for Climate Change Research Center, Universidade Estadual de Campinas (UNICAMP), 13083-875, Campinas, SP, Brazil
| | - Rafael Silva Oliveira
- Departamento de Biologia Vegetal, Instituto de Biologia, Universidade Estadual de Campinas (UNICAMP), 13083-862, Campinas, SP, Brazil.
- School of Biological Sciences, University of Western Australia (UWA), Perth, WA, 6009, Australia.
| |
Collapse
|
17
|
Hadjithomas M, Chen IMA, Chu K, Huang J, Ratner A, Palaniappan K, Andersen E, Markowitz V, Kyrpides NC, Ivanova NN. IMG-ABC: new features for bacterial secondary metabolism analysis and targeted biosynthetic gene cluster discovery in thousands of microbial genomes. Nucleic Acids Res 2016; 45:D560-D565. [PMID: 27903896 PMCID: PMC5210574 DOI: 10.1093/nar/gkw1103] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Accepted: 11/08/2016] [Indexed: 01/25/2023] Open
Abstract
Secondary metabolites produced by microbes have diverse biological functions, which makes them a great potential source of biotechnologically relevant compounds with antimicrobial, anti-cancer and other activities. The proteins needed to synthesize these natural products are often encoded by clusters of co-located genes called biosynthetic gene clusters (BCs). In order to advance the exploration of microbial secondary metabolism, we developed the largest publically available database of experimentally verified and predicted BCs, the Integrated Microbial Genomes Atlas of Biosynthetic gene Clusters (IMG-ABC) (https://img.jgi.doe.gov/abc/). Here, we describe an update of IMG-ABC, which includes ClusterScout, a tool for targeted identification of custom biosynthetic gene clusters across 40 000 isolate microbial genomes, and a new search capability to query more than 700 000 BCs from isolate genomes for clusters with similar Pfam composition. Additional features enable fast exploration and analysis of BCs through two new interactive visualization features, a BC function heatmap and a BC similarity network graph. These new tools and features add to the value of IMG-ABC's vast body of BC data, facilitating their in-depth analysis and accelerating secondary metabolite discovery.
Collapse
Affiliation(s)
- Michalis Hadjithomas
- Microbial Genome and Metagenome Program, Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - I-Min A Chen
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Ken Chu
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Jinghua Huang
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Anna Ratner
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Krishna Palaniappan
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Evan Andersen
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Victor Markowitz
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Nikos C Kyrpides
- Microbial Genome and Metagenome Program, Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - Natalia N Ivanova
- Microbial Genome and Metagenome Program, Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA
| |
Collapse
|
18
|
Paez-Espino D, Chen IMA, Palaniappan K, Ratner A, Chu K, Szeto E, Pillay M, Huang J, Markowitz VM, Nielsen T, Huntemann M, K Reddy TB, Pavlopoulos GA, Sullivan MB, Campbell BJ, Chen F, McMahon K, Hallam SJ, Denef V, Cavicchioli R, Caffrey SM, Streit WR, Webster J, Handley KM, Salekdeh GH, Tsesmetzis N, Setubal JC, Pope PB, Liu WT, Rivers AR, Ivanova NN, Kyrpides NC. IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses. Nucleic Acids Res 2016; 45:D457-D465. [PMID: 27799466 PMCID: PMC5210529 DOI: 10.1093/nar/gkw1030] [Citation(s) in RCA: 106] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2016] [Revised: 10/15/2016] [Accepted: 10/27/2016] [Indexed: 12/19/2022] Open
Abstract
Viruses represent the most abundant life forms on the planet. Recent experimental and computational improvements have led to a dramatic increase in the number of viral genome sequences identified primarily from metagenomic samples. As a result of the expanding catalog of metagenomic viral sequences, there exists a need for a comprehensive computational platform integrating all these sequences with associated metadata and analytical tools. Here we present IMG/VR (https://img.jgi.doe.gov/vr/), the largest publicly available database of 3908 isolate reference DNA viruses with 264 413 computationally identified viral contigs from >6000 ecologically diverse metagenomic samples. Approximately half of the viral contigs are grouped into genetically distinct quasi-species clusters. Microbial hosts are predicted for 20 000 viral sequences, revealing nine microbial phyla previously unreported to be infected by viruses. Viral sequences can be queried using a variety of associated metadata, including habitat type and geographic location of the samples, or taxonomic classification according to hallmark viral genes. IMG/VR has a user-friendly interface that allows users to interrogate all integrated data and interact by comparing with external sequences, thus serving as an essential resource in the viral genomics community.
Collapse
Affiliation(s)
- David Paez-Espino
- Department of Energy, Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - I-Min A Chen
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Krishna Palaniappan
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Anna Ratner
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Ken Chu
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Ernest Szeto
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Manoj Pillay
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Jinghua Huang
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Victor M Markowitz
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Torben Nielsen
- Department of Energy, Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - Marcel Huntemann
- Department of Energy, Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - T B K Reddy
- Department of Energy, Joint Genome Institute, Walnut Creek, CA 94598, USA
| | | | - Matthew B Sullivan
- Departments of Microbiology and Civil, Environmental and Geodetic Engineering, The Ohio State University, Columbus, OH 43210, USA
| | - Barbara J Campbell
- Department of Biological Sciences, Clemson University, Clemson, SC 29634, USA
| | - Feng Chen
- Institute of Marine and Environmental Technology, University of Maryland Center for Environmental Science, Baltimore, MD 21202, USA
| | - Katherine McMahon
- Department of Civil and Environmental Engineering, Department of Bacteriology, University of Wisconsin, Madison, WI 53706, USA
| | - Steve J Hallam
- Department of Microbiology and Immunology, University of British Columbia, Vancouver, BC V6T 1Z3, Canada.,Genome Science, Technology, and Program in Bioinformatics, University of British Columbia, Vancouver, BC V6T 1Z4, Canada.,Peter Wall Institute for Advanced Studies, University of British Columbia, Vancouver, BC V6T 1Z2, Canada.,ECOSCOPE Training Program, University of British Columbia, Vancouver, BC V6T 0A1, Canada
| | - Vincent Denef
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109-1048, USA
| | - Ricardo Cavicchioli
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, NSW 2052, Australia
| | - Sean M Caffrey
- Department of Biological Sciences, University of Calgary, Calgary, AB T2N 4V8, Canada
| | - Wolfgang R Streit
- Biocenter Klein Flottbek, Department of Microbiology and Biotechnology, University of Hamburg, Hamburg 22609, Germany
| | - John Webster
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, NSW 2052, Australia
| | - Kim M Handley
- School of Biological Sciences, University of Auckland, Auckland 1010, New Zealand
| | - Ghasem H Salekdeh
- Department of Systems Biology, Agricultural Biotechnology Research Institute of Iran, Agricultural Research, Education, and Extension Organization, Karaj 31535-1897, Iran
| | - Nicolas Tsesmetzis
- Shell International Exploration and Production Inc., Houston, TX 77082, USA
| | - Joao C Setubal
- Department of Biochemistry, Institute of Chemistry, Universidade de Sao Paulo, SP 05508-000, Brazil
| | - Phillip B Pope
- Department of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås 1432, Norway
| | - Wen-Tso Liu
- Department of Civil and Environmental Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Adam R Rivers
- Department of Energy, Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - Natalia N Ivanova
- Department of Energy, Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - Nikos C Kyrpides
- Department of Energy, Joint Genome Institute, Walnut Creek, CA 94598, USA
| |
Collapse
|
19
|
Chen IMA, Markowitz VM, Chu K, Palaniappan K, Szeto E, Pillay M, Ratner A, Huang J, Andersen E, Huntemann M, Varghese N, Hadjithomas M, Tennessen K, Nielsen T, Ivanova NN, Kyrpides NC. IMG/M: integrated genome and metagenome comparative data analysis system. Nucleic Acids Res 2016; 45:D507-D516. [PMID: 27738135 PMCID: PMC5210632 DOI: 10.1093/nar/gkw929] [Citation(s) in RCA: 310] [Impact Index Per Article: 38.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2016] [Accepted: 10/05/2016] [Indexed: 12/16/2022] Open
Abstract
The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support for examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review (ER) companion system (IMG/M ER: https://img.jgi.doe.gov/mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system.
Collapse
Affiliation(s)
- I-Min A Chen
- Biosciences Computing Group, Computational Science Department, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Victor M Markowitz
- Biosciences Computing Group, Computational Science Department, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Ken Chu
- Biosciences Computing Group, Computational Science Department, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Krishna Palaniappan
- Biosciences Computing Group, Computational Science Department, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Ernest Szeto
- Biosciences Computing Group, Computational Science Department, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Manoj Pillay
- Biosciences Computing Group, Computational Science Department, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Anna Ratner
- Biosciences Computing Group, Computational Science Department, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Jinghua Huang
- Biosciences Computing Group, Computational Science Department, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Evan Andersen
- Biosciences Computing Group, Computational Science Department, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Marcel Huntemann
- Microbial Genome and Metagenome Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Neha Varghese
- Microbial Genome and Metagenome Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Michalis Hadjithomas
- Microbial Genome and Metagenome Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Kristin Tennessen
- Microbial Genome and Metagenome Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Torben Nielsen
- Microbial Genome and Metagenome Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Natalia N Ivanova
- Microbial Genome and Metagenome Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Nikos C Kyrpides
- Microbial Genome and Metagenome Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| |
Collapse
|
20
|
Chen IMA, Markowitz VM, Palaniappan K, Szeto E, Chu K, Huang J, Ratner A, Pillay M, Hadjithomas M, Huntemann M, Mikhailova N, Ovchinnikova G, Ivanova NN, Kyrpides NC. Supporting community annotation and user collaboration in the integrated microbial genomes (IMG) system. BMC Genomics 2016; 17:307. [PMID: 27118214 PMCID: PMC4847265 DOI: 10.1186/s12864-016-2629-y] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2016] [Accepted: 04/16/2016] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND The exponential growth of genomic data from next generation technologies renders traditional manual expert curation effort unsustainable. Many genomic systems have included community annotation tools to address the problem. Most of these systems adopted a "Wiki-based" approach to take advantage of existing wiki technologies, but encountered obstacles in issues such as usability, authorship recognition, information reliability and incentive for community participation. RESULTS Here, we present a different approach, relying on tightly integrated method rather than "Wiki-based" method, to support community annotation and user collaboration in the Integrated Microbial Genomes (IMG) system. The IMG approach allows users to use existing IMG data warehouse and analysis tools to add gene, pathway and biosynthetic cluster annotations, to analyze/reorganize contigs, genes and functions using workspace datasets, and to share private user annotations and workspace datasets with collaborators. We show that the annotation effort using IMG can be part of the research process to overcome the user incentive and authorship recognition problems thus fostering collaboration among domain experts. The usability and reliability issues are addressed by the integration of curated information and analysis tools in IMG, together with DOE Joint Genome Institute (JGI) expert review. CONCLUSION By incorporating annotation operations into IMG, we provide an integrated environment for users to perform deeper and extended data analysis and annotation in a single system that can lead to publications and community knowledge sharing as shown in the case studies.
Collapse
Affiliation(s)
- I-Min A Chen
- Biosciences Computing, Computational Research Division, Lawrence Berkeley, National Laboratory, Berkeley, California, USA.
| | - Victor M Markowitz
- Biosciences Computing, Computational Research Division, Lawrence Berkeley, National Laboratory, Berkeley, California, USA
| | - Krishna Palaniappan
- Biosciences Computing, Computational Research Division, Lawrence Berkeley, National Laboratory, Berkeley, California, USA
| | - Ernest Szeto
- Biosciences Computing, Computational Research Division, Lawrence Berkeley, National Laboratory, Berkeley, California, USA
| | - Ken Chu
- Biosciences Computing, Computational Research Division, Lawrence Berkeley, National Laboratory, Berkeley, California, USA
| | - Jinghua Huang
- Biosciences Computing, Computational Research Division, Lawrence Berkeley, National Laboratory, Berkeley, California, USA
| | - Anna Ratner
- Biosciences Computing, Computational Research Division, Lawrence Berkeley, National Laboratory, Berkeley, California, USA
| | - Manoj Pillay
- Biosciences Computing, Computational Research Division, Lawrence Berkeley, National Laboratory, Berkeley, California, USA
| | - Michalis Hadjithomas
- Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, California, USA
| | - Marcel Huntemann
- Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, California, USA
| | - Natalia Mikhailova
- Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, California, USA
| | - Galina Ovchinnikova
- Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, California, USA
| | - Natalia N Ivanova
- Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, California, USA
| | - Nikos C Kyrpides
- Prokaryotic Super Program, DOE Joint Genome Institute, Walnut Creek, California, USA
| |
Collapse
|
21
|
Huntemann M, Ivanova NN, Mavromatis K, Tripp HJ, Paez-Espino D, Palaniappan K, Szeto E, Pillay M, Chen IMA, Pati A, Nielsen T, Markowitz VM, Kyrpides NC. Erratum to: The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4). Stand Genomic Sci 2016; 11:27. [PMID: 27004084 PMCID: PMC4800771 DOI: 10.1186/s40793-016-0148-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2016] [Accepted: 03/11/2016] [Indexed: 11/21/2022] Open
Affiliation(s)
- Marcel Huntemann
- Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - Natalia N Ivanova
- Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - Konstantinos Mavromatis
- Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA ; Present Address: Computational Biology Group, Celgene Corporation, Summit, USA
| | - H James Tripp
- Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - David Paez-Espino
- Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - Krishnaveni Palaniappan
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA
| | - Ernest Szeto
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA
| | - Manoj Pillay
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA
| | - I-Min A Chen
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA
| | - Amrita Pati
- Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - Torben Nielsen
- Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - Victor M Markowitz
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA
| | - Nikos C Kyrpides
- Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| |
Collapse
|
22
|
Huntemann M, Ivanova NN, Mavromatis K, Tripp HJ, Paez-Espino D, Tennessen K, Palaniappan K, Szeto E, Pillay M, Chen IMA, Pati A, Nielsen T, Markowitz VM, Kyrpides NC. The standard operating procedure of the DOE-JGI Metagenome Annotation Pipeline (MAP v.4). Stand Genomic Sci 2016; 11:17. [PMID: 26918089 PMCID: PMC4766715 DOI: 10.1186/s40793-016-0138-x] [Citation(s) in RCA: 85] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2015] [Accepted: 02/17/2016] [Indexed: 11/10/2022] Open
Abstract
The DOE-JGI Metagenome Annotation Pipeline (MAP v.4) performs structural and functional annotation for metagenomic sequences that are submitted to the Integrated Microbial Genomes with Microbiomes (IMG/M) system for comparative analysis. The pipeline runs on nucleotide sequences provided via the IMG submission site. Users must first define their analysis projects in GOLD and then submit the associated sequence datasets consisting of scaffolds/contigs with optional coverage information and/or unassembled reads in fasta and fastq file formats. The MAP processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNAs, as well as CRISPR elements. Structural annotation is followed by functional annotation including assignment of protein product names and connection to various protein family databases.
Collapse
Affiliation(s)
- Marcel Huntemann
- Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut, Creek USA
| | - Natalia N Ivanova
- Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut, Creek USA
| | - Konstantinos Mavromatis
- Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut, Creek USA
| | - H James Tripp
- Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut, Creek USA
| | - David Paez-Espino
- Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut, Creek USA
| | - Kristin Tennessen
- Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut, Creek USA
| | - Krishnaveni Palaniappan
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA
| | - Ernest Szeto
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA
| | - Manoj Pillay
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA
| | - I-Min A Chen
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA
| | - Amrita Pati
- Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut, Creek USA
| | - Torben Nielsen
- Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut, Creek USA
| | - Victor M Markowitz
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA
| | - Nikos C Kyrpides
- Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut, Creek USA
| |
Collapse
|
23
|
Markowitz VM, Chen IMA, Chu K, Pati A, Ivanova NN, Kyrpides NC. Ten Years of Maintaining and Expanding a Microbial Genome and Metagenome Analysis System. Trends Microbiol 2015; 23:730-741. [DOI: 10.1016/j.tim.2015.07.012] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2015] [Revised: 07/15/2015] [Accepted: 07/31/2015] [Indexed: 10/22/2022]
|
24
|
Huntemann M, Ivanova NN, Mavromatis K, Tripp HJ, Paez-Espino D, Palaniappan K, Szeto E, Pillay M, Chen IMA, Pati A, Nielsen T, Markowitz VM, Kyrpides NC. The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4). Stand Genomic Sci 2015; 10:86. [PMID: 26512311 PMCID: PMC4623924 DOI: 10.1186/s40793-015-0077-y] [Citation(s) in RCA: 195] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2015] [Accepted: 10/13/2015] [Indexed: 11/10/2022] Open
Abstract
The DOE-JGI Microbial Genome Annotation Pipeline performs structural and functional annotation of microbial genomes that are further included into the Integrated Microbial Genome comparative analysis system. MGAP is applied to assembled nucleotide sequence datasets that are provided via the IMG submission site. Dataset submission for annotation first requires project and associated metadata description in GOLD. The MGAP sequence data processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNA features, as well as CRISPR elements. Structural annotation is followed by assignment of protein product names and functions.
Collapse
Affiliation(s)
- Marcel Huntemann
- Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, USA
| | - Natalia N Ivanova
- Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, USA
| | - Konstantinos Mavromatis
- Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, USA ; Present Address: Computational Biology Group, Celgene Corporation, Summit, USA
| | - H James Tripp
- Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, USA
| | - David Paez-Espino
- Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, USA
| | - Krishnaveni Palaniappan
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA
| | - Ernest Szeto
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA
| | - Manoj Pillay
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA
| | - I-Min A Chen
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA
| | - Amrita Pati
- Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, USA
| | - Torben Nielsen
- Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, USA
| | - Victor M Markowitz
- Biosciences Computing, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA
| | - Nikos C Kyrpides
- Genome Biology Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, USA
| |
Collapse
|
25
|
Markowitz VM, Chen IMA, Palaniappan K, Chu K, Szeto E, Pillay M, Ratner A, Huang J, Woyke T, Huntemann M, Anderson I, Billis K, Varghese N, Mavromatis K, Pati A, Ivanova NN, Kyrpides NC. IMG 4 version of the integrated microbial genomes comparative analysis system. Nucleic Acids Res 2013; 42:D560-7. [PMID: 24165883 PMCID: PMC3965111 DOI: 10.1093/nar/gkt963] [Citation(s) in RCA: 459] [Impact Index Per Article: 41.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).
Collapse
Affiliation(s)
- Victor M Markowitz
- Biological Data Management and Technology Center, Computational Research Division Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, 94720 USA and Department of Energy, Microbial Genome and Metagenome Program, Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, 94598 USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Markowitz VM, Chen IMA, Chu K, Szeto E, Palaniappan K, Pillay M, Ratner A, Huang J, Pagani I, Tringe S, Huntemann M, Billis K, Varghese N, Tennessen K, Mavromatis K, Pati A, Ivanova NN, Kyrpides NC. IMG/M 4 version of the integrated metagenome comparative analysis system. Nucleic Acids Res 2013; 42:D568-73. [PMID: 24136997 PMCID: PMC3964948 DOI: 10.1093/nar/gkt919] [Citation(s) in RCA: 196] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
IMG/M (http://img.jgi.doe.gov/m) provides support for comparative analysis of microbial community aggregate genomes (metagenomes) in the context of a comprehensive set of reference genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG/M’s data content and analytical tools have expanded continuously since its first version was released in 2007. Since the last report published in the 2012 NAR Database Issue, IMG/M’s database architecture, annotation and data integration pipelines and analysis tools have been extended to copewith the rapid growth in the number and size of metagenome data sets handled by the system. IMG/M data marts provide support for the analysis of publicly available genomes, expert review of metagenome annotations (IMG/M ER: http://img.jgi.doe.gov/mer) and Human Microbiome Project (HMP)-specific metagenome samples (IMG/M HMP: http://img.jgi.doe.gov/imgm_hmp).
Collapse
Affiliation(s)
- Victor M Markowitz
- Biological Data Management and Technology Center, Computational Research Division Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, 94720 USA and Microbial Genome and Metagenome Program, Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, 94598 USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Chen IMA, Markowitz VM, Chu K, Anderson I, Mavromatis K, Kyrpides NC, Ivanova NN. Improving microbial genome annotations in an integrated database context. PLoS One 2013; 8:e54859. [PMID: 23424620 PMCID: PMC3570495 DOI: 10.1371/journal.pone.0054859] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2012] [Accepted: 12/17/2012] [Indexed: 12/26/2022] Open
Abstract
Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG) family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/.
Collapse
Affiliation(s)
- I-Min A. Chen
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Victor M. Markowitz
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
- * E-mail:
| | - Ken Chu
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Iain Anderson
- Microbial Genomics and Metagenomics Program, Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America
| | - Konstantinos Mavromatis
- Microbial Genomics and Metagenomics Program, Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America
| | - Nikos C. Kyrpides
- Microbial Genomics and Metagenomics Program, Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America
| | - Natalia N. Ivanova
- Microbial Genomics and Metagenomics Program, Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America
| |
Collapse
|
28
|
Markowitz VM, Chen IMA, Chu K, Szeto E, Palaniappan K, Jacob B, Ratner A, Liolios K, Pagani I, Huntemann M, Mavromatis K, Ivanova NN, Kyrpides NC. IMG/M-HMP: a metagenome comparative analysis system for the Human Microbiome Project. PLoS One 2012; 7:e40151. [PMID: 22792232 PMCID: PMC3390314 DOI: 10.1371/journal.pone.0040151] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2012] [Accepted: 06/01/2012] [Indexed: 11/18/2022] Open
Abstract
The Integrated Microbial Genomes and Metagenomes (IMG/M) resource is a data management system that supports the analysis of sequence data from microbial communities in the integrated context of all publicly available draft and complete genomes from the three domains of life as well as a large number of plasmids and viruses. IMG/M currently contains thousands of genomes and metagenome samples with billions of genes. IMG/M-HMP is an IMG/M data mart serving the US National Institutes of Health (NIH) Human Microbiome Project (HMP), focussed on HMP generated metagenome datasets, and is one of the central resources provided from the HMP Data Analysis and Coordination Center (DACC). IMG/M-HMP is available at http://www.hmpdacc-resources.org/imgm_hmp/.
Collapse
Affiliation(s)
- Victor M. Markowitz
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
- * E-mail: (VMM); (NCK)
| | - I-Min A. Chen
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Ken Chu
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Ernest Szeto
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Krishna Palaniappan
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Biju Jacob
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Anna Ratner
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - Konstantinos Liolios
- Microbial Genomics and Metagenomics Program, Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America
| | - Ioanna Pagani
- Microbial Genomics and Metagenomics Program, Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America
| | - Marcel Huntemann
- Microbial Genomics and Metagenomics Program, Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America
| | - Konstantinos Mavromatis
- Microbial Genomics and Metagenomics Program, Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America
| | - Natalia N. Ivanova
- Microbial Genomics and Metagenomics Program, Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America
| | - Nikos C. Kyrpides
- Microbial Genomics and Metagenomics Program, Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America
- * E-mail: (VMM); (NCK)
| |
Collapse
|
29
|
Markowitz VM, Chen IMA, Palaniappan K, Chu K, Szeto E, Grechkin Y, Ratner A, Jacob B, Huang J, Williams P, Huntemann M, Anderson I, Mavromatis K, Ivanova NN, Kyrpides NC. IMG: the Integrated Microbial Genomes database and comparative analysis system. Nucleic Acids Res 2012; 40:D115-22. [PMID: 22194640 PMCID: PMC3245086 DOI: 10.1093/nar/gkr1044] [Citation(s) in RCA: 910] [Impact Index Per Article: 75.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
The Integrated Microbial Genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG integrates publicly available draft and complete genomes from all three domains of life with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. IMG's data content and analytical capabilities have been continuously extended through regular updates since its first release in March 2005. IMG is available at http://img.jgi.doe.gov. Companion IMG systems provide support for expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er), teaching courses and training in microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu) and analysis of genomes related to the Human Microbiome Project (IMG/HMP: http://www.hmpdacc-resources.org/img_hmp).
Collapse
Affiliation(s)
- Victor M Markowitz
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Pagani I, Liolios K, Jansson J, Chen IMA, Smirnova T, Nosrat B, Markowitz VM, Kyrpides NC. The Genomes OnLine Database (GOLD) v.4: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res 2011; 40:D571-9. [PMID: 22135293 PMCID: PMC3245063 DOI: 10.1093/nar/gkr1100] [Citation(s) in RCA: 375] [Impact Index Per Article: 28.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
The Genomes OnLine Database (GOLD, http://www.genomesonline.org/) is a comprehensive resource for centralized monitoring of genome and metagenome projects worldwide. Both complete and ongoing projects, along with their associated metadata, can be accessed in GOLD through precomputed tables and a search page. As of September 2011, GOLD, now on version 4.0, contains information for 11 472 sequencing projects, of which 2907 have been completed and their sequence data has been deposited in a public repository. Out of these complete projects, 1918 are finished and 989 are permanent drafts. Moreover, GOLD contains information for 340 metagenome studies associated with 1927 metagenome samples. GOLD continues to expand, moving toward the goal of providing the most comprehensive repository of metadata information related to the projects and their organisms/environments in accordance with the Minimum Information about any (x) Sequence specification and beyond.
Collapse
Affiliation(s)
- Ioanna Pagani
- Department of Energy Joint Genome Institute, Microbial Genomics and Metagenomics Program, 2800 Mitchell Drive, Walnut Creek, CA, USA
| | | | | | | | | | | | | | | |
Collapse
|
31
|
Markowitz VM, Chen IMA, Chu K, Szeto E, Palaniappan K, Grechkin Y, Ratner A, Jacob B, Pati A, Huntemann M, Liolios K, Pagani I, Anderson I, Mavromatis K, Ivanova NN, Kyrpides NC. IMG/M: the integrated metagenome data management and comparative analysis system. Nucleic Acids Res 2011; 40:D123-9. [PMID: 22086953 PMCID: PMC3245048 DOI: 10.1093/nar/gkr975] [Citation(s) in RCA: 167] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
The integrated microbial genomes and metagenomes (IMG/M) system provides support for comparative analysis of microbial community aggregate genomes (metagenomes) in a comprehensive integrated context. IMG/M integrates metagenome data sets with isolate microbial genomes from the IMG system. IMG/M's data content and analytical capabilities have been extended through regular updates since its first release in 2007. IMG/M is available at http://img.jgi.doe.gov/m. A companion IMG/M systems provide support for annotation and expert review of unpublished metagenomic data sets (IMG/M ER: http://img.jgi.doe.gov/mer).
Collapse
Affiliation(s)
- Victor M Markowitz
- Biological Data Management and Technology Center, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, California, CA 94702, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Liolios K, Chen IMA, Mavromatis K, Tavernarakis N, Hugenholtz P, Markowitz VM, Kyrpides NC. The Genomes On Line Database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res 2010; 38:D346-54. [PMID: 19914934 PMCID: PMC2808860 DOI: 10.1093/nar/gkp848] [Citation(s) in RCA: 312] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2009] [Accepted: 09/22/2009] [Indexed: 11/14/2022] Open
Abstract
The Genomes On Line Database (GOLD) is a comprehensive resource for centralized monitoring of genome and metagenome projects worldwide. Both complete and ongoing projects, along with their associated metadata, can be accessed in GOLD through precomputed tables and a search page. As of September 2009, GOLD contains information for more than 5800 sequencing projects, of which 1100 have been completed and their sequence data deposited in a public repository. GOLD continues to expand, moving toward the goal of providing the most comprehensive repository of metadata information related to the projects and their organisms/environments in accordance with the Minimum Information about a (Meta)Genome Sequence (MIGS/MIMS) specification. GOLD is available at: http://www.genomesonline.org and has a mirror site at the Institute of Molecular Biology and Biotechnology, Crete, Greece, at: http://gold.imbb.forth.gr/
Collapse
Affiliation(s)
- Konstantinos Liolios
- Genome Biology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, CA, USA, Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology, Heraklion, Crete, Greece and Microbial Ecology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, USA
| | - I-Min A. Chen
- Genome Biology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, CA, USA, Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology, Heraklion, Crete, Greece and Microbial Ecology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, USA
| | - Konstantinos Mavromatis
- Genome Biology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, CA, USA, Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology, Heraklion, Crete, Greece and Microbial Ecology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, USA
| | - Nektarios Tavernarakis
- Genome Biology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, CA, USA, Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology, Heraklion, Crete, Greece and Microbial Ecology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, USA
| | - Philip Hugenholtz
- Genome Biology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, CA, USA, Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology, Heraklion, Crete, Greece and Microbial Ecology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, USA
| | - Victor M. Markowitz
- Genome Biology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, CA, USA, Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology, Heraklion, Crete, Greece and Microbial Ecology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, USA
| | - Nikos C. Kyrpides
- Genome Biology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, CA, USA, Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology, Heraklion, Crete, Greece and Microbial Ecology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, USA
| |
Collapse
|
33
|
Markowitz VM, Chen IMA, Palaniappan K, Chu K, Szeto E, Grechkin Y, Ratner A, Anderson I, Lykidis A, Mavromatis K, Ivanova NN, Kyrpides NC. The integrated microbial genomes system: an expanding comparative analysis resource. Nucleic Acids Res 2009; 38:D382-90. [PMID: 19864254 PMCID: PMC2808961 DOI: 10.1093/nar/gkp887] [Citation(s) in RCA: 214] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. Since its first release in 2005, IMG’s data content and analytical capabilities have been constantly expanded through regular releases. Several companion IMG systems have been set up in order to serve domain specific needs, such as expert review of genome annotations. IMG is available at http://img.jgi.doe.gov.
Collapse
Affiliation(s)
- Victor M Markowitz
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, California, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Mavromatis K, Ivanova NN, Chen IMA, Szeto E, Markowitz VM, Kyrpides NC. The DOE-JGI Standard Operating Procedure for the Annotations of Microbial Genomes. Stand Genomic Sci 2009; 1:63-7. [PMID: 21304638 PMCID: PMC3035208 DOI: 10.4056/sigs.632] [Citation(s) in RCA: 184] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
The DOE-JGI Microbial Annotation Pipeline (DOE-JGI MAP) supports gene prediction and/or functional annotation of microbial genomes towards comparative analysis with the Integrated Microbial Genome (IMG) system. DOE-JGI MAP annotation is applied on nucleotide sequence datasets included in the IMG-ER (Expert Review) version of IMG via the IMG ER submission site. Users can submit the sequence datasets consisting of one or more contigs in a multi-fasta file. DOE-JGI MAP annotation includes prediction of protein coding and RNA genes, as well as repeats and assignment of product names to these genes.
Collapse
|
35
|
Markowitz VM, Mavromatis K, Ivanova NN, Chen IMA, Chu K, Kyrpides NC. IMG ER: a system for microbial genome annotation expert review and curation. ACTA ACUST UNITED AC 2009; 25:2271-8. [PMID: 19561336 DOI: 10.1093/bioinformatics/btp393] [Citation(s) in RCA: 726] [Impact Index Per Article: 48.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
MOTIVATION A rapidly increasing number of microbial genomes are sequenced by organizations worldwide and are eventually included into various public genome data resources. The quality of the annotations depends largely on the original dataset providers, with erroneous or incomplete annotations often carried over into the public resources and difficult to correct. RESULTS We have developed an Expert Review (ER) version of the Integrated Microbial Genomes (IMG) system, with the goal of supporting systematic and efficient revision of microbial genome annotations. IMG ER provides tools for the review and curation of annotations of both new and publicly available microbial genomes within IMG's rich integrated genome framework. New genome datasets are included into IMG ER prior to their public release either with their native annotations or with annotations generated by IMG ER's annotation pipeline. IMG ER tools allow addressing annotation problems detected with IMG's comparative analysis tools, such as genes missed by gene prediction pipelines or genes without an associated function. Over the past year, IMG ER was used for improving the annotations of about 150 microbial genomes.
Collapse
Affiliation(s)
- Victor M Markowitz
- Biological Data Management and Technology Center, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.
| | | | | | | | | | | |
Collapse
|
36
|
Markowitz VM, Ivanova NN, Szeto E, Palaniappan K, Chu K, Dalevi D, Chen IMA, Grechkin Y, Dubchak I, Anderson I, Lykidis A, Mavromatis K, Hugenholtz P, Kyrpides NC. IMG/M: a data management and analysis system for metagenomes. Nucleic Acids Res 2008; 36:D534-8. [PMID: 17932063 PMCID: PMC2238950 DOI: 10.1093/nar/gkm869] [Citation(s) in RCA: 233] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2007] [Revised: 09/22/2007] [Accepted: 09/24/2007] [Indexed: 11/13/2022] Open
Abstract
IMG/M is a data management and analysis system for microbial community genomes (metagenomes) hosted at the Department of Energy's (DOE) Joint Genome Institute (JGI). IMG/M consists of metagenome data integrated with isolate microbial genomes from the Integrated Microbial Genomes (IMG) system. IMG/M provides IMG's comparative data analysis tools extended to handle metagenome data, together with metagenome-specific analysis tools. IMG/M is available at http://img.jgi.doe.gov/m.
Collapse
Affiliation(s)
- Victor M. Markowitz
- Biological Data Management and Technology Center, Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, Department of Energy Joint Genome Institute, Microbial Ecology Program and Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - Natalia N. Ivanova
- Biological Data Management and Technology Center, Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, Department of Energy Joint Genome Institute, Microbial Ecology Program and Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - Ernest Szeto
- Biological Data Management and Technology Center, Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, Department of Energy Joint Genome Institute, Microbial Ecology Program and Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - Krishna Palaniappan
- Biological Data Management and Technology Center, Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, Department of Energy Joint Genome Institute, Microbial Ecology Program and Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - Ken Chu
- Biological Data Management and Technology Center, Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, Department of Energy Joint Genome Institute, Microbial Ecology Program and Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - Daniel Dalevi
- Biological Data Management and Technology Center, Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, Department of Energy Joint Genome Institute, Microbial Ecology Program and Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - I-Min A. Chen
- Biological Data Management and Technology Center, Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, Department of Energy Joint Genome Institute, Microbial Ecology Program and Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - Yuri Grechkin
- Biological Data Management and Technology Center, Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, Department of Energy Joint Genome Institute, Microbial Ecology Program and Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - Inna Dubchak
- Biological Data Management and Technology Center, Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, Department of Energy Joint Genome Institute, Microbial Ecology Program and Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - Iain Anderson
- Biological Data Management and Technology Center, Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, Department of Energy Joint Genome Institute, Microbial Ecology Program and Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - Athanasios Lykidis
- Biological Data Management and Technology Center, Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, Department of Energy Joint Genome Institute, Microbial Ecology Program and Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - Konstantinos Mavromatis
- Biological Data Management and Technology Center, Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, Department of Energy Joint Genome Institute, Microbial Ecology Program and Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - Philip Hugenholtz
- Biological Data Management and Technology Center, Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, Department of Energy Joint Genome Institute, Microbial Ecology Program and Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| | - Nikos C. Kyrpides
- Biological Data Management and Technology Center, Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, Department of Energy Joint Genome Institute, Microbial Ecology Program and Department of Energy Joint Genome Institute, Genome Biology Program, 2800 Mitchell Drive, Walnut Creek, USA
| |
Collapse
|
37
|
Markowitz VM, Szeto E, Palaniappan K, Grechkin Y, Chu K, Chen IMA, Dubchak I, Anderson I, Lykidis A, Mavromatis K, Ivanova NN, Kyrpides NC. The integrated microbial genomes (IMG) system in 2007: data content and analysis tool extensions. Nucleic Acids Res 2007; 36:D528-33. [PMID: 17933782 PMCID: PMC2238897 DOI: 10.1093/nar/gkm846] [Citation(s) in RCA: 166] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
The integrated microbial genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and annotating genomes, genes and functions, individually or in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through quarterly releases. IMG is provided by the DOE-Joint Genome Institute (JGI) and is available from http://img.jgi.doe.gov.
Collapse
Affiliation(s)
- Victor M Markowitz
- Biological Data Management and Technology Center, Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
38
|
|