Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Lobb B, Kurtz DA, Moreno-Hagelsieb G, Doxey AC. Remote homology and the functions of metagenomic dark matter. Front Genet 2015;6:234. [PMID: 26257768 PMCID: PMC4508852 DOI: 10.3389/fgene.2015.00234] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2015] [Accepted: 06/22/2015] [Indexed: 01/26/2023] Open

For:	Lobb B, Kurtz DA, Moreno-Hagelsieb G, Doxey AC. Remote homology and the functions of metagenomic dark matter. Front Genet 2015;6:234. [PMID: 26257768 PMCID: PMC4508852 DOI: 10.3389/fgene.2015.00234] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2015] [Accepted: 06/22/2015] [Indexed: 01/26/2023] Open

Number

Cited by Other Article(s)

Vakirlis N, Kupczok A. Large-scale investigation of species-specific orphan genes in the human gut microbiome elucidates their evolutionary origins. Genome Res 2024;34:888-903. [PMID: 38977308 PMCID: PMC11293555 DOI: 10.1101/gr.278977.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 06/12/2024] [Indexed: 07/10/2024]

Elisée E, Ducrot L, Méheust R, Bastard K, Fossey-Jouenne A, Grogan G, Pelletier E, Petit JL, Stam M, de Berardinis V, Zaparucha A, Vallenet D, Vergne-Vaxelaire C. A refined picture of the native amine dehydrogenase family revealed by extensive biodiversity screening. Nat Commun 2024;15:4933. [PMID: 38858403 PMCID: PMC11164908 DOI: 10.1038/s41467-024-49009-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 05/20/2024] [Indexed: 06/12/2024] Open

Affiliation(s)

Eddy Elisée Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry, France
Laurine Ducrot Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry, France
Raphaël Méheust Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry, France
Karine Bastard School of Pharmacy, Faculty of Medicine and Health, University of Sydney, Sydney, NSW, 2006, Australia
Aurélie Fossey-Jouenne Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry, France
Gideon Grogan York Structural Biology Laboratory, Department of Chemistry, University of York, Heslington, York, YO10 5DD, UK
Eric Pelletier Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry, France
Jean-Louis Petit Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry, France
Mark Stam Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry, France
Véronique de Berardinis Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry, France
Anne Zaparucha Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry, France
David Vallenet Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry, France.
Carine Vergne-Vaxelaire Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry, France.

Collapse

Llinares-López F, Berthet Q, Blondel M, Teboul O, Vert JP. Deep embedding and alignment of protein sequences. Nat Methods 2023;20:104-111. [PMID: 36522501 DOI: 10.1038/s41592-022-01700-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Accepted: 10/24/2022] [Indexed: 12/23/2022]

Escudeiro P, Henry CS, Dias RP. Functional characterization of prokaryotic dark matter: the road so far and what lies ahead. CURRENT RESEARCH IN MICROBIAL SCIENCES 2022;3:100159. [PMID: 36561390 PMCID: PMC9764257 DOI: 10.1016/j.crmicr.2022.100159] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 07/18/2022] [Accepted: 08/05/2022] [Indexed: 12/25/2022] Open

Vanni C, Schechter MS, Acinas SG, Barberán A, Buttigieg PL, Casamayor EO, Delmont TO, Duarte CM, Eren AM, Finn RD, Kottmann R, Mitchell A, Sánchez P, Siren K, Steinegger M, Gloeckner FO, Fernàndez-Guerra A. Unifying the known and unknown microbial coding sequence space. eLife 2022;11:e67667. [PMID: 35356891 PMCID: PMC9132574 DOI: 10.7554/elife.67667] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 03/30/2022] [Indexed: 12/02/2022] Open

Affiliation(s)

Chiara Vanni Microbial Genomics and Bioinformatics Research G, Max Planck Institute for Marine MicrobiologyBremenGermany Jacobs University BremenBremenGermany
Matthew S Schechter Microbial Genomics and Bioinformatics Research G, Max Planck Institute for Marine MicrobiologyBremenGermany Department of Medicine, University of ChicagoChicagoUnited States
Silvia G Acinas Department of Marine Biology and Oceanography, Institut de Ciències del Mar (CSIC)BarcelonaSpain
Albert Barberán Department of Environmental Science, University of ArizonaTucsonUnited States
Pier Luigi Buttigieg Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research, Alfred Wegener InstituteBremerhavenGermany
Emilio O Casamayor Center for Advanced Studies of Blanes CEAB-CSIC, Spanish Council for ResearchBlanesSpain
Tom O Delmont Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-SaclayEvryFrance
Carlos M Duarte Red Sea Research Centre and Computational Bioscience Research Center, King Abdullah University of Science and TechnologyThuwalSaudi Arabia
A Murat Eren Department of Medicine, University of ChicagoChicagoUnited States Josephine Bay Paul Center, Marine Biological LaboratoryWoods HoleUnited States
Robert D Finn European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome CampusHinxtonUnited Kingdom
Renzo Kottmann Microbial Genomics and Bioinformatics Research G, Max Planck Institute for Marine MicrobiologyBremenGermany
Alex Mitchell European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome CampusHinxtonUnited Kingdom
Pablo Sánchez Department of Marine Biology and Oceanography, Institut de Ciències del Mar (CSIC)BarcelonaSpain
Kimmo Siren Section for Evolutionary Genomics, The GLOBE Institute, University of CopenhagenCopenhagenDenmark
Martin Steinegger School of Biological Sciences, Seoul National UniversitySeoulRepublic of Korea Institute of Molecular Biology and Genetics, Seoul National UniversitySeoulRepublic of Korea
Frank Oliver Gloeckner Jacobs University BremenBremenGermany University of Bremen and Life Sciences and ChemistryBremenGermany Computing Center, Helmholtz Center for Polar and Marine ResearchBremerhavenGermany
Antonio Fernàndez-Guerra Microbial Genomics and Bioinformatics Research G, Max Planck Institute for Marine MicrobiologyBremenGermany Lundbeck Foundation GeoGenetics Centre, GLOBE Institute, University of CopenhagenCopenhagenDenmark

Collapse

Functional Characterisation of Bile Metagenome: Study of Metagenomic Dark Matter. Microorganisms 2021;9:microorganisms9112201. [PMID: 34835325 PMCID: PMC8621414 DOI: 10.3390/microorganisms9112201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 10/01/2021] [Accepted: 10/11/2021] [Indexed: 11/16/2022] Open

Lobb B, Tremblay BJM, Moreno-Hagelsieb G, Doxey AC. PathFams: statistical detection of pathogen-associated protein domains. BMC Genomics 2021;22:663. [PMID: 34521345 PMCID: PMC8442362 DOI: 10.1186/s12864-021-07982-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 09/01/2021] [Indexed: 11/10/2022] Open

Castro-Severyn J, Pardo-Esté C, Mendez KN, Fortt J, Marquez S, Molina F, Castro-Nallar E, Remonsellez F, Saavedra CP. Living to the High Extreme: Unraveling the Composition, Structure, and Functional Insights of Bacterial Communities Thriving in the Arsenic-Rich Salar de Huasco Altiplanic Ecosystem. Microbiol Spectr 2021;9:e0044421. [PMID: 34190603 PMCID: PMC8552739 DOI: 10.1128/spectrum.00444-21] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 06/07/2021] [Indexed: 01/03/2023] Open

Abstract

Microbial communities inhabiting extreme environments such as Salar de Huasco (SH) in northern Chile are adapted to thrive while exposed to several abiotic pressures and the presence of toxic elements such as arsenic (As). Hence, we aimed to uncover the role of As in shaping bacterial composition, structure, and functional potential in five different sites in this altiplanic wetland using a shotgun metagenomic approach. The sites exhibit wide gradients of As (9 to 321 mg/kg), and our results showed highly diverse communities and a clear dominance exerted by the Proteobacteria and Bacteroidetes phyla. Functional potential analyses show broadly convergent patterns, contrasting with their great taxonomic variability. As-related metabolism, as well as other functional categories such as those related to the CH4 and S cycles, differs among the five communities. Particularly, we found that the distribution and abundance of As-related genes increase as the As concentration rises. Approximately 75% of the detected genes for As metabolism belong to expulsion mechanisms; arsJ and arsP pumps are related to sites with higher As concentrations and are present almost exclusively in Proteobacteria. Furthermore, taxonomic diversity and functional potential are reflected in the 12 reconstructed high-quality metagenome assembled genomes (MAGs) belonging to the Bacteroidetes (5), Proteobacteria (5), Cyanobacteria (1), and Gemmatimonadetes (1) phyla. We conclude that SH microbial communities are diverse and possess a broad genetic repertoire to thrive under extreme conditions, including increasing concentrations of highly toxic As. Finally, this environment represents a reservoir of unknown and undescribed microorganisms, with great metabolic versatility, which needs further study. IMPORTANCE As microbial communities inhabiting extreme environments are fundamental for maintaining ecosystems, many studies concerning composition, functionality, and interactions have been carried out. However, much is still unknown. Here, we sampled microbial communities in the Salar de Huasco, an extreme environment subjected to several abiotic stresses (high UV radiation, salinity and arsenic; low pressure and temperatures). We found that although microbes are taxonomically diverse, functional potential seems to have an important degree of convergence, suggesting high levels of adaptation. Particularly, arsenic metabolism showed differences associated with increasing concentrations of the metalloid throughout the area, and it effectively exerts a significant pressure over these organisms. Thus, the significance of this research is that we describe highly specialized communities thriving in little-explored environments subjected to several pressures, considered analogous of early Earth and other planets, that have the potential for unraveling technologies to face the repercussions of climate change in many areas of interest.

Collapse

Lobb B, Tremblay BJM, Moreno-Hagelsieb G, Doxey AC. An assessment of genome annotation coverage across the bacterial tree of life. Microb Genom 2020;6. [PMID: 32124724 PMCID: PMC7200070 DOI: 10.1099/mgen.0.000341] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

Abstract

Although gene-finding in bacterial genomes is relatively straightforward, the automated assignment of gene function is still challenging, resulting in a vast quantity of hypothetical sequences of unknown function. But how prevalent are hypothetical sequences across bacteria, what proportion of genes in different bacterial genomes remain unannotated, and what factors affect annotation completeness? To address these questions, we surveyed over 27 000 bacterial genomes from the Genome Taxonomy Database, and measured genome annotation completeness as a function of annotation method, taxonomy, genome size, 'research bias' and publication date. Our analysis revealed that 52 and 79 % of the average bacterial proteome could be functionally annotated based on protein and domain-based homology searches, respectively. Annotation coverage using protein homology search varied significantly from as low as 14 % in some species to as high as 98 % in others. We found that taxonomy is a major factor influencing annotation completeness, with distinct trends observed across the microbial tree (e.g. the lowest level of completeness was found in the Patescibacteria lineage). Most lineages showed a significant association between genome size and annotation incompleteness, likely reflecting a greater degree of uncharacterized sequences in 'accessory' proteomes than in 'core' proteomes. Finally, research bias, as measured by publication volume, was also an important factor influencing genome annotation completeness, with early model organisms showing high completeness levels relative to other genomes in their own taxonomic lineages. Our work highlights the disparity in annotation coverage across the bacterial tree of life and emphasizes a need for more experimental characterization of accessory proteomes as well as understudied lineages.

Collapse

Structure-Based Deep Mining Reveals First-Time Annotations for 46 Percent of the Dark Annotation Space of the 9,671-Member Superproteome of the Nucleocytoplasmic Large DNA Viruses. J Virol 2020;94:JVI.00854-20. [PMID: 32999026 DOI: 10.1128/jvi.00854-20] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 09/16/2020] [Indexed: 12/20/2022] Open

Abstract

We conducted an exhaustive search for three-dimensional structural homologs to the proteins of 20 key phylogenetically distinct nucleocytoplasmic DNA viruses (NCLDV). Structural matches covered 429 known protein domain superfamilies, with the most highly represented being ankyrin repeat, P-loop NTPase, F-box, protein kinase, and membrane occupation and recognition nexus (MORN) repeat. Domain superfamily diversity correlated with genome size, but a diversity of around 200 superfamilies appeared to correlate with an abrupt switch to paralogization. Extensive structural homology was found across the range of eukaryotic RNA polymerase II subunits and their associated basal transcription factors, with the coordinated gain and loss of clusters of subunits on a virus-by-virus basis. The total number of predicted endonucleases across the 20 NCLDV was nearly quadrupled from 36 to 132, covering much of the structural and functional diversity of endonucleases throughout the biosphere in DNA restriction, repair, and homing. Unexpected findings included capsid protein-transcription factor chimeras; endonuclease chimeras; enzymes for detoxification; antimicrobial peptides and toxin-antitoxin systems associated with symbiosis, immunity, and addiction; and novel proteins for membrane abscission and protein turnover.IMPORTANCE We extended the known annotation space for the NCLDV by 46%, revealing high-probability structural matches for fully 45% of the 9,671 query proteins and confirming up to 98% of existing annotations per virus. The most prevalent protein families included ankyrin repeat- and MORN repeat-containing proteins, many of which included an F-box, suggesting extensive host cell modulation among the NCLDV. Regression suggested a minimum requirement for around 36 protein structural superfamilies for a viable NCLDV, and beyond around 200 superfamilies, genome expansion by the acquisition of new functions was abruptly replaced by paralogization. We found homologs to herpesvirus surface glycoprotein gB in cytoplasmic viruses. This study provided the first prediction of an endonuclease in 10 of the 20 viruses examined; the first report in a virus of a phenolic acid decarboxylase, proteasomal subunit, or cysteine knot (defensin) protein; and the first report of a prokaryotic-type ribosomal protein in a eukaryotic virus.

Collapse

Buongermino Pereira M, Österlund T, Eriksson KM, Backhaus T, Axelson-Fisk M, Kristiansson E. A comprehensive survey of integron-associated genes present in metagenomes. BMC Genomics 2020;21:495. [PMID: 32689930 PMCID: PMC7370490 DOI: 10.1186/s12864-020-06830-5] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Accepted: 06/15/2020] [Indexed: 12/19/2022] Open

Abstract

Background

Integrons are genomic elements that mediate horizontal gene transfer by inserting and removing genetic material using site-specific recombination. Integrons are commonly found in bacterial genomes, where they maintain a large and diverse set of genes that plays an important role in adaptation and evolution. Previous studies have started to characterize the wide range of biological functions present in integrons. However, the efforts have so far mainly been limited to genomes from cultivable bacteria and amplicons generated by PCR, thus targeting only a small part of the total integron diversity. Metagenomic data, generated by direct sequencing of environmental and clinical samples, provides a more holistic and unbiased analysis of integron-associated genes. However, the fragmented nature of metagenomic data has previously made such analysis highly challenging.

Results

Here, we present a systematic survey of integron-associated genes in metagenomic data. The analysis was based on a newly developed computational method where integron-associated genes were identified by detecting their associated recombination sites. By processing contiguous sequences assembled from more than 10 terabases of metagenomic data, we were able to identify 13,397 unique integron-associated genes. Metagenomes from marine microbial communities had the highest occurrence of integron-associated genes with levels more than 100-fold higher than in the human microbiome. The identified genes had a large functional diversity spanning over several functional classes. Genes associated with defense mechanisms and mobility facilitators were most overrepresented and more than five times as common in integrons compared to other bacterial genes. As many as two thirds of the genes were found to encode proteins of unknown function. Less than 1% of the genes were associated with antibiotic resistance, of which several were novel, previously undescribed, resistance gene variants.

Conclusions

Our results highlight the large functional diversity maintained by integrons present in unculturable bacteria and significantly expands the number of described integron-associated genes.

Collapse

King CH, Desai H, Sylvetsky AC, LoTempio J, Ayanyan S, Carrie J, Crandall KA, Fochtman BC, Gasparyan L, Gulzar N, Howell P, Issa N, Krampis K, Mishra L, Morizono H, Pisegna JR, Rao S, Ren Y, Simonyan V, Smith K, VedBrat S, Yao MD, Mazumder R. Baseline human gut microbiota profile in healthy people and standard reporting template. PLoS One 2019;14:e0206484. [PMID: 31509535 PMCID: PMC6738582 DOI: 10.1371/journal.pone.0206484] [Citation(s) in RCA: 105] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Accepted: 08/05/2019] [Indexed: 12/19/2022] Open

Abstract

A comprehensive knowledge of the types and ratios of microbes that inhabit the healthy human gut is necessary before any kind of pre-clinical or clinical study can be performed that attempts to alter the microbiome to treat a condition or improve therapy outcome. To address this need we present an innovative scalable comprehensive analysis workflow, a healthy human reference microbiome list and abundance profile (GutFeelingKB), and a novel Fecal Biome Population Report (FecalBiome) with clinical applicability. GutFeelingKB provides a list of 157 organisms (8 phyla, 18 classes, 23 orders, 38 families, 59 genera and 109 species) that forms the baseline biome and therefore can be used as healthy controls for studies related to dysbiosis. This list can be expanded to 863 organisms if closely related proteomes are considered. The incorporation of microbiome science into routine clinical practice necessitates a standard report for comparison of an individual’s microbiome to the growing knowledgebase of “normal” microbiome data. The FecalBiome and the underlying technology of GutFeelingKB address this need. The knowledgebase can be useful to regulatory agencies for the assessment of fecal transplant and other microbiome products, as it contains a list of organisms from healthy individuals. In addition to the list of organisms and their abundances, this study also generated a collection of assembled contiguous sequences (contigs) of metagenomics dark matter. In this study, metagenomic dark matter represents sequences that cannot be mapped to any known sequence but can be assembled into contigs of 10,000 nucleotides or higher. These sequences can be used to create primers to study potential novel organisms. All data is freely available from https://hive.biochemistry.gwu.edu/gfkb and NCBI’s Short Read Archive.

Collapse

Affiliation(s)

Charles H. King The Department of Biochemistry & Molecular Medicine, School of Medicine and Health Sciences, George Washington University Medical Center, Washington, DC, United States of America McCormick Genomic and Proteomic Center, George Washington University, Washington, DC, United States of America
Hiral Desai The Department of Biochemistry & Molecular Medicine, School of Medicine and Health Sciences, George Washington University Medical Center, Washington, DC, United States of America
Allison C. Sylvetsky The Department of Exercise and Nutrition Sciences, Milken Institute School of Public Health, George Washington University, Washington, DC, United States of America
Jonathan LoTempio The Institute for Biomedical Science, School of Medicine and Health Sciences, George Washington University, Washington, DC, United States of America Center for Genetic Medicine, Children’s National Medical Center, George Washington University, Washington, DC, United States of America
Shant Ayanyan The Department of Biochemistry & Molecular Medicine, School of Medicine and Health Sciences, George Washington University Medical Center, Washington, DC, United States of America
Jill Carrie The Department of Biochemistry & Molecular Medicine, School of Medicine and Health Sciences, George Washington University Medical Center, Washington, DC, United States of America
Keith A. Crandall Computational Biology Institute and The Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, George Washington University, Washington, DC, United States of America
Brian C. Fochtman The Department of Biochemistry & Molecular Medicine, School of Medicine and Health Sciences, George Washington University Medical Center, Washington, DC, United States of America
Lusine Gasparyan The Department of Biochemistry & Molecular Medicine, School of Medicine and Health Sciences, George Washington University Medical Center, Washington, DC, United States of America
Naila Gulzar The Department of Biochemistry & Molecular Medicine, School of Medicine and Health Sciences, George Washington University Medical Center, Washington, DC, United States of America
Paul Howell KamTek Inc, Frederick, Maryland, United States of America
Najy Issa The Department of Exercise and Nutrition Sciences, Milken Institute School of Public Health, George Washington University, Washington, DC, United States of America
Konstantinos Krampis Department of Biological Sciences, Hunter College, City University of New York, New York, New York, United States of America
Lopa Mishra Center for Translational Medicine, Department of Surgery, George Washington University, Washington, DC, United States of America
Hiroki Morizono Center for Genetic Medicine, Children’s National Medical Center, George Washington University, Washington, DC, United States of America
Joseph R. Pisegna Division of Gastroenterology and Hepatology VA Greater Los Angeles Healthcare System and Department of Medicine and Human Genetics, University of California, Los Angeles, Los Angeles, California, United States of America
Shuyun Rao Center for Translational Medicine, Department of Surgery, George Washington University, Washington, DC, United States of America
Yao Ren The Department of Biochemistry & Molecular Medicine, School of Medicine and Health Sciences, George Washington University Medical Center, Washington, DC, United States of America
Vahan Simonyan The Department of Biochemistry & Molecular Medicine, School of Medicine and Health Sciences, George Washington University Medical Center, Washington, DC, United States of America
Krista Smith The Department of Biochemistry & Molecular Medicine, School of Medicine and Health Sciences, George Washington University Medical Center, Washington, DC, United States of America
Sharanjit VedBrat KamTek Inc, Frederick, Maryland, United States of America
Michael D. Yao Washington DC VA Medical Center, Gastroenterology & Hepatology Section, Washington, DC, United States of America Department of Medicine, School of Medicine and Health Sciences, George Washington University, Washington, DC, United States of America
Raja Mazumder The Department of Biochemistry & Molecular Medicine, School of Medicine and Health Sciences, George Washington University Medical Center, Washington, DC, United States of America Department of Medicine, School of Medicine and Health Sciences, George Washington University, Washington, DC, United States of America * E-mail:

Collapse

Exploring the Evolution of Virulence Factors through Bioinformatic Data Mining. mSystems 2019;4:4/3/e00162-19. [PMID: 31117023 PMCID: PMC6529551 DOI: 10.1128/msystems.00162-19] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open

Identification of new members of alkaliphilic lipases in archaea and metagenome database using reconstruction of ancestral sequences. 3 Biotech 2019;9:165. [PMID: 30997302 DOI: 10.1007/s13205-019-1693-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2018] [Accepted: 02/27/2019] [Indexed: 10/27/2022] Open

Calderon D, Peña L, Suarez A, Villamil C, Ramirez-Rojas A, Anzola JM, García-Betancur JC, Cepeda ML, Uribe D, Del Portillo P, Mongui A. Recovery and functional validation of hidden soil enzymes in metagenomic libraries. Microbiologyopen 2019;8:e00572. [PMID: 30851083 PMCID: PMC6460280 DOI: 10.1002/mbo3.572] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2017] [Revised: 11/01/2017] [Accepted: 11/09/2017] [Indexed: 11/10/2022] Open

Orphan Genes Shared by Pathogenic Genomes Are More Associated with Bacterial Pathogenicity. mSystems 2019;4:mSystems00290-18. [PMID: 30801025 PMCID: PMC6372840 DOI: 10.1128/msystems.00290-18] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Accepted: 01/08/2019] [Indexed: 11/20/2022] Open

Abstract

Recent pangenome analyses of numerous bacterial species have suggested that each genome of a single species may have a significant fraction of its gene content unique or shared by a very few genomes (i.e., ORFans). We selected nine bacterial genera, each containing at least five pathogenic and five nonpathogenic genomes, to compare their ORFans in relation to pathogenicity-related genes. Pathogens in these genera are known to cause a number of common and devastating human diseases such as pneumonia, diphtheria, melioidosis, and tuberculosis. Thus, they are worthy of in-depth systems microbiology investigations, including the comparative study of ORFans between pathogens and nonpathogens. We provide direct evidence to suggest that ORFans shared by more pathogens are more associated with pathogenicity-related genes and thus are more important targets for development of new diagnostic markers or therapeutic drugs for bacterial infectious diseases.

Orphan genes (also known as ORFans [i.e., orphan open reading frames]) are new genes that enable an organism to adapt to its specific living environment. Our focus in this study is to compare ORFans between pathogens (P) and nonpathogens (NP) of the same genus. Using the pangenome idea, we have identified 130,169 ORFans in nine bacterial genera (505 genomes) and classified these ORFans into four groups: (i) SS-ORFans (P), which are only found in a single pathogenic genome; (ii) SS-ORFans (NP), which are only found in a single nonpathogenic genome; (iii) PS-ORFans (P), which are found in multiple pathogenic genomes; and (iv) NS-ORFans (NP), which are found in multiple nonpathogenic genomes. Within the same genus, pathogens do not always have more genes, more ORFans, or more pathogenicity-related genes (PRGs)—including prophages, pathogenicity islands (PAIs), virulence factors (VFs), and horizontal gene transfers (HGTs)—than nonpathogens. Interestingly, in pathogens of the nine genera, the percentages of PS-ORFans are consistently higher than those of SS-ORFans, which is not true in nonpathogens. Similarly, in pathogens of the nine genera, the percentages of PS-ORFans matching the four types of PRGs are also always higher than those of SS-ORFans, but this is not true in nonpathogens. All of these findings suggest the greater importance of PS-ORFans for bacterial pathogenicity.

IMPORTANCE Recent pangenome analyses of numerous bacterial species have suggested that each genome of a single species may have a significant fraction of its gene content unique or shared by a very few genomes (i.e., ORFans). We selected nine bacterial genera, each containing at least five pathogenic and five nonpathogenic genomes, to compare their ORFans in relation to pathogenicity-related genes. Pathogens in these genera are known to cause a number of common and devastating human diseases such as pneumonia, diphtheria, melioidosis, and tuberculosis. Thus, they are worthy of in-depth systems microbiology investigations, including the comparative study of ORFans between pathogens and nonpathogens. We provide direct evidence to suggest that ORFans shared by more pathogens are more associated with pathogenicity-related genes and thus are more important targets for development of new diagnostic markers or therapeutic drugs for bacterial infectious diseases.

Collapse

Vakirlis N, Hebert AS, Opulente DA, Achaz G, Hittinger CT, Fischer G, Coon JJ, Lafontaine I. A Molecular Portrait of De Novo Genes in Yeasts. Mol Biol Evol 2019;35:631-645. [PMID: 29220506 DOI: 10.1093/molbev/msx315] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

Watson AK, Lannes R, Pathmanathan JS, Méheust R, Karkar S, Colson P, Corel E, Lopez P, Bapteste E. The Methodology Behind Network Thinking: Graphs to Analyze Microbial Complexity and Evolution. Methods Mol Biol 2019;1910:271-308. [PMID: 31278668 DOI: 10.1007/978-1-4939-9074-0_9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Bernard G, Pathmanathan JS, Lannes R, Lopez P, Bapteste E. Microbial Dark Matter Investigations: How Microbial Studies Transform Biological Knowledge and Empirically Sketch a Logic of Scientific Discovery. Genome Biol Evol 2018;10:707-715. [PMID: 29420719 PMCID: PMC5830969 DOI: 10.1093/gbe/evy031] [Citation(s) in RCA: 55] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/05/2018] [Indexed: 02/07/2023] Open

Discovering novel hydrolases from hot environments. Biotechnol Adv 2018;36:2077-2100. [PMID: 30266344 DOI: 10.1016/j.biotechadv.2018.09.004] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Revised: 09/21/2018] [Accepted: 09/24/2018] [Indexed: 12/12/2022]

Keshri V, Panda A, Levasseur A, Rolain JM, Pontarotti P, Raoult D. Phylogenomic Analysis of β-Lactamase in Archaea and Bacteria Enables the Identification of Putative New Members. Genome Biol Evol 2018;10:1106-1114. [PMID: 29672703 PMCID: PMC5905574 DOI: 10.1093/gbe/evy028] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/02/2018] [Indexed: 01/09/2023] Open

Discovery of novel bacterial toxins by genomics and computational biology. Toxicon 2018;147:2-12. [PMID: 29438679 DOI: 10.1016/j.toxicon.2018.02.002] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Revised: 12/23/2017] [Accepted: 02/07/2018] [Indexed: 12/13/2022]

Two fundamentally different classes of microbial genes. Nat Microbiol 2016;2:16208. [PMID: 27819663 DOI: 10.1038/nmicrobiol.2016.208] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2016] [Accepted: 09/20/2016] [Indexed: 01/15/2023]

Lobb B, Doxey AC. Novel function discovery through sequence and structural data mining. Curr Opin Struct Biol 2016;38:53-61. [DOI: 10.1016/j.sbi.2016.05.017] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2016] [Revised: 05/17/2016] [Accepted: 05/24/2016] [Indexed: 01/30/2023]

Neuhaus K, Landstorfer R, Fellner L, Simon S, Schafferhans A, Goldberg T, Marx H, Ozoline ON, Rost B, Kuster B, Keim DA, Scherer S. Translatomics combined with transcriptomics and proteomics reveals novel functional, recently evolved orphan genes in Escherichia coli O157:H7 (EHEC). BMC Genomics 2016;17:133. [PMID: 26911138 PMCID: PMC4765031 DOI: 10.1186/s12864-016-2456-1] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Accepted: 02/09/2016] [Indexed: 12/30/2022] Open

Abstract

Background

Genomes of E. coli, including that of the human pathogen Escherichia coli O157:H7 (EHEC) EDL933, still harbor undetected protein-coding genes which, apparently, have escaped annotation due to their small size and non-essential function. To find such genes, global gene expression of EHEC EDL933 was examined, using strand-specific RNAseq (transcriptome), ribosomal footprinting (translatome) and mass spectrometry (proteome).

Results

Using the above methods, 72 short, non-annotated protein-coding genes were detected. All of these showed signals in the ribosomal footprinting assay indicating mRNA translation. Seven were verified by mass spectrometry. Fifty-seven genes are annotated in other enterobacteriaceae, mainly as hypothetical genes; the remaining 15 genes constitute novel discoveries. In addition, protein structure and function were predicted computationally and compared between EHEC-encoded proteins and 100-times randomly shuffled proteins. Based on this comparison, 61 of the 72 novel proteins exhibit predicted structural and functional features similar to those of annotated proteins. Many of the novel genes show differential transcription when grown under eleven diverse growth conditions suggesting environmental regulation. Three genes were found to confer a phenotype in previous studies, e.g., decreased cattle colonization.

Conclusions

These findings demonstrate that ribosomal footprinting can be used to detect novel protein coding genes, contributing to the growing body of evidence that hypothetical genes are not annotation artifacts and opening an additional way to study their functionality. All 72 genes are taxonomically restricted and, therefore, appear to have evolved relatively recently de novo.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-016-2456-1) contains supplementary material, which is available to authorized users.

Collapse

Affiliation(s)

Klaus Neuhaus Lehrstuhl für Mikrobielle Ökologie, Zentralinstitut für Ernährungs- und Lebensmittelforschung, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.
Richard Landstorfer Lehrstuhl für Mikrobielle Ökologie, Zentralinstitut für Ernährungs- und Lebensmittelforschung, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.
Lea Fellner Lehrstuhl für Mikrobielle Ökologie, Zentralinstitut für Ernährungs- und Lebensmittelforschung, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.
Svenja Simon Lehrstuhl für Datenanalyse und Visualisierung, Fachbereich Informatik und Informationswissenschaft, Universität Konstanz, Box 78, 78457, Konstanz, Germany.
Andrea Schafferhans Department of Informatics - Bioinformatics & TUM-IAS, Technische Universität München, Boltzmannstraße 3, 85748, Garching, Germany.
Tatyana Goldberg Department of Informatics - Bioinformatics & TUM-IAS, Technische Universität München, Boltzmannstraße 3, 85748, Garching, Germany.
Harald Marx Chair of Proteomics and Bioanalytics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Emil-Erlenmeyer-Forum 5, 85354, Freising, Germany.
Olga N Ozoline Institute of Cell Biophysics, Russian Academy of Sciences, Moscow Region, 142290, Pushchino, Russia.
Burkhard Rost Department of Informatics - Bioinformatics & TUM-IAS, Technische Universität München, Boltzmannstraße 3, 85748, Garching, Germany.
Bernhard Kuster Chair of Proteomics and Bioanalytics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Emil-Erlenmeyer-Forum 5, 85354, Freising, Germany. .,Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), Technische Universität München, Gregor-Mendel-Str. 4, 85354, Freising, Germany.
Daniel A Keim Lehrstuhl für Datenanalyse und Visualisierung, Fachbereich Informatik und Informationswissenschaft, Universität Konstanz, Box 78, 78457, Konstanz, Germany.
Siegfried Scherer Lehrstuhl für Mikrobielle Ökologie, Zentralinstitut für Ernährungs- und Lebensmittelforschung, Wissenschaftszentrum Weihenstephan, Technische Universität München, Weihenstephaner Berg 3, 85354, Freising, Germany.

Collapse

Petrenko P, Lobb B, Kurtz DA, Neufeld JD, Doxey AC. MetAnnotate: function-specific taxonomic profiling and comparison of metagenomes. BMC Biol 2015;13:92. [PMID: 26541816 PMCID: PMC4636000 DOI: 10.1186/s12915-015-0195-4] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Accepted: 10/02/2015] [Indexed: 11/13/2022] Open