1
|
Manzano-Morales S, Liu Y, González-Bodí S, Huerta-Cepas J, Iranzo J. Comparison of gene clustering criteria reveals intrinsic uncertainty in pangenome analyses. Genome Biol 2023; 24:250. [PMID: 37904249 PMCID: PMC10614367 DOI: 10.1186/s13059-023-03089-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 10/16/2023] [Indexed: 11/01/2023] Open
Abstract
BACKGROUND A key step for comparative genomics is to group open reading frames into functionally and evolutionarily meaningful gene clusters. Gene clustering is complicated by intraspecific duplications and horizontal gene transfers that are frequent in prokaryotes. In consequence, gene clustering methods must deal with a trade-off between identifying vertically transmitted representatives of multicopy gene families, which are recognizable by synteny conservation, and retrieving complete sets of species-level orthologs. We studied the implications of adopting homology, orthology, or synteny conservation as formal criteria for gene clustering by performing comparative analyses of 125 prokaryotic pangenomes. RESULTS Clustering criteria affect pangenome functional characterization, core genome inference, and reconstruction of ancestral gene content to different extents. Species-wise estimates of pangenome and core genome sizes change by the same factor when using different clustering criteria, allowing robust cross-species comparisons regardless of the clustering criterion. However, cross-species comparisons of genome plasticity and functional profiles are substantially affected by inconsistencies among clustering criteria. Such inconsistencies are driven not only by mobile genetic elements, but also by genes involved in defense, secondary metabolism, and other accessory functions. In some pangenome features, the variability attributed to methodological inconsistencies can even exceed the effect sizes of ecological and phylogenetic variables. CONCLUSIONS Choosing an appropriate criterion for gene clustering is critical to conduct unbiased pangenome analyses. We provide practical guidelines to choose the right method depending on the research goals and the quality of genome assemblies, and a benchmarking dataset to assess the robustness and reproducibility of future comparative studies.
Collapse
Affiliation(s)
- Saioa Manzano-Morales
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
- Barcelona Supercomputing Centre (BSC-CNS) - Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Yang Liu
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
- Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Centre, South China Agricultural University, Guangzhou, China
| | - Sara González-Bodí
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
| | - Jaime Huerta-Cepas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain.
| | - Jaime Iranzo
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain.
- Institute for Biocomputation and Physics of Complex Systems (BIFI), University of Zaragoza, Zaragoza, Spain.
| |
Collapse
|
2
|
Wang L, Wang Y, Huang X, Ma R, Li J, Wang F, Jiao N, Zhang R. Potential metabolic and genetic interaction among viruses, methanogen and methanotrophic archaea, and their syntrophic partners. ISME COMMUNICATIONS 2022; 2:50. [PMID: 37938729 PMCID: PMC9723712 DOI: 10.1038/s43705-022-00135-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 05/30/2022] [Accepted: 06/15/2022] [Indexed: 04/27/2023]
Abstract
The metabolism of methane in anoxic ecosystems is mainly mediated by methanogens and methane-oxidizing archaea (MMA), key players in global carbon cycling. Viruses are vital in regulating their host fate and ecological function. However, our knowledge about the distribution and diversity of MMA viruses and their interactions with hosts is rather limited. Here, by searching metagenomes containing mcrA (the gene coding for the α-subunit of methyl-coenzyme M reductase) from a wide variety of environments, 140 viral operational taxonomic units (vOTUs) that potentially infect methanogens or methane-oxidizing archaea were retrieved. Four MMA vOTUs (three infecting the order Methanobacteriales and one infecting the order Methanococcales) were predicted to cross-domain infect sulfate-reducing bacteria. By facilitating assimilatory sulfur reduction, MMA viruses may increase the fitness of their hosts in sulfate-depleted anoxic ecosystems and benefit from synthesis of the sulfur-containing amino acid cysteine. Moreover, cell-cell aggregation promoted by MMA viruses may be beneficial for both the viruses and their hosts by improving infectivity and environmental stress resistance, respectively. Our results suggest a potential role of viruses in the ecological and environmental adaptation of methanogens and methane-oxidizing archaea.
Collapse
Affiliation(s)
- Long Wang
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, China
- Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, China
| | - Yinzhao Wang
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Xingyu Huang
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, China
| | - Ruijie Ma
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, China
| | - Jiangtao Li
- State Key Laboratory of Marine Geology, Tongji University, Shanghai, China
| | - Fengping Wang
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Nianzhi Jiao
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, China
| | - Rui Zhang
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, China.
- Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, China.
| |
Collapse
|
3
|
Foorthuis R. On the nature and types of anomalies: a review of deviations in data. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS 2021; 12:297-331. [PMID: 34368422 PMCID: PMC8331998 DOI: 10.1007/s41060-021-00265-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2020] [Accepted: 05/17/2021] [Indexed: 02/07/2023]
Abstract
Anomalies are occurrences in a dataset that are in some way unusual and do not fit the general patterns. The concept of the anomaly is typically ill defined and perceived as vague and domain-dependent. Moreover, despite some 250 years of publications on the topic, no comprehensive and concrete overviews of the different types of anomalies have hitherto been published. By means of an extensive literature review this study therefore offers the first theoretically principled and domain-independent typology of data anomalies and presents a full overview of anomaly types and subtypes. To concretely define the concept of the anomaly and its different manifestations, the typology employs five dimensions: data type, cardinality of relationship, anomaly level, data structure, and data distribution. These fundamental and data-centric dimensions naturally yield 3 broad groups, 9 basic types, and 63 subtypes of anomalies. The typology facilitates the evaluation of the functional capabilities of anomaly detection algorithms, contributes to explainable data science, and provides insights into relevant topics such as local versus global anomalies.
Collapse
|
4
|
Blombach F, Matelska D, Fouqueau T, Cackett G, Werner F. Key Concepts and Challenges in Archaeal Transcription. J Mol Biol 2019; 431:4184-4201. [PMID: 31260691 DOI: 10.1016/j.jmb.2019.06.020] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Revised: 06/18/2019] [Accepted: 06/20/2019] [Indexed: 12/17/2022]
Abstract
Transcription is enabled by RNA polymerase and general factors that allow its progress through the transcription cycle by facilitating initiation, elongation and termination. The transitions between specific stages of the transcription cycle provide opportunities for the global and gene-specific regulation of gene expression. The exact mechanisms and the extent to which the different steps of transcription are exploited for regulation vary between the domains of life, individual species and transcription units. However, a surprising degree of conservation is apparent. Similar key steps in the transcription cycle can be targeted by homologous or unrelated factors providing insights into the mechanisms of RNAP and the evolution of the transcription machinery. Archaea are bona fide prokaryotes but employ a eukaryote-like transcription system to express the information of bacteria-like genomes. Thus, archaea provide the means not only to study transcription mechanisms of interesting model systems but also to test key concepts of regulation in this arena. In this review, we discuss key principles of archaeal transcription, new questions that still await experimental investigation, and how novel integrative approaches hold great promise to fill this gap in our knowledge.
Collapse
Affiliation(s)
- Fabian Blombach
- Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London, WC1E 6BT, United Kingdom.
| | - Dorota Matelska
- Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London, WC1E 6BT, United Kingdom
| | - Thomas Fouqueau
- Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London, WC1E 6BT, United Kingdom
| | - Gwenny Cackett
- Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London, WC1E 6BT, United Kingdom
| | - Finn Werner
- Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London, WC1E 6BT, United Kingdom.
| |
Collapse
|
5
|
Boyd JA, Jungbluth SP, Leu AO, Evans PN, Woodcroft BJ, Chadwick GL, Orphan VJ, Amend JP, Rappé MS, Tyson GW. Divergent methyl-coenzyme M reductase genes in a deep-subseafloor Archaeoglobi. THE ISME JOURNAL 2019; 13:1269-1279. [PMID: 30651609 PMCID: PMC6474303 DOI: 10.1038/s41396-018-0343-2] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/07/2018] [Revised: 11/29/2018] [Accepted: 12/11/2018] [Indexed: 12/28/2022]
Abstract
The methyl-coenzyme M reductase (MCR) complex is a key enzyme in archaeal methane generation and has recently been proposed to also be involved in the oxidation of short-chain hydrocarbons including methane, butane, and potentially propane. The number of archaeal clades encoding the MCR continues to grow, suggesting that this complex was inherited from an ancient ancestor, or has undergone extensive horizontal gene transfer. Expanding the representation of MCR-encoding lineages through metagenomic approaches will help resolve the evolutionary history of this complex. Here, a near-complete Archaeoglobi metagenome-assembled genome (MAG; Ca. Polytropus marinifundus gen. nov. sp. nov.) was recovered from the deep subseafloor along the Juan de Fuca Ridge flank that encodes two divergent McrABG operons similar to those found in Ca. Bathyarchaeota and Ca. Syntrophoarchaeum MAGs. Ca. P. marinifundus is basal to members of the class Archaeoglobi, and encodes the genes for β-oxidation, potentially allowing an alkanotrophic metabolism similar to that proposed for Ca. Syntrophoarchaeum. Ca. P. marinifundus also encodes a respiratory electron transport chain that can potentially utilize nitrate, iron, and sulfur compounds as electron acceptors. Phylogenetic analysis suggests that the Ca. P. marinifundus MCR operons were horizontally transferred, changing our understanding of the evolution and distribution of this complex in the Archaea.
Collapse
Affiliation(s)
- Joel A Boyd
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD, Australia
| | - Sean P Jungbluth
- Center for Dark Energy Biosphere Investigations, University of Southern California, Los Angeles, CA, USA
- Department of Energy, Joint Genome Institute, Walnut Creek, CA, USA
| | - Andy O Leu
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD, Australia
| | - Paul N Evans
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD, Australia
| | - Ben J Woodcroft
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD, Australia
| | - Grayson L Chadwick
- Division of Geological and Planetary Sciences, California Institute of Technology, Pasadena, CA, USA
| | - Victoria J Orphan
- Division of Geological and Planetary Sciences, California Institute of Technology, Pasadena, CA, USA
| | - Jan P Amend
- Departments of Earth Sciences and Biological Sciences, University of Southern California, Los Angeles, CA, USA
| | - Michael S Rappé
- Hawaii Institute of Marine Biology, University of Hawaii at Manoa, Kaneohe, HI, USA
| | - Gene W Tyson
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD, Australia.
| |
Collapse
|
6
|
An evolving view of methane metabolism in the Archaea. Nat Rev Microbiol 2019; 17:219-232. [DOI: 10.1038/s41579-018-0136-7] [Citation(s) in RCA: 231] [Impact Index Per Article: 38.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2017] [Accepted: 11/26/2018] [Indexed: 11/08/2022]
|
7
|
Wolf YI, Kazlauskas D, Iranzo J, Lucía-Sanz A, Kuhn JH, Krupovic M, Dolja VV, Koonin EV. Origins and Evolution of the Global RNA Virome. mBio 2018; 9:e02329-18. [PMID: 30482837 PMCID: PMC6282212 DOI: 10.1128/mbio.02329-18] [Citation(s) in RCA: 334] [Impact Index Per Article: 47.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Accepted: 10/31/2018] [Indexed: 01/12/2023] Open
Abstract
Viruses with RNA genomes dominate the eukaryotic virome, reaching enormous diversity in animals and plants. The recent advances of metaviromics prompted us to perform a detailed phylogenomic reconstruction of the evolution of the dramatically expanded global RNA virome. The only universal gene among RNA viruses is the gene encoding the RNA-dependent RNA polymerase (RdRp). We developed an iterative computational procedure that alternates the RdRp phylogenetic tree construction with refinement of the underlying multiple-sequence alignments. The resulting tree encompasses 4,617 RNA virus RdRps and consists of 5 major branches; 2 of the branches include positive-sense RNA viruses, 1 is a mix of positive-sense (+) RNA and double-stranded RNA (dsRNA) viruses, and 2 consist of dsRNA and negative-sense (-) RNA viruses, respectively. This tree topology implies that dsRNA viruses evolved from +RNA viruses on at least two independent occasions, whereas -RNA viruses evolved from dsRNA viruses. Reconstruction of RNA virus evolution using the RdRp tree as the scaffold suggests that the last common ancestors of the major branches of +RNA viruses encoded only the RdRp and a single jelly-roll capsid protein. Subsequent evolution involved independent capture of additional genes, in particular, those encoding distinct RNA helicases, enabling replication of larger RNA genomes and facilitating virus genome expression and virus-host interactions. Phylogenomic analysis reveals extensive gene module exchange among diverse viruses and horizontal virus transfer between distantly related hosts. Although the network of evolutionary relationships within the RNA virome is bound to further expand, the present results call for a thorough reevaluation of the RNA virus taxonomy.IMPORTANCE The majority of the diverse viruses infecting eukaryotes have RNA genomes, including numerous human, animal, and plant pathogens. Recent advances of metagenomics have led to the discovery of many new groups of RNA viruses in a wide range of hosts. These findings enable a far more complete reconstruction of the evolution of RNA viruses than was attainable previously. This reconstruction reveals the relationships between different Baltimore classes of viruses and indicates extensive transfer of viruses between distantly related hosts, such as plants and animals. These results call for a major revision of the existing taxonomy of RNA viruses.
Collapse
Affiliation(s)
- Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | - Darius Kazlauskas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
- Département de Microbiologie, Institut Pasteur, Paris, France
| | - Jaime Iranzo
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | - Adriana Lucía-Sanz
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
- Centro Nacional de Biotecnología, Madrid, Spain
| | - Jens H Kuhn
- Integrated Research Facility at Fort Detrick, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Frederick, Maryland, USA
| | - Mart Krupovic
- Département de Microbiologie, Institut Pasteur, Paris, France
| | - Valerian V Dolja
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, Oregon, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| |
Collapse
|
8
|
Ferrer M, Sorokin DY, Wolf YI, Ciordia S, Mena MC, Bargiela R, Koonin EV, Makarova KS. Proteomic Analysis of Methanonatronarchaeum thermophilum AMET1, a Representative of a Putative New Class of Euryarchaeota, "Methanonatronarchaeia". Genes (Basel) 2018; 9:E28. [PMID: 29360740 PMCID: PMC5852551 DOI: 10.3390/genes9020028] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Revised: 01/09/2018] [Accepted: 01/10/2018] [Indexed: 01/22/2023] Open
Abstract
The recently discovered Methanonatronarchaeia are extremely halophilic and moderately thermophilic methyl-reducing methanogens representing a novel class-level lineage in the phylum Euryarchaeota related to the class Halobacteria. Here we present a detailed analysis of 1D-nano liquid chromatography-electrospray ionization tandem mass spectrometry data obtained for "Methanonatronarchaeum thermophilum" AMET1 grown in different physiological conditions, including variation of the growth temperature and substrates. Analysis of these data allows us to refine the current understanding of the key biosynthetic pathways of this triple extremophilic methanogenic euryarchaeon and identify proteins that are likely to be involved in its response to growth condition changes.
Collapse
Affiliation(s)
| | - Dimitry Y Sorokin
- Winogradsky Institute of Microbiology, Research Centre for Biotechnology, Russian Academy of Sciences, Prospect 60-let Octyabrya 7/2, 117312 Moscow, Russia.
- Department of Biotechnology, Delft University of Technology, van der Maasweg 9, 2629 HZ Delft, The Netherlands.
| | - Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| | - Sergio Ciordia
- Proteomics Facility, Centro Nacional de Biotecnología, CSIC, 28049 Madrid, Spain.
| | - María C Mena
- Proteomics Facility, Centro Nacional de Biotecnología, CSIC, 28049 Madrid, Spain.
| | | | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| | - Kira S Makarova
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| |
Collapse
|