1
|
Konkel Z, Kubatko L, Slot JC. CLOCI: unveiling cryptic fungal gene clusters with generalized detection. Nucleic Acids Res 2024; 52:e75. [PMID: 39016185 PMCID: PMC11381361 DOI: 10.1093/nar/gkae625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 07/01/2024] [Accepted: 07/10/2024] [Indexed: 07/18/2024] Open
Abstract
Gene clusters are genomic loci that contain multiple genes that are functionally and genetically linked. Gene clusters collectively encode diverse functions, including small molecule biosynthesis, nutrient assimilation, metabolite degradation, and production of proteins essential for growth and development. Identifying gene clusters is a powerful tool for small molecule discovery and provides insight into the ecology and evolution of organisms. Current detection algorithms focus on canonical 'core' biosynthetic functions many gene clusters encode, while overlooking uncommon or unknown cluster classes. These overlooked clusters are a potential source of novel natural products and comprise an untold portion of overall gene cluster repertoires. Unbiased, function-agnostic detection algorithms therefore provide an opportunity to reveal novel classes of gene clusters and more precisely define genome organization. We present CLOCI (Co-occurrence Locus and Orthologous Cluster Identifier), an algorithm that identifies gene clusters using multiple proxies of selection for coordinated gene evolution. Our approach generalizes gene cluster detection and gene cluster family circumscription, improves detection of multiple known functional classes, and unveils non-canonical gene clusters. CLOCI is suitable for genome-enabled small molecule mining, and presents an easily tunable approach for delineating gene cluster families and homologous loci.
Collapse
Affiliation(s)
- Zachary Konkel
- Department of Plant Pathology, The Ohio State University, Columbus, OH 43210, USA
- Center for Applied Plant Sciences, The Ohio State University, Columbus, OH 43210, USA
| | - Laura Kubatko
- Department of Ecology and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA
- Department of Statistics, The Ohio State University, Columbus, OH 43210, USA
| | - Jason C Slot
- Department of Plant Pathology, The Ohio State University, Columbus, OH 43210, USA
- Center for Applied Plant Sciences, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
2
|
Iturbe P, Martín AS, Hamamoto H, Marcet-Houben M, Galbaldón T, Solano C, Lasa I. Noncontiguous operon atlas for the Staphylococcus aureus genome. MICROLIFE 2024; 5:uqae007. [PMID: 38651166 PMCID: PMC11034616 DOI: 10.1093/femsml/uqae007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Revised: 03/20/2024] [Accepted: 04/08/2024] [Indexed: 04/25/2024]
Abstract
Bacteria synchronize the expression of genes with related functions by organizing genes into operons so that they are cotranscribed together in a single polycistronic messenger RNA. However, some cellular processes may benefit if the simultaneous production of the operon proteins coincides with the inhibition of the expression of an antagonist gene. To coordinate such situations, bacteria have evolved noncontiguous operons (NcOs), a subtype of operons that contain one or more genes that are transcribed in the opposite direction to the other operon genes. This structure results in overlapping transcripts whose expression is mutually repressed. The presence of NcOs cannot be predicted computationally and their identification requires a detailed knowledge of the bacterial transcriptome. In this study, we used direct RNA sequencing methodology to determine the NcOs map in the Staphylococcus aureus genome. We detected the presence of 18 NcOs in the genome of S. aureus and four in the genome of the lysogenic prophage 80α. The identified NcOs comprise genes involved in energy metabolism, metal acquisition and transport, toxin-antitoxin systems, and control of the phage life cycle. Using the menaquinone operon as a proof of concept, we show that disarrangement of the NcO architecture results in a reduction of bacterial fitness due to an increase in menaquinone levels and a decrease in the rate of oxygen consumption. Our study demonstrates the significance of NcO structures in bacterial physiology and emphasizes the importance of combining operon maps with transcriptomic data to uncover previously unnoticed functional relationships between neighbouring genes.
Collapse
Affiliation(s)
- Pablo Iturbe
- Laboratory of Microbial Pathogenesis, Navarrabiomed-Universidad Pública de Navarra (UPNA)-Hospital Universitario de Navarra (HUN), IdiSNA, Irunlarrea 3, Pamplona, 31008 Navarra, Spain
| | - Alvaro San Martín
- Laboratory of Microbial Pathogenesis, Navarrabiomed-Universidad Pública de Navarra (UPNA)-Hospital Universitario de Navarra (HUN), IdiSNA, Irunlarrea 3, Pamplona, 31008 Navarra, Spain
| | - Hiroshi Hamamoto
- Faculty of Medicine, Department of Infectious diseases, Yamagata University, 2-2-2 Lida-Nishi, 990-9585 Yamagata, Japan
| | - Marina Marcet-Houben
- Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain
| | - Toni Galbaldón
- Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain
- Catalan Institution for Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
- CIBER de Enfermedades Infecciosas, Instituto de Salud Carlos III, 28029 Madrid, Spain
| | - Cristina Solano
- Laboratory of Microbial Pathogenesis, Navarrabiomed-Universidad Pública de Navarra (UPNA)-Hospital Universitario de Navarra (HUN), IdiSNA, Irunlarrea 3, Pamplona, 31008 Navarra, Spain
| | - Iñigo Lasa
- Laboratory of Microbial Pathogenesis, Navarrabiomed-Universidad Pública de Navarra (UPNA)-Hospital Universitario de Navarra (HUN), IdiSNA, Irunlarrea 3, Pamplona, 31008 Navarra, Spain
| |
Collapse
|
3
|
Marcet-Houben M, Cruz F, Gómez-Garrido J, Alioto TS, Nunez-Rodriguez JC, Mesanza N, Gut M, Iturritxa E, Gabaldon T. Genomics of the expanding pine pathogen Lecanosticta acicola reveals patterns of ongoing genetic admixture. mSystems 2024; 9:e0092823. [PMID: 38364101 PMCID: PMC10949461 DOI: 10.1128/msystems.00928-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 01/09/2024] [Indexed: 02/18/2024] Open
Abstract
Lecanosticta acicola is the causal agent for brown spot needle blight that affects pine trees across the northern hemisphere. Based on marker genes and microsatellite data, two distinct lineages have been identified that were introduced into Europe on two separate occasions. Despite their overall distinct geographic distribution, they have been found to coexist in regions of northern Spain and France. Here, we present the first genome-wide study of Lecanosticta acicola, including assembly of the reference genome and a population genomics analysis of 70 natural isolates from northern Spain. We show that most of the isolates belong to the southern lineage but show signs of introgression with northern lineage isolates, indicating mating between the two lineages. We also identify phenotypic differences between the two lineages based on the activity profiles of 20 enzymes, with introgressed strains being more phenotypically similar to members of the southern lineage. In conclusion, we show undergoing genetic admixture between the two main lineages of L. acicola in a region of recent expansion. IMPORTANCE Lecanosticta acicola is a fungal pathogen causing severe defoliation, growth reduction, and even death in more than 70 conifer species. Despite the increasing incidence of this species, little is known about its population dynamics. Two divergent lineages have been described that have now been found together in regions of France and Spain, but it is unknown how these mixed populations evolve. Here we present the first reference genome for this important plant pathogenic fungi and use it to study the population genomics of 70 isolates from an affected forest in the north of Spain. We find signs of introgression between the two main lineages, indicating that active mating is occurring in this region which could propitiate the appearance of novel traits in this species. We also study the phenotypic differences across this population based on enzymatic activities on 20 compounds.
Collapse
Affiliation(s)
- Marina Marcet-Houben
- Barcelona Supercomputing Centre (BSC-CNS), Barcelona, Spain
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Centro de Investigación Biomédica En Red de Enfermedades Infecciosas (CIBERINFEC), Barcelona, Spain
| | - Fernando Cruz
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Jéssica Gómez-Garrido
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Tyler S. Alioto
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Juan Carlos Nunez-Rodriguez
- Barcelona Supercomputing Centre (BSC-CNS), Barcelona, Spain
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Nebai Mesanza
- Instituto Vasco de Investigación y Desarrollo Agrario (BRTA), Arkaute, Araba, Spain
| | - Marta Gut
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Eugenia Iturritxa
- Instituto Vasco de Investigación y Desarrollo Agrario (BRTA), Arkaute, Araba, Spain
| | - Toni Gabaldon
- Barcelona Supercomputing Centre (BSC-CNS), Barcelona, Spain
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Centro de Investigación Biomédica En Red de Enfermedades Infecciosas (CIBERINFEC), Barcelona, Spain
- Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain
| |
Collapse
|
4
|
Sperschneider J, Yildirir G, Rizzi YS, Malar C M, Mayrand Nicol A, Sorwar E, Villeneuve-Laroche M, Chen ECH, Iwasaki W, Brauer EK, Bosnich W, Gutjahr C, Corradi N. Arbuscular mycorrhizal fungi heterokaryons have two nuclear populations with distinct roles in host-plant interactions. Nat Microbiol 2023; 8:2142-2153. [PMID: 37884816 DOI: 10.1038/s41564-023-01495-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 09/11/2023] [Indexed: 10/28/2023]
Abstract
Arbuscular mycorrhizal fungi (AMF) are prominent root symbionts that can carry thousands of nuclei deriving from two parental strains in a large syncytium. These co-existing genomes can also vary in abundance with changing environmental conditions. Here we assemble the nuclear genomes of all four publicly available AMF heterokaryons using PacBio high-fidelity and Hi-C sequencing. We find that the two co-existing genomes of these strains are phylogenetically related but differ in structure, content and epigenetics. We confirm that AMF heterokaryon genomes vary in relative abundance across conditions and show this can lead to nucleus-specific differences in expression during interactions with plants. Population analyses also reveal signatures of genetic exchange indicative of past events of sexual reproduction in these strains. This work uncovers the origin and contribution of two nuclear genomes in AMF heterokaryons and opens avenues for the improvement and environmental application of these strains.
Collapse
Affiliation(s)
- Jana Sperschneider
- Black Mountain Science and Innovation Park, CSIRO Agriculture and Food, Canberra, Australian Capital Territory, Australia
| | - Gokalp Yildirir
- Department of Biology, University of Ottawa, Ottawa, Ontario, Canada
| | - Yanina S Rizzi
- Plant Genetics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
- Max-Planck-Institute of Molecular Plant Physiology, Potsdam-Golm, Germany
| | - Mathu Malar C
- Department of Biology, University of Ottawa, Ottawa, Ontario, Canada
| | | | - Essam Sorwar
- Department of Biology, University of Ottawa, Ottawa, Ontario, Canada
| | | | - Eric C H Chen
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Wataru Iwasaki
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Elizabeth K Brauer
- Department of Biology, University of Ottawa, Ottawa, Ontario, Canada
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, Ontario, Canada
| | - Whynn Bosnich
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, Ontario, Canada
| | - Caroline Gutjahr
- Plant Genetics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
- Max-Planck-Institute of Molecular Plant Physiology, Potsdam-Golm, Germany
| | - Nicolas Corradi
- Department of Biology, University of Ottawa, Ottawa, Ontario, Canada.
| |
Collapse
|
5
|
Marcet-Houben M, Collado-Cala I, Fuentes-Palacios D, Gómez AD, Molina M, Garisoain-Zafra A, Chorostecki U, Gabaldón T. EvolClustDB: Exploring Eukaryotic Gene Clusters with Evolutionarily Conserved Genomic Neighbourhoods. J Mol Biol 2023:168013. [PMID: 36806474 DOI: 10.1016/j.jmb.2023.168013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 01/24/2023] [Accepted: 02/11/2023] [Indexed: 02/17/2023]
Abstract
Conservation of gene neighbourhood over evolutionary distances is generally indicative of shared regulation or functional association among genes. This concept has been broadly exploited in prokaryotes but its use on eukaryotic genomes has been limited to specific functional classes, such as biosynthetic gene clusters. We here used an evolutionary-based gene cluster discovery algorithm (EvolClust) to pre-compute evolutionarily conserved gene neighbourhoods, which can be searched, browsed and downloaded in EvolClustDB. We inferred ∼35,000 cluster families in 882 different species in genome comparisons of five taxonomically broad clades: Fungi, Plants, Metazoans, Insects and Protists. EvolClustDB allows browsing through the cluster families, as well as searching by protein, species, identifier or sequence. Visualization allows inspecting gene order per species in a phylogenetic context, so that relevant evolutionary events such as gain, loss or transfer, can be inferred. EvolClustDB is freely available, without registration, at http://evolclustdb.org/.
Collapse
Affiliation(s)
- Marina Marcet-Houben
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain
| | - Ismael Collado-Cala
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain
| | - Diego Fuentes-Palacios
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain
| | - Alicia D Gómez
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain
| | - Manuel Molina
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain
| | - Andrés Garisoain-Zafra
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain
| | - Uciel Chorostecki
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain
| | - Toni Gabaldón
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain; Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain; Centro de Investigación Biomédica En Red de Enfermedades Infecciosas (CIBERINFEC), Barcelona, Spain.
| |
Collapse
|
6
|
Robert NSM, Sarigol F, Zieger E, Simakov O. SYNPHONI: scale-free and phylogeny-aware reconstruction of synteny conservation and transformation across animal genomes. Bioinformatics 2022; 38:5434-5436. [PMID: 36269177 PMCID: PMC9750109 DOI: 10.1093/bioinformatics/btac695] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 09/24/2022] [Accepted: 10/19/2022] [Indexed: 12/25/2022] Open
Abstract
SUMMARY Current approaches detect conserved genomic order either at chromosomal (macrosynteny) or at subchromosomal scales (microsynteny). The latter generally requires collinearity and hard thresholds on syntenic region size, thus excluding a major proportion of syntenies with recent expansions or minor rearrangements. 'SYNPHONI' bridges the gap between micro- and macrosynteny detection, providing detailed information on both synteny conservation and transformation throughout the evolutionary history of animal genomes. AVAILABILITY AND IMPLEMENTATION Source code is freely available at https://github.com/nsmro/SYNPHONI, implemented in Python 3.9. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Fatih Sarigol
- Department of Neurosciences and Developmental Biology, University of Vienna, Vienna A-1030, Austria
| | - Elisabeth Zieger
- Department of Evolutionary Biology, University of Vienna, Vienna A-1030, Austria
| | | |
Collapse
|
7
|
Schall PZ, Latham KE. Cross-species meta-analysis of transcriptome changes during the morula-to-blastocyst transition: metabolic and physiological changes take center stage. Am J Physiol Cell Physiol 2021; 321:C913-C931. [PMID: 34669511 DOI: 10.1152/ajpcell.00318.2021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
The morula-to-blastocyst transition (MBT) culminates with formation of inner cell mass (ICM) and trophectoderm (TE) lineages. Recent studies identified signaling pathways driving lineage specification, but some features of these pathways display significant species divergence. To better understand evolutionary conservation of the MBT, we completed a meta-analysis of RNA sequencing data from five model species and ICMTE differences from four species. Although many genes change in expression during the MBT within any given species, the number of shared differentially expressed genes (DEGs) is comparatively small, and the number of shared ICMTE DEGs is even smaller. DEGs related to known lineage determining pathways (e.g., POU5F1) are seen, but the most prominent pathways and functions associated with shared DEGs or shared across individual species DEG lists impact basic physiological and metabolic activities, such as TCA cycle, unfolded protein response, oxidative phosphorylation, sirtuin signaling, mitotic roles of polo-like kinases, NRF2-mediated oxidative stress, estrogen receptor signaling, apoptosis, necrosis, lipid and fatty acid metabolism, cholesterol biosynthesis, endocytosis, AMPK signaling, homeostasis, transcription, and cell death. We also observed prominent differences in transcriptome regulation between ungulates and nonungulates, particularly for ICM- and TE-enhanced mRNAs. These results extend our understanding of shared mechanisms of the MBT and formation of the ICM and TE and should better inform the selection of model species for particular applications.
Collapse
Affiliation(s)
- Peter Z Schall
- Department of Animal Science, Michigan State University, East Lansing, Michigan.,Reproductive and Developmental Sciences Program, Michigan State University, East Lansing, Michigan.,Comparative Medicine and Integrative Biology Program, Michigan State University, East Lansing, Michigan
| | - Keith E Latham
- Department of Animal Science, Michigan State University, East Lansing, Michigan.,Reproductive and Developmental Sciences Program, Michigan State University, East Lansing, Michigan.,Department of Obstetrics, Gynecology, & Reproductive Biology, Michigan State University, East Lansing, Michigan
| |
Collapse
|
8
|
Foflonker F, Blaby-Haas CE. Colocality to Cofunctionality: Eukaryotic Gene Neighborhoods as a Resource for Function Discovery. Mol Biol Evol 2021; 38:650-662. [PMID: 32886760 PMCID: PMC7826186 DOI: 10.1093/molbev/msaa221] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Diverging from the classic paradigm of random gene order in eukaryotes, gene proximity can be leveraged to systematically identify functionally related gene neighborhoods in eukaryotes, utilizing techniques pioneered in bacteria. Current methods of identifying gene neighborhoods typically rely on sequence similarity to characterized gene products. However, this approach is not robust for nonmodel organisms like algae, which are evolutionarily distant from well-characterized model organisms. Here, we utilize a comparative genomic approach to identify evolutionarily conserved proximal orthologous gene pairs conserved across at least two taxonomic classes of green algae. A total of 317 gene neighborhoods were identified. In some cases, gene proximity appears to have been conserved since before the streptophyte–chlorophyte split, 1,000 Ma. Using functional inferences derived from reconstructed evolutionary relationships, we identified several novel functional clusters. A putative mycosporine-like amino acid, “sunscreen,” neighborhood contains genes similar to either vertebrate or cyanobacterial pathways, suggesting a novel mosaic biosynthetic pathway in green algae. One of two putative arsenic-detoxification neighborhoods includes an organoarsenical transporter (ArsJ), a glyceraldehyde 3-phosphate dehydrogenase-like gene, homologs of which are involved in arsenic detoxification in bacteria, and a novel algal-specific phosphoglycerate kinase-like gene. Mutants of the ArsJ-like transporter and phosphoglycerate kinase-like genes in Chlamydomonas reinhardtii were found to be sensitive to arsenate, providing experimental support for the role of these identified neighbors in resistance to arsenate. Potential evolutionary origins of neighborhoods are discussed, and updated annotations for formerly poorly annotated genes are presented, highlighting the potential of this strategy for functional annotation.
Collapse
|
9
|
Linard B, Ebersberger I, McGlynn SE, Glover N, Mochizuki T, Patricio M, Lecompte O, Nevers Y, Thomas PD, Gabaldón T, Sonnhammer E, Dessimoz C, Uchiyama I. Ten Years of Collaborative Progress in the Quest for Orthologs. Mol Biol Evol 2021; 38:3033-3045. [PMID: 33822172 PMCID: PMC8321534 DOI: 10.1093/molbev/msab098] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 02/07/2021] [Accepted: 04/01/2021] [Indexed: 12/19/2022] Open
Abstract
Accurate determination of the evolutionary relationships between genes is a foundational challenge in biology. Homology-evolutionary relatedness-is in many cases readily determined based on sequence similarity analysis. By contrast, whether or not two genes directly descended from a common ancestor by a speciation event (orthologs) or duplication event (paralogs) is more challenging, yet provides critical information on the history of a gene. Since 2009, this task has been the focus of the Quest for Orthologs (QFO) Consortium. The sixth QFO meeting took place in Okazaki, Japan in conjunction with the 67th National Institute for Basic Biology conference. Here, we report recent advances, applications, and oncoming challenges that were discussed during the conference. Steady progress has been made toward standardization and scalability of new and existing tools. A feature of the conference was the presentation of a panel of accessible tools for phylogenetic profiling and several developments to bring orthology beyond the gene unit-from domains to networks. This meeting brought into light several challenges to come: leveraging orthology computations to get the most of the incoming avalanche of genomic data, integrating orthology from domain to biological network levels, building better gene models, and adapting orthology approaches to the broad evolutionary and genomic diversity recognized in different forms of life and viruses.
Collapse
Affiliation(s)
- Benjamin Linard
- LIRMM, University of Montpellier, CNRS, Montpellier, France.,SPYGEN, Le Bourget-du-Lac, France
| | - Ingo Ebersberger
- Institute of Cell Biology and Neuroscience, Goethe University Frankfurt, Frankfurt, Germany.,Senckenberg Biodiversity and Climate Research Centre (S-BIKF), Frankfurt, Germany.,LOEWE Center for Translational Biodiversity Genomics (TBG), Frankfurt, Germany
| | - Shawn E McGlynn
- Earth-Life Science Institute, Tokyo Institute of Technology, Meguro, Tokyo, Japan.,Blue Marble Space Institute of Science, Seattle, WA, USA
| | - Natasha Glover
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Tomohiro Mochizuki
- Earth-Life Science Institute, Tokyo Institute of Technology, Meguro, Tokyo, Japan
| | - Mateus Patricio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Odile Lecompte
- Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Fédération de Médecine Translationnelle de Strasbourg, Strasbourg, France
| | - Yannis Nevers
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Paul D Thomas
- Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, CA, USA
| | - Toni Gabaldón
- Barcelona Supercomputing Centre (BCS-CNS), Jordi Girona, Barcelona, Spain.,Institute for Research in Biomedicine (IRB), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Erik Sonnhammer
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden
| | - Christophe Dessimoz
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.,Department of Computer Science, University College London, London, United Kingdom.,Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Ikuo Uchiyama
- Department of Theoretical Biology, National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Aichi, Japan
| | | |
Collapse
|
10
|
Chorostecki U, Molina M, Pryszcz LP, Gabaldón T. MetaPhOrs 2.0: integrative, phylogeny-based inference of orthology and paralogy across the tree of life. Nucleic Acids Res 2020; 48:W553-W557. [PMID: 32343307 PMCID: PMC7319458 DOI: 10.1093/nar/gkaa282] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 04/01/2020] [Accepted: 04/25/2020] [Indexed: 12/23/2022] Open
Abstract
Inferring homology relationships across genes in different species is a central task in comparative genomics. Therefore, a large number of resources and methods have been developed over the years. Some public databases include phylogenetic trees of homologous gene families which can be used to further differentiate homology relationships into orthology and paralogy. MetaPhOrs is a web server that integrates phylogenetic information from different sources to provide orthology and paralogy relationships based on a common phylogeny-based predictive algorithm and associated with a consistency-based confidence score. Here we describe the latest version of the web server which includes major new implementations and provides orthology and paralogy relationships derived from ∼8.2 million gene family trees-from 13 different source repositories across ∼4000 species with sequenced genomes. MetaPhOrs server is freely available, without registration, at http://orthology.phylomedb.org/.
Collapse
Affiliation(s)
- Uciel Chorostecki
- Barcelona Supercomputing Centre (BSC-CNS), 08034 Barcelona, Spain.,Institute for Research in Biomedicine (IRB), The Barcelona Institute of Science and Technology, 08028 Barcelona, Spain
| | - Manuel Molina
- Barcelona Supercomputing Centre (BSC-CNS), 08034 Barcelona, Spain.,Institute for Research in Biomedicine (IRB), The Barcelona Institute of Science and Technology, 08028 Barcelona, Spain
| | - Leszek P Pryszcz
- Centre for Genomic Regulation, 08003 Barcelona, Spain.,International Institute of Molecular and Cell Biology, 4 Ks. Trojdena Street, 02-109 Warsaw, Poland
| | - Toni Gabaldón
- Barcelona Supercomputing Centre (BSC-CNS), 08034 Barcelona, Spain.,Institute for Research in Biomedicine (IRB), The Barcelona Institute of Science and Technology, 08028 Barcelona, Spain.,ICREA, 08010 Barcelona, Spain
| |
Collapse
|
11
|
Abstract
MOTIVATION An important task in comparative genomics is to detect functional units by analyzing gene-context patterns. Colinear syntenic blocks (CSBs) are groups of genes that are consistently encoded in the same neighborhood and in the same order across a wide range of taxa. Such CSBs are likely essential for the regulation of gene expression in prokaryotes. Recent results indicate that colinearity can be conserved across multiple operons, thus motivating the discovery of multi-operon CSBs. This computational task raises scalability challenges in large datasets. RESULTS We propose an efficient algorithm for the discovery of cross-strand multi-operon CSBs in large genomic datasets. The proposed algorithm uses match-point arithmetic, which is scalable for large datasets of microbial genomes in terms of running time and space requirements. The algorithm is implemented and incorporated into a tool with a graphical user interface, called CSBFinder-S. We applied CSBFinder-S to data mine 1485 prokaryotic genomes and analyzed the identified cross-strand CSBs. Our results indicate that most of the syntenic blocks are exclusively colinear. Additional results indicate that transcriptional regulation by overlapping transcriptional genes is abundant in bacteria. We demonstrate the utility of CSBFinder-S to identify common function of the gene-pair PulEF in multiple contexts, including Type 2 Secretion System, Type 4 Pilus System and DNA uptake machinery. AVAILABILITY AND IMPLEMENTATION CSBFinder-S software and code are publicly available at https://github.com/dinasv/CSBFinder. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Dina Svetlitsky
- Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Tal Dagan
- Institute of Microbiology, Kiel University, Kiel 24118, Germany
| | - Michal Ziv-Ukelson
- Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| |
Collapse
|