1
|
Maratikyathanahalli Srikanta R, Wang L, Zhu T, Deal KR, Huo N, Gu YQ, McGuire PE, Dvorak J, Luo MC. Aegilops tauschii genome assembly v6.0 with improved sequence contiguity differentiates assembly errors from genuine differences with the D subgenome of Chinese Spring wheat assembly IWGSC RefSeq v2.1. G3 (BETHESDA, MD.) 2025; 15:jkaf042. [PMID: 40052782 PMCID: PMC12060248 DOI: 10.1093/g3journal/jkaf042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/10/2024] [Accepted: 02/19/2025] [Indexed: 05/09/2025]
Abstract
Aegilops tauschii is the donor of the D subgenome of hexaploid wheat and a valuable genetic resource for wheat improvement. Several reference-quality genome sequences have been reported for A. tauschii accession AL8/78. A new genome sequence assembly (Aet v6.0) built from long Pacific Biosciences HiFi reads and employing an optical genome map constructed with a new technology is reported here for this accession. The N50 contig length of 31.81 Mb greatly exceeded that of the previous AL8/78 genome sequence assembly (Aet v5.0). Of 1,254 super-scaffolds, 92, comprising 98% of the total super-scaffold length, were anchored on a high-resolution genetic map, and pseudomolecules were assembled. The number of gaps in the pseudomolecules was reduced from 52,910 in Aet v5.0 to 351 in Aet v6.0. Gene models were transferred from the Aet v5.0 assembly into the Aet v6.0 assembly. A total of 40,447 putative orthologous gene pairs were identified between the Aet v6.0 and Chinese Spring wheat IWGSC RefSer v2.1 D-subgenome pseudomolecules. Orthologous gene pairs were used to compare the structure of the A. tauschii and wheat D-subgenome pseudomolecules. A total of 223 structural differences were identified. They included 44 large differences in sequence orientation and 25 differences in sequence location. A technique for discriminating between assembly errors and real structural variation between closely related genomes is suggested.
Collapse
Affiliation(s)
| | - Le Wang
- Department of Plant Sciences, University of California, Davis, Davis, CA 95616, USA
| | - Tingting Zhu
- Department of Plant Sciences, University of California, Davis, Davis, CA 95616, USA
| | - Karin R Deal
- Department of Plant Sciences, University of California, Davis, Davis, CA 95616, USA
| | - Naxin Huo
- Crop Improvement and Genetics Research Unit, USDA-ARS, Albany, CA 94710, USA
| | - Yong Q Gu
- Crop Improvement and Genetics Research Unit, USDA-ARS, Albany, CA 94710, USA
| | - Patrick E McGuire
- Department of Plant Sciences, University of California, Davis, Davis, CA 95616, USA
| | - Jan Dvorak
- Department of Plant Sciences, University of California, Davis, Davis, CA 95616, USA
| | - Ming-Cheng Luo
- Department of Plant Sciences, University of California, Davis, Davis, CA 95616, USA
| |
Collapse
|
2
|
Jin XJ, Yu Y, Lin HY, Liu FL, Wang HF, Ma Q, Chen Y, Zhang YH, Li P. Revisiting the backbone phylogeny and inferring the evolutionary trends in inflorescence of Elsholtzieae (Lamiaceae): new insights from orthologous nuclear genes. Cladistics 2025; 41:157-176. [PMID: 39966307 DOI: 10.1111/cla.12604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 12/08/2024] [Accepted: 12/11/2024] [Indexed: 02/20/2025] Open
Abstract
The angiosperm tribe of Elsholtzieae (Lamiaceae) is characterized by complex inflorescences and has notable medicinal and economic significance. Relationships within Elsholtzieae, including the monophyly of Elsholtzia and Keiskea, and relationships among Mosla, Keiskea and Perilla, remain uncertain, hindering insights into inflorescence evolution within the tribe. Using hybridization capture sequencing and deep genome skimming data analysis, we reconstruct a phylogeny of Elsholtzieae using 279 orthologous nuclear loci from 56 species. We evaluated uncertainty among relationships using concatenation, coalescent and network approaches. Using a time-calibrated phylogeny, we reconstructed ancestral inflorescence traits to elucidate the patterns in their evolution within the tribe. Our analyses consistently support the paraphyly of the genus Elsholtzia. Phylogenetic network analyses, confirmed by PhyloNetworks and SplitsTree, showed reticulation events among the major lineages of Elsholtzieae. The unstable polyphyly of Keiskea observed in ASTRAL (accurate species tree algorithm), ML (maximum likelihood) and MP (maximum parsimony) analyses may be related to introgression from Perilla and Mosla. Based on the analyses of phylogenetic trees within Elsholtzieae, the evolutionary trajectory of inflorescences demonstrates a pattern of diversification, with specialization as one aspect of this process. Elsholtzieae support the hypothesis that compressed inflorescences evolved from larger and more complex ancestral forms through successive compressions of the inflorescence axis. Additionally, certain lineages within the tribe display a trend towards simplified inflorescences, characterized by a reduction in the number of florets. This highlights both the specialization and the diversity in the evolution of inflorescence structures within the tribe.
Collapse
Affiliation(s)
- Xin-Jie Jin
- College of Life and Environmental Science, Wenzhou University, Wenzhou, 325035, Zhejiang, China
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai, 201602, China
- Institute for Eco-environmental Research of Sanyang Wetland, Wenzhou University, Wenzhou, 325014, Zhejiang, China
| | - Yan Yu
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, 610065, Sichuan, China
| | - Han-Yang Lin
- Zhejiang Provincial Key Laboratory of Plant Evolutionary Ecology and Conservation, School of Life Sciences, Taizhou University, Taizhou, 318000, Zhejiang, China
| | - Feng-Luan Liu
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai, 201602, China
| | - Hai-Feng Wang
- College of Life and Environmental Science, Wenzhou University, Wenzhou, 325035, Zhejiang, China
| | - Qing Ma
- College of Biology and Environmental Engineering, Zhejiang Shuren University, Hangzhou, 310015, Zhejiang, China
| | - Yang Chen
- Laboratory of Systematic and Evolutionary Botany and Biodiversity, College of Life Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Yong-Hua Zhang
- College of Life and Environmental Science, Wenzhou University, Wenzhou, 325035, Zhejiang, China
- Institute for Eco-environmental Research of Sanyang Wetland, Wenzhou University, Wenzhou, 325014, Zhejiang, China
| | - Pan Li
- Laboratory of Systematic and Evolutionary Botany and Biodiversity, College of Life Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| |
Collapse
|
3
|
Feuermann M, Mi H, Gaudet P, Muruganujan A, Lewis SE, Ebert D, Mushayahama T, Thomas PD. A compendium of human gene functions derived from evolutionary modelling. Nature 2025; 640:146-154. [PMID: 40011791 PMCID: PMC11964926 DOI: 10.1038/s41586-025-08592-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 01/03/2025] [Indexed: 02/28/2025]
Abstract
A comprehensive, computable representation of the functional repertoire of all macromolecules encoded within the human genome is a foundational resource for biology and biomedical research. The Gene Ontology Consortium has been working towards this goal by generating a structured body of information about gene functions, which now includes experimental findings reported in more than 175,000 publications for human genes and genes in experimentally tractable model organisms1,2. Here, we describe the results of a large, international effort to integrate all of these findings to create a representation of human gene functions that is as complete and accurate as possible. Specifically, we apply an expert-curated, explicit evolutionary modelling approach to all human protein-coding genes. This approach integrates available experimental information across families of related genes into models that reconstruct the gain and loss of functional characteristics over evolutionary time. The models and the resulting set of 68,667 integrated gene functions cover approximately 82% of human protein-coding genes. The functional repertoire reveals a marked preponderance of molecular regulatory functions, and the models provide insights into the evolutionary origins of human gene functions. We show that our set of descriptions of functions can improve the widely used genomic technique of Gene Ontology enrichment analysis. The experimental evidence for each functional characteristic is recorded, thereby enabling the scientific community to help review and improve the resource, which we have made publicly available.
Collapse
Affiliation(s)
- Marc Feuermann
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Geneva, Switzerland
| | - Huaiyu Mi
- Division of Bioinformatics, Department of Population and Public Health Sciences, University of Southern California Los Angeles, Los Angeles, CA, USA
| | - Pascale Gaudet
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Geneva, Switzerland
| | - Anushya Muruganujan
- Division of Bioinformatics, Department of Population and Public Health Sciences, University of Southern California Los Angeles, Los Angeles, CA, USA
| | - Suzanna E Lewis
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Dustin Ebert
- Division of Bioinformatics, Department of Population and Public Health Sciences, University of Southern California Los Angeles, Los Angeles, CA, USA
| | - Tremayne Mushayahama
- Division of Bioinformatics, Department of Population and Public Health Sciences, University of Southern California Los Angeles, Los Angeles, CA, USA
| | - Paul D Thomas
- Division of Bioinformatics, Department of Population and Public Health Sciences, University of Southern California Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
4
|
Singh V, Singh V. Inferring Interaction Networks from Transcriptomic Data: Methods and Applications. Methods Mol Biol 2024; 2812:11-37. [PMID: 39068355 DOI: 10.1007/978-1-0716-3886-6_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Transcriptomic data is a treasure trove in modern molecular biology, as it offers a comprehensive viewpoint into the intricate nuances of gene expression dynamics underlying biological systems. This genetic information must be utilized to infer biomolecular interaction networks that can provide insights into the complex regulatory mechanisms underpinning the dynamic cellular processes. Gene regulatory networks and protein-protein interaction networks are two major classes of such networks. This chapter thoroughly investigates the wide range of methodologies used for distilling insightful revelations from transcriptomic data that include association-based methods (based on correlation among expression vectors), probabilistic models (using Bayesian and Gaussian models), and interologous methods. We reviewed different approaches for evaluating the significance of interactions based on the network topology and biological functions of the interacting molecules and discuss various strategies for the identification of functional modules. The chapter concludes with highlighting network-based techniques of prioritizing key genes, outlining the centrality-based, diffusion- based, and subgraph-based methods. The chapter provides a meticulous framework for investigating transcriptomic data to uncover assembly of complex molecular networks for their adaptable analyses across a broad spectrum of biological domains.
Collapse
Affiliation(s)
- Vikram Singh
- Centre for Computational Biology and Bioinformatics, Central University of Himachal Pradesh, Dharamshala, Himachal Pradesh, India
| | - Vikram Singh
- Centre for Computational Biology and Bioinformatics, Central University of Himachal Pradesh, Dharamshala, Himachal Pradesh, India.
| |
Collapse
|
5
|
Julca I, Tan QW, Mutwil M. Toward kingdom-wide analyses of gene expression. TRENDS IN PLANT SCIENCE 2023; 28:235-249. [PMID: 36344371 DOI: 10.1016/j.tplants.2022.09.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 09/22/2022] [Accepted: 09/30/2022] [Indexed: 06/16/2023]
Abstract
Gene expression data for Archaeplastida are accumulating exponentially, with more than 300 000 RNA-sequencing (RNA-seq) experiments available for hundreds of species. The gene expression data stem from thousands of experiments that capture gene expression in various organs, tissues, cell types, (a)biotic perturbations, and genotypes. Advances in software tools make it possible to process all these data in a matter of weeks on modern office computers, giving us the possibility to study gene expression in a kingdom-wide manner for the first time. We discuss how the expression data can be accessed and processed and outline analyses that take advantage of cross-species analyses, allowing us to generate powerful and robust hypotheses about gene function and evolution.
Collapse
Affiliation(s)
- Irene Julca
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore 637551, Singapore
| | - Qiao Wen Tan
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore 637551, Singapore
| | - Marek Mutwil
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore 637551, Singapore.
| |
Collapse
|
6
|
Landscape of Post-Transcriptional tRNA Modifications in Streptomyces albidoflavus J1074 as Portrayed by Mass Spectrometry and Genomic Data Mining. J Bacteriol 2023; 205:e0029422. [PMID: 36468867 PMCID: PMC9879100 DOI: 10.1128/jb.00294-22] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/10/2022] Open
Abstract
Actinobacterial genus Streptomyces (streptomycetes) represents one of the largest cultivable group of bacteria famous for their ability to produce valuable specialized (secondary) metabolites. Regulation of secondary metabolic pathways inextricably couples the latter to essential cellular processes that determine levels of amino acids, carbohydrates, phosphate, etc. Post-transcriptional tRNA modifications remain one of the least studied aspects of streptomycete physiology, albeit a few of them were recently shown to impact antibiotic production. In this study, we describe the diversity of post-transcriptional tRNA modifications in model strain Streptomyces albus (albidoflavus) J1074 by combining mass spectrometry and genomic data. Our results show that J1074 can produce more chemically distinct tRNA modifications than previously thought. An in silico approach identified orthologs for enzymes governing most of the identified tRNA modifications. Yet, genetic control of certain modifications remained elusive, suggesting early divergence of tRNA modification pathways in Streptomyces from the better studied model bacteria, such as Escherichia coli and Bacillus subtilis. As a first point in case, our data point to the presence of a non-canonical MiaE enzyme performing hydroxylation of prenylated adenosines. A further finding concerns the methylthiotransferase MiaB, which requires previous modification of adenosines by MiaA to i6A for thiomethylation to ms2i6A. We show here that the J1074 ortholog, when overexpressed, yields ms2A in a ΔmiaA background. Our results set the working ground for and justify a more detailed studies of biological significance of tRNA modification pathways in streptomycetes. IMPORTANCE Post-transcriptional tRNA modifications (PTTMs) play an important role in maturation and functionality of tRNAs. Little is known about tRNA modifications in the antibiotic-producing actinobacterial genus Streptomyces, even though peculiar tRNA-based regulatory mechanisms operate in this taxon. We provide a first detailed description of the chemical diversity of PTTMs in the model species, S. albidoflavus J1074, and identify most plausible genes for these PTTMs. Some of the PTTMs are described for the first time for Streptomyces. Production of certain PTTMs in J1074 appears to depend on enzymes that show no sequence similarity to known PTTM enzymes from model species. Our findings are of relevance for interrogation of genetic basis of PTTMs in pathogenic actinobacteria, such as M. tuberculosis.
Collapse
|
7
|
Duan G, Wu G, Chen X, Tian D, Li Z, Sun Y, Du Z, Hao L, Song S, Gao Y, Xiao J, Zhang Z, Bao Y, Tang B, Zhao W. HGD: an integrated homologous gene database across multiple species. Nucleic Acids Res 2022; 51:D994-D1002. [PMID: 36318261 PMCID: PMC9825607 DOI: 10.1093/nar/gkac970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/28/2022] [Accepted: 10/17/2022] [Indexed: 11/06/2022] Open
Abstract
Homology is fundamental to infer genes' evolutionary processes and relationships with shared ancestry. Existing homolog gene resources vary in terms of inferring methods, homologous relationship and identifiers, posing inevitable difficulties for choosing and mapping homology results from one to another. Here, we present HGD (Homologous Gene Database, https://ngdc.cncb.ac.cn/hgd), a comprehensive homologs resource integrating multi-species, multi-resources and multi-omics, as a complement to existing resources providing public and one-stop data service. Currently, HGD houses a total of 112 383 644 homologous pairs for 37 species, including 19 animals, 16 plants and 2 microorganisms. Meanwhile, HGD integrates various annotations from public resources, including 16 909 homologs with traits, 276 670 homologs with variants, 398 573 homologs with expression and 536 852 homologs with gene ontology (GO) annotations. HGD provides a wide range of omics gene function annotations to help users gain a deeper understanding of gene function.
Collapse
Affiliation(s)
| | | | - Xiaoning Chen
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Dongmei Tian
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Zhaohua Li
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yanling Sun
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Zhenglin Du
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Lili Hao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Shuhui Song
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yuan Gao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jingfa Xiao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhang Zhang
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yiming Bao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Bixia Tang
- Correspondence may also be addressed to Bixia Tang.
| | - Wenming Zhao
- To whom correspondence should be addressed. Tel: +86 1084097636; Fax: +86 1084097720;
| |
Collapse
|
8
|
Ramírez-Zavaleta CY, García-Barrera LJ, Rodríguez-Verástegui LL, Arrieta-Flores D, Gregorio-Jorge J. An Overview of PRR- and NLR-Mediated Immunities: Conserved Signaling Components across the Plant Kingdom That Communicate Both Pathways. Int J Mol Sci 2022; 23:12974. [PMID: 36361764 PMCID: PMC9654257 DOI: 10.3390/ijms232112974] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 10/17/2022] [Accepted: 10/18/2022] [Indexed: 09/10/2023] Open
Abstract
Cell-surface-localized pattern recognition receptors (PRRs) and intracellular nucleotide-binding domain and leucine-rich repeat receptors (NLRs) are plant immune proteins that trigger an orchestrated downstream signaling in response to molecules of microbial origin or host plant origin. Historically, PRRs have been associated with pattern-triggered immunity (PTI), whereas NLRs have been involved with effector-triggered immunity (ETI). However, recent studies reveal that such binary distinction is far from being applicable to the real world. Although the perception of plant pathogens and the final mounting response are achieved by different means, central hubs involved in signaling are shared between PTI and ETI, blurring the zig-zag model of plant immunity. In this review, we not only summarize our current understanding of PRR- and NLR-mediated immunities in plants, but also highlight those signaling components that are evolutionarily conserved across the plant kingdom. Altogether, we attempt to offer an overview of how plants mediate and integrate the induction of the defense responses that comprise PTI and ETI, emphasizing the need for more evolutionary molecular plant-microbe interactions (EvoMPMI) studies that will pave the way to a better understanding of the emergence of the core molecular machinery involved in the so-called evolutionary arms race between plants and microbes.
Collapse
Affiliation(s)
- Candy Yuriria Ramírez-Zavaleta
- Programa Académico de Ingeniería en Biotecnología—Cuerpo Académico Procesos Biotecnológicos, Universidad Politécnica de Tlaxcala, Av. Universidad Politécnica 1, Tepeyanco 90180, Mexico
| | - Laura Jeannette García-Barrera
- Instituto de Biotecnología y Ecología Aplicada (INBIOTECA), Universidad Veracruzana, Av. de las Culturas, Veracruzanas No. 101, Xalapa 91090, Mexico
- Centro de Investigación en Biotecnología Aplicada, Instituto Politécnico Nacional, Carretera Estatal Santa Inés Tecuexcomac-Tepetitla Km.1.5, Santa Inés-Tecuexcomac-Tepetitla 90700, Mexico
| | | | - Daniela Arrieta-Flores
- Programa Académico de Ingeniería en Biotecnología—Cuerpo Académico Procesos Biotecnológicos, Universidad Politécnica de Tlaxcala, Av. Universidad Politécnica 1, Tepeyanco 90180, Mexico
- Departamento de Biotecnología, Universidad Autónoma Metropolitana, Iztapalapa, Ciudad de México 09310, Mexico
| | - Josefat Gregorio-Jorge
- Consejo Nacional de Ciencia y Tecnología—Comisión Nacional del Agua, Av. Insurgentes Sur 1582, Col. Crédito Constructor, Del. Benito Juárez, Ciudad de México 03940, Mexico
| |
Collapse
|
9
|
Zhang X, Smith DR. An overview of online resources for intra-species detection of gene duplications. Front Genet 2022; 13:1012788. [PMID: 36313461 PMCID: PMC9606816 DOI: 10.3389/fgene.2022.1012788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 09/20/2022] [Indexed: 11/13/2022] Open
Abstract
Gene duplication plays an important role in evolutionary mechanism, which can act as a new source of genetic material in genome evolution. However, detecting duplicate genes from genomic data can be challenging. Various bioinformatics resources have been developed to identify duplicate genes from single and/or multiple species. Here, we summarize the metrics used to measure sequence identity among gene duplicates within species, compare several computational approaches that have been used to predict gene duplicates, and review recent advancements of a Basic Local Alignment Search Tool (BLAST)-based web tool and database, allowing future researchers to easily identify intra-species gene duplications. This article is a quick reference guide for research tools used for detecting gene duplicates.
Collapse
Affiliation(s)
- Xi Zhang
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS, Canada
- Institute for Comparative Genomics, Dalhousie University, Halifax, NS, Canada
| | - David Roy Smith
- Department of Biology, Western University, London, ON, Canada
| |
Collapse
|
10
|
Monzon V, Paysan-Lafosse T, Wood V, Bateman A. Reciprocal best structure hits: using AlphaFold models to discover distant homologues. BIOINFORMATICS ADVANCES 2022; 2:vbac072. [PMID: 36408459 PMCID: PMC9666668 DOI: 10.1093/bioadv/vbac072] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 09/16/2022] [Accepted: 10/05/2022] [Indexed: 11/17/2022]
Abstract
Motivation The conventional methods to detect homologous protein pairs use the comparison of protein sequences. But the sequences of two homologous proteins may diverge significantly and consequently may be undetectable by standard approaches. The release of the AlphaFold 2.0 software enables the prediction of highly accurate protein structures and opens many opportunities to advance our understanding of protein functions, including the detection of homologous protein structure pairs. Results In this proof-of-concept work, we search for the closest homologous protein pairs using the structure models of five model organisms from the AlphaFold database. We compare the results with homologous protein pairs detected by their sequence similarity and show that the structural matching approach finds a similar set of results. In addition, we detect potential novel homologs solely with the structural matching approach, which can help to understand the function of uncharacterized proteins and make previously overlooked connections between well-characterized proteins. We also observe limitations of our implementation of the structure-based approach, particularly when handling highly disordered proteins or short protein structures. Our work shows that high accuracy protein structure models can be used to discover homologous protein pairs, and we expose areas for improvement of this structural matching approach. Availability and Implementation Information to the discovered homologous protein pairs can be found at the following URL: https://doi.org/10.17863/CAM.87873. The code can be accessed here: https://github.com/VivianMonzon/Reciprocal_Best_Structure_Hits. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Vivian Monzon
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB21 4HH, UK
| | - Typhaine Paysan-Lafosse
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB21 4HH, UK
| | - Valerie Wood
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB21 4HH, UK
| |
Collapse
|
11
|
Lozano-Fernandez J. A Practical Guide to Design and Assess a Phylogenomic Study. Genome Biol Evol 2022; 14:evac129. [PMID: 35946263 PMCID: PMC9452790 DOI: 10.1093/gbe/evac129] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/03/2022] [Indexed: 11/13/2022] Open
Abstract
Over the last decade, molecular systematics has undergone a change of paradigm as high-throughput sequencing now makes it possible to reconstruct evolutionary relationships using genome-scale datasets. The advent of "big data" molecular phylogenetics provided a battery of new tools for biologists but simultaneously brought new methodological challenges. The increase in analytical complexity comes at the price of highly specific training in computational biology and molecular phylogenetics, resulting very often in a polarized accumulation of knowledge (technical on one side and biological on the other). Interpreting the robustness of genome-scale phylogenetic studies is not straightforward, particularly as new methodological developments have consistently shown that the general belief of "more genes, more robustness" often does not apply, and because there is a range of systematic errors that plague phylogenomic investigations. This is particularly problematic because phylogenomic studies are highly heterogeneous in their methodology, and best practices are often not clearly defined. The main aim of this article is to present what I consider as the ten most important points to take into consideration when planning a well-thought-out phylogenomic study and while evaluating the quality of published papers. The goal is to provide a practical step-by-step guide that can be easily followed by nonexperts and phylogenomic novices in order to assess the technical robustness of phylogenomic studies or improve the experimental design of a project.
Collapse
Affiliation(s)
- Jesus Lozano-Fernandez
- Department of Genetics, Microbiology and Statistics, Biodiversity Research Institute (IRBio), University of Barcelona, Avd. Diagonal 643, 08028 Barcelona, Spain
- Institute of Evolutionary Biology (CSIC – Universitat Pompeu Fabra), Passeig marítim de la Barcelona 37-49, 08003 Barcelona, Spain
| |
Collapse
|
12
|
Adaptation of the gut pathobiont Enterococcus faecalis to deoxycholate and taurocholate bile acids. Sci Rep 2022; 12:8485. [PMID: 35590028 PMCID: PMC9120511 DOI: 10.1038/s41598-022-12552-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Accepted: 05/11/2022] [Indexed: 11/24/2022] Open
Abstract
Enterococcus faecalis is a natural inhabitant of the human gastrointestinal tract. This bacterial species is subdominant in a healthy physiological state of the gut microbiota (eubiosis) in adults, but can become dominant and cause infections when the intestinal homeostasis is disrupted (dysbiosis). The relatively high concentrations of bile acids deoxycholate (DCA) and taurocholate (TCA) hallmark eubiosis and dysbiosis, respectively. This study aimed to better understand how E. faecalis adapts to DCA and TCA. We showed that DCA impairs E. faecalis growth and possibly imposes a continuous adjustment in the expression of many essential genes, including a majority of ribosomal proteins. This may account for slow growth and low levels of E. faecalis in the gut. In contrast, TCA had no detectable growth effect. The evolving transcriptome upon TCA adaptation showed the early activation of an oligopeptide permease system (opp2) followed by the adjustment of amino acid and nucleotide metabolisms. We provide evidence that TCA favors the exploitation of oligopeptide resources to fuel amino acid needs in limiting oligopeptide conditions. Altogether, our data suggest that the combined effects of decreased DCA and increased TCA concentrations can contribute to the rise of E. faecalis population during dysbiosis.
Collapse
|
13
|
Tan Y, Wang C, Schneider T, Li H, de Souza RF, Tang X, Swisher Grimm KD, Hsieh TF, Wang X, Li X, Zhang D. Comparative Phylogenomic Analysis Reveals Evolutionary Genomic Changes and Novel Toxin Families in Endophytic Liberibacter Pathogens. Microbiol Spectr 2021; 9:e0050921. [PMID: 34523996 PMCID: PMC8557891 DOI: 10.1128/spectrum.00509-21] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Accepted: 08/10/2021] [Indexed: 01/02/2023] Open
Abstract
Liberibacter pathogens are the causative agents of several severe crop diseases worldwide, including citrus Huanglongbing and potato zebra chip. These bacteria are endophytic and nonculturable, which makes experimental approaches challenging and highlights the need for bioinformatic analysis in advancing our understanding about Liberibacter pathogenesis. Here, we performed an in-depth comparative phylogenomic analysis of the Liberibacter pathogens and their free-living, nonpathogenic, ancestral species, aiming to identify major genomic changes and determinants associated with their evolutionary transitions in living habitats and pathogenicity. Using gene neighborhood analysis and phylogenetic classification, we systematically uncovered, annotated, and classified all prophage loci into four types, including one previously unrecognized group. We showed that these prophages originated through independent gene transfers at different evolutionary stages of Liberibacter and only the SC-type prophage was associated with the emergence of the pathogens. Using ortholog clustering, we vigorously identified two additional sets of genomic genes, which were either lost or gained in the ancestor of the pathogens. Consistent with the habitat change, the lost genes were enriched for biosynthesis of cellular building blocks. Importantly, among the gained genes, we uncovered several previously unrecognized toxins, including new toxins homologous to the EspG/VirA effectors, a YdjM phospholipase toxin, and a secreted endonuclease/exonuclease/phosphatase (EEP) protein. Our results substantially extend the knowledge of the evolutionary events and potential determinants leading to the emergence of endophytic, pathogenic Liberibacter species, which will facilitate the design of functional experiments and the development of new methods for detection and blockage of these pathogens. IMPORTANCELiberibacter pathogens are associated with several severe crop diseases, including citrus Huanglongbing, the most destructive disease to the citrus industry. Currently, no effective cure or treatments are available, and no resistant citrus variety has been found. The fact that these obligate endophytic pathogens are not culturable has made it extremely challenging to experimentally uncover the genes/proteins important to Liberibacter pathogenesis. Further, earlier bioinformatics studies failed to identify key genomic determinants, such as toxins and effector proteins, that underlie the pathogenicity of the bacteria. In this study, an in-depth comparative genomic analysis of Liberibacter pathogens along with their ancestral nonpathogenic species identified the prophage loci and several novel toxins that are evolutionarily associated with the emergence of the pathogens. These results shed new light on the disease mechanism of Liberibacter pathogens and will facilitate the development of new detection and blockage methods targeting the toxins.
Collapse
Affiliation(s)
- Yongjun Tan
- Department of Biology, College of Arts & Sciences, Saint Louis University, St. Louis, Missouri, USA
| | - Cindy Wang
- Department of Biology, College of Arts & Sciences, Saint Louis University, St. Louis, Missouri, USA
| | - Theresa Schneider
- Department of Biology, College of Arts & Sciences, Saint Louis University, St. Louis, Missouri, USA
| | - Huan Li
- Department of Biology, College of Arts & Sciences, Saint Louis University, St. Louis, Missouri, USA
| | - Robson Francisco de Souza
- Departamento de Microbiologia, Instituto de Ciências Biomédicas, Universidade de São Paulo, São Paulo, Brazil
| | - Xueming Tang
- School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Kylie D. Swisher Grimm
- United States Department of Agriculture—Agricultural Research Service, Temperate Tree Fruit and Vegetable Research Unit, Prosser, Washington, USA
| | - Tzung-Fu Hsieh
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, North Carolina, USA
- Plants for Human Health Institute, North Carolina State University, Kannapolis, North Carolina, USA
| | - Xu Wang
- Department of Pathobiology, College of Veterinary Medicine, Auburn University, Auburn, Alabama, USA
- Alabama Agricultural Experiment Station, Auburn University, Auburn, Alabama, USA
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama, USA
| | - Xu Li
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, North Carolina, USA
- Plants for Human Health Institute, North Carolina State University, Kannapolis, North Carolina, USA
| | - Dapeng Zhang
- Department of Biology, College of Arts & Sciences, Saint Louis University, St. Louis, Missouri, USA
- Bioinformatics and Computational Biology Program, College of Arts & Sciences, Saint Louis University, St. Louis, Missouri, USA
| |
Collapse
|
14
|
Wang P, Mao Y, Su Y, Wang J. Comparative analysis of transcriptomic data shows the effects of multiple evolutionary selection processes on codon usage in Marsupenaeus japonicus and Marsupenaeus pulchricaudatus. BMC Genomics 2021; 22:781. [PMID: 34717552 PMCID: PMC8557549 DOI: 10.1186/s12864-021-08106-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2020] [Accepted: 10/19/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Kuruma shrimp, a major commercial shrimp species in the world, has two cryptic or sibling species, Marsupenaeus japonicus and Marsupenaeus pulchricaudatus. Codon usage analysis would contribute to our understanding of the genetic and evolutionary characteristics of the two Marsupenaeus species. In this study, we analyzed codon usage and related indices using coding sequences (CDSs) from RNA-seq data. RESULTS Using CodonW 1.4.2 software, we performed the codon bias analysis of transcriptomes obtained from hepatopancreas tissues, which indicated weak codon bias. Almost all parameters had similar correlations for both species. The gene expression level (FPKM) was negatively correlated with A/T3s. We determined 12 and 14 optimal codons for M. japonicus and M. pulchricaudatus, respectively, and all optimal codons have a C/G-ending. The two Marsupenaeus species had different usage frequencies of codon pairs, which contributed to further analysis of transcriptional differences between them. Orthologous genes that underwent positive selection (ω > 1) had a higher correlation coefficient than that of experienced purifying selection (ω < 1). Parity Rule 2 (PR2) and effective number of codons (ENc) plot analysis showed that the codon usage patterns of both species were influenced by both mutations and selection. Moreover, the average observed ENc value was lower than the expected value for both species, suggesting that factors other than GC may play roles in these phenomena. The results of multispecies clustering based on codon preference were consistent with traditional classification. CONCLUSIONS This study provides a relatively comprehensive understanding of the correlations among codon usage bias, gene expression, and selection pressures of CDSs for M. japonicus and M. pulchricaudatus. The genetic evolution was driven by mutations and selection pressure. Moreover, the results point out new insights into the specificities and evolutionary characteristics of the two Marsupenaeus species.
Collapse
Affiliation(s)
- Panpan Wang
- Jiangsu Key Laboratory of Marine Bioresources and Environment/ Jiangsu Key Laboratory of Marine Biotechnology, Jiangsu Ocean University, Lianyungang, 222005, China
- Co-Innovation Center of Jiangsu Marine Bio-Industry Technology, Jiangsu Ocean University, Lianyungang, 222005, China
- The Jiangsu Provincial Infrastructure for Conservation and Utilization of Agricultural Germplasm, Nanjing, 210014, China
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| | - Yong Mao
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, 361102, Fujian, China.
- Fujian Key Laboratory of Genetics and Breeding of Marine Organisms, Xiamen University, Xiamen, 361102, China.
| | - Yongquan Su
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| | - Jun Wang
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| |
Collapse
|
15
|
Large-scale phylogenomics of the genus Macrostomum (Platyhelminthes) reveals cryptic diversity and novel sexual traits. Mol Phylogenet Evol 2021; 166:107296. [PMID: 34438051 DOI: 10.1016/j.ympev.2021.107296] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 07/01/2021] [Accepted: 08/19/2021] [Indexed: 02/07/2023]
Abstract
Free-living flatworms of the genus Macrostomum are small and transparent animals, representing attractive study organisms for a broad range of topics in evolutionary, developmental, and molecular biology. The genus includes the model organism M. lignano for which extensive molecular resources are available, and recently there is a growing interest in extending work to additional species in the genus. These endeavours are currently hindered because, even though >200 Macrostomum species have been taxonomically described, molecular phylogenetic information and geographic sampling remain limited. We report on a global sampling campaign aimed at increasing taxon sampling and geographic representation of the genus. Specifically, we use extensive transcriptome and single-locus data to generate phylogenomic hypotheses including 145 species. Across different phylogenetic methods and alignments used, we identify several consistent clades, while their exact grouping is less clear, possibly due to a radiation early in Macrostomum evolution. Moreover, we uncover a large undescribed diversity, with 94 of the studied species likely being new to science, and we identify multiple novel morphological traits. Furthermore, we identify cryptic speciation in a taxonomically challenging assemblage of species, suggesting that the use of molecular markers is a prerequisite for future work, and we describe the distribution of putative synapomorphies and suggest taxonomic revisions based on our finding. Our large-scale phylogenomic dataset now provides a robust foundation for comparative analyses of morphological, behavioural and molecular evolution in this genus.
Collapse
|
16
|
Parreira VDSC, Santos LGC, Rodrigues ML, Passetti F. ExVe: The knowledge base of orthologous proteins identified in fungal extracellular vesicles. Comput Struct Biotechnol J 2021; 19:2286-2296. [PMID: 33995920 PMCID: PMC8102145 DOI: 10.1016/j.csbj.2021.04.031] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 04/09/2021] [Accepted: 04/13/2021] [Indexed: 12/11/2022] Open
Abstract
Extracellular vesicles (EVs) are double-membrane particles associated with intercellular communication. Since the discovery of EV production in the fungus Cryptococcus neoformans, the importance of EV release in its physiology and pathogenicity has been investigated. To date, few studies have investigated the proteomic content of EVs from multiple fungal species. Our main objective was to use an orthology approach to compare proteins identified by EV shotgun proteomics in 8 pathogenic and 1 nonpathogenic species. Using protein information from the UniProt and FungiDB databases, we integrated data for 11,433 hits in fungal EVs with an orthology perspective, resulting in 3,834 different orthologous groups. OG6_100083 (Hsp70 Pfam domain) was the unique orthologous group that was identified for all fungal species. Proteins with this protein domain are associated with the stress response, survival and morphological changes in different fungal species. Although no pathogenic orthologous group was found, we identified 5 orthologous groups exclusive to S. cerevisiae. Using the criteria of at least 7 pathogenic fungi to define a cluster, we detected the 4 unique pathogenic orthologous groups. Taken together, our data suggest that Hsp70-related proteins might play a key role in fungal EVs, regardless of the pathogenic status. Using an orthology approach, we identified at least 4 protein domains that could be novel therapeutic targets against pathogenic fungi. Our results were compiled in the herein described ExVe database, which is publicly available at http://exve.icc.fiocruz.br.
Collapse
Affiliation(s)
| | | | - Marcio L Rodrigues
- Instituto Carlos Chagas, FIOCRUZ, Rua Prof. Algacyr Munhoz Mader, 3775, CEP 81350-010, Curitiba/PR, Brazil.,Instituto de Microbiologia Paulo de Góes, Universidade Federal do Rio de Janeiro (UFRJ), Brazil
| | - Fabio Passetti
- Instituto Carlos Chagas, FIOCRUZ, Rua Prof. Algacyr Munhoz Mader, 3775, CEP 81350-010, Curitiba/PR, Brazil
| |
Collapse
|
17
|
Chen Y, Song W, Xie X, Wang Z, Guan P, Peng H, Jiao Y, Ni Z, Sun Q, Guo W. A Collinearity-Incorporating Homology Inference Strategy for Connecting Emerging Assemblies in the Triticeae Tribe as a Pilot Practice in the Plant Pangenomic Era. MOLECULAR PLANT 2020; 13:1694-1708. [PMID: 32979565 DOI: 10.1016/j.molp.2020.09.019] [Citation(s) in RCA: 146] [Impact Index Per Article: 29.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Revised: 09/03/2020] [Accepted: 09/21/2020] [Indexed: 05/18/2023]
Abstract
Plant genome sequencing has dramatically increased, and some species even have multiple high-quality reference versions. Demands for clade-specific homology inference and analysis have increased in the pangenomic era. Here we present a novel method, GeneTribe (https://chenym1.github.io/genetribe/), for homology inference among genetically similar genomes that incorporates gene collinearity and shows better performance than traditional sequence-similarity-based methods in terms of accuracy and scalability. The Triticeae tribe is a typical allopolyploid-rich clade with complex species relationships that includes many important crops, such as wheat, barley, and rye. We built Triticeae-GeneTribe (http://wheat.cau.edu.cn/TGT/), a homology database, by integrating 12 Triticeae genomes and 3 outgroup model genomes and implemented versatile analysis and visualization functions. With macrocollinearity analysis, we were able to construct a refined model illustrating the structural rearrangements of the 4A-5A-7B chromosomes in wheat as two major translocation events. With collinearity analysis at both the macro- and microscale, we illustrated the complex evolutionary history of homologs of the wheat vernalization gene Vrn2, which evolved as a combined result of genome translocation, duplication, and polyploidization and gene loss events. Our work provides a useful practice for connecting emerging genome assemblies, with awareness of the extensive polyploidy in plants, and will help researchers efficiently exploit genome sequence resources.
Collapse
Affiliation(s)
- Yongming Chen
- Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Wanjun Song
- Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China; Beijing Geek Gene Technology Co Ltd, Beijing 100193, China
| | - Xiaoming Xie
- Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Zihao Wang
- Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Panfeng Guan
- Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Huiru Peng
- Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Yuannian Jiao
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhongfu Ni
- Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Qixin Sun
- Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Weilong Guo
- Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China.
| |
Collapse
|
18
|
Brandies PA, Tang S, Johnson RSP, Hogg CJ, Belov K. The first Antechinus reference genome provides a resource for investigating the genetic basis of semelparity and age-related neuropathologies. GIGABYTE 2020; 2020:gigabyte7. [PMID: 36824596 PMCID: PMC9631953 DOI: 10.46471/gigabyte.7] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 11/04/2020] [Indexed: 12/14/2022] Open
Abstract
Antechinus are a genus of mouse-like marsupials that exhibit a rare reproductive strategy known as semelparity and also naturally develop age-related neuropathologies similar to those in humans. We provide the first annotated antechinus reference genome for the brown antechinus (Antechinus stuartii). The reference genome is 3.3 Gb in size with a scaffold N50 of 73Mb and 93.3% complete mammalian BUSCOs. Using bioinformatic methods we assign scaffolds to chromosomes and identify 0.78 Mb of Y-chromosome scaffolds. Comparative genomics revealed interesting expansions in the NMRK2 gene and the protocadherin gamma family, which have previously been associated with aging and age-related dementias respectively. Transcriptome data displayed expression of common Alzheimer's related genes in the antechinus brain and highlight the potential of utilising the antechinus as a future disease model. The valuable genomic resources provided herein will enable future research to explore the genetic basis of semelparity and age-related processes in the antechinus.
Collapse
Affiliation(s)
- Parice A. Brandies
- School of Life and Environmental Sciences, Faculty of Science, University of Sydney, Sydney, New South Wales, Australia
| | - Simon Tang
- School of Life and Environmental Sciences, Faculty of Science, University of Sydney, Sydney, New South Wales, Australia
| | - Robert S. P. Johnson
- Zoologica: Veterinary and Zoological Consulting, Millthorpe, New South Wales, Australia
| | - Carolyn J. Hogg
- School of Life and Environmental Sciences, Faculty of Science, University of Sydney, Sydney, New South Wales, Australia
| | - Katherine Belov
- School of Life and Environmental Sciences, Faculty of Science, University of Sydney, Sydney, New South Wales, Australia
| |
Collapse
|
19
|
Hernández-Salmerón JE, Moreno-Hagelsieb G. Progress in quickly finding orthologs as reciprocal best hits: comparing blast, last, diamond and MMseqs2. BMC Genomics 2020; 21:741. [PMID: 33099302 PMCID: PMC7585182 DOI: 10.1186/s12864-020-07132-6] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Accepted: 10/09/2020] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Finding orthologs remains an important bottleneck in comparative genomics analyses. While the authors of software for the quick comparison of protein sequences evaluate the speed of their software and compare their results against the most usual software for the task, it is not common for them to evaluate their software for more particular uses, such as finding orthologs as reciprocal best hits (RBH). Here we compared RBH results obtained using software that runs faster than blastp. Namely, lastal, diamond, and MMseqs2. RESULTS We found that lastal required the least time to produce results. However, it yielded fewer results than any other program when comparing the proteins encoded by evolutionarily distant genomes. The program producing the most similar number of RBH to blastp was diamond ran with the "ultra-sensitive" option. However, this option was diamond's slowest, with the "very-sensitive" option offering the best balance between speed and RBH results. The speeding up of the programs was much more evident when dealing with eukaryotic genomes, which code for more numerous proteins. For example, lastal took a median of approx. 1.5% of the blastp time to run with bacterial proteomes and 0.6% with eukaryotic ones, while diamond with the very-sensitive option took 7.4% and 5.2%, respectively. Though estimated error rates were very similar among the RBH obtained with all programs, RBH obtained with MMseqs2 had the lowest error rates among the programs tested. CONCLUSIONS The fast algorithms for pairwise protein comparison produced results very similar to blast in a fraction of the time, with diamond offering the best compromise in speed, sensitivity and quality, as long as a sensitivity option, other than the default, was chosen.
Collapse
Affiliation(s)
| | - Gabriel Moreno-Hagelsieb
- Wilfrid Laurier University, Department of Biology, 75 University Ave W, Waterloo, N2L 3C5 ON Canada
| |
Collapse
|
20
|
Lallemand T, Leduc M, Landès C, Rizzon C, Lerat E. An Overview of Duplicated Gene Detection Methods: Why the Duplication Mechanism Has to Be Accounted for in Their Choice. Genes (Basel) 2020; 11:E1046. [PMID: 32899740 PMCID: PMC7565063 DOI: 10.3390/genes11091046] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 09/01/2020] [Accepted: 09/02/2020] [Indexed: 12/11/2022] Open
Abstract
Gene duplication is an important evolutionary mechanism allowing to provide new genetic material and thus opportunities to acquire new gene functions for an organism, with major implications such as speciation events. Various processes are known to allow a gene to be duplicated and different models explain how duplicated genes can be maintained in genomes. Due to their particular importance, the identification of duplicated genes is essential when studying genome evolution but it can still be a challenge due to the various fates duplicated genes can encounter. In this review, we first describe the evolutionary processes allowing the formation of duplicated genes but also describe the various bioinformatic approaches that can be used to identify them in genome sequences. Indeed, these bioinformatic approaches differ according to the underlying duplication mechanism. Hence, understanding the specificity of the duplicated genes of interest is a great asset for tool selection and should be taken into account when exploring a biological question.
Collapse
Affiliation(s)
- Tanguy Lallemand
- IRHS, Agrocampus-Ouest, INRAE, Université d’Angers, SFR 4207 QuaSaV, 49071 Beaucouzé, France; (T.L.); (M.L.); (C.L.)
| | - Martin Leduc
- IRHS, Agrocampus-Ouest, INRAE, Université d’Angers, SFR 4207 QuaSaV, 49071 Beaucouzé, France; (T.L.); (M.L.); (C.L.)
| | - Claudine Landès
- IRHS, Agrocampus-Ouest, INRAE, Université d’Angers, SFR 4207 QuaSaV, 49071 Beaucouzé, France; (T.L.); (M.L.); (C.L.)
| | - Carène Rizzon
- Laboratoire de Mathématiques et Modélisation d’Evry (LaMME), Université d’Evry Val d’Essonne, Université Paris-Saclay, UMR CNRS 8071, ENSIIE, USC INRAE, 23 bvd de France, CEDEX, 91037 Evry Paris, France;
| | - Emmanuelle Lerat
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR 5558, F-69622 Villeurbanne, France
| |
Collapse
|
21
|
Abstract
Knowing phylogenetic relationships among species is fundamental for many studies in biology. An accurate phylogenetic tree underpins our understanding of the major transitions in evolution, such as the emergence of new body plans or metabolism, and is key to inferring the origin of new genes, detecting molecular adaptation, understanding morphological character evolution and reconstructing demographic changes in recently diverged species. Although data are ever more plentiful and powerful analysis methods are available, there remain many challenges to reliable tree building. Here, we discuss the major steps of phylogenetic analysis, including identification of orthologous genes or proteins, multiple sequence alignment, and choice of substitution models and inference methodologies. Understanding the different sources of errors and the strategies to mitigate them is essential for assembling an accurate tree of life.
Collapse
|
22
|
Abe T, Ikarashi R, Mizoguchi M, Otake M, Ikemura T. A strategy for predicting gene functions from genome and metagenome sequences on the basis of oligopeptide frequency distance. Genes Genet Syst 2020; 95:11-19. [DOI: 10.1266/ggs.19-00041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Affiliation(s)
- Takashi Abe
- Department of Information Engineering, Faculty of Engineering, Niigata University
| | - Ryo Ikarashi
- Department of Information Engineering, Faculty of Engineering, Niigata University
| | - Masaya Mizoguchi
- Department of Information Engineering, Faculty of Engineering, Niigata University
| | - Masashi Otake
- Department of Information Engineering, Faculty of Engineering, Niigata University
| | - Toshimichi Ikemura
- Department of Bioscience, Nagahama Institute of Bio-Science and Technology
| |
Collapse
|
23
|
Zhang Y, Zhang Z, Zhang H, Zhao Y, Zhang Z, Xiao J. PADS Arsenal: a database of prokaryotic defense systems related genes. Nucleic Acids Res 2020; 48:D590-D598. [PMID: 31620779 PMCID: PMC7145686 DOI: 10.1093/nar/gkz916] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 10/03/2019] [Accepted: 10/04/2019] [Indexed: 12/16/2022] Open
Abstract
Defense systems are vital weapons for prokaryotes to resist heterologous DNA and survive from the constant invasion of viruses, and they are widely used in biochemistry investigation and antimicrobial drug research. So far, numerous types of defense systems have been discovered, but there is no comprehensive defense systems database to organize prokaryotic defense gene datasets. To fill this gap, we unveil the prokaryotic antiviral defense system (PADS) Arsenal (https://bigd.big.ac.cn/padsarsenal), a public database dedicated to gathering, storing, analyzing and visualizing prokaryotic defense gene datasets. The initial version of PADS Arsenal integrates 18 distinctive categories of defense system with the annotation of 6 600 264 genes retrieved from 63,701 genomes across 33 390 species of archaea and bacteria. PADS Arsenal provides various ways to retrieve defense systems related genes information and visualize them with multifarious function modes. Moreover, an online analysis pipeline is integrated into PADS Arsenal to facilitate annotation and evolutionary analysis of defense genes. PADS Arsenal can also visualize the dynamic variation information of defense genes from pan-genome analysis. Overall, PADS Arsenal is a state-of-the-art open comprehensive resource to accelerate the research of prokaryotic defense systems.
Collapse
Affiliation(s)
- Yadong Zhang
- National Genomics Data Center, Beijing 100101, China
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhewen Zhang
- National Genomics Data Center, Beijing 100101, China
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Hao Zhang
- National Genomics Data Center, Beijing 100101, China
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yongbing Zhao
- Department of Health Sciences Research, Mayo Clinic, Jacksonville, FL 32224, USA
| | - Zaichao Zhang
- Department of Biology, The University of Western Ontario, London, Ontario N6A 5B7, Canada
| | - Jingfa Xiao
- National Genomics Data Center, Beijing 100101, China
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
24
|
Park YC, Choi SY, Kim JH, Jang CS. Molecular Functions of Rice Cytosol-Localized RING Finger Protein 1 in Response to Salt and Drought and Comparative Analysis of Its Grass Orthologs. PLANT & CELL PHYSIOLOGY 2019; 60:2394-2409. [PMID: 31292649 DOI: 10.1093/pcp/pcz133] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Accepted: 07/02/2019] [Indexed: 05/29/2023]
Abstract
In higher plants, the post-translational modification of target proteins via the attachment of molecules such as ubiquitin (Ub) mediates a variety of cellular functions via the Ub/26S proteasome system. Here, a really interesting new gene (RING)-H2 type E3 ligase, which regulates target proteins via the Ub/26S proteasome system, was isolated from a rice plant, and its other grass orthologs were examined to determine the evolution of its molecular function during speciation. The gene encoding Oryza sativa cytoplasmic-localized RING finger protein 1 (OsCLR1) was highly expressed under salt and drought stresses. By contrast, the three grass orthologs, SbCLR1 from Sorghum bicolor, ZmCLR1 from Zea mays and TaCLR1 from Triticum aestivum, showed different responses to these stresses. Despite these differences, all four orthologs exhibited E3 ligase activity with cytosol-targeted localization, demonstrating conserved molecular functions. Although OsCLR1-overexpressing plants showed higher survival rates under both salt and drought stresses than that of the wild type (WT) plants, this pattern was not observed in the other orthologs. In addition, OsCLR1-overexpressing plants exhibited lower germination rates in ABA than that of WT plants, whereas the three ortholog CLR1-overexpressing plants showed rates similar to the WT plants. These results indicate the positive regulation of OsCLR1 in response to salt and drought in an ABA-dependent manner. Despite the molecular functions of the three CLR1 orthologs remaining largely unknown, our results provide an insight into the evolutionary fate of CLR1 grass orthologs during speciation after the divergence from a common ancestor.
Collapse
Affiliation(s)
- Yong Chan Park
- Plant Genomics Laboratory, Department of Bio-Resources Sciences, Kangwon National University, Chuncheon, Republic of Korea
| | - Seung Young Choi
- Plant Genomics Laboratory, Department of Bio-Resources Sciences, Kangwon National University, Chuncheon, Republic of Korea
| | - Jong Ho Kim
- Plant Genomics Laboratory, Department of Bio-Resources Sciences, Kangwon National University, Chuncheon, Republic of Korea
| | - Cheol Seong Jang
- Plant Genomics Laboratory, Department of Bio-Resources Sciences, Kangwon National University, Chuncheon, Republic of Korea
| |
Collapse
|
25
|
Hu X, Friedberg I. SwiftOrtho: A fast, memory-efficient, multiple genome orthology classifier. Gigascience 2019; 8:giz118. [PMID: 31648300 PMCID: PMC6812468 DOI: 10.1093/gigascience/giz118] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Revised: 06/07/2019] [Accepted: 09/05/2019] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Gene homology type classification is required for many types of genome analyses, including comparative genomics, phylogenetics, and protein function annotation. Consequently, a large variety of tools have been developed to perform homology classification across genomes of different species. However, when applied to large genomic data sets, these tools require high memory and CPU usage, typically available only in computational clusters. FINDINGS Here we present a new graph-based orthology analysis tool, SwiftOrtho, which is optimized for speed and memory usage when applied to large-scale data. SwiftOrtho uses long k-mers to speed up homology search, while using a reduced amino acid alphabet and spaced seeds to compensate for the loss of sensitivity due to long k-mers. In addition, it uses an affinity propagation algorithm to reduce the memory usage when clustering large-scale orthology relationships into orthologous groups. In our tests, SwiftOrtho was the only tool that completed orthology analysis of proteins from 1,760 bacterial genomes on a computer with only 4 GB RAM. Using various standard orthology data sets, we also show that SwiftOrtho has a high accuracy. CONCLUSIONS SwiftOrtho enables the accurate comparative genomic analyses of thousands of genomes using low-memory computers. SwiftOrtho is available at https://github.com/Rinoahu/SwiftOrtho.
Collapse
Affiliation(s)
- Xiao Hu
- Department of Veterinary Microbiology and Preventive Medicine, 2118 Veterinary Medicine, College of Veterinary Medicine, Iowa State University, Ames, IA, 50011, USA
| | - Iddo Friedberg
- Department of Veterinary Microbiology and Preventive Medicine, 2118 Veterinary Medicine, College of Veterinary Medicine, Iowa State University, Ames, IA, 50011, USA
| |
Collapse
|
26
|
Correia K, Yu SM, Mahadevan R. AYbRAH: a curated ortholog database for yeasts and fungi spanning 600 million years of evolution. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2019; 2019:5403499. [PMID: 30893420 PMCID: PMC6425859 DOI: 10.1093/database/baz022] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Revised: 01/17/2019] [Accepted: 01/28/2019] [Indexed: 12/14/2022]
Abstract
Budding yeasts inhabit a range of environments by exploiting various metabolic traits. The genetic bases for these traits are mostly unknown, preventing their addition or removal in a chassis organism for metabolic engineering. Insight into the evolution of orthologs, paralogs and xenologs in the yeast pan-genome can help bridge these genotypes; however, existing phylogenomic databases do not span diverse yeasts, and sometimes cannot distinguish between these homologs. To help understand the molecular evolution of these traits in yeasts, we created Analyzing Yeasts by Reconstructing Ancestry of Homologs (AYbRAH), an open-source database of predicted and manually curated ortholog groups for 33 diverse fungi and yeasts in Dikarya, spanning 600 million years of evolution. OrthoMCL and OrthoDB were used to cluster protein sequence into ortholog and homolog groups, respectively; MAFFT and PhyML reconstructed the phylogeny of all homolog groups. Ortholog assignments for enzymes and small metabolite transporters were compared to their phylogenetic reconstruction, and curated to resolve any discrepancies. Information on homolog and ortholog groups can be viewed in the AYbRAH web portal (https://lmse.github.io/aybrah/), including functional annotations, predictions for mitochondrial localization and transmembrane domains, literature references and phylogenetic reconstructions. Ortholog assignments in AYbRAH were compared to HOGENOM, KEGG Orthology, OMA, eggNOG and PANTHER. PANTHER and OMA had the most congruent ortholog groups with AYbRAH, while the other phylogenomic databases had greater amounts of under-clustering, over-clustering or no ortholog annotations for proteins. Future plans are discussed for AYbRAH, and recommendations are made for other research communities seeking to create curated ortholog databases.
Collapse
Affiliation(s)
- Kevin Correia
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, College Street, Toronto, ON, Canada
| | - Shi M Yu
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, College Street, Toronto, ON, Canada
| | - Radhakrishnan Mahadevan
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, College Street, Toronto, ON, Canada.,Institute of Biomaterials and Biomedical Engineering, University of Toronto, College Street, Toronto, ON, Canada
| |
Collapse
|
27
|
Rey C, Veber P, Boussau B, Sémon M. CAARS: comparative assembly and annotation of RNA-Seq data. Bioinformatics 2019; 35:2199-2207. [PMID: 30452539 PMCID: PMC6596894 DOI: 10.1093/bioinformatics/bty903] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2017] [Revised: 09/13/2018] [Accepted: 11/16/2018] [Indexed: 02/05/2023] Open
Abstract
MOTIVATION RNA sequencing (RNA-Seq) is a widely used approach to obtain transcript sequences in non-model organisms, notably for performing comparative analyses. However, current bioinformatic pipelines do not take full advantage of pre-existing reference data in related species for improving RNA-Seq assembly, annotation and gene family reconstruction. RESULTS We built an automated pipeline named CAARS to combine novel data from RNA-Seq experiments with existing multi-species gene family alignments. RNA-Seq reads are assembled into transcripts by both de novo and assisted assemblies. Then, CAARS incorporates transcripts into gene families, builds gene alignments and trees and uses phylogenetic information to classify the genes as orthologs and paralogs of existing genes. We used CAARS to assemble and annotate RNA-Seq data in rodents and fishes using distantly related genomes as reference, a difficult case for this kind of analysis. We showed CAARS assemblies are more complete and accurate than those assembled by a standard pipeline consisting of de novo assembly coupled with annotation by sequence similarity on a guide species. In addition to annotated transcripts, CAARS provides gene family alignments and trees, annotated with orthology relationships, directly usable for downstream comparative analyses. AVAILABILITY AND IMPLEMENTATION CAARS is implemented in Python and Ocaml and is freely available at https://github.com/carinerey/caars. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Carine Rey
- UnivLyon, Université Claude Bernard Lyon 1, ENS de Lyon, CNRS UMR, INSERM U1210, LBMC, F-69007, Lyon, France
| | - Philippe Veber
- UnivLyon, Université Claude Bernard Lyon 1, CNRS, UMR, LBBE, F-69100, Villeurbanne, France
| | - Bastien Boussau
- UnivLyon, Université Claude Bernard Lyon 1, CNRS, UMR, LBBE, F-69100, Villeurbanne, France
| | - Marie Sémon
- UnivLyon, Université Claude Bernard Lyon 1, ENS de Lyon, CNRS UMR, INSERM U1210, LBMC, F-69007, Lyon, France
| |
Collapse
|
28
|
Torres Manno MA, Pizarro MD, Prunello M, Magni C, Daurelio LD, Espariz M. GeM-Pro: a tool for genome functional mining and microbial profiling. Appl Microbiol Biotechnol 2019; 103:3123-3134. [DOI: 10.1007/s00253-019-09648-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Revised: 01/11/2019] [Accepted: 01/14/2019] [Indexed: 11/30/2022]
|
29
|
Vieira GA, Prosdocimi F. Accessible molecular phylogenomics at no cost: obtaining 14 new mitogenomes for the ant subfamily Pseudomyrmecinae from public data. PeerJ 2019; 7:e6271. [PMID: 30697483 PMCID: PMC6348091 DOI: 10.7717/peerj.6271] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Accepted: 12/10/2018] [Indexed: 11/20/2022] Open
Abstract
The advent of Next Generation Sequencing has reduced sequencing costs and increased genomic projects from a huge amount of organismal taxa, generating an unprecedented amount of genomic datasets publicly available. Often, only a tiny fraction of outstanding relevance of the genomic data produced by researchers is used in their works. This fact allows the data generated to be recycled in further projects worldwide. The assembly of complete mitogenomes is frequently overlooked though it is useful to understand evolutionary relationships among taxa, especially those presenting poor mtDNA sampling at the level of genera and families. This is exactly the case for ants (Hymenoptera:Formicidae) and more specifically for the subfamily Pseudomyrmecinae, a group of arboreal ants with several cases of convergent coevolution without any complete mitochondrial sequence available. In this work, we assembled, annotated and performed comparative genomics analyses of 14 new complete mitochondria from Pseudomyrmecinae species relying solely on public datasets available from the Sequence Read Archive (SRA). We used all complete mitogenomes available for ants to study the gene order conservation and also to generate two phylogenetic trees using both (i) concatenated set of 13 mitochondrial genes and (ii) the whole mitochondrial sequences. Even though the tree topologies diverged subtly from each other (and from previous studies), our results confirm several known relationships and generate new evidences for sister clade classification inside Pseudomyrmecinae clade. We also performed a synteny analysis for Formicidae and identified possible sites in which nucleotidic insertions happened in mitogenomes of pseudomyrmecine ants. Using a data mining/bioinformatics approach, the current work increased the number of complete mitochondrial genomes available for ants from 15 to 29, demonstrating the unique potential of public databases for mitogenomics studies. The wide applications of mitogenomes in research and presence of mitochondrial data in different public dataset types makes the "no budget mitogenomics" approach ideal for comprehensive molecular studies, especially for subsampled taxa.
Collapse
Affiliation(s)
- Gabriel A. Vieira
- Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Rio de Janeiro, Brazil
| | - Francisco Prosdocimi
- Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Rio de Janeiro, Brazil
| |
Collapse
|
30
|
Abstract
The distinction between orthologs and paralogs, genes that started diverging by speciation versus duplication, is relevant in a wide range of contexts, most notably phylogenetic tree inference and protein function annotation. In this chapter, we provide an overview of the methods used to infer orthology and paralogy. We survey both graph-based approaches (and their various grouping strategies) and tree-based approaches, which solve the more general problem of gene/species tree reconciliation. We discuss conceptual differences among the various orthology inference methods and databases and examine the difficult issue of verifying and benchmarking orthology predictions. Finally, we review typical applications of orthologous genes, groups, and reconciled trees and conclude with thoughts on future methodological developments.
Collapse
|
31
|
Mier P, Pérez-Pulido AJ, Andrade-Navarro MA. Automated selection of homologs to track the evolutionary history of proteins. BMC Bioinformatics 2018; 19:431. [PMID: 30453878 PMCID: PMC6245638 DOI: 10.1186/s12859-018-2457-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Accepted: 10/31/2018] [Indexed: 11/26/2022] Open
Abstract
Background The selection of distant homologs of a query protein under study is a usual and useful application of protein sequence databases. Such sets of homologs are often applied to investigate the function of a protein and the degree to which experimental results can be transferred from one organism to another. In particular, a variety of databases facilitates static browsing for orthologs. However, these resources have a limited power when identifying orthologs between taxonomically distant species. In addition, in some situations, for a given query protein, it is advantageous to compare the sets of orthologs from different specific organisms: this recursive step-wise search might give an idea of the evolutionary path of the protein as a series of consecutive steps, for example gaining or losing domains. However, a step-wise orthology search is a time-consuming task if the number of steps is high. Results To illustrate a solution for this problem, we present the web tool ProteinPathTracker, which allows to track the evolutionary history of a query protein by locating homologs in selected proteomes along several evolutionary paths. Additional functionalities include locking a region of interest to follow its evolution in the discovered homologous sequences and the study of the protein function evolution by analysis of the annotations of the homologs. Conclusions ProteinPathTracker is an easy-to-use web tool that automatises the practice of looking for selected homologs in distant species in a straightforward way for non-expert users. Electronic supplementary material The online version of this article (10.1186/s12859-018-2457-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Pablo Mier
- Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Hüsch-Weg 15, 55128, Mainz, Germany.
| | | | - Miguel A Andrade-Navarro
- Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Hüsch-Weg 15, 55128, Mainz, Germany
| |
Collapse
|
32
|
Zoranovic T, Manent J, Willoughby L, Matos de Simoes R, La Marca JE, Golenkina S, Cuiping X, Gruber S, Angjeli B, Kanitz EE, Cronin SJF, Neely GG, Wernitznig A, Humbert PO, Simpson KJ, Mitsiades CS, Richardson HE, Penninger JM. A genome-wide Drosophila epithelial tumorigenesis screen identifies Tetraspanin 29Fb as an evolutionarily conserved suppressor of Ras-driven cancer. PLoS Genet 2018; 14:e1007688. [PMID: 30325918 PMCID: PMC6203380 DOI: 10.1371/journal.pgen.1007688] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Revised: 10/26/2018] [Accepted: 09/11/2018] [Indexed: 12/15/2022] Open
Abstract
Oncogenic mutations in the small GTPase Ras contribute to ~30% of human cancers. However, Ras mutations alone are insufficient for tumorigenesis, therefore it is paramount to identify cooperating cancer-relevant signaling pathways. We devised an in vivo near genome-wide, functional screen in Drosophila and discovered multiple novel, evolutionarily-conserved pathways controlling Ras-driven epithelial tumorigenesis. Human gene orthologs of the fly hits were significantly downregulated in thousands of primary tumors, revealing novel prognostic markers for human epithelial tumors. Of the top 100 candidate tumor suppressor genes, 80 were validated in secondary Drosophila assays, identifying many known cancer genes and multiple novel candidate genes that cooperate with Ras-driven tumorigenesis. Low expression of the confirmed hits significantly correlated with the KRASG12 mutation status and poor prognosis in pancreatic cancer. Among the novel top 80 candidate cancer genes, we mechanistically characterized the function of the top hit, the Tetraspanin family member Tsp29Fb, revealing that Tsp29Fb regulates EGFR signaling, epithelial architecture and restrains tumor growth and invasion. Our functional Drosophila screen uncovers multiple novel and evolutionarily conserved epithelial cancer genes, and experimentally confirmed Tsp29Fb as a key regulator of EGFR/Ras induced epithelial tumor growth and invasion. Cancer involves the cooperative interaction of many gene mutations. The Ras signaling pathway is upregulated in many human cancers, but upregulated Ras signaling alone is not sufficient to induce malignant tumors. We have undertaken a genome-wide genetic screen using a transgenic RNAi library in the vinegar fly, Drosophila melanogaster, to identify tumor suppressor genes that cooperate with the Ras oncogene (RasV12) in conferring overgrown invasive tumors. We stratified the hits by analyzing the expression of human orthologs of these genes in human epithelial cancers, revealing genes that were strongly downregulated in human cancer. By conducting secondary genetic interaction tests, we validated 80 of the top 100 genes. Pathway analysis of these genes revealed that 55 fell into known pathways involved in human cancer, whereas 25 were unique genes. We then confirmed the tumor suppressor properties of one of these genes, Tsp29Fb, encoding a Tetraspanin membrane protein, and showed that Tsp29Fb functions as a tumor suppressor by inhibiting Ras signaling and by maintaining epithelial cell polarity. Altogether, our study has revealed novel Ras-cooperating tumor suppressors in Drosophila and suggests that these genes may also be involved in human cancer.
Collapse
Affiliation(s)
- Tamara Zoranovic
- IMBA, Institute of Molecular Biotechnology of the Austrian Academy of Science, Campus Vienna BioCentre, Vienna, Austria
| | - Jan Manent
- Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
- Department of Biochemistry & Genetics, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, Victoria, Australia
| | - Lee Willoughby
- Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
| | - Ricardo Matos de Simoes
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, Massachusetts, United States of America
| | - John E. La Marca
- Department of Biochemistry & Genetics, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, Victoria, Australia
| | - Sofya Golenkina
- Department of Biochemistry & Genetics, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, Victoria, Australia
| | - Xia Cuiping
- IMBA, Institute of Molecular Biotechnology of the Austrian Academy of Science, Campus Vienna BioCentre, Vienna, Austria
| | - Susanne Gruber
- IMBA, Institute of Molecular Biotechnology of the Austrian Academy of Science, Campus Vienna BioCentre, Vienna, Austria
| | - Belinda Angjeli
- IMBA, Institute of Molecular Biotechnology of the Austrian Academy of Science, Campus Vienna BioCentre, Vienna, Austria
| | - Elisabeth Eva Kanitz
- IMBA, Institute of Molecular Biotechnology of the Austrian Academy of Science, Campus Vienna BioCentre, Vienna, Austria
| | - Shane J. F. Cronin
- IMBA, Institute of Molecular Biotechnology of the Austrian Academy of Science, Campus Vienna BioCentre, Vienna, Austria
| | - G. Gregory Neely
- IMBA, Institute of Molecular Biotechnology of the Austrian Academy of Science, Campus Vienna BioCentre, Vienna, Austria
- The Charles Perkins Centre, School of Life & Environmental Sciences, The University of Sydney, Sydney, New South Wales, Australia
| | | | - Patrick O. Humbert
- Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
- Department of Biochemistry & Genetics, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, Victoria, Australia
- Sir Peter MacCallum Department of Oncology, Department of Anatomy & Neuroscience, Department of Biochemistry & Molecular Biology, and Department of Clinical Pathology, University of Melbourne, Melbourne, Victoria, Australia
| | - Kaylene J. Simpson
- Sir Peter MacCallum Department of Oncology, Department of Anatomy & Neuroscience, Department of Biochemistry & Molecular Biology, and Department of Clinical Pathology, University of Melbourne, Melbourne, Victoria, Australia
- Victorian Center for Functional Genomics, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
| | - Constantine S. Mitsiades
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Helena E. Richardson
- Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
- Department of Biochemistry & Genetics, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, Victoria, Australia
- Sir Peter MacCallum Department of Oncology, Department of Anatomy & Neuroscience, Department of Biochemistry & Molecular Biology, and Department of Clinical Pathology, University of Melbourne, Melbourne, Victoria, Australia
- * E-mail: (HER); (JMP)
| | - Josef M. Penninger
- IMBA, Institute of Molecular Biotechnology of the Austrian Academy of Science, Campus Vienna BioCentre, Vienna, Austria
- * E-mail: (HER); (JMP)
| |
Collapse
|
33
|
Salazar AN, Abeel T. Approximate, simultaneous comparison of microbial genome architectures via syntenic anchoring of quiver representations. Bioinformatics 2018; 34:i732-i742. [PMID: 30423098 PMCID: PMC6129293 DOI: 10.1093/bioinformatics/bty614] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Motivation A long-standing limitation in comparative genomic studies is the dependency on a reference genome, which hinders the spectrum of genetic diversity that can be identified across a population of organisms. This is especially true in the microbial world where genome architectures can significantly vary. There is therefore a need for computational methods that can simultaneously analyze the architectures of multiple genomes without introducing bias from a reference. Results In this article, we present Ptolemy: a novel method for studying the diversity of genome architectures-such as structural variation and pan-genomes-across a collection of microbial assemblies without the need of a reference. Ptolemy is a 'top-down' approach to compare whole genome assemblies. Genomes are represented as labeled multi-directed graphs-known as quivers-which are then merged into a single, canonical quiver by identifying 'gene anchors' via synteny analysis. The canonical quiver represents an approximate, structural alignment of all genomes in a given collection encoding structural variation across (sub-) populations within the collection. We highlight various applications of Ptolemy by analyzing structural variation and the pan-genomes of different datasets composing of Mycobacterium, Saccharomyces, Escherichia and Shigella species. Our results show that Ptolemy is flexible and can handle both conserved and highly dynamic genome architectures. Ptolemy is user-friendly-requires only FASTA-formatted assembly along with a corresponding GFF-formatted file-and resource-friendly-can align 24 genomes in ∼10 mins with four CPUs and <2 GB of RAM. Availability and implementation Github: https://github.com/AbeelLab/ptolemy. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alex N Salazar
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Thomas Abeel
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
34
|
Yushchuk O, Ostash I, Vlasiuk I, Gren T, Luzhetskyy A, Kalinowski J, Fedorenko V, Ostash B. Heterologous AdpA transcription factors enhance landomycin production in Streptomyces cyanogenus S136 under a broad range of growth conditions. Appl Microbiol Biotechnol 2018; 102:8419-8428. [PMID: 30056513 DOI: 10.1007/s00253-018-9249-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Revised: 07/09/2018] [Accepted: 07/14/2018] [Indexed: 01/14/2023]
Abstract
Streptomyces cyanogenus S136 is the only known producer of landomycin A (LaA), one of the largest glycosylated angucycline antibiotics possessing strong antiproliferative properties. There is rising interest in elucidation of mechanisms of action of landomycins, which, in turn, requires access to large quantities of the pure compounds. Overproduction of LaA has been achieved in the past through manipulation of cluster-situated regulatory genes. However, other components of the LaA biosynthetic regulatory network remain unknown. To fill this gap, we elucidated the contribution of AdpA family pleiotropic regulators in landomycin production via expression of adpA genes of different origins in S. cyanogenus S136. Overexpression of the native S. cyanogenus S136 adpA ortholog had no effect on landomycin titers. In the same time, expression of several heterologous adpA genes led to significantly increased landomycin production under different cultivation conditions. Hence, heterologous adpA genes are a useful tool to enhance or activate landomycin production by S. cyanogenus. Our ongoing research effort is focused on identification of mutations that render S. cyanogenus AdpA nonfunctional.
Collapse
Affiliation(s)
- Oleksandr Yushchuk
- Department of Genetics and Biotechnology, Ivan Franko National University of Lviv, Hrushevskoho St. 4, Rm. 102, Lviv, 79005, Ukraine
| | - Iryna Ostash
- Department of Genetics and Biotechnology, Ivan Franko National University of Lviv, Hrushevskoho St. 4, Rm. 102, Lviv, 79005, Ukraine
| | - Iryna Vlasiuk
- Department of Genetics and Biotechnology, Ivan Franko National University of Lviv, Hrushevskoho St. 4, Rm. 102, Lviv, 79005, Ukraine
| | - Tetiana Gren
- Technology Platform Genomics, CeBiTec, Bielefeld University, Universitätsstraße 25, 33615, Bielefeld, Germany
| | - Andriy Luzhetskyy
- Department of Pharmaceutical Biotechnology, Helmholtz Institute for Pharmaceutical Research Saarland, Actinobacteria Metabolic Engineering Group, Saarland University, UdS Campus C2 3, 66123, Saarbrucken, Germany
| | - Joern Kalinowski
- Technology Platform Genomics, CeBiTec, Bielefeld University, Universitätsstraße 25, 33615, Bielefeld, Germany
| | - Victor Fedorenko
- Department of Genetics and Biotechnology, Ivan Franko National University of Lviv, Hrushevskoho St. 4, Rm. 102, Lviv, 79005, Ukraine
| | - Bohdan Ostash
- Department of Genetics and Biotechnology, Ivan Franko National University of Lviv, Hrushevskoho St. 4, Rm. 102, Lviv, 79005, Ukraine.
| |
Collapse
|
35
|
Lindsey ARI, Kelkar YD, Wu X, Sun D, Martinson EO, Yan Z, Rugman-Jones PF, Hughes DST, Murali SC, Qu J, Dugan S, Lee SL, Chao H, Dinh H, Han Y, Doddapaneni HV, Worley KC, Muzny DM, Ye G, Gibbs RA, Richards S, Yi SV, Stouthamer R, Werren JH. Comparative genomics of the miniature wasp and pest control agent Trichogramma pretiosum. BMC Biol 2018; 16:54. [PMID: 29776407 PMCID: PMC5960102 DOI: 10.1186/s12915-018-0520-9] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2018] [Accepted: 04/20/2018] [Indexed: 12/25/2022] Open
Abstract
Background Trichogrammatids are minute parasitoid wasps that develop within other insect eggs. They are less than half a millimeter long, smaller than some protozoans. The Trichogrammatidae are one of the earliest branching families of Chalcidoidea: a diverse superfamily of approximately half a million species of parasitoid wasps, proposed to have evolved from a miniaturized ancestor. Trichogramma are frequently used in agriculture, released as biological control agents against major moth and butterfly pests. Additionally, Trichogramma are well known for their symbiotic bacteria that induce asexual reproduction in infected females. Knowledge of the genome sequence of Trichogramma is a major step towards further understanding its biology and potential applications in pest control. Results We report the 195-Mb genome sequence of Trichogramma pretiosum and uncover signatures of miniaturization and adaptation in Trichogramma and related parasitoids. Comparative analyses reveal relatively rapid evolution of proteins involved in ribosome biogenesis and function, transcriptional regulation, and ploidy regulation. Chalcids also show loss or especially rapid evolution of 285 gene clusters conserved in other Hymenoptera, including many that are involved in signal transduction and embryonic development. Comparisons between sexual and asexual lineages of Trichogramma pretiosum reveal that there is no strong evidence for genome degradation (e.g., gene loss) in the asexual lineage, although it does contain a lower repeat content than the sexual lineage. Trichogramma shows particularly rapid genome evolution compared to other hymenopterans. We speculate these changes reflect adaptations to miniaturization, and to life as a specialized egg parasitoid. Conclusions The genomes of Trichogramma and related parasitoids are a valuable resource for future studies of these diverse and economically important insects, including explorations of parasitoid biology, symbiosis, asexuality, biological control, and the evolution of miniaturization. Understanding the molecular determinants of parasitism can also inform mass rearing of Trichogramma and other parasitoids for biological control. Electronic supplementary material The online version of this article (10.1186/s12915-018-0520-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Amelia R I Lindsey
- Department of Entomology, University of California Riverside, Riverside, California, 92521, USA. .,Present Address: Department of Biology, Indiana University, Bloomington, Indiana, 47405, USA.
| | - Yogeshwar D Kelkar
- Department of Biology, University of Rochester, Rochester, New York, 14627, USA
| | - Xin Wu
- School of Biological Sciences, Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia, 30332, USA
| | - Dan Sun
- School of Biological Sciences, Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia, 30332, USA
| | - Ellen O Martinson
- Department of Biology, University of Rochester, Rochester, New York, 14627, USA.,Present Address: Department of Entomology, University of Georgia, Athens, Georgia, 30602, USA
| | - Zhichao Yan
- Department of Biology, University of Rochester, Rochester, New York, 14627, USA.,State Key Laboratory of Rice Biology & Ministry of Agriculture Key Laboratory of Agricultural Entomology, Institute of Insect Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Paul F Rugman-Jones
- Department of Entomology, University of California Riverside, Riverside, California, 92521, USA
| | - Daniel S T Hughes
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Shwetha C Murali
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Jiaxin Qu
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Shannon Dugan
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Sandra L Lee
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Hsu Chao
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Huyen Dinh
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Yi Han
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Harsha Vardhan Doddapaneni
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Kim C Worley
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Donna M Muzny
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Gongyin Ye
- State Key Laboratory of Rice Biology & Ministry of Agriculture Key Laboratory of Agricultural Entomology, Institute of Insect Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Richard A Gibbs
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Stephen Richards
- Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Soojin V Yi
- School of Biological Sciences, Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia, 30332, USA
| | - Richard Stouthamer
- Department of Entomology, University of California Riverside, Riverside, California, 92521, USA.
| | - John H Werren
- Department of Biology, University of Rochester, Rochester, New York, 14627, USA.
| |
Collapse
|
36
|
Lindsey ARI, Kelkar YD, Wu X, Sun D, Martinson EO, Yan Z, Rugman-Jones PF, Hughes DST, Murali SC, Qu J, Dugan S, Lee SL, Chao H, Dinh H, Han Y, Doddapaneni HV, Worley KC, Muzny DM, Ye G, Gibbs RA, Richards S, Yi SV, Stouthamer R, Werren JH. Comparative genomics of the miniature wasp and pest control agent Trichogramma pretiosum. BMC Biol 2018. [DOI: 10.1186/s12915-018-0520-9 10.1186/s12915-018-0520-9 [pii]] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
|
37
|
Galpert D, Fernández A, Herrera F, Antunes A, Molina-Ruiz R, Agüero-Chapin G. Surveying alignment-free features for Ortholog detection in related yeast proteomes by using supervised big data classifiers. BMC Bioinformatics 2018; 19:166. [PMID: 29724166 PMCID: PMC5934817 DOI: 10.1186/s12859-018-2148-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2017] [Accepted: 04/04/2018] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND The development of new ortholog detection algorithms and the improvement of existing ones are of major importance in functional genomics. We have previously introduced a successful supervised pairwise ortholog classification approach implemented in a big data platform that considered several pairwise protein features and the low ortholog pair ratios found between two annotated proteomes (Galpert, D et al., BioMed Research International, 2015). The supervised models were built and tested using a Saccharomycete yeast benchmark dataset proposed by Salichos and Rokas (2011). Despite several pairwise protein features being combined in a supervised big data approach; they all, to some extent were alignment-based features and the proposed algorithms were evaluated on a unique test set. Here, we aim to evaluate the impact of alignment-free features on the performance of supervised models implemented in the Spark big data platform for pairwise ortholog detection in several related yeast proteomes. RESULTS The Spark Random Forest and Decision Trees with oversampling and undersampling techniques, and built with only alignment-based similarity measures or combined with several alignment-free pairwise protein features showed the highest classification performance for ortholog detection in three yeast proteome pairs. Although such supervised approaches outperformed traditional methods, there were no significant differences between the exclusive use of alignment-based similarity measures and their combination with alignment-free features, even within the twilight zone of the studied proteomes. Just when alignment-based and alignment-free features were combined in Spark Decision Trees with imbalance management, a higher success rate (98.71%) within the twilight zone could be achieved for a yeast proteome pair that underwent a whole genome duplication. The feature selection study showed that alignment-based features were top-ranked for the best classifiers while the runners-up were alignment-free features related to amino acid composition. CONCLUSIONS The incorporation of alignment-free features in supervised big data models did not significantly improve ortholog detection in yeast proteomes regarding the classification qualities achieved with just alignment-based similarity measures. However, the similarity of their classification performance to that of traditional ortholog detection methods encourages the evaluation of other alignment-free protein pair descriptors in future research.
Collapse
Affiliation(s)
- Deborah Galpert
- Departamento de Ciencia de la Computación, Universidad Central ¨Marta Abreu¨ de Las Villas (UCLV), 54830, Santa Clara, Cuba
| | - Alberto Fernández
- Department of Computer Science and Artificial Intelligence, Research Center on Information and Communications Technology (CITIC-UGR), University of Granada, 18071, Granada, Spain
| | - Francisco Herrera
- Department of Computer Science and Artificial Intelligence, Research Center on Information and Communications Technology (CITIC-UGR), University of Granada, 18071, Granada, Spain
| | - Agostinho Antunes
- CIIMAR/CIMAR, Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n 4450-208 Matosinhos, Porto, Portugal.,Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007, Porto, Portugal
| | - Reinaldo Molina-Ruiz
- Centro de Bioactivos Químicos (CBQ), Universidad Central ¨Marta Abreu¨ de Las Villas (UCLV), 54830, Santa Clara, Cuba
| | - Guillermin Agüero-Chapin
- CIIMAR/CIMAR, Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n 4450-208 Matosinhos, Porto, Portugal. .,Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007, Porto, Portugal. .,Centro de Bioactivos Químicos (CBQ), Universidad Central ¨Marta Abreu¨ de Las Villas (UCLV), 54830, Santa Clara, Cuba.
| |
Collapse
|
38
|
Reconstruction of the ancestral metazoan genome reveals an increase in genomic novelty. Nat Commun 2018; 9:1730. [PMID: 29712911 PMCID: PMC5928047 DOI: 10.1038/s41467-018-04136-5] [Citation(s) in RCA: 79] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2017] [Accepted: 02/28/2018] [Indexed: 12/03/2022] Open
Abstract
Understanding the emergence of the Animal Kingdom is one of the major challenges of modern evolutionary biology. Many genomic changes took place along the evolutionary lineage that gave rise to the Metazoa. Recent research has revealed the role that co-option of old genes played during this transition, but the contribution of genomic novelty has not been fully assessed. Here, using extensive genome comparisons between metazoans and multiple outgroups, we infer the minimal protein-coding genome of the first animal, in addition to other eukaryotic ancestors, and estimate the proportion of novelties in these ancient genomes. Contrary to the prevailing view, this uncovers an unprecedented increase in the extent of genomic novelty during the origin of metazoans, and identifies 25 groups of metazoan-specific genes that are essential across the Animal Kingdom. We argue that internal genomic changes were as important as external factors in the emergence of animals. Animals, the Metazoa, co-opted numerous unicellular genes in their transition to multicellularity. Here, the authors use phylogenomic analyses to infer the genome composition of the ancestor of extant animals and show there was also a burst of novel gene groups associated with this transition.
Collapse
|
39
|
Thanki AS, Soranzo N, Haerty W, Davey RP. GeneSeqToFamily: a Galaxy workflow to find gene families based on the Ensembl Compara GeneTrees pipeline. Gigascience 2018; 7:1-10. [PMID: 29425291 PMCID: PMC5863215 DOI: 10.1093/gigascience/giy005] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2017] [Revised: 07/31/2017] [Accepted: 01/18/2018] [Indexed: 11/13/2022] Open
Abstract
Background Gene duplication is a major factor contributing to evolutionary novelty, and the contraction or expansion of gene families has often been associated with morphological, physiological, and environmental adaptations. The study of homologous genes helps us to understand the evolution of gene families. It plays a vital role in finding ancestral gene duplication events as well as identifying genes that have diverged from a common ancestor under positive selection. There are various tools available, such as MSOAR, OrthoMCL, and HomoloGene, to identify gene families and visualize syntenic information between species, providing an overview of syntenic regions evolution at the family level. Unfortunately, none of them provide information about structural changes within genes, such as the conservation of ancestral exon boundaries among multiple genomes. The Ensembl GeneTrees computational pipeline generates gene trees based on coding sequences, provides details about exon conservation, and is used in the Ensembl Compara project to discover gene families. Findings A certain amount of expertise is required to configure and run the Ensembl Compara GeneTrees pipeline via command line. Therefore, we converted this pipeline into a Galaxy workflow, called GeneSeqToFamily, and provided additional functionality. This workflow uses existing tools from the Galaxy ToolShed, as well as providing additional wrappers and tools that are required to run the workflow. Conclusions GeneSeqToFamily represents the Ensembl GeneTrees pipeline as a set of interconnected Galaxy tools, so they can be run interactively within the Galaxy's user-friendly workflow environment while still providing the flexibility to tailor the analysis by changing configurations and tools if necessary. Additional tools allow users to subsequently visualize the gene families produced by the workflow, using the Aequatus.js interactive tool, which has been developed as part of the Aequatus software project.
Collapse
Affiliation(s)
- Anil S Thanki
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| | - Nicola Soranzo
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| | - Wilfried Haerty
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| | - Robert P Davey
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| |
Collapse
|
40
|
Shrestha AMS, Frith MC, Asai K, Richard H. Jointly aligning a group of DNA reads improves accuracy of identifying large deletions. Nucleic Acids Res 2018; 46:e18. [PMID: 29182778 PMCID: PMC5815140 DOI: 10.1093/nar/gkx1175] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2017] [Revised: 09/06/2017] [Accepted: 11/16/2017] [Indexed: 01/28/2023] Open
Abstract
Performing sequence alignment to identify structural variants, such as large deletions, from genome sequencing data is a fundamental task, but current methods are far from perfect. The current practice is to independently align each DNA read to a reference genome. We show that the propensity of genomic rearrangements to accumulate in repeat-rich regions imposes severe ambiguities in these alignments, and consequently on the variant calls-with current read lengths, this affects more than one third of known large deletions in the C. Venter genome. We present a method to jointly align reads to a genome, whereby alignment ambiguity of one read can be disambiguated by other reads. We show this leads to a significant improvement in the accuracy of identifying large deletions (≥20 bases), while imposing minimal computational overhead and maintaining an overall running time that is at par with current tools. A software implementation is available as an open-source Python program called JRA at https://bitbucket.org/jointreadalignment/jra-src.
Collapse
Affiliation(s)
- Anish M S Shrestha
- Department of Computational Biology and Medical Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba, Japan
| | - Martin C Frith
- Department of Computational Biology and Medical Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba, Japan
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-3-26 Aomi, Koto-ku, Tokyo, Japan
- Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan
| | - Kiyoshi Asai
- Department of Computational Biology and Medical Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba, Japan
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-3-26 Aomi, Koto-ku, Tokyo, Japan
| | - Hugues Richard
- Sorbonne Universités, UPMC Univ Paris 06, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 4 place Jussieu, 75005 Paris, France
| |
Collapse
|
41
|
Abstract
Computational pan-genome analysis has emerged from the rapid increase of available genome sequencing data. Starting from a microbial pan-genome, the concept has spread to a variety of species, such as plants or viruses. Characterizing a pan-genome provides insights into intra-species evolution, functions, and diversity. However, researchers face challenges such as processing and maintaining large datasets while providing accurate and efficient analysis approaches. Comparative genomics methods are required for detecting conserved and unique regions between a set of genomes. This chapter gives an overview of tools available for indexing pan-genomes, identifying the sub-regions of a pan-genome and offering a variety of downstream analysis methods. These tools are categorized into two groups, gene-based and sequence-based, according to the pan-genome identification method. We highlight the differences, advantages, and disadvantages between the tools, and provide information about the general workflow, methodology of pan-genome identification, covered functionalities, usability and availability of the tools.
Collapse
Affiliation(s)
- Tina Zekic
- Faculty of Technology, Bielefeld University, Bielefeld, Germany
- Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany
- International Research Training Group 1906, Bielefeld University, Bielefeld, Germany
| | - Guillaume Holley
- Faculty of Technology, Bielefeld University, Bielefeld, Germany
- Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany
- International Research Training Group 1906, Bielefeld University, Bielefeld, Germany
| | - Jens Stoye
- Faculty of Technology, Bielefeld University, Bielefeld, Germany.
- Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany.
- International Research Training Group 1906, Bielefeld University, Bielefeld, Germany.
| |
Collapse
|
42
|
Jahangiri-Tazehkand S, Wong L, Eslahchi C. OrthoGNC: A Software for Accurate Identification of Orthologs Based on Gene Neighborhood Conservation. GENOMICS PROTEOMICS & BIOINFORMATICS 2017; 15:361-370. [PMID: 29133277 PMCID: PMC5828658 DOI: 10.1016/j.gpb.2017.07.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/22/2017] [Revised: 07/17/2017] [Accepted: 07/28/2017] [Indexed: 11/17/2022]
Abstract
Orthology relations can be used to transfer annotations from one gene (or protein) to another. Hence, detecting orthology relations has become an important task in the post-genomic era. Various genomic events, such as duplication and horizontal gene transfer, can cause erroneous assignment of orthology relations. In closely-related species, gene neighborhood information can be used to resolve many ambiguities in orthology inference. Here we present OrthoGNC, a software for accurately predicting pairwise orthology relations based on gene neighborhood conservation. Analyses on simulated and real data reveal the high accuracy of OrthoGNC. In addition to orthology detection, OrthoGNC can be employed to investigate the conservation of genomic context among potential orthologs detected by other methods. OrthoGNC is freely available online at http://bs.ipm.ir/softwares/orthognc and http://tinyurl.com/orthoGNC.
Collapse
Affiliation(s)
| | - Limsoon Wong
- School of Computing, National University of Singapore, Singapore 117417, Singapore
| | - Changiz Eslahchi
- Department of Computer Science, Shahid Beheshti University, Tehran 1983969411, Iran.
| |
Collapse
|
43
|
Song H, Gao H, Liu J, Tian P, Nan Z. Comprehensive analysis of correlations among codon usage bias, gene expression, and substitution rate in Arachis duranensis and Arachis ipaënsis orthologs. Sci Rep 2017; 7:14853. [PMID: 29093502 PMCID: PMC5665869 DOI: 10.1038/s41598-017-13981-1] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2017] [Accepted: 10/04/2017] [Indexed: 11/22/2022] Open
Abstract
The relationship between evolutionary rates and gene expression in model plant orthologs is well documented. However, little is known about the relationships between gene expression and evolutionary trends in Arachis orthologs. We identified 7,435 one-to-one orthologs, including 925 single-copy and 6,510 multiple-copy sequences in Arachis duranensis and Arachis ipaënsis. Codon usage was stronger for shorter polypeptides, which were encoded by codons with higher GC contents. Highly expressed coding sequences had higher codon usage bias, GC content, and expression breadth. Additionally, expression breadth was positively correlated with polypeptide length, but there was no correlation between gene expression and polypeptide length. Inferred selective pressure was also negatively correlated with both gene expression and expression breadth in all one-to-one orthologs, while positively but non-significantly correlated with gene expression in sequences with signatures of positive selection. Gene expression levels and expression breadth were significantly higher for single-copy genes than for multiple-copy genes. Similarly, the gene expression and expression breadth in sequences with signatures of purifying selection were higher than those of sequences with positive selective signatures. These results indicated that gene expression differed between single-copy and multiple-copy genes as well as sequences with signatures of positive and purifying selection.
Collapse
Affiliation(s)
- Hui Song
- State Key Laboratory of Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730000, China.
| | - Hongjuan Gao
- State Key Laboratory of Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730000, China
| | - Jing Liu
- State Key Laboratory of Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730000, China
| | - Pei Tian
- State Key Laboratory of Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730000, China
| | - Zhibiao Nan
- State Key Laboratory of Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730000, China.
| |
Collapse
|
44
|
Nichio BTL, Marchaukoski JN, Raittz RT. New Tools in Orthology Analysis: A Brief Review of Promising Perspectives. Front Genet 2017; 8:165. [PMID: 29163633 PMCID: PMC5674930 DOI: 10.3389/fgene.2017.00165] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2017] [Accepted: 10/16/2017] [Indexed: 11/23/2022] Open
Abstract
Nowadays defying homology relationships among sequences is essential for biological research. Within homology the analysis of orthologs sequences is of great importance for computational biology, annotation of genomes and for phylogenetic inference. Since 2007, with the increase in the number of new sequences being deposited in large biological databases, researchers have begun to analyse computerized methodologies and tools aimed at selecting the most promising ones in the prediction of orthologous groups. Literature in this field of research describes the problems that the majority of available tools show, such as those encountered in accuracy, time required for analysis (especially in light of the increasing volume of data being submitted, which require faster techniques) and the automatization of the process without requiring manual intervention. Conducting our search through BMC, Google Scholar, NCBI PubMed, and Expasy, we examined more than 600 articles pursuing the most recent techniques and tools developed to solve most the problems still existing in orthology detection. We listed the main computational tools created and developed between 2011 and 2017, taking into consideration the differences in the type of orthology analysis, outlining the main features of each tool and pointing to the problems that each one tries to address. We also observed that several tools still use as their main algorithm the BLAST "all-against-all" methodology, which entails some limitations, such as limited number of queries, computational cost, and high processing time to complete the analysis. However, new promising tools are being developed, like OrthoVenn (which uses the Venn diagram to show the relationship of ortholog groups generated by its algorithm); or proteinOrtho (which improves the accuracy of ortholog groups); or ReMark (tackling the integration of the pipeline to turn the entry process automatic); or OrthAgogue (using algorithms developed to minimize processing time); and proteinOrtho (developed for dealing with large amounts of biological data). We made a comparison among the main features of four tool and tested them using four for prokaryotic genomas. We hope that our review can be useful for researchers and will help them in selecting the most appropriate tool for their work in the field of orthology.
Collapse
Affiliation(s)
| | | | - Roberto Tadeu Raittz
- Department of Bioinformatics, Professional and Technical Education Sector, Federal University of Paraná, Curitiba, Brazil
| |
Collapse
|
45
|
Competitive Ability of Maize Pollen Grains Requires Paralogous Serine Threonine Protein Kinases STK1 and STK2. Genetics 2017; 207:1361-1370. [PMID: 28986443 DOI: 10.1534/genetics.117.300358] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2017] [Accepted: 10/03/2017] [Indexed: 11/18/2022] Open
Abstract
serine threonine kinase1 (stk1) and serine threonine kinase2 (stk2) are closely related maize paralogous genes predicted to encode serine/threonine protein kinases. Pollen mutated in stk1 or stk2 competes poorly with normal pollen, pointing to a defect in pollen tube germination or growth. Both genes are expressed in pollen, but not in most other tissues. In germination media, STK1 and STK2 fluorescent fusion proteins localize to the plasma membrane of the vegetative cell. RNA-seq experiments identified 534 differentially expressed genes in stk1 mutant pollen relative to wild type. Gene ontology (GO) molecular functional analysis uncovered several differentially expressed genes with putative ribosome initiation and elongation functions, suggesting that stk1 might affect ribosome function. Of the two paralogs, stk1 may play a more important role in pollen development than stk2, as stk2 mutations have a smaller pollen transmission effect. However, stk2 does act as an enhancer of stk1 because the double mutant combination is only infrequently pollen-transmitted in double heterozygotes. We conclude that the stk paralogs play an essential role in pollen development.
Collapse
|
46
|
Song H, Liu J, Song Q, Zhang Q, Tian P, Nan Z. Comprehensive Analysis of Codon Usage Bias in Seven Epichloë Species and Their Peramine-Coding Genes. Front Microbiol 2017; 8:1419. [PMID: 28798739 PMCID: PMC5529348 DOI: 10.3389/fmicb.2017.01419] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Accepted: 07/13/2017] [Indexed: 11/22/2022] Open
Abstract
Codon usage bias plays an important role in shaping genomes and genes in unicellular species and multicellular species. Here, we first analyzed codon usage bias in seven Epichloë species and their peramine-coding genes. Our results showed that both natural selection and mutation pressure played a role in forming codon usage bias in seven Epichloë species. All seven Epichloë species contained a peramine-coding gene cluster. Interestingly, codon usage bias of peramine-coding genes were not affected by natural selection or mutation pressure. There were 13 codons more frequently found in Epichloë genome sequences, peramine-coding gene clusters and orthologous peramine-coding genes, all of which had a bias to end with a C nucleotide. In the seven genomes analyzed, codon usage was biased in highly expressed coding sequences (CDSs) with shorter length and higher GC content. Genes in the peramine-coding gene cluster had higher GC content at the third nucleotide position of the codon, and highly expressed genes had higher GC content at the second position. In orthologous peramine-coding CDSs, high expression level was not significantly correlated with CDS length and GC content. Analysis of selection pressure identified that the genes orthologous to peramine genes were under purifying selection. There were no differences in codon usage bias and selection pressure between peramine product genes and non-functional peramine product genes. Our results provide insights into understanding codon evolution in Epichloë species.
Collapse
Affiliation(s)
- Hui Song
- State Key Laboratory of Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou UniversityLanzhou, China
| | - Jing Liu
- State Key Laboratory of Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou UniversityLanzhou, China
| | - Qiuyan Song
- State Key Laboratory of Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou UniversityLanzhou, China
| | - Qingping Zhang
- State Key Laboratory of Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou UniversityLanzhou, China
| | - Pei Tian
- State Key Laboratory of Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou UniversityLanzhou, China
| | - Zhibiao Nan
- State Key Laboratory of Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou UniversityLanzhou, China
| |
Collapse
|
47
|
Koshla O, Lopatniuk M, Rokytskyy I, Yushchuk O, Dacyuk Y, Fedorenko V, Luzhetskyy A, Ostash B. Properties of Streptomyces albus J1074 mutant deficient in tRNALeu UAA gene bldA. Arch Microbiol 2017; 199:1175-1183. [DOI: 10.1007/s00203-017-1389-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2017] [Revised: 05/06/2017] [Accepted: 05/16/2017] [Indexed: 11/28/2022]
|
48
|
Peláez R, Niculcea M, Martínez A. The Mammalian Peptide Adrenomedullin Acts as a Growth Factor in Tobacco Plants. Front Physiol 2017; 8:219. [PMID: 28446879 PMCID: PMC5388738 DOI: 10.3389/fphys.2017.00219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 03/27/2017] [Indexed: 11/29/2022] Open
Abstract
Growth factors are extracellular signals that regulate cell proliferation and total body mass. Some animal growth factors can work on plant tissues and vice versa. Here we show that the mammalian growth factor adrenomedullin (AM) induces growth in tobacco plants. Addition of synthetic AM resulted in a dose-dependent growth of tobacco calluses. Furthermore, AM transgenic plants showed enhanced survival and significant increases in stem diameter, plant height, leaf length, weight of all organs, and a reduction in the time to flowering when compared to plants transformed with the control vector. These differences were maintained when organs were dried, resulting in a mean total biomass increase of 21.3%. The levels of soluble sugars and proteins in the leaves were unchanged between genotypes. AM transgenic plants had a significantly higher expression of cyclin D3 and the transcription factor E2FB than controls, suggesting that cell cycle regulation may be part of the intracellular signaling of AM in plants. In summary, mammalian AM increases vascular plants' survival and biomass with no apparent detriment of plant's morphological and/or biochemical properties, thus this strategy could be useful for crop productivity improvement.
Collapse
Affiliation(s)
| | | | - Alfredo Martínez
- Biomass Booster LtdLogroño, Spain.,Oncology Area, Center for Biomedical Research of La RiojaLogroño, Spain
| |
Collapse
|
49
|
Ruprecht C, Vaid N, Proost S, Persson S, Mutwil M. Beyond Genomics: Studying Evolution with Gene Coexpression Networks. TRENDS IN PLANT SCIENCE 2017; 22:298-307. [PMID: 28126286 DOI: 10.1016/j.tplants.2016.12.011] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2016] [Revised: 12/06/2016] [Accepted: 12/22/2016] [Indexed: 05/08/2023]
Abstract
Understanding how genomes change as organisms become more complex is a central question in evolution. Molecular evolutionary studies typically correlate the appearance of genes and gene families with the emergence of biological pathways and morphological features. While such approaches are of great importance to understand how organisms evolve, they are also limited, as functionally related genes work together in contexts of dynamic gene networks. Since functionally related genes are often transcriptionally coregulated, gene coexpression networks present a resource to study the evolution of biological pathways. In this opinion article, we discuss recent developments in this field and how coexpression analyses can be merged with existing genomic approaches to transfer functional knowledge between species to study the appearance or extension of pathways.
Collapse
Affiliation(s)
- Colin Ruprecht
- Max-Planck Institute of Colloids and Interfaces, Am Muehlenberg 1, 14476 Potsdam, Germany
| | - Neha Vaid
- Max-Planck Institute for Molecular Plant Physiology, Am Muehlenberg 1, 14476 Potsdam, Germany
| | - Sebastian Proost
- Max-Planck Institute for Molecular Plant Physiology, Am Muehlenberg 1, 14476 Potsdam, Germany
| | - Staffan Persson
- School of BioSciences, University of Melbourne, Parkville, VIC 3010, Australia; ARC Centre of Excellence in Plant Cell Walls, School of Biosciences, University of Melbourne,Parkville, VIC 3010, Australia
| | - Marek Mutwil
- Max-Planck Institute for Molecular Plant Physiology, Am Muehlenberg 1, 14476 Potsdam, Germany.
| |
Collapse
|
50
|
Ambrosino L, Chiusano ML. Transcriptologs: A Transcriptome-Based Approach to Predict Orthology Relationships. Bioinform Biol Insights 2017; 11:1177932217690136. [PMID: 28469416 PMCID: PMC5348085 DOI: 10.1177/1177932217690136] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2016] [Accepted: 12/17/2016] [Indexed: 12/17/2022] Open
Abstract
The detection of orthologs is a key approach in genomics, useful to understand gene evolution and phylogenetic relationships and essential for gene function prediction. However, a reliable annotation of the encoded protein regions is still a limiting aspect in genomics, mainly due to the lack of confirmatory experimental evidence at proteome level. Nevertheless, the current ortholog collections are generally based on protein sequence comparisons, in addition to the availability of large transcriptome sequence collections. We developed Transcriptologs, a method for the prediction of orthologs based on similarities of translated fragments from messenger RNAs of 2 species. We implemented a procedure to extend BLAST-based alignments and to define orthologs based on the Bidirectional Best Hit approach. Results from a test case on Arabidopsis thaliana and Sorghum bicolor transcript collections revealed in some cases outperformance of Transcriptologs in comparison with a classical protein-based analysis in terms of alignment quality, revealing similarities otherwise not detectable.
Collapse
Affiliation(s)
- Luca Ambrosino
- Department of Agriculture, University of Naples "Federico II," Portici, Italy
| | - Maria Luisa Chiusano
- Department of Agriculture, University of Naples "Federico II," Portici, Italy.,Research Infrastructures for Marine Biological Resources (RIMAR), Stazione Zoologica Anton Dohrn Napoli, Naples, Italy
| |
Collapse
|