1
|
Zhao R, Wang HH, Wang Z, Xiao X, Yin XH, Hu SY, Miao HN, Zhang YJ, Liang P, Gu SH. Omics Analysis of Odorant-Binding Proteins and Cuticle-Enriched SfruOBP18 Confers Multi-Insecticide Tolerance in Spodoptera frugiperda. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2024. [PMID: 39373658 DOI: 10.1021/acs.jafc.4c05737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/08/2024]
Abstract
Spodoptera frugiperda is a notorious pest that develops a high resistance to many insecticides. Recently, insect odorant-binding proteins (OBPs) have been proven to participate in insecticide resistance. However, the functional evidence supporting the cross-link between OBPs and insecticide resistance remains unexplored. Here, we identified 50 SfruOBPs from the larval transcriptome and genome. Notably, SfruOBP18 was highly expressed in the larval cuticle and could be induced to upregulate its expression by multi-insecticides. Ligand-binding assays revealed that SfruOBP18 bound strongly with four insecticides; RNAi and insecticide bioassay demonstrated that the knockdown of SfruOBP18 did not affect larval survival and development. However, it can significantly increase the larval susceptibility to multi-insecticides, suggesting an uncommon role of SfruOBP18 in multi-insecticide susceptibility. Our study provides a comprehensive understanding of SfruOBPs and furthermore proves that a larval cuticle-enriched OBP can bind with and confer larval tolerance to multi-insecticides. SfruOBP18 could be a new insecticidal target for controlling Lepidoptera pests.
Collapse
Affiliation(s)
- Rui Zhao
- Department of Entomology, China Agricultural University, Beijing 100193, China
| | - Huan-Huan Wang
- Department of Entomology, China Agricultural University, Beijing 100193, China
| | - Zhuo Wang
- Department of Entomology, China Agricultural University, Beijing 100193, China
- Sanya Institute of China Agricultural University, Sanya 572024, China
| | - Xing Xiao
- Department of Entomology, China Agricultural University, Beijing 100193, China
| | - Xin-Hui Yin
- Department of Entomology, China Agricultural University, Beijing 100193, China
| | - Shi-Yuan Hu
- Department of Entomology, China Agricultural University, Beijing 100193, China
| | - Hao-Nan Miao
- Department of Entomology, China Agricultural University, Beijing 100193, China
| | - Yong-Jun Zhang
- State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Pei Liang
- Department of Entomology, China Agricultural University, Beijing 100193, China
| | - Shao-Hua Gu
- Department of Entomology, China Agricultural University, Beijing 100193, China
- Sanya Institute of China Agricultural University, Sanya 572024, China
| |
Collapse
|
2
|
Estevez-Castro CF, Rodrigues MF, Babarit A, Ferreira FV, de Andrade EG, Marois E, Cogni R, Aguiar ERGR, Marques JT, Olmo RP. Neofunctionalization driven by positive selection led to the retention of the loqs2 gene encoding an Aedes specific dsRNA binding protein. BMC Biol 2024; 22:14. [PMID: 38273313 PMCID: PMC10809485 DOI: 10.1186/s12915-024-01821-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Accepted: 01/10/2024] [Indexed: 01/27/2024] Open
Abstract
BACKGROUND Mosquito borne viruses, such as dengue, Zika, yellow fever and Chikungunya, cause millions of infections every year. These viruses are mostly transmitted by two urban-adapted mosquito species, Aedes aegypti and Aedes albopictus. Although mechanistic understanding remains largely unknown, Aedes mosquitoes may have unique adaptations that lower the impact of viral infection. Recently, we reported the identification of an Aedes specific double-stranded RNA binding protein (dsRBP), named Loqs2, that is involved in the control of infection by dengue and Zika viruses in mosquitoes. Preliminary analyses suggested that the loqs2 gene is a paralog of loquacious (loqs) and r2d2, two co-factors of the RNA interference (RNAi) pathway, a major antiviral mechanism in insects. RESULTS Here we analyzed the origin and evolution of loqs2. Our data suggest that loqs2 originated from two independent duplications of the first double-stranded RNA binding domain of loqs that occurred before the origin of the Aedes Stegomyia subgenus, around 31 million years ago. We show that the loqs2 gene is evolving under relaxed purifying selection at a faster pace than loqs, with evidence of neofunctionalization driven by positive selection. Accordingly, we observed that Loqs2 is localized mainly in the nucleus, different from R2D2 and both isoforms of Loqs that are cytoplasmic. In contrast to r2d2 and loqs, loqs2 expression is stage- and tissue-specific, restricted mostly to reproductive tissues in adult Ae. aegypti and Ae. albopictus. Transgenic mosquitoes engineered to express loqs2 ubiquitously undergo developmental arrest at larval stages that correlates with massive dysregulation of gene expression without major effects on microRNAs or other endogenous small RNAs, classically associated with RNA interference. CONCLUSIONS Our results uncover the peculiar origin and neofunctionalization of loqs2 driven by positive selection. This study shows an example of unique adaptations in Aedes mosquitoes that could ultimately help explain their effectiveness as virus vectors.
Collapse
Affiliation(s)
- Carlos F Estevez-Castro
- Department of Biochemistry and Immunology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil
- CNRS UPR9022, Inserm U1257, Université de Strasbourg, 67084, Strasbourg, France
| | - Murillo F Rodrigues
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR, 97403-5289, USA
| | - Antinéa Babarit
- CNRS UPR9022, Inserm U1257, Université de Strasbourg, 67084, Strasbourg, France
| | - Flávia V Ferreira
- Department of Biochemistry and Immunology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil
| | - Elisa G de Andrade
- Department of Biochemistry and Immunology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil
- CNRS UPR9022, Inserm U1257, Université de Strasbourg, 67084, Strasbourg, France
| | - Eric Marois
- CNRS UPR9022, Inserm U1257, Université de Strasbourg, 67084, Strasbourg, France
| | - Rodrigo Cogni
- Department of Ecology, Institute of Biosciences, University of São Paulo, São Paulo, 05508-090, Brazil
| | - Eric R G R Aguiar
- Department of Biological Science, Center of Biotechnology and Genetics, State University of Santa Cruz, Ilhéus, 45662-900, Brazil
| | - João T Marques
- Department of Biochemistry and Immunology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil.
- CNRS UPR9022, Inserm U1257, Université de Strasbourg, 67084, Strasbourg, France.
| | - Roenick P Olmo
- Department of Biochemistry and Immunology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil.
- CNRS UPR9022, Inserm U1257, Université de Strasbourg, 67084, Strasbourg, France.
| |
Collapse
|
3
|
Chang JM, Floden EW, Herrero J, Gascuel O, Di Tommaso P, Notredame C. Incorporating alignment uncertainty into Felsenstein's phylogenetic bootstrap to improve its reliability. Bioinformatics 2019; 37:1506-1514. [PMID: 30726875 PMCID: PMC8275982 DOI: 10.1093/bioinformatics/btz082] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Revised: 12/12/2018] [Accepted: 02/05/2019] [Indexed: 12/30/2022] Open
Abstract
Motivation Most evolutionary analyses are based on pre-estimated multiple sequence alignment. Wong et al. established the existence of an uncertainty induced by multiple sequence alignment when reconstructing phylogenies. They were able to show that in many cases different aligners produce different phylogenies, with no simple objective criterion sufficient to distinguish among these alternatives. Results We demonstrate that incorporating MSA induced uncertainty into bootstrap sampling can significantly increase correlation between clade correctness and its corresponding bootstrap value. Our procedure involves concatenating several alternative multiple sequence alignments of the same sequences, produced using different commonly used aligners. We then draw bootstrap replicates while favoring columns of the more unique aligner among the concatenated aligners. We named this concatenation and bootstrapping method, Weighted Partial Super Bootstrap (wpSBOOT). We show on three simulated datasets of 16, 32 and 64 tips that our method improves the predictive power of bootstrap values. We also used as a benchmark an empirical collection of 853 1-to-1 orthologous genes from seven yeast species and found wpSBOOT to significantly improve discrimination capacity between topologically correct and incorrect trees. Bootstrap values of wpSBOOT are comparable to similar readouts estimated using a single method. However, for reduced trees by 50% and 95% bootstrap thresholds, wpSBOOT comes out the lowest Type I error (less FP). Availability The automated generation of replicates has been implemented in the T-Coffee package, which is available as open source freeware available from www.tcoffee.org. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jia-Ming Chang
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Evan W Floden
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Javier Herrero
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Olivier Gascuel
- Unité Bioinformatique Evolutive, Centre de Bioinformatique, Biostatistique et Biologie Intégrative (C3BI)-USR 3756 CNRS and Institut Pasteur, Paris, France
| | - Paolo Di Tommaso
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Cedric Notredame
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| |
Collapse
|
4
|
A Shepherd D, Klaere S. How Well Does Your Phylogenetic Model Fit Your Data? Syst Biol 2018; 68:157-167. [DOI: 10.1093/sysbio/syy066] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2016] [Accepted: 10/11/2018] [Indexed: 12/27/2022] Open
Affiliation(s)
- Daisy A Shepherd
- Department of Statistics, The University of Auckland, Auckland, New Zealand
| | - Steffen Klaere
- Department of Statistics, The University of Auckland, Auckland, New Zealand
- School of Biological Sciences, The University of Auckland, Auckland, New Zealand
| |
Collapse
|
5
|
Bromham L, Duchêne S, Hua X, Ritchie AM, Duchêne DA, Ho SYW. Bayesian molecular dating: opening up the black box. Biol Rev Camb Philos Soc 2017; 93:1165-1191. [DOI: 10.1111/brv.12390] [Citation(s) in RCA: 104] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2017] [Revised: 11/13/2017] [Accepted: 11/17/2017] [Indexed: 12/27/2022]
Affiliation(s)
- Lindell Bromham
- Macroevolution & Macroecology, Division of Ecology & Evolution, Research School of Biology; Australian National University; Canberra ACT 2601 Australia
| | - Sebastián Duchêne
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute; The University of Melbourne; Melbourne VIC 3010 Australia
- School of Life and Environmental Sciences; University of Sydney; Sydney NSW 2006 Australia
| | - Xia Hua
- Macroevolution & Macroecology, Division of Ecology & Evolution, Research School of Biology; Australian National University; Canberra ACT 2601 Australia
| | - Andrew M. Ritchie
- School of Life and Environmental Sciences; University of Sydney; Sydney NSW 2006 Australia
| | - David A. Duchêne
- Macroevolution & Macroecology, Division of Ecology & Evolution, Research School of Biology; Australian National University; Canberra ACT 2601 Australia
- School of Life and Environmental Sciences; University of Sydney; Sydney NSW 2006 Australia
| | - Simon Y. W. Ho
- School of Life and Environmental Sciences; University of Sydney; Sydney NSW 2006 Australia
| |
Collapse
|
6
|
Nascimento FF, Reis MD, Yang Z. A biologist's guide to Bayesian phylogenetic analysis. Nat Ecol Evol 2017; 1:1446-1454. [PMID: 28983516 PMCID: PMC5624502 DOI: 10.1038/s41559-017-0280-x] [Citation(s) in RCA: 87] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2016] [Accepted: 07/17/2017] [Indexed: 11/09/2022]
Abstract
Bayesian methods have become very popular in molecular phylogenetics due to the availability of user-friendly software implementing sophisticated models of evolution. However, Bayesian phylogenetic models are complex, and analyses are often carried out using default settings, which may not be appropriate. Here, we summarize the major features of Bayesian phylogenetic inference and discuss Bayesian computation using Markov chain Monte Carlo (MCMC), the diagnosis of an MCMC run, and ways of summarising the MCMC sample. We discuss the specification of the prior, the choice of the substitution model, and partitioning of the data. Finally, we provide a list of common Bayesian phylogenetic software and provide recommendations as to their use.
Collapse
Affiliation(s)
- Fabrícia F Nascimento
- Department of Zoology, University of Oxford, Oxford, OX1 3PS, UK.
- Department of Infectious Disease Epidemiology, Imperial College London, London, W2 1PG, UK.
| | - Mario Dos Reis
- School of Biological and Chemical Sciences, Queen Mary University of London, London, E1 4NS, UK
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, London, WC1E 6BT, UK.
| |
Collapse
|
7
|
Adato O, Ninyo N, Gophna U, Snir S. Detecting Horizontal Gene Transfer between Closely Related Taxa. PLoS Comput Biol 2015; 11:e1004408. [PMID: 26439115 PMCID: PMC4595140 DOI: 10.1371/journal.pcbi.1004408] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2014] [Accepted: 06/20/2015] [Indexed: 01/12/2023] Open
Abstract
Horizontal gene transfer (HGT), the transfer of genetic material between organisms, is crucial for genetic innovation and the evolution of genome architecture. Existing HGT detection algorithms rely on a strong phylogenetic signal distinguishing the transferred sequence from ancestral (vertically derived) genes in its recipient genome. Detecting HGT between closely related species or strains is challenging, as the phylogenetic signal is usually weak and the nucleotide composition is normally nearly identical. Nevertheless, there is a great importance in detecting HGT between congeneric species or strains, especially in clinical microbiology, where understanding the emergence of new virulent and drug-resistant strains is crucial, and often time-sensitive. We developed a novel, self-contained technique named Near HGT, based on the synteny index, to measure the divergence of a gene from its native genomic environment and used it to identify candidate HGT events between closely related strains. The method confirms candidate transferred genes based on the constant relative mutability (CRM). Using CRM, the algorithm assigns a confidence score based on “unusual” sequence divergence. A gene exhibiting exceptional deviations according to both synteny and mutability criteria, is considered a validated HGT product. We first employed the technique to a set of three E. coli strains and detected several highly probable horizontally acquired genes. We then compared the method to existing HGT detection tools using a larger strain data set. When combined with additional approaches our new algorithm provides richer picture and brings us closer to the goal of detecting all newly acquired genes in a particular strain. The transfer of genetic material between organisms, usually denoted as horizontal (or lateral) gene transfer (HGT or LGT), is a prime mechanism in microbial evolution and responsible for genetic innovation and the evolution of genome architecture. Detecting HGT between closely related species or strains is imperative as drug-resistant pathogenic strains most often acquire their virulence from closely related bacteria. The proposed method combines two evolutionary signals that were not employed in the past for this task. One is the synteny index (SI), measuring the loss of synteny in an organism, and the other is a novel concept—constant relative mutability (CRM), maintaining that genes preserve their relative evolution rate along linages (although the latter ones may each change). We show both in simulation and real biological data that the method is sound and, in the cases examined, provides stronger sensitivity than existing methods. We therefore believe this novel approach represents a significant advance, for the first time enabling the detection of previously ignored HGT events that will bring us closer to the goal of detecting all newly acquired genes in a particular strain. Availability: The method is publicly available at http://research.haifa.ac.il/~ssagi/software/nearHGT.zip
Collapse
Affiliation(s)
- Orit Adato
- Department of Evolutionary Biology, University of Haifa, Haifa, Israel
| | - Noga Ninyo
- Department of Evolutionary Biology, University of Haifa, Haifa, Israel
| | - Uri Gophna
- Department of Molecular Microbiology and Biotechnology Tel Aviv University, Tel-Aviv, Israel
| | - Sagi Snir
- Department of Evolutionary Biology, University of Haifa, Haifa, Israel
- * E-mail:
| |
Collapse
|
8
|
Kasarda DD, Adalsteins E, Lew EJL, Lazo GR, Altenbach SB. Farinin: characterization of a novel wheat endosperm protein belonging to the prolamin superfamily. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2013; 61:2407-17. [PMID: 23414243 DOI: 10.1021/jf3053466] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Starch granule surface-associated proteins were separated by HPLC and identified by direct protein sequencing. Among the proteins identified was one that consisted of two polypeptide chains of 11 and 19 kDa linked by disulfide bonds. Sequencing of tryptic peptides from each of the polypeptides revealed similarities between some of the peptides and avenin-like b proteins encoded by partial cDNAs in NCBI. To identify a contiguous sequence that matched all of the peptides, contigs encoding three avenin-like b proteins were constructed from ESTs of the cultivar Butte 86. All peptide sequences were found in a protein encoded by one of these contigs that had not been identified previously. Protein and DNA sequences indicated that the two polypeptide chains were derived from a parent protein that had been cleaved at the C-terminal position of an asparagine residue. The name farinin is suggested for this protein and other avenin-like b proteins. Evolutionary relationships of the protein are discussed and a simple computer molecular model was constructed. On the basis of its sequence, the new protein was likely to be allergenic but unlikely to be active in celiac disease.
Collapse
Affiliation(s)
- Donald D Kasarda
- Western Regional Research Center, Agricultural Research Service, U.S. Department of Agriculture , 800 Buchanan Street, Albany, California 94710, United States
| | | | | | | | | |
Collapse
|
9
|
Tóth A, Hausknecht A, Krisai-Greilhuber I, Papp T, Vágvölgyi C, Nagy LG. Iteratively refined guide trees help improving alignment and phylogenetic inference in the mushroom family Bolbitiaceae. PLoS One 2013; 8:e56143. [PMID: 23418526 PMCID: PMC3572013 DOI: 10.1371/journal.pone.0056143] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2012] [Accepted: 01/07/2013] [Indexed: 11/19/2022] Open
Abstract
Reconciling traditional classifications, morphology, and the phylogenetic relationships of brown-spored agaric mushrooms has proven difficult in many groups, due to extensive convergence in morphological features. Here, we address the monophyly of the Bolbitiaceae, a family with over 700 described species and examine the higher-level relationships within the family using a newly constructed multilocus dataset (ITS, nrLSU rDNA and EF1-alpha). We tested whether the fast-evolving Internal Transcribed Spacer (ITS) sequences can be accurately aligned across the family, by comparing the outcome of two iterative alignment refining approaches (an automated and a manual) and various indel-treatment strategies. We used PRANK to align sequences in both cases. Our results suggest that--although PRANK successfully evades overmatching of gapped sites, referred previously to as alignment overmatching--it infers an unrealistically high number of indel events with natively generated guide-trees. This 'alignment undermatching' could be avoided by using more rigorous (e.g. ML) guide trees. The trees inferred in this study support the monophyly of the core Bolbitiaceae, with the exclusion of Panaeolus, Agrocybe, and some of the genera formerly placed in the family. Bolbitius and Conocybe were found monophyletic, however, Pholiotina and Galerella require redefinition. The phylogeny revealed that stipe coverage type is a poor predictor of phylogenetic relationships, indicating the need for a revision of the intrageneric relationships within Conocybe.
Collapse
Affiliation(s)
- Annamária Tóth
- Department of Microbiology, Faculty of Science and Informatics, University of Szeged, Szeged, Hungary
| | - Anton Hausknecht
- Department of Systematic and Evolutionary Botany, Faculty Centre of Biodiversity, University of Vienna, Wien, Austria
| | - Irmgard Krisai-Greilhuber
- Department of Systematic and Evolutionary Botany, Faculty Centre of Biodiversity, University of Vienna, Wien, Austria
| | - Tamás Papp
- Department of Microbiology, Faculty of Science and Informatics, University of Szeged, Szeged, Hungary
| | - Csaba Vágvölgyi
- Department of Microbiology, Faculty of Science and Informatics, University of Szeged, Szeged, Hungary
| | - László G. Nagy
- Department of Microbiology, Faculty of Science and Informatics, University of Szeged, Szeged, Hungary
| |
Collapse
|
10
|
Mello B, Schrago CG. Incorrect handling of calibration information in divergence time inference: an example from volcanic islands. Ecol Evol 2012; 2:493-500. [PMID: 22822429 PMCID: PMC3399139 DOI: 10.1002/ece3.94] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2011] [Revised: 11/25/2011] [Accepted: 11/29/2011] [Indexed: 11/21/2022] Open
Abstract
Divergence time studies rely on calibration information from several sources. The age of volcanic islands is one of the standard references to obtain chronological data to estimate the absolute times of lineage diversifications. This strategy assumes that cladogenesis is necessarily associated with island formation, and punctual calibrations are commonly used to date the splits of endemic island species. Here, we re-examined three studies that inferred divergence times for different Hawaiian lineages assuming fixed calibration points. We show that, by permitting probabilistic calibrations, some divergences are estimated to be significantly younger or older than the age of the island formation, thus yielding distinct ecological scenarios for the speciation process. The results highlight the importance of using calibration information correctly, as well as the possibility of incorporating volcanic island studies into a formal, biogeographical hypothesis-testing framework.
Collapse
Affiliation(s)
- Beatriz Mello
- Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | | |
Collapse
|
11
|
Bayzid MDS, Warnow T. Estimating Optimal Species Trees from Incomplete Gene Trees Under Deep Coalescence. J Comput Biol 2012; 19:591-605. [DOI: 10.1089/cmb.2012.0037] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Affiliation(s)
| | - Tandy Warnow
- Department of Computer Science, University of Texas at Austin, Austin
| |
Collapse
|
12
|
Liberles DA, Teichmann SA, Bahar I, Bastolla U, Bloom J, Bornberg-Bauer E, Colwell LJ, de Koning APJ, Dokholyan NV, Echave J, Elofsson A, Gerloff DL, Goldstein RA, Grahnen JA, Holder MT, Lakner C, Lartillot N, Lovell SC, Naylor G, Perica T, Pollock DD, Pupko T, Regan L, Roger A, Rubinstein N, Shakhnovich E, Sjölander K, Sunyaev S, Teufel AI, Thorne JL, Thornton JW, Weinreich DM, Whelan S. The interface of protein structure, protein biophysics, and molecular evolution. Protein Sci 2012; 21:769-85. [PMID: 22528593 PMCID: PMC3403413 DOI: 10.1002/pro.2071] [Citation(s) in RCA: 155] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2012] [Revised: 03/22/2012] [Accepted: 03/23/2012] [Indexed: 12/20/2022]
Abstract
Abstract The interface of protein structural biology, protein biophysics, molecular evolution, and molecular population genetics forms the foundations for a mechanistic understanding of many aspects of protein biochemistry. Current efforts in interdisciplinary protein modeling are in their infancy and the state-of-the art of such models is described. Beyond the relationship between amino acid substitution and static protein structure, protein function, and corresponding organismal fitness, other considerations are also discussed. More complex mutational processes such as insertion and deletion and domain rearrangements and even circular permutations should be evaluated. The role of intrinsically disordered proteins is still controversial, but may be increasingly important to consider. Protein geometry and protein dynamics as a deviation from static considerations of protein structure are also important. Protein expression level is known to be a major determinant of evolutionary rate and several considerations including selection at the mRNA level and the role of interaction specificity are discussed. Lastly, the relationship between modeling and needed high-throughput experimental data as well as experimental examination of protein evolution using ancestral sequence resurrection and in vitro biochemistry are presented, towards an aim of ultimately generating better models for biological inference and prediction.
Collapse
Affiliation(s)
- David A Liberles
- Department of Molecular Biology, University of WyomingLaramie, Wyoming 82071
| | - Sarah A Teichmann
- MRC Laboratory of Molecular BiologyHills Road, Cambridge CB2 0QH, United Kingdom
| | - Ivet Bahar
- Department of Computational and Systems Biology, School of Medicine, University of PittsburghPittsburgh, Pennsylvania 15213
| | - Ugo Bastolla
- Bioinformatics Unit. Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autonoma de Madrid28049 Cantoblanco Madrid, Spain
| | - Jesse Bloom
- Division of Basic Sciences, Fred Hutchinson Cancer Research CenterSeattle, Washington 98109
| | - Erich Bornberg-Bauer
- Evolutionary Bioinformatics Group, Institute for Evolution and Biodiversity, University of MuensterGermany
| | - Lucy J Colwell
- MRC Laboratory of Molecular BiologyHills Road, Cambridge CB2 0QH, United Kingdom
| | - A P Jason de Koning
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of ColoradoAurora, Colorado
| | - Nikolay V Dokholyan
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel HillNorth Carolina 27599
| | - Julian Echave
- Escuela de Ciencia y Tecnología, Universidad Nacional de San MartínMartín de Irigoyen 3100, 1650 San Martín, Buenos Aires, Argentina
| | - Arne Elofsson
- Department of Biochemistry and Biophysics, Center for Biomembrane Research, Stockholm Bioinformatics Center, Science for Life Laboratory, Swedish E-science Research Center, Stockholm University106 91 Stockholm, Sweden
| | - Dietlind L Gerloff
- Biomolecular Engineering Department, University of CaliforniaSanta Cruz, California 95064
| | - Richard A Goldstein
- Division of Mathematical Biology, National Institute for Medical Research (MRC)Mill Hill, London NW7 1AA, United Kingdom
| | - Johan A Grahnen
- Department of Molecular Biology, University of WyomingLaramie, Wyoming 82071
| | - Mark T Holder
- Department of Ecology and Evolutionary Biology, University of KansasLawrence, Kansas 66045
| | - Clemens Lakner
- Bioinformatics Research Center, North Carolina State UniversityRaleigh, North Carolina 27695
| | - Nicholas Lartillot
- Département de Biochimie, Faculté de Médecine, Université de MontréalMontréal, QC H3T1J4, Canada
| | - Simon C Lovell
- Faculty of Life Sciences, University of ManchesterManchester M13 9PT, United Kingdom
| | - Gavin Naylor
- Department of Biology, College of CharlestonCharleston, South Carolina 29424
| | - Tina Perica
- MRC Laboratory of Molecular BiologyHills Road, Cambridge CB2 0QH, United Kingdom
| | - David D Pollock
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of ColoradoAurora, Colorado
| | - Tal Pupko
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv UniversityTel Aviv, Israel
| | - Lynne Regan
- Department of Molecular Biophysics and Biochemistry, Yale UniversityNew Haven 06511
| | - Andrew Roger
- Department of Biochemistry and Molecular Biology, Dalhousie UniversityHalifax, NS, Canada
| | - Nimrod Rubinstein
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv UniversityTel Aviv, Israel
| | - Eugene Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard UniversityCambridge, Massachusetts 02138
| | - Kimmen Sjölander
- Department of Bioengineering, University of CaliforniaBerkeley, Berkeley, California 94720
| | - Shamil Sunyaev
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School77 Avenue Louis Pasteur, Boston, Massachusetts 02115
| | - Ashley I Teufel
- Department of Molecular Biology, University of WyomingLaramie, Wyoming 82071
| | - Jeffrey L Thorne
- Bioinformatics Research Center, North Carolina State UniversityRaleigh, North Carolina 27695
| | - Joseph W Thornton
- Howard Hughes Medical Institute and Institute for Ecology and Evolution, University of OregonEugene, Oregon 97403
- Department of Human Genetics, University of ChicagoChicago, Illinois 60637
- Department of Ecology and Evolution, University of ChicagoChicago, Illinois 60637
| | - Daniel M Weinreich
- Department of Ecology and Evolutionary Biology, and Center for Computational Molecular Biology, Brown UniversityProvidence, Rhode Island 02912
| | - Simon Whelan
- Faculty of Life Sciences, University of ManchesterManchester M13 9PT, United Kingdom
| |
Collapse
|
13
|
Löytynoja A, Vilella AJ, Goldman N. Accurate extension of multiple sequence alignments using a phylogeny-aware graph algorithm. ACTA ACUST UNITED AC 2012; 28:1684-91. [PMID: 22531217 PMCID: PMC3381962 DOI: 10.1093/bioinformatics/bts198] [Citation(s) in RCA: 133] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Accurate alignment of large numbers of sequences is demanding and the computational burden is further increased by downstream analyses depending on these alignments. With the abundance of sequence data, an integrative approach of adding new sequences to existing alignments without their full re-computation and maintaining the relative matching of existing sequences is an attractive option. Another current challenge is the extension of reference alignments with fragmented sequences, as those coming from next-generation metagenomics, that contain relatively little information. Widely used methods for alignment extension are based on profile representation of reference sequences. These do not incorporate and use phylogenetic information and are affected by the composition of the reference alignment and the phylogenetic positions of query sequences. RESULTS We have developed a method for phylogeny-aware alignment of partial-order sequence graphs and apply it here to the extension of alignments with new data. Our new method, called PAGAN, infers ancestral sequences for the reference alignment and adds new sequences in their phylogenetic context, either to predefined positions or by finding the best placement for sequences of unknown origin. Unlike profile-based alternatives, PAGAN considers the phylogenetic relatedness of the sequences and is not affected by inclusion of more diverged sequences in the reference set. Our analyses show that PAGAN outperforms alternative methods for alignment extension and provides superior accuracy for both DNA and protein data, the improvement being especially large for fragmented sequences. Moreover, PAGAN-generated alignments of noisy next-generation sequencing (NGS) sequences are accurate enough for the use of RNA-seq data in evolutionary analyses. AVAILABILITY PAGAN is written in C++, licensed under the GPL and its source code is available at http://code.google.com/p/pagan-msa.
Collapse
Affiliation(s)
- Ari Löytynoja
- EMBL-European Bioinformatics Institute, Hinxton, CB10 1SD, UK.
| | | | | |
Collapse
|
14
|
Kawrykow A, Roumanis G, Kam A, Kwak D, Leung C, Wu C, Zarour E, Sarmenta L, Blanchette M, Waldispühl J. Phylo: a citizen science approach for improving multiple sequence alignment. PLoS One 2012; 7:e31362. [PMID: 22412834 PMCID: PMC3296692 DOI: 10.1371/journal.pone.0031362] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2011] [Accepted: 01/09/2012] [Indexed: 01/07/2023] Open
Abstract
Background Comparative genomics, or the study of the relationships of genome structure and function across different species, offers a powerful tool for studying evolution, annotating genomes, and understanding the causes of various genetic disorders. However, aligning multiple sequences of DNA, an essential intermediate step for most types of analyses, is a difficult computational task. In parallel, citizen science, an approach that takes advantage of the fact that the human brain is exquisitely tuned to solving specific types of problems, is becoming increasingly popular. There, instances of hard computational problems are dispatched to a crowd of non-expert human game players and solutions are sent back to a central server. Methodology/Principal Findings We introduce Phylo, a human-based computing framework applying “crowd sourcing” techniques to solve the Multiple Sequence Alignment (MSA) problem. The key idea of Phylo is to convert the MSA problem into a casual game that can be played by ordinary web users with a minimal prior knowledge of the biological context. We applied this strategy to improve the alignment of the promoters of disease-related genes from up to 44 vertebrate species. Since the launch in November 2010, we received more than 350,000 solutions submitted from more than 12,000 registered users. Our results show that solutions submitted contributed to improving the accuracy of up to 70% of the alignment blocks considered. Conclusions/Significance We demonstrate that, combined with classical algorithms, crowd computing techniques can be successfully used to help improving the accuracy of MSA. More importantly, we show that an NP-hard computational problem can be embedded in casual game that can be easily played by people without significant scientific training. This suggests that citizen science approaches can be used to exploit the billions of “human-brain peta-flops” of computation that are spent every day playing games. Phylo is available at: http://phylo.cs.mcgill.ca.
Collapse
Affiliation(s)
- Alexander Kawrykow
- School of Computer Science and McGill Centre for Bioinformatics, McGill University, Montreal, Quebec, Canada
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Koskela M, Annila A. Looking for the Last Universal Common Ancestor (LUCA). Genes (Basel) 2012; 3:81-7. [PMID: 24704844 PMCID: PMC3899962 DOI: 10.3390/genes3010081] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2011] [Revised: 12/18/2011] [Accepted: 12/29/2011] [Indexed: 11/22/2022] Open
Abstract
Genomic sequences across diverse species seem to align towards a common ancestry, eventually implying that eons ago some universal antecedent organism would have lived on the face of Earth. However, when evolution is understood not only as a biological process but as a general thermodynamic process, it becomes apparent that the quest for the last universal common ancestor is unattainable. Ambiguities in alignments are unavoidable because the driving forces and paths of evolution cannot be separated from each other. Thus tracking down life’s origin is by its nature a non-computable task. The thermodynamic tenet clarifies that evolution is a path-dependent process of least-time consumption of free energy. The natural process is without a demarcation line between animate and inanimate.
Collapse
Affiliation(s)
- Minna Koskela
- Department of Biosciences, Viikinkaari 1, FI-00014 University of Helsinki, Finland.
| | - Arto Annila
- Department of Biosciences, Viikinkaari 1, FI-00014 University of Helsinki, Finland.
| |
Collapse
|
16
|
Löytynoja A. Alignment methods: strategies, challenges, benchmarking, and comparative overview. Methods Mol Biol 2012; 855:203-35. [PMID: 22407710 DOI: 10.1007/978-1-61779-582-4_7] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Comparative evolutionary analyses of molecular sequences are solely based on the identities and differences detected between homologous characters. Errors in this homology statement, that is errors in the alignment of the sequences, are likely to lead to errors in the downstream analyses. Sequence alignment and phylogenetic inference are tightly connected and many popular alignment programs use the phylogeny to divide the alignment problem into smaller tasks. They then neglect the phylogenetic tree, however, and produce alignments that are not evolutionarily meaningful. The use of phylogeny-aware methods reduces the error but the resulting alignments, with evolutionarily correct representation of homology, can challenge the existing practices and methods for viewing and visualising the sequences. The inter-dependency of alignment and phylogeny can be resolved by joint estimation of the two; methods based on statistical models allow for inferring the alignment parameters from the data and correctly take into account the uncertainty of the solution but remain computationally challenging. Widely used alignment methods are based on heuristic algorithms and unlikely to find globally optimal solutions. The whole concept of one correct alignment for the sequences is questionable, however, as there typically exist vast numbers of alternative, roughly equally good alignments that should also be considered. This uncertainty is hidden by many popular alignment programs and is rarely correctly taken into account in the downstream analyses. The quest for finding and improving the alignment solution is complicated by the lack of suitable measures of alignment goodness. The difficulty of comparing alternative solutions also affects benchmarks of alignment methods and the results strongly depend on the measure used. As the effects of alignment error cannot be predicted, comparing the alignments' performance in downstream analyses is recommended.
Collapse
Affiliation(s)
- Ari Löytynoja
- European Bioinformatics Institute (EMBL), Hinxton, UK.
| |
Collapse
|
17
|
Can sensitivity analysis help to detect long-branch attraction? Mol Phylogenet Evol 2011; 61:899-903. [DOI: 10.1016/j.ympev.2011.08.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2011] [Revised: 08/01/2011] [Accepted: 08/04/2011] [Indexed: 11/19/2022]
|
18
|
|
19
|
Wang Z, Nilsson RH, Lopez-Giraldez F, Zhuang WY, Dai YC, Johnston PR, Townsend JP. Tasting soil fungal diversity with earth tongues: phylogenetic test of SATé alignments for environmental ITS data. PLoS One 2011; 6:e19039. [PMID: 21533038 PMCID: PMC3080880 DOI: 10.1371/journal.pone.0019039] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2011] [Accepted: 03/22/2011] [Indexed: 12/14/2022] Open
Abstract
An abundance of novel fungal lineages have been indicated by DNA sequencing of the nuclear ribosomal ITS region from environmental samples such as soil and wood. Although phylogenetic analysis of these novel lineages is a key component of unveiling the structure and diversity of complex communities, such analyses are rare for environmental ITS data due to the difficulties of aligning this locus across significantly divergent taxa. One potential approach to this issue is simultaneous alignment and tree estimation. We targeted divergent ITS sequences of the earth tongue fungi (Geoglossomycetes), a basal class in the Ascomycota, to assess the performance of SATé, recent software that combines progressive alignment and tree building. We found that SATé performed well in generating high-quality alignments and in accurately estimating the phylogeny of earth tongue fungi. Drawing from a data set of 300 sequences of earth tongues and progressively more distant fungal lineages, 30 insufficiently identified ITS sequences from the public sequence databases were assigned to the Geoglossomycetes. The association between earth tongues and plants has been hypothesized for a long time, but hard evidence is yet to be collected. The ITS phylogeny showed that four ectomycorrhizal isolates shared a clade with Geoglossum but not with Trichoglossum earth tongues, pointing to the significant potential inherent to ecological data mining of environmental samples. Environmental sampling holds the key to many focal questions in mycology, and simultaneous alignment and tree estimation, as performed by SATé, can be a highly efficient companion in that pursuit.
Collapse
Affiliation(s)
- Zheng Wang
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America.
| | | | | | | | | | | | | |
Collapse
|
20
|
Sipos B, Massingham T, Jordan GE, Goldman N. PhyloSim - Monte Carlo simulation of sequence evolution in the R statistical computing environment. BMC Bioinformatics 2011; 12:104. [PMID: 21504561 PMCID: PMC3102636 DOI: 10.1186/1471-2105-12-104] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2010] [Accepted: 04/19/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The Monte Carlo simulation of sequence evolution is routinely used to assess the performance of phylogenetic inference methods and sequence alignment algorithms. Progress in the field of molecular evolution fuels the need for more realistic and hence more complex simulations, adapted to particular situations, yet current software makes unreasonable assumptions such as homogeneous substitution dynamics or a uniform distribution of indels across the simulated sequences. This calls for an extensible simulation framework written in a high-level functional language, offering new functionality and making it easy to incorporate further complexity. RESULTS PhyloSim is an extensible framework for the Monte Carlo simulation of sequence evolution, written in R, using the Gillespie algorithm to integrate the actions of many concurrent processes such as substitutions, insertions and deletions. Uniquely among sequence simulation tools, PhyloSim can simulate arbitrarily complex patterns of rate variation and multiple indel processes, and allows for the incorporation of selective constraints on indel events. User-defined complex patterns of mutation and selection can be easily integrated into simulations, allowing PhyloSim to be adapted to specific needs. CONCLUSIONS Close integration with R and the wide range of features implemented offer unmatched flexibility, making it possible to simulate sequence evolution under a wide range of realistic settings. We believe that PhyloSim will be useful to future studies involving simulated alignments.
Collapse
Affiliation(s)
- Botond Sipos
- EMBL-European Bioinformatics Institute, Hinxton, UK.
| | | | | | | |
Collapse
|
21
|
Lam TTY, Hon CC, Tang JW. Use of phylogenetics in the molecular epidemiology and evolutionary studies of viral infections. Crit Rev Clin Lab Sci 2010; 47:5-49. [PMID: 20367503 DOI: 10.3109/10408361003633318] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Since DNA sequencing techniques first became available almost 30 years ago, the amount of nucleic acid sequence data has increased enormously. Phylogenetics, which is widely applied to compare and analyze such data, is particularly useful for the analysis of genes from rapidly evolving viruses. It has been used extensively to describe the molecular epidemiology and transmission of the human immunodeficiency virus (HIV), the origins and subsequent evolution of the severe acute respiratory syndrome (SARS)-associated coronavirus (SCoV), and, more recently, the evolving epidemiology of avian influenza as well as seasonal and pandemic human influenza viruses. Recent advances in phylogenetic methods can infer more in-depth information about the patterns of virus emergence, adding to the conventional approaches in viral epidemiology. Examples of this information include estimations (with confidence limits) of the actual time of the origin of a new viral strain or its emergence in a new species, viral recombination and reassortment events, the rate of population size change in a viral epidemic, and how the virus spreads and evolves within a specific population and geographical region. Such sequence-derived information obtained from the phylogenetic tree can assist in the design and implementation of public health and therapeutic interventions. However, application of many of these advanced phylogenetic methods are currently limited to specialized phylogeneticists and statisticians, mainly because of their mathematical basis and their dependence on the use of a large number of computer programs. This review attempts to bridge this gap by presenting conceptual, technical, and practical aspects of applying phylogenetic methods in studies of influenza, HIV, and SCoV. It aims to provide, with minimal mathematics and statistics, a practical overview of how phylogenetic methods can be incorporated into virological studies by clinical and laboratory specialists.
Collapse
Affiliation(s)
- Tommy Tsan-Yuk Lam
- School of Biological Sciences, The University of Hong Kong, Hong Kong Special Administrative Region, China
| | | | | |
Collapse
|