1
|
Yilmaz F, Karageorgiou C, Kim K, Pajic P, Scheer K, Human Genome Structural Variation Consortium, Beck CR, Torregrossa AM, Lee C, Gokcumen O. Reconstruction of the human amylase locus reveals ancient duplications seeding modern-day variation. Science 2024; 386:eadn0609. [PMID: 39418342 PMCID: PMC11707797 DOI: 10.1126/science.adn0609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 05/27/2024] [Accepted: 09/24/2024] [Indexed: 10/19/2024]
Abstract
Previous studies suggested that the copy number of the human salivary amylase gene, AMY1, correlates with starch-rich diets. However, evolutionary analyses are hampered by the absence of accurate, sequence-resolved haplotype variation maps. We identified 30 structurally distinct haplotypes at nucleotide resolution among 98 present-day humans, revealing that the coding sequences of AMY1 copies are evolving under negative selection. Genomic analyses of these haplotypes in archaic hominins and ancient human genomes suggest that a common three-copy haplotype, dating as far back as 800,000 years ago, has seeded rapidly evolving rearrangements through recurrent nonallelic homologous recombination. Additionally, haplotypes with more than three AMY1 copies have significantly increased in frequency among European farmers over the past 4000 years, potentially as an adaptive response to increased starch digestion.
Collapse
Affiliation(s)
- Feyza Yilmaz
- The Jackson Laboratory for Genomic Medicine, Farmington,
CT, USA
| | | | - Kwondo Kim
- The Jackson Laboratory for Genomic Medicine, Farmington,
CT, USA
| | - Petar Pajic
- Department of Biological Sciences, University at Buffalo,
Buffalo, NY, USA
| | - Kendra Scheer
- Department of Biological Sciences, University at Buffalo,
Buffalo, NY, USA
| | | | - Christine R. Beck
- The Jackson Laboratory for Genomic Medicine, Farmington,
CT, USA
- University of Connecticut, Institute for Systems Genomics,
Storrs, CT, USA
- The University of Connecticut Health Center, Farmington,
CT, USA
| | - Ann-Marie Torregrossa
- Department of Psychology, University at Buffalo, Buffalo,
NY, USA
- University at Buffalo Center for Ingestive Behavior
Research, University at Buffalo, Buffalo, NY, USA
| | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington,
CT, USA
| | - Omer Gokcumen
- Department of Biological Sciences, University at Buffalo,
Buffalo, NY, USA
| |
Collapse
|
2
|
Zhang F, Zhang J, Sun Y. Influence of an indigenous yeast, CECA, from the Ningxia wine region of China, on the fungal and bacterial dynamics and function during Cabernet Sauvignon wine fermentation. JOURNAL OF THE SCIENCE OF FOOD AND AGRICULTURE 2024; 104:8693-8706. [PMID: 38922891 DOI: 10.1002/jsfa.13696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 04/16/2024] [Accepted: 06/12/2024] [Indexed: 06/28/2024]
Abstract
BACKGROUND Saccharomyces cerevisiae CECA was a potential indigenous Chinese wine yeast that can produce aroma and flavor in Cabernet Sauvignon wines. High-throughput sequencing combined with metabolite analysis was applied to analyze the effects of CECA inoculation on the native microbial community interaction and metabolism during Cabernet Sauvignon wine fermentation. RESULTS Fermentations were performed with three different inoculant strategies: spontaneous fermentation without inoculation, inoculation with CECA after grape must sterilization, and direct inoculation of CECA. Results showed that the diversity of bacteria (P = 0.033) is more sensitive to CECA inoculation than fungi (P = 0.563). In addition, CECA inoculation altered the species composition of core microorganisms (relative abundance >1%) and the keystone species (accounting for the top 1% of the most important interactions), as well as of the biomarkers (linear discriminant analysis > 3.0, P < 0.05). Furthermore, the inoculation could change the cluster of metabolites, and these differential metabolite sets were correlated with four fungal taxa of Issatchenkia, Issatchenkia orientalis, Saccharomycetales, Saccharomycetes and two bacterial taxa of Pantoea, Tatumella ptyseos, were significantly correlated. Inoculated fermentation also altered the correlation between dominant microorganisms and aroma compounds, giving Cabernet Sauvignon wines more herbal, floral, fruity, and cheesy aromas. CONCLUSION Saccharomyces cerevisiae CECA and dimethyl dicarbonate (DMDC) inhibition treatments significantly altered the microbial community structure of Cabernet Sauvignon wines, which in turn affected the microbial-metabolite correlation. These findings will help winemakers to control the microbial dynamics and functions during wine fermentation, and be more widely used in regional typical wine fermentations. © 2024 Society of Chemical Industry.
Collapse
Affiliation(s)
- Fang Zhang
- School of Food Science and Engineering, Ningxia University, Yinchuan, P. R. China
| | - Jing Zhang
- State Key Laboratory of Environmental Criteria and Risk Assessment, National Engineering Laboratory for Lake Pollution Control and Ecological Restoration, Chinese Research Academy of Environmental Sciences, Beijing, China
| | - Yue Sun
- College of Enology and Horticulture, Ningxia University, Yinchuan, P. R. China
- Engineering Research Center of Grape and Wine, Ministry of Education, Yinchuan, P. R. China
| |
Collapse
|
3
|
Calamari ZT, Song A, Cohen E, Akter M, Das Roy R, Hallikas O, Christensen MM, Li P, Marangoni P, Jernvall J, Klein OD. Bank vole genomics links determinate and indeterminate growth of teeth. BMC Genomics 2024; 25:1000. [PMID: 39472825 PMCID: PMC11523675 DOI: 10.1186/s12864-024-10901-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Accepted: 10/14/2024] [Indexed: 11/02/2024] Open
Abstract
BACKGROUND Continuously growing teeth are an important innovation in mammalian evolution, yet genetic regulation of continuous growth by stem cells remains incompletely understood. Dental stem cells responsible for tooth crown growth are lost at the onset of tooth root formation. Genetic signaling that initiates this loss is difficult to study with the ever-growing incisor and rooted molars of mice, the most common mammalian dental model species, because signals for root formation overlap with signals that pattern tooth size and shape (i.e., cusp patterns). Bank and prairie voles (Cricetidae, Rodentia, Glires) have evolved rooted and unrooted molars while retaining similar size and shape, providing alternative models for studying roots. RESULTS We assembled a de novo genome of Myodes glareolus, a vole with high-crowned, rooted molars, and performed genomic and transcriptomic analyses in a broad phylogenetic context of Glires (rodents and lagomorphs) to assess differential selection and evolution in tooth forming genes. Bulk transcriptomics comparisons of embryonic molar development between bank voles and mice demonstrated overall conservation of gene expression levels, with species-specific differences corresponding to the accelerated and more extensive patterning of the vole molar. We leverage convergent evolution of unrooted molars across the clade to examine changes that may underlie the evolution of unrooted molars. We identified 15 dental genes with changing synteny relationships and six dental genes undergoing positive selection across Glires, two of which were undergoing positive selection in species with unrooted molars, Dspp and Aqp1. Decreased expression of both genes in prairie voles with unrooted molars compared to bank voles supports the presence of positive selection and may underlie differences in root formation. CONCLUSIONS Our results support ongoing evolution of dental genes across Glires and identify candidate genes for mechanistic studies of root formation. Comparative research using the bank vole as a model species can reveal the complex evolutionary background of convergent evolution for ever-growing molars.
Collapse
Affiliation(s)
- Zachary T Calamari
- Baruch College, City University of New York, One Bernard Baruch Way, New York, NY, 10010, USA.
- The Graduate Center, City University of New York, 365 Fifth Ave, New York, NY, 10016, USA.
- Program in Craniofacial Biology, Department of Orofacial Sciences, University of California, San Francisco, San Francisco, CA, 94158, USA.
- Division of Paleontology, American Museum of Natural History, Central Park West at 79th Street, New York, NY, 10024, USA.
| | - Andrew Song
- Baruch College, City University of New York, One Bernard Baruch Way, New York, NY, 10010, USA
- Cornell University, 616 Thurston Ave, Ithaca, NY, 14853, USA
| | - Emily Cohen
- Baruch College, City University of New York, One Bernard Baruch Way, New York, NY, 10010, USA
- New York University College of Dentistry, 345 E 34th St, New York, NY, 10010, USA
| | - Muspika Akter
- Baruch College, City University of New York, One Bernard Baruch Way, New York, NY, 10010, USA
| | - Rishi Das Roy
- Institute of Biotechnology, University of Helsinki, Helsinki, FI-00014, Finland
| | - Outi Hallikas
- Institute of Biotechnology, University of Helsinki, Helsinki, FI-00014, Finland
| | - Mona M Christensen
- Institute of Biotechnology, University of Helsinki, Helsinki, FI-00014, Finland
| | - Pengyang Li
- Program in Craniofacial Biology, Department of Orofacial Sciences, University of California, San Francisco, San Francisco, CA, 94158, USA
- Department of Pediatrics, Cedars-Sinai Guerin Children's, 8700 Beverly Blvd., Suite 2416, Los Angeles, CA, 90048, USA
- Department of Bioengineering, Stanford University, 443 Via Ortega, Rm 119, Stanford, CA, 94305, USA
| | - Pauline Marangoni
- Program in Craniofacial Biology, Department of Orofacial Sciences, University of California, San Francisco, San Francisco, CA, 94158, USA
- Department of Pediatrics, Cedars-Sinai Guerin Children's, 8700 Beverly Blvd., Suite 2416, Los Angeles, CA, 90048, USA
| | - Jukka Jernvall
- Institute of Biotechnology, University of Helsinki, Helsinki, FI-00014, Finland
- Department of Geosciences and Geography, University of Helsinki, Helsinki, FI-00014, Finland
| | - Ophir D Klein
- Program in Craniofacial Biology, Department of Orofacial Sciences, University of California, San Francisco, San Francisco, CA, 94158, USA.
- Department of Pediatrics, Cedars-Sinai Guerin Children's, 8700 Beverly Blvd., Suite 2416, Los Angeles, CA, 90048, USA.
| |
Collapse
|
4
|
Redelings BD, Holmes I, Lunter G, Pupko T, Anisimova M. Insertions and Deletions: Computational Methods, Evolutionary Dynamics, and Biological Applications. Mol Biol Evol 2024; 41:msae177. [PMID: 39172750 PMCID: PMC11385596 DOI: 10.1093/molbev/msae177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Revised: 07/02/2024] [Accepted: 07/09/2024] [Indexed: 08/24/2024] Open
Abstract
Insertions and deletions constitute the second most important source of natural genomic variation. Insertions and deletions make up to 25% of genomic variants in humans and are involved in complex evolutionary processes including genomic rearrangements, adaptation, and speciation. Recent advances in long-read sequencing technologies allow detailed inference of insertions and deletion variation in species and populations. Yet, despite their importance, evolutionary studies have traditionally ignored or mishandled insertions and deletions due to a lack of comprehensive methodologies and statistical models of insertions and deletion dynamics. Here, we discuss methods for describing insertions and deletion variation and modeling insertions and deletions over evolutionary time. We provide practical advice for tackling insertions and deletions in genomic sequences and illustrate our discussion with examples of insertions and deletion-induced effects in human and other natural populations and their contribution to evolutionary processes. We outline promising directions for future developments in statistical methodologies that would allow researchers to analyze insertions and deletion variation and their effects in large genomic data sets and to incorporate insertions and deletions in evolutionary inference.
Collapse
Affiliation(s)
| | - Ian Holmes
- Department of Bioengineering, University of California, Berkeley, CA 94720, USA
- Calico Life Sciences LLC, South San Francisco, CA 94080, USA
| | - Gerton Lunter
- Department of Epidemiology, University Medical Center Groningen, University of Groningen, Groningen 9713 GZ, The Netherlands
| | - Tal Pupko
- The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Maria Anisimova
- Institute of Computational Life Sciences, Zurich University of Applied Sciences, Wädenswil, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
5
|
Amalfitano A, Stocchi N, Atencio HM, Villarreal F, Ten Have A. Seqrutinator: scrutiny of large protein superfamily sequence datasets for the identification and elimination of non-functional homologues. Genome Biol 2024; 25:230. [PMID: 39187866 PMCID: PMC11346255 DOI: 10.1186/s13059-024-03371-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 08/13/2024] [Indexed: 08/28/2024] Open
Abstract
Seqrutinator is an objective, flexible pipeline that removes sequences with sequencing and/or gene model errors and sequences from pseudogenes from complex, eukaryotic protein superfamilies. Testing Seqrutinator on major superfamilies BAHD, CYP, and UGT removes only 1.94% of SwissProt entries, 14% of entries from the model plant Arabidopsis thaliana, but 80% of entries from Pinus taeda's recent complete proteome. Application of Seqrutinator on crude BAHDomes, CYPomes, and UGTomes obtained from 16 plant proteomes shows convergence of the numbers of paralogues. MSAs, phylogenies, and particularly functional clustering improve drastically upon Seqrutinator application, indicating good performance.
Collapse
Affiliation(s)
- Agustín Amalfitano
- Laboratorio de Procesamiento de Imágenes, ICyTE-CONICET-UNMdP, Mar del Plata, Argentina
| | - Nicolás Stocchi
- Computational Biology and Comparative Genomics, IIB-CONICET-UNMdP, Mar del Plata, Argentina
| | - Hugo Marcelo Atencio
- Banco Activo de Germoplasma de Papa Andina, EEA-Balcarce INTA, Balcarce, Argentina
| | - Fernando Villarreal
- Computational Biology and Comparative Genomics, IIB-CONICET-UNMdP, Mar del Plata, Argentina.
| | - Arjen Ten Have
- Computational Biology and Comparative Genomics, IIB-CONICET-UNMdP, Mar del Plata, Argentina
| |
Collapse
|
6
|
Calamari ZT, Song A, Cohen E, Akter M, Roy RD, Hallikas O, Christensen MM, Li P, Marangoni P, Jernvall J, Klein OD. Vole genomics links determinate and indeterminate growth of teeth. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.18.572015. [PMID: 38187646 PMCID: PMC10769287 DOI: 10.1101/2023.12.18.572015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Continuously growing teeth are an important innovation in mammalian evolution, yet genetic regulation of continuous growth by stem cells remains incompletely understood. Dental stem cells responsible for tooth crown growth are lost at the onset of tooth root formation. Genetic signaling that initiates this loss is difficult to study with the ever-growing incisor and rooted molars of mice, the most common mammalian dental model species, because signals for root formation overlap with signals that pattern tooth size and shape (i.e., cusp patterns). Different species of voles (Cricetidae, Rodentia, Glires) have evolved rooted and unrooted molars that have similar size and shape, providing alternative models for studying roots. We assembled a de novo genome of Myodes glareolus, a vole with high-crowned, rooted molars, and performed genomic and transcriptomic analyses in a broad phylogenetic context of Glires (rodents and lagomorphs) to assess differential selection and evolution in tooth forming genes. We identified 15 dental genes with changing synteny relationships and six dental genes undergoing positive selection across Glires, two of which were undergoing positive selection in species with unrooted molars, Dspp and Aqp1. Decreased expression of both genes in prairie voles with unrooted molars compared to bank voles supports the presence of positive selection and may underlie differences in root formation. Bulk transcriptomics analyses of embryonic molar development in bank voles also demonstrated conserved patterns of dental gene expression compared to mice, with species-specific variation likely related to developmental timing and morphological differences between mouse and vole molars. Our results support ongoing evolution of dental genes across Glires, revealing the complex evolutionary background of convergent evolution for ever-growing molars.
Collapse
Affiliation(s)
- Zachary T. Calamari
- Baruch College, City University of New York, One Bernard Baruch Way, New York, NY 10010, USA
- The Graduate Center, City University of New York, 365 Fifth Ave, New York, NY 10016, USA
- Program in Craniofacial Biology and Department of Orofacial Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Division of Paleontology, American Museum of Natural History, Central Park West at 79th Street, New York, NY, 10024, USA
| | - Andrew Song
- Baruch College, City University of New York, One Bernard Baruch Way, New York, NY 10010, USA
- Cornell University, 616 Thurston Ave, Ithaca, NY 14853, USA
| | - Emily Cohen
- Baruch College, City University of New York, One Bernard Baruch Way, New York, NY 10010, USA
- New York University College of Dentistry, 345 E 34th St, New York, NY 10010
| | - Muspika Akter
- Baruch College, City University of New York, One Bernard Baruch Way, New York, NY 10010, USA
| | - Rishi Das Roy
- Institute of Biotechnology, University of Helsinki, FI-00014 Helsinki, Finland
| | - Outi Hallikas
- Institute of Biotechnology, University of Helsinki, FI-00014 Helsinki, Finland
| | - Mona M. Christensen
- Institute of Biotechnology, University of Helsinki, FI-00014 Helsinki, Finland
| | - Pengyang Li
- Program in Craniofacial Biology and Department of Orofacial Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Department of Pediatrics, Cedars-Sinai Guerin Children’s, 8700 Beverly Blvd., Suite 2416, Los Angeles, CA 90048, USA
| | - Pauline Marangoni
- Program in Craniofacial Biology and Department of Orofacial Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Department of Pediatrics, Cedars-Sinai Guerin Children’s, 8700 Beverly Blvd., Suite 2416, Los Angeles, CA 90048, USA
| | - Jukka Jernvall
- Institute of Biotechnology, University of Helsinki, FI-00014 Helsinki, Finland
- Department of Geosciences and Geography, University of Helsinki, FI-00014 Helsinki, Finland
| | - Ophir D. Klein
- Program in Craniofacial Biology and Department of Orofacial Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Department of Pediatrics, Cedars-Sinai Guerin Children’s, 8700 Beverly Blvd., Suite 2416, Los Angeles, CA 90048, USA
| |
Collapse
|
7
|
Yilmaz F, Karageorgiou C, Kim K, Pajic P, Scheer K, Beck CR, Torregrossa AM, Lee C, Gokcumen O. Paleolithic Gene Duplications Primed Adaptive Evolution of Human Amylase Locus Upon Agriculture. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.27.568916. [PMID: 38077078 PMCID: PMC10705236 DOI: 10.1101/2023.11.27.568916] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Starch digestion is a cornerstone of human nutrition. The amylase genes code for the starch-digesting amylase enzyme. Previous studies suggested that the salivary amylase (AMY1) gene copy number increased in response to agricultural diets. However, the lack of nucleotide resolution of the amylase locus hindered detailed evolutionary analyses. Here, we have resolved this locus at nucleotide resolution in 98 present-day humans and identified 30 distinct haplotypes, revealing that the coding sequences of all amylase gene copies are evolving under negative selection. The phylogenetic reconstruction suggested that haplotypes with three AMY1 gene copies, prevalent across all continents and constituting about 70% of observed haplotypes, originated before the out-of-Africa migrations of ancestral modern humans. Using thousands of unique 25 base pair sequences across the amylase locus, we showed that additional AMY1 gene copies existed in the genomes of four archaic hominin genomes, indicating that the initial duplication of this locus may have occurred as far back 800,000 years ago. We similarly analyzed 73 ancient human genomes dating from 300 - 45,000 years ago and found that the AMY1 copy number variation observed today existed long before the advent of agriculture (~10,000 years ago), predisposing this locus to adaptive increase in the frequency of higher amylase copy number with the spread of agriculture. Mechanistically, the common three-copy haplotypes seeded non-allelic homologous recombination events that appear to be occurring at one of the fastest rates seen for tandem repeats in the human genome. Our study provides a comprehensive population-level understanding of the genomic structure of the amylase locus, identifying the mechanisms and evolutionary history underlying its duplication and copy number variability in relation to the onset of agriculture.
Collapse
|
8
|
Ye Y, Shum MH, Tsui JL, Yu G, Smith DK, Zhu H, Wu JT, Guan Y, Lam TTY. Robust expansion of phylogeny for fast-growing genome sequence data. PLoS Comput Biol 2024; 20:e1011871. [PMID: 38330139 PMCID: PMC10898724 DOI: 10.1371/journal.pcbi.1011871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 02/27/2024] [Accepted: 01/29/2024] [Indexed: 02/10/2024] Open
Abstract
Massive sequencing of SARS-CoV-2 genomes has urged novel methods that employ existing phylogenies to add new samples efficiently instead of de novo inference. 'TIPars' was developed for such challenge integrating parsimony analysis with pre-computed ancestral sequences. It took about 21 seconds to insert 100 SARS-CoV-2 genomes into a 100k-taxa reference tree using 1.4 gigabytes. Benchmarking on four datasets, TIPars achieved the highest accuracy for phylogenies of moderately similar sequences. For highly similar and divergent scenarios, fully parsimony-based and likelihood-based phylogenetic placement methods performed the best respectively while TIPars was the second best. TIPars accomplished efficient and accurate expansion of phylogenies of both similar and divergent sequences, which would have broad biological applications beyond SARS-CoV-2. TIPars is accessible from https://tipars.hku.hk/ and source codes are available at https://github.com/id-bioinfo/TIPars.
Collapse
Affiliation(s)
- Yongtao Ye
- State Key Laboratory of Emerging Infectious Diseases, School of Public Health, The University of Hong Kong, Hong Kong SAR, P. R. China
- Laboratory of Data Discovery for Health Limited, 19W Hong Kong Science & Technology Parks, Hong Kong SAR, P. R. China
| | - Marcus H Shum
- State Key Laboratory of Emerging Infectious Diseases, School of Public Health, The University of Hong Kong, Hong Kong SAR, P. R. China
- Laboratory of Data Discovery for Health Limited, 19W Hong Kong Science & Technology Parks, Hong Kong SAR, P. R. China
| | - Joseph L Tsui
- State Key Laboratory of Emerging Infectious Diseases, School of Public Health, The University of Hong Kong, Hong Kong SAR, P. R. China
- Laboratory of Data Discovery for Health Limited, 19W Hong Kong Science & Technology Parks, Hong Kong SAR, P. R. China
| | - Guangchuang Yu
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, Guangdong, China
| | - David K Smith
- State Key Laboratory of Emerging Infectious Diseases, School of Public Health, The University of Hong Kong, Hong Kong SAR, P. R. China
| | - Huachen Zhu
- State Key Laboratory of Emerging Infectious Diseases, School of Public Health, The University of Hong Kong, Hong Kong SAR, P. R. China
- Laboratory of Data Discovery for Health Limited, 19W Hong Kong Science & Technology Parks, Hong Kong SAR, P. R. China
- Guangdong-Hongkong Joint Laboratory of Emerging Infectious Diseases, Joint Institute of Virology (Shantou University/The University of Hong Kong), Shantou, Guangdong, P. R. China
- EKIH (Gewuzhikang) Pathogen Research Institute, Futian District, Shenzhen City, Guangdong, P. R. China
| | - Joseph T Wu
- State Key Laboratory of Emerging Infectious Diseases, School of Public Health, The University of Hong Kong, Hong Kong SAR, P. R. China
- Laboratory of Data Discovery for Health Limited, 19W Hong Kong Science & Technology Parks, Hong Kong SAR, P. R. China
| | - Yi Guan
- State Key Laboratory of Emerging Infectious Diseases, School of Public Health, The University of Hong Kong, Hong Kong SAR, P. R. China
- Laboratory of Data Discovery for Health Limited, 19W Hong Kong Science & Technology Parks, Hong Kong SAR, P. R. China
- Guangdong-Hongkong Joint Laboratory of Emerging Infectious Diseases, Joint Institute of Virology (Shantou University/The University of Hong Kong), Shantou, Guangdong, P. R. China
- EKIH (Gewuzhikang) Pathogen Research Institute, Futian District, Shenzhen City, Guangdong, P. R. China
| | - Tommy Tsan-Yuk Lam
- State Key Laboratory of Emerging Infectious Diseases, School of Public Health, The University of Hong Kong, Hong Kong SAR, P. R. China
- Laboratory of Data Discovery for Health Limited, 19W Hong Kong Science & Technology Parks, Hong Kong SAR, P. R. China
- Guangdong-Hongkong Joint Laboratory of Emerging Infectious Diseases, Joint Institute of Virology (Shantou University/The University of Hong Kong), Shantou, Guangdong, P. R. China
- EKIH (Gewuzhikang) Pathogen Research Institute, Futian District, Shenzhen City, Guangdong, P. R. China
- Centre for Immunology & Infection Limited, 17W Hong Kong Science & Technology Parks, Hong Kong SAR, P. R. China
| |
Collapse
|
9
|
Ma B, Gong H, Xu Q, Gao Y, Guan A, Wang H, Hua K, Luo R, Jin H. Bases-dependent Rapid Phylogenetic Clustering (Bd-RPC) enables precise and efficient phylogenetic estimation in viruses. Virus Evol 2024; 10:veae005. [PMID: 38361823 PMCID: PMC10868571 DOI: 10.1093/ve/veae005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 01/06/2024] [Accepted: 01/22/2024] [Indexed: 02/17/2024] Open
Abstract
Understanding phylogenetic relationships among species is essential for many biological studies, which call for an accurate phylogenetic tree to understand major evolutionary transitions. The phylogenetic analyses present a major challenge in estimation accuracy and computational efficiency, especially recently facing a wave of severe emerging infectious disease outbreaks. Here, we introduced a novel, efficient framework called Bases-dependent Rapid Phylogenetic Clustering (Bd-RPC) for new sample placement for viruses. In this study, a brand-new recoding method called Frequency Vector Recoding was implemented to approximate the phylogenetic distance, and the Phylogenetic Simulated Annealing Search algorithm was developed to match the recoded distance matrix with the phylogenetic tree. Meanwhile, the indel (insertion/deletion) was heuristically introduced to foreign sequence recognition for the first time. Here, we compared the Bd-RPC with the recent placement software (PAGAN2, EPA-ng, TreeBeST) and evaluated it in Alphacoronavirus, Alphaherpesvirinae, and Betacoronavirus by using Split and Robinson-Foulds distances. The comparisons showed that Bd-RPC maintained the highest precision with great efficiency, demonstrating good performance in new sample placement on all three virus genera. Finally, a user-friendly website (http://www.bd-rpc.xyz) is available for users to classify new samples instantly and facilitate exploration of the phylogenetic research in viruses, and the Bd-RPC is available on GitHub (http://github.com/Bin-Ma/bd-rpc).
Collapse
Affiliation(s)
- Bin Ma
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Huimin Gong
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Qianshuai Xu
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Yuan Gao
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Aohan Guan
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Haoyu Wang
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Kexin Hua
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Rui Luo
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| | - Hui Jin
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
- College of Veterinary Medicine, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, Hubei 430070, China
| |
Collapse
|
10
|
Mönttinen HAM, Frilander MJ, Löytynoja A. Generation of de novo miRNAs from template switching during DNA replication. Proc Natl Acad Sci U S A 2023; 120:e2310752120. [PMID: 38019864 PMCID: PMC10710096 DOI: 10.1073/pnas.2310752120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Accepted: 11/01/2023] [Indexed: 12/01/2023] Open
Abstract
The mechanisms generating novel genes and genetic information are poorly known, even for microRNA (miRNA) genes with an extremely constrained design. All miRNA primary transcripts need to fold into a stem-loop structure to yield short gene products ([Formula: see text]22 nt) that bind and repress their mRNA targets. While a substantial number of miRNA genes are ancient and highly conserved, short secondary structures coding for entirely novel miRNA genes have been shown to emerge in a lineage-specific manner. Template switching is a DNA-replication-related mutation mechanism that can introduce complex changes and generate perfect base pairing for entire hairpin structures in a single event. Here, we show that the template-switching mutations (TSMs) have participated in the emergence of over 6,000 suitable hairpin structures in the primate lineage to yield at least 18 new human miRNA genes, that is 26% of the miRNAs inferred to have arisen since the origin of primates. While the mechanism appears random, the TSM-generated miRNAs are enriched in introns where they can be expressed with their host genes. The high frequency of TSM events provides raw material for evolution. Being orders of magnitude faster than other mechanisms proposed for de novo creation of genes, TSM-generated miRNAs enable near-instant rewiring of genetic information and rapid adaptation to changing environments.
Collapse
Affiliation(s)
- Heli A. M. Mönttinen
- Institute of Biotechnology, Helsinki Institute of Life Science, University of Helsinki, HelsinkiFI-000, Finland
| | - Mikko J. Frilander
- Institute of Biotechnology, Helsinki Institute of Life Science, University of Helsinki, HelsinkiFI-000, Finland
| | - Ari Löytynoja
- Institute of Biotechnology, Helsinki Institute of Life Science, University of Helsinki, HelsinkiFI-000, Finland
| |
Collapse
|
11
|
Abstract
Since advances in next-generation sequencing (NGS) technique enabled to investigate uncultured microbiota and their genomes in unbiased manner, many microbiome researches have been reporting strong evidences for close links of microbiome to human health and disease. Bioinformatic and statistical analysis of NGS-based microbiome data are essential components in those microbiome researches to explore the complex composition of microbial community and understand the functions of community members in relation to host and environment. This chapter introduces bioinformatic analysis methods that generate taxonomy and functional feature count table along with phylogenetic tree from raw NGS microbiome data and then introduce statistical methods and machine learning approaches for analyzing the outputs of the bioinformatic analysis to infer the biodiversity of a microbial community and unravel host-microbiome association. Understanding the advantages and limitations of the analysis methods will help readers use the methods correctly in microbiome data analysis and may give a new opportunity to develop new analytic techniques for microbiome research.
Collapse
Affiliation(s)
- Youngchul Kim
- Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA.
| |
Collapse
|
12
|
Foley G, Mora A, Ross CM, Bottoms S, Sützl L, Lamprecht ML, Zaugg J, Essebier A, Balderson B, Newell R, Thomson RES, Kobe B, Barnard RT, Guddat L, Schenk G, Carsten J, Gumulya Y, Rost B, Haltrich D, Sieber V, Gillam EMJ, Bodén M. Engineering indel and substitution variants of diverse and ancient enzymes using Graphical Representation of Ancestral Sequence Predictions (GRASP). PLoS Comput Biol 2022; 18:e1010633. [PMID: 36279274 PMCID: PMC9632902 DOI: 10.1371/journal.pcbi.1010633] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Revised: 11/03/2022] [Accepted: 10/04/2022] [Indexed: 11/06/2022] Open
Abstract
Ancestral sequence reconstruction is a technique that is gaining widespread use in molecular evolution studies and protein engineering. Accurate reconstruction requires the ability to handle appropriately large numbers of sequences, as well as insertion and deletion (indel) events, but available approaches exhibit limitations. To address these limitations, we developed Graphical Representation of Ancestral Sequence Predictions (GRASP), which efficiently implements maximum likelihood methods to enable the inference of ancestors of families with more than 10,000 members. GRASP implements partial order graphs (POGs) to represent and infer insertion and deletion events across ancestors, enabling the identification of building blocks for protein engineering. To validate the capacity to engineer novel proteins from realistic data, we predicted ancestor sequences across three distinct enzyme families: glucose-methanol-choline (GMC) oxidoreductases, cytochromes P450, and dihydroxy/sugar acid dehydratases (DHAD). All tested ancestors demonstrated enzymatic activity. Our study demonstrates the ability of GRASP (1) to support large data sets over 10,000 sequences and (2) to employ insertions and deletions to identify building blocks for engineering biologically active ancestors, by exploring variation over evolutionary time.
Collapse
Affiliation(s)
- Gabriel Foley
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - Ariane Mora
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - Connie M. Ross
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - Scott Bottoms
- Campus Straubing for Biotechnology and Sustainability, Technische Universität München, Straubing, Germany
| | - Leander Sützl
- Institut für Lebensmitteltechnologie, Universität für Bodenkultur Wien, Vienna, Austria
| | - Marnie L. Lamprecht
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - Julian Zaugg
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - Alexandra Essebier
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - Brad Balderson
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - Rhys Newell
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - Raine E. S. Thomson
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - Bostjan Kobe
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
- Institute for Molecular Bioscience and Australian Infectious Diseases Research Centre, The University of Queensland, Brisbane, Australia
| | - Ross T. Barnard
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - Luke Guddat
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - Gerhard Schenk
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
- Sustainable Minerals Institute, The University of Queensland, Brisbane, Australia
| | - Jörg Carsten
- Zentralinstitut für Katalyseforschung, Technische Universität München, Munich, Germany
| | - Yosephine Gumulya
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - Burkhard Rost
- Fakultät für Informatik, Technische Universität München, Munich, Germany
| | - Dietmar Haltrich
- Institut für Lebensmitteltechnologie, Universität für Bodenkultur Wien, Vienna, Austria
| | - Volker Sieber
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
- Campus Straubing for Biotechnology and Sustainability, Technische Universität München, Straubing, Germany
- Zentralinstitut für Katalyseforschung, Technische Universität München, Munich, Germany
| | - Elizabeth M. J. Gillam
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
- * E-mail: (MB); (EMJG)
| | - Mikael Bodén
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
- * E-mail: (MB); (EMJG)
| |
Collapse
|
13
|
Lozano-Fernandez J. A Practical Guide to Design and Assess a Phylogenomic Study. Genome Biol Evol 2022; 14:evac129. [PMID: 35946263 PMCID: PMC9452790 DOI: 10.1093/gbe/evac129] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/03/2022] [Indexed: 11/13/2022] Open
Abstract
Over the last decade, molecular systematics has undergone a change of paradigm as high-throughput sequencing now makes it possible to reconstruct evolutionary relationships using genome-scale datasets. The advent of "big data" molecular phylogenetics provided a battery of new tools for biologists but simultaneously brought new methodological challenges. The increase in analytical complexity comes at the price of highly specific training in computational biology and molecular phylogenetics, resulting very often in a polarized accumulation of knowledge (technical on one side and biological on the other). Interpreting the robustness of genome-scale phylogenetic studies is not straightforward, particularly as new methodological developments have consistently shown that the general belief of "more genes, more robustness" often does not apply, and because there is a range of systematic errors that plague phylogenomic investigations. This is particularly problematic because phylogenomic studies are highly heterogeneous in their methodology, and best practices are often not clearly defined. The main aim of this article is to present what I consider as the ten most important points to take into consideration when planning a well-thought-out phylogenomic study and while evaluating the quality of published papers. The goal is to provide a practical step-by-step guide that can be easily followed by nonexperts and phylogenomic novices in order to assess the technical robustness of phylogenomic studies or improve the experimental design of a project.
Collapse
Affiliation(s)
- Jesus Lozano-Fernandez
- Department of Genetics, Microbiology and Statistics, Biodiversity Research Institute (IRBio), University of Barcelona, Avd. Diagonal 643, 08028 Barcelona, Spain
- Institute of Evolutionary Biology (CSIC – Universitat Pompeu Fabra), Passeig marítim de la Barcelona 37-49, 08003 Barcelona, Spain
| |
Collapse
|
14
|
Xiao R, Chen S, Wang X, Chen K, Hu J, Wei K, Ning Y, Xiong T, Lu F. Microbial community starters affect the profiles of volatile compounds in traditional Chinese Xiaoqu rice wine: Assement via high-throughput sequencing and gas chromatography-ion mobility spectrometry. Lebensm Wiss Technol 2022. [DOI: 10.1016/j.lwt.2022.114000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
15
|
Czech L, Stamatakis A, Dunthorn M, Barbera P. Metagenomic Analysis Using Phylogenetic Placement-A Review of the First Decade. FRONTIERS IN BIOINFORMATICS 2022; 2:871393. [PMID: 36304302 PMCID: PMC9580882 DOI: 10.3389/fbinf.2022.871393] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 04/11/2022] [Indexed: 12/20/2022] Open
Abstract
Phylogenetic placement refers to a family of tools and methods to analyze, visualize, and interpret the tsunami of metagenomic sequencing data generated by high-throughput sequencing. Compared to alternative (e. g., similarity-based) methods, it puts metabarcoding sequences into a phylogenetic context using a set of known reference sequences and taking evolutionary history into account. Thereby, one can increase the accuracy of metagenomic surveys and eliminate the requirement for having exact or close matches with existing sequence databases. Phylogenetic placement constitutes a valuable analysis tool per se, but also entails a plethora of downstream tools to interpret its results. A common use case is to analyze species communities obtained from metagenomic sequencing, for example via taxonomic assignment, diversity quantification, sample comparison, and identification of correlations with environmental variables. In this review, we provide an overview over the methods developed during the first 10 years. In particular, the goals of this review are 1) to motivate the usage of phylogenetic placement and illustrate some of its use cases, 2) to outline the full workflow, from raw sequences to publishable figures, including best practices, 3) to introduce the most common tools and methods and their capabilities, 4) to point out common placement pitfalls and misconceptions, 5) to showcase typical placement-based analyses, and how they can help to analyze, visualize, and interpret phylogenetic placement data.
Collapse
Affiliation(s)
- Lucas Czech
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA, United States
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Micah Dunthorn
- Natural History Museum, University of Oslo, Oslo, Norway
| | | |
Collapse
|
16
|
Chao J, Tang F, Xu L. Developments in Algorithms for Sequence Alignment: A Review. Biomolecules 2022; 12:biom12040546. [PMID: 35454135 PMCID: PMC9024764 DOI: 10.3390/biom12040546] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 03/29/2022] [Accepted: 03/31/2022] [Indexed: 01/27/2023] Open
Abstract
The continuous development of sequencing technologies has enabled researchers to obtain large amounts of biological sequence data, and this has resulted in increasing demands for software that can perform sequence alignment fast and accurately. A number of algorithms and tools for sequence alignment have been designed to meet the various needs of biologists. Here, the ideas that prevail in the research of sequence alignment and some quality estimation methods for multiple sequence alignment tools are summarized.
Collapse
Affiliation(s)
- Jiannan Chao
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China;
| | - Furong Tang
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324003, China;
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen 518055, China
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen 518055, China
- Correspondence:
| |
Collapse
|
17
|
Álvarez-Carretero S, Tamuri AU, Battini M, Nascimento FF, Carlisle E, Asher RJ, Yang Z, Donoghue PCJ, Dos Reis M. A species-level timeline of mammal evolution integrating phylogenomic data. Nature 2022; 602:263-267. [PMID: 34937052 DOI: 10.1038/s41586-021-04341-1] [Citation(s) in RCA: 99] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 12/13/2021] [Indexed: 11/09/2022]
Abstract
High-throughput sequencing projects generate genome-scale sequence data for species-level phylogenies1-3. However, state-of-the-art Bayesian methods for inferring timetrees are computationally limited to small datasets and cannot exploit the growing number of available genomes4. In the case of mammals, molecular-clock analyses of limited datasets have produced conflicting estimates of clade ages with large uncertainties5,6, and thus the timescale of placental mammal evolution remains contentious7-10. Here we develop a Bayesian molecular-clock dating approach to estimate a timetree of 4,705 mammal species integrating information from 72 mammal genomes. We show that increasingly larger phylogenomic datasets produce diversification time estimates with progressively smaller uncertainties, facilitating precise tests of macroevolutionary hypotheses. For example, we confidently reject an explosive model of placental mammal origination in the Palaeogene8 and show that crown Placentalia originated in the Late Cretaceous with unambiguous ordinal diversification in the Palaeocene/Eocene. Our Bayesian methodology facilitates analysis of complete genomes and thousands of species within an integrated framework, making it possible to address hitherto intractable research questions on species diversifications. This approach can be used to address other contentious cases of animal and plant diversifications that require analysis of species-level phylogenomic datasets.
Collapse
Affiliation(s)
- Sandra Álvarez-Carretero
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK
- Department of Genetics, Evolution and Environment, University College London, London, UK
| | - Asif U Tamuri
- Centre for Advanced Research Computing, University College London, London, UK
- EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Matteo Battini
- School of Earth Sciences, University of Bristol, Bristol, UK
| | - Fabrícia F Nascimento
- MRC Centre for Global Infectious Disease Analysis, School of Public Health, Imperial College London, London, UK
| | - Emily Carlisle
- School of Earth Sciences, University of Bristol, Bristol, UK
| | - Robert J Asher
- Department of Zoology, University of Cambridge, Cambridge, UK
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, London, UK
| | | | - Mario Dos Reis
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK.
| |
Collapse
|
18
|
Template switching in DNA replication can create and maintain RNA hairpins. Proc Natl Acad Sci U S A 2022; 119:2107005119. [PMID: 35046021 PMCID: PMC8794818 DOI: 10.1073/pnas.2107005119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/14/2021] [Indexed: 11/18/2022] Open
Abstract
The evolutionary origin of RNA stem structures and the preservation of their base pairing under a spontaneous and random mutation process have puzzled theoretical evolutionary biologists. DNA replication-related template switching is a mutation mechanism that creates reverse-complement copies of sequence regions within a genome by replicating briefly along either the complementary or nascent DNA strand. Depending on the relative positions and context of the four switch points, this process may produce a reverse-complement repeat capable of forming the stem of a perfect DNA hairpin or fix the base pairing of an existing stem. Template switching is typically thought to trigger large structural changes, and its possible role in the origin and evolution of RNA genes has not been studied. Here, we show that the reconstructed ancestral histories of RNA genes contain mutation patterns consistent with the DNA replication-related template switching. In addition to multibase compensatory mutations, the mechanism can explain complex sequence changes, although mutations breaking the structure rarely get fixed in evolution. Our results suggest a solution for the long-standing dilemma of RNA gene evolution and demonstrate how template switching can both create perfect stems with a single mutation event and help maintaining the stem structure over time. Interestingly, template switching also provides an elegant explanation for the asymmetric base pair frequencies within RNA stems.
Collapse
|
19
|
Hu X, Zhou R, Li H, Zhao X, Sun Y, Fan Y, Zhang S. Alterations of Gut Microbiome and Serum Metabolome in Coronary Artery Disease Patients Complicated With Non-alcoholic Fatty Liver Disease Are Associated With Adverse Cardiovascular Outcomes. Front Cardiovasc Med 2022; 8:805812. [PMID: 35047580 PMCID: PMC8761954 DOI: 10.3389/fcvm.2021.805812] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Accepted: 11/26/2021] [Indexed: 12/12/2022] Open
Abstract
Rationale: Patients suffering from coronary artery disease (CAD) complicated with nonalcoholic fatty liver disease (NAFLD) present worse cardiovascular outcomes than CAD patients without NAFLD. The progression of CAD is recently reported to be associated with gut microbiota and microbe-derived metabolites. However, it remains unclear how the complication of NAFLD will affect gut microbiota and microbe-derived metabolites in CAD patients, and whether or not this interplay is related to the worse cardiovascular outcomes in CAD-NAFLD patients. Methods: We performed 16S rRNA sequencing and serum metabolomic analysis in 27 CAD patients with NAFLD, 81 CAD patients without NAFLD, and 24 matched healthy volunteers. Predicted functional profiling was achieved using PICRUSt2. The occurrence of cardiovascular events was assessed by a follow-up study. The association of alterations in the gut microbiome and metabolome with adverse cardiovascular events and clinical indicators was revealed by Spearman correlation analysis. Results: We discovered that the complication of NAFLD was associated with worse clinical outcomes in CAD patients and critical serum metabolome shifts. We identified 25 metabolite modules that were correlated with poor clinical outcome in CAD-NAFLD patients compared with non-NAFLD patients, represented by increased cardiac-toxic metabolites including prochloraz, brofaromine, aristolochic acid, triethanolamine, and reduced potentially beneficial metabolites including estradiol, chitotriose, palmitelaidic acid, and moxisylyte. In addition, the gut microbiome of individuals with CAD-NAFLD was changed and characterized by increased abundances of Oscillibacter ruminantium and Dialister invisus, and decreased abundances of Fusicatenibacter saccharivorans, Bacteroides ovatus and Prevotella copri. PICRUSt2 further confirmed an increase of potential pathogenic bacteria in CAD-NAFLD. Moreover, we found that variations of gut microbiota were critically correlated with changed circulating metabolites and clinical outcomes, which revealed that aberrant gut microbiota in CAD-NAFLD patients may sculpt a detrimental metabolome which results in adverse cardiovascular outcomes. Conclusions: Our findings suggest that CAD patients complicated with NAFLD result in worse clinical outcomes possibly by modulating the features of the gut microbiota and circulating metabolites. We introduce “liver-gut microbiota-heart axis” as a possible mechanism underlying this interrelationship. Our study provides new insights on the contribution of gut microbiota heterogeneity to CAD-NAFLD progression and suggests novel strategies for disease therapy.
Collapse
Affiliation(s)
- Xiaomin Hu
- Department of Cardiology, Department of Medical Research Center, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Ruilin Zhou
- Department of Cardiology, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Hanyu Li
- Department of Cardiology, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Xinyue Zhao
- Department of Cardiology, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Yueshen Sun
- Department of Cardiology, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Yue Fan
- Department of Cardiology, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| | - Shuyang Zhang
- Department of Cardiology, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, China
| |
Collapse
|
20
|
Garcia AK, Fer E, Sephus C, Kacar B. An Integrated Method to Reconstruct Ancient Proteins. Methods Mol Biol 2022; 2569:267-281. [PMID: 36083453 DOI: 10.1007/978-1-0716-2691-7_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Proteins have played a fundamental role throughout life's history on Earth. Despite their biological importance, ancient origin, early function, and evolution of proteins are seldom able to be directly studied because few of these attributes are preserved across geologic timescales. Ancestral sequence reconstruction (ASR) provides a method to infer ancestral amino acid sequences and determine the evolutionary predecessors of modern-day proteins using phylogenetic tools. Laboratory application of ASR allows ancient sequences to be deduced from genetic information available in extant organisms and then experimentally resurrected to elucidate ancestral characteristics. In this article, we provide a generalized, stepwise protocol that considers the major elements of a well-designed ASR study and details potential sources of reconstruction bias that can reduce the relevance of historical inferences. We underscore key stages in our approach so that it may be broadly utilized to reconstruct the evolutionary histories of proteins.
Collapse
Affiliation(s)
- Amanda K Garcia
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
| | - Evrim Fer
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
- Microbiology Doctoral Training Program, University of Wisconsin-Madison, Madison, WI, USA
| | - Cathryn Sephus
- Scripps Institution of Oceanography, University of California at San Diego, La Jolla, CA, USA
| | - Betul Kacar
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
21
|
Kartali T, Nyilasi I, Kocsubé S, Patai R, Polgár TF, Zsindely N, Nagy G, Bodai L, Lipinszki Z, Vágvölgyi C, Papp T. Characterization of Four Novel dsRNA Viruses Isolated from Mucor hiemalis Strains. Viruses 2021; 13:v13112319. [PMID: 34835124 PMCID: PMC8625083 DOI: 10.3390/v13112319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 11/16/2021] [Accepted: 11/17/2021] [Indexed: 11/16/2022] Open
Abstract
We previously screened the total nucleic acid extracts of 123 Mucor strains for the presence of dsRNA molecules without further molecular analyses. Here, we characterized five novel dsRNA genomes isolated from four different Mucor hiemalis strains with next-generation sequencing (NGS), namely Mucor hiemalis virus 1a (MhV1a) from WRL CN(M) 122; Mucor hiemalis virus 1b (MhV1b) from NRRL 3624; Mucor hiemalis virus 2 (MhV2) from NRRL 3616; and Mucor hiemalis virus 3 (MhV3) and Mucor hiemalis virus (MhV4) from NRRL 3617 strains. Genomes contain two open reading frames (ORF), which encode the coat protein (CP) and the RNA dependent RNA polymerase (RdRp), respectively. In MhV1a and MhV1b, it is predicted to be translated as a fusion protein via -1 ribosomal frameshift, while in MhV4 via a rare +1 (or-2) ribosomal frameshift. In MhV2 and MhV3, the presence of specific UAAUG pentanucleotide motif points to the fact for coupled translation termination and reinitialization. MhV1a, MhV2, and MhV3 are part of the clade representing the genus Victorivirus, while MhV4 is seated in Totivirus genus clade. The detected VLPs in Mucor strains were from 33 to 36 nm in diameter. Hybridization analysis revealed that the dsRNA molecules of MhV1a-MhV4 hybridized to the corresponding molecules.
Collapse
Affiliation(s)
- Tünde Kartali
- Department of Microbiology, Faculty of Science and Informatics, University of Szeged, 6726 Szeged, Hungary; (I.N.); (S.K.); (N.Z.); (C.V.)
- Correspondence: (T.K.); (T.P.)
| | - Ildikó Nyilasi
- Department of Microbiology, Faculty of Science and Informatics, University of Szeged, 6726 Szeged, Hungary; (I.N.); (S.K.); (N.Z.); (C.V.)
| | - Sándor Kocsubé
- Department of Microbiology, Faculty of Science and Informatics, University of Szeged, 6726 Szeged, Hungary; (I.N.); (S.K.); (N.Z.); (C.V.)
| | - Roland Patai
- Neuronal Plasticity Research Group, Institute of Biophysics, Biological Research Centre, 6726 Szeged, Hungary; (R.P.); (T.F.P.)
| | - Tamás F. Polgár
- Neuronal Plasticity Research Group, Institute of Biophysics, Biological Research Centre, 6726 Szeged, Hungary; (R.P.); (T.F.P.)
- Theoretical Medicine Doctoral School, University of Szeged, 6722 Szeged, Hungary
| | - Nóra Zsindely
- Department of Microbiology, Faculty of Science and Informatics, University of Szeged, 6726 Szeged, Hungary; (I.N.); (S.K.); (N.Z.); (C.V.)
| | - Gábor Nagy
- Department of Biochemistry and Molecular Biology, Faculty of Science and Informatics, University of Szeged, 6726 Szeged, Hungary; (G.N.); (L.B.)
| | - László Bodai
- Department of Biochemistry and Molecular Biology, Faculty of Science and Informatics, University of Szeged, 6726 Szeged, Hungary; (G.N.); (L.B.)
| | - Zoltán Lipinszki
- MTA SZBK Lendület Laboratory of Cell Cycle Regulation, Institute of Biochemistry, Biological Research Centre, Eötvös Loránd Research Network (ELKH), 6726 Szeged, Hungary;
| | - Csaba Vágvölgyi
- Department of Microbiology, Faculty of Science and Informatics, University of Szeged, 6726 Szeged, Hungary; (I.N.); (S.K.); (N.Z.); (C.V.)
| | - Tamás Papp
- Department of Microbiology, Faculty of Science and Informatics, University of Szeged, 6726 Szeged, Hungary; (I.N.); (S.K.); (N.Z.); (C.V.)
- MTA-SZTE Fungal Pathogenicity Mechanisms Research Group, Hungarian Academy of Sciences and Department of Microbiology, University of Szeged, 6726 Szeged, Hungary
- Correspondence: (T.K.); (T.P.)
| |
Collapse
|
22
|
Xiao Q, Chen Y, Liu C, Robson F, Roy S, Cheng X, Wen J, Mysore K, Miller AJ, Murray JD. MtNPF6.5 mediates chloride uptake and nitrate preference in Medicago roots. EMBO J 2021; 40:e106847. [PMID: 34523752 PMCID: PMC8561640 DOI: 10.15252/embj.2020106847] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Revised: 07/23/2021] [Accepted: 07/28/2021] [Indexed: 11/09/2022] Open
Abstract
The preference for nitrate over chloride through regulation of transporters is a fundamental feature of plant ion homeostasis. We show that Medicago truncatula MtNPF6.5, an ortholog of Arabidopsis thaliana AtNPF6.3/NRT1.1, can mediate nitrate and chloride uptake in Xenopus oocytes but is chloride selective and that its close homologue, MtNPF6.7, can transport nitrate and chloride but is nitrate selective. The MtNPF6.5 mutant showed greatly reduced chloride content relative to wild type, and MtNPF6.5 expression was repressed by high chloride, indicating a primary role for MtNPF6.5 in root chloride uptake. MtNPF6.5 and MtNPF6.7 were repressed and induced by nitrate, respectively, and these responses required the transcription factor MtNLP1. Moreover, loss of MtNLP1 prevented the rapid switch from chloride to nitrate as the main anion in nitrate-starved plants after nitrate provision, providing insight into the underlying mechanism for nitrate preference. Sequence analysis revealed three sub-types of AtNPF6.3 orthologs based on their predicted substrate-binding residues: A (chloride selective), B (nitrate selective), and C (legume specific). The absence of B-type AtNPF6.3 homologues in early diverged plant lineages suggests that they evolved from a chloride-selective MtNPF6.5-like protein.
Collapse
Affiliation(s)
- Qiying Xiao
- CAS‐JIC Centre of Excellence for Plant and Microbial Science (CEPAMS)Centre for Excellence in Molecular Plant Sciences (CEMPS)Shanghai Institute of Plant Physiology and Ecology (SIPPE)Chinese Academy of SciencesShanghaiChina
| | - Yi Chen
- John Innes CentreNorwich Research Park, NorwichUK
| | - Cheng‐Wu Liu
- John Innes CentreNorwich Research Park, NorwichUK
- Present address:
School of Life SciencesUniversity of Science and Technology of ChinaHefeiChina
| | - Fran Robson
- John Innes CentreNorwich Research Park, NorwichUK
| | - Sonali Roy
- John Innes CentreNorwich Research Park, NorwichUK
- Noble Research InstituteArdmoreOKUSA
| | | | | | | | | | - Jeremy D Murray
- CAS‐JIC Centre of Excellence for Plant and Microbial Science (CEPAMS)Centre for Excellence in Molecular Plant Sciences (CEMPS)Shanghai Institute of Plant Physiology and Ecology (SIPPE)Chinese Academy of SciencesShanghaiChina
- John Innes CentreNorwich Research Park, NorwichUK
| |
Collapse
|
23
|
Piatkowski BT, Yavitt JB, Turetsky MR, Shaw AJ. Natural selection on a carbon cycling trait drives ecosystem engineering by Sphagnum (peat moss). Proc Biol Sci 2021; 288:20210609. [PMID: 34403639 DOI: 10.1098/rspb.2021.0609] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Sphagnum peat mosses have an extraordinary impact on the global carbon cycle as they control long-term carbon sequestration in boreal peatland ecosystems. Sphagnum species engineer peatlands, which harbour roughly a quarter of all terrestrial carbon, through peat accumulation by constructing their own niche that allows them to outcompete other plants. Interspecific variation in peat production, largely resulting from differences in tissue decomposability, is hypothesized to drive niche differentiation along microhabitat gradients thereby alleviating competitive pressure. However, little empirical evidence exists for the role of selection in the creation and maintenance of such gradients. In order to document how niche construction and differentiation evolved in Sphagnum, we quantified decomposability for 54 species under natural conditions and used phylogenetic comparative methods to model the evolution of this carbon cycling trait. We show that decomposability tracks the phylogenetic diversification of peat mosses, that natural selection favours different levels of decomposability corresponding to optimum niche and that divergence in this trait occurred early in the evolution of the genus prior to the divergence of most extant species. Our results demonstrate the evolution of ecosystem engineering via natural selection on an extended phenotype, of a fundamental ecosystem process, and one of the Earth's largest soil carbon pools.
Collapse
Affiliation(s)
| | - Joseph B Yavitt
- Department of Natural Resources, Cornell University, Ithaca, NY 14853, USA
| | - Merritt R Turetsky
- Institute of Arctic and Alpine Research, University of Colorado, Boulder, CO 80309, USA
| | - A Jonathan Shaw
- Department of Biology, Duke University, Durham, NC 27708, USA
| |
Collapse
|
24
|
Zhang C, Zhao Y, Braun EL, Mirarab S. TAPER: Pinpointing errors in multiple sequence alignments despite varying rates of evolution. Methods Ecol Evol 2021. [DOI: 10.1111/2041-210x.13696] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- Chao Zhang
- Bioinformatics and Systems Biology Program University of California San Diego CA USA
| | - Yiming Zhao
- Electrical and Computer Engineering Department University of California San Diego CA USA
| | - Edward L. Braun
- Department of Biology and Genetics Institute University of Florida Gainesville FL USA
| | - Siavash Mirarab
- Electrical and Computer Engineering Department University of California San Diego CA USA
| |
Collapse
|
25
|
Gupta M, Zaharias P, Warnow T. Accurate Large-scale Phylogeny-Aware Alignment using BAli-Phy. Bioinformatics 2021; 37:4677-4683. [PMID: 34320635 DOI: 10.1093/bioinformatics/btab555] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Revised: 06/25/2021] [Accepted: 07/27/2021] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION BAli-Phy, a popular Bayesian method that co-estimates multiple sequence alignments and phylogenetic trees, is a rigorous statistical method, but due to its computational requirements, it has generally been limited to relatively small datasets (at most about 100 sequences). Here we repurpose BAli-Phy as a ``phylogeny-aware" alignment method: we estimate the phylogeny from the input of unaligned sequences, and then use that as a fixed tree within BAli-Phy. RESULTS We show that this approach achieves high accuracy, greatly superior to Prank, the current most popular phylogeny-aware alignment method, and is even more accurate than MAFFT, one of the top performing alignment methods in common use. Furthermore, this approach can be used to align very large datasets (up to 1000 sequences in this study). AVAILABILITY See https://doi.org/10.13012/B2IDB-7863273_V1 for datasets used in this study. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Maya Gupta
- 1University of Illinois Urbana-Champaign, Urbana IL 61801, USA
| | - Paul Zaharias
- 1University of Illinois Urbana-Champaign, Urbana IL 61801, USA
| | - Tandy Warnow
- 1University of Illinois Urbana-Champaign, Urbana IL 61801, USA
| |
Collapse
|
26
|
Turakhia Y, Thornlow B, Hinrichs AS, De Maio N, Gozashti L, Lanfear R, Haussler D, Corbett-Detig R. Ultrafast Sample placement on Existing tRees (UShER) enables real-time phylogenetics for the SARS-CoV-2 pandemic. Nat Genet 2021; 53:809-816. [PMID: 33972780 PMCID: PMC9248294 DOI: 10.1038/s41588-021-00862-7] [Citation(s) in RCA: 236] [Impact Index Per Article: 59.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 03/31/2021] [Indexed: 02/03/2023]
Abstract
As the SARS-CoV-2 virus spreads through human populations, the unprecedented accumulation of viral genome sequences is ushering in a new era of 'genomic contact tracing'-that is, using viral genomes to trace local transmission dynamics. However, because the viral phylogeny is already so large-and will undoubtedly grow many fold-placing new sequences onto the tree has emerged as a barrier to real-time genomic contact tracing. Here, we resolve this challenge by building an efficient tree-based data structure encoding the inferred evolutionary history of the virus. We demonstrate that our approach greatly improves the speed of phylogenetic placement of new samples and data visualization, making it possible to complete the placements under the constraints of real-time contact tracing. Thus, our method addresses an important need for maintaining a fully updated reference phylogeny. We make these tools available to the research community through the University of California Santa Cruz SARS-CoV-2 Genome Browser to enable rapid cross-referencing of information in new virus sequences with an ever-expanding array of molecular and structural biology data. The methods described here will empower research and genomic contact tracing for SARS-CoV-2 specifically for laboratories worldwide.
Collapse
Affiliation(s)
- Yatish Turakhia
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA.
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA.
| | - Bryan Thornlow
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Angie S Hinrichs
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Nicola De Maio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - Landen Gozashti
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
| | - Robert Lanfear
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - David Haussler
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA.
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA.
- National Research University Higher School of Economics, Moscow, Russian Federation.
| |
Collapse
|
27
|
Abstract
Multiple sequence alignment is a core first step in many bioinformatics analyses, and errors in these alignments can have negative consequences for scientific studies. In this article, we review some of the recent literature evaluating multiple sequence alignment methods and identify specific challenges that arise when performing these evaluations. In particular, we discuss the different trends observed in simulation studies and when using biological benchmarks. Overall, we find that multiple sequence alignment, far from being a "solved problem," would benefit from new attention.
Collapse
Affiliation(s)
- Tandy Warnow
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA.
| |
Collapse
|
28
|
Abstract
Evolutionary analyses require sequence alignments that correctly represent evolutionary homology. Evolutionary homology and proteins' structural similarity are not the same and sequence alignments generated with methods designed for structural matching can be seriously misleading in comparative and phylogenetic analyses. The phylogeny-aware alignment algorithm implemented in the program PRANK has been shown to produce good alignments for evolutionary inferences. Unlike other alignment programs, PRANK makes use of phylogenetic information to distinguish alignment gaps caused by insertions or deletions and, thereafter, handles the two types of events differently. As a by-product of the correct handling of insertions and deletions, PRANK can provide the inferred ancestral sequences as a part of the output and mark the alignment gaps differently depending on their origin in insertion or deletion events. As the algorithm infers the evolutionary history of the sequences, PRANK can be sensitive to errors in the guide phylogeny and violations on the underlying assumptions about the origin and patterns of gaps. To mitigate the effects of such model violations, the phylogeny-aware alignment algorithm has been re-implemented in program PAGAN. By using sequence graphs, PAGAN can model and accumulate evidence from more complex gap structures than PRANK does, and incorporate this uncertainty in the inferred ancestral sequences. These issues are discussed in detail below and practical advice is provided for the use of PRANK and PAGAN in evolutionary analysis. The two software packages can be downloaded from http://wasabiapp.org/software .
Collapse
Affiliation(s)
- Ari Löytynoja
- Institute of Biotechnology, HiLIFE, University of Helsinki, Helsinki, Finland.
| |
Collapse
|
29
|
Abstract
Wasabi is an open-source, web-based graphical environment for evolutionary sequence analysis and visualization, designed to work with multiple sequence alignments within their phylogenetic context. Its interactive user interface provides convenient access to external data sources and computational tools and is easily extendable with custom tools and pipelines using a plugin system. Wasabi stores intermediate editing and analysis steps as workflow histories and provides direct-access web links to datasets, allowing for reproducible, collaborative research, and easy dissemination of the results. In addition to shared analyses and installation-free usage, the web-based design allows Wasabi to be run as a cross-platform, stand-alone application and makes its integration to other web services straightforward.This chapter gives a detailed description and guidelines for the use of Wasabi's analysis environment. Example use cases will give step-by-step instructions for practical application of the public Wasabi, from quick data visualization to branched analysis pipelines and publishing of results. We end with a brief discussion of advanced usage of Wasabi, including command-line communication, interface extension, offline usage, and integration to local and public web services. The public Wasabi application, its source code, documentation, and other materials are available at http://wasabiapp.org.
Collapse
Affiliation(s)
- Andres Veidenberg
- Institute of Biotechnology, University of Helsinki, Helsinki, Finland.
| | - Ari Löytynoja
- Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| |
Collapse
|
30
|
Turakhia Y, Thornlow B, Hinrichs AS, De Maio N, Gozashti L, Lanfear R, Haussler D, Corbett-Detig R. Ultrafast Sample Placement on Existing Trees (UShER) Empowers Real-Time Phylogenetics for the SARS-CoV-2 Pandemic. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2020:2020.09.26.314971. [PMID: 33024970 PMCID: PMC7536873 DOI: 10.1101/2020.09.26.314971] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
UNLABELLED As the SARS-CoV-2 virus spreads through human populations, the unprecedented accumulation of viral genome sequences is ushering a new era of "genomic contact tracing" - that is, using viral genome sequences to trace local transmission dynamics. However, because the viral phylogeny is already so large - and will undoubtedly grow many fold - placing new sequences onto the tree has emerged as a barrier to real-time genomic contact tracing. Here, we resolve this challenge by building an efficient, tree-based data structure encoding the inferred evolutionary history of the virus. We demonstrate that our approach improves the speed of phylogenetic placement of new samples and data visualization by orders of magnitude, making it possible to complete the placements under real-time constraints. Our method also provides the key ingredient for maintaining a fully-updated reference phylogeny. We make these tools available to the research community through the UCSC SARS-CoV-2 Genome Browser to enable rapid cross-referencing of information in new virus sequences with an ever-expanding array of molecular and structural biology data. The methods described here will empower research and genomic contact tracing for laboratories worldwide. SOFTWARE AVAILABILITY USHER is available to users through the UCSC Genome Browser at https://genome.ucsc.edu/cgi-bin/hgPhyloPlace . The source code and detailed instructions on how to compile and run UShER are available from https://github.com/yatisht/usher .
Collapse
|
31
|
An Unusual Amino Acid Substitution Within Hummingbird Cytochrome c Oxidase Alters a Key Proton-Conducting Channel. G3-GENES GENOMES GENETICS 2020; 10:2477-2485. [PMID: 32444359 PMCID: PMC7341133 DOI: 10.1534/g3.120.401312] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Hummingbirds in flight exhibit the highest mass-specific metabolic rate of all vertebrates. The bioenergetic requirements associated with sustained hovering flight raise the possibility of unique amino acid substitutions that would enhance aerobic metabolism. Here, we have identified a non-conservative substitution within the mitochondria-encoded cytochrome c oxidase subunit I (COI) that is fixed within hummingbirds, but not among other vertebrates. This unusual change is also rare among metazoans, but can be identified in several clades with diverse life histories. We performed atomistic molecular dynamics simulations using bovine and hummingbird COI models, thereby bypassing experimental limitations imposed by the inability to modify mtDNA in a site-specific manner. Intriguingly, our findings suggest that COI amino acid position 153 (bovine numbering convention) provides control over the hydration and activity of a key proton channel in COX. We discuss potential phenotypic outcomes linked to this alteration encoded by hummingbird mitochondrial genomes.
Collapse
|
32
|
Freitas L, Mesquita RD, Schrago CG. Survey for positively selected coding regions in the genome of the hematophagous tsetse fly Glossina morsitans identifies candidate genes associated with feeding habits and embryonic development. Genet Mol Biol 2020; 43:e20180311. [PMID: 32555940 PMCID: PMC7288665 DOI: 10.1590/1678-4685-gmb-2018-0311] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2018] [Accepted: 08/23/2019] [Indexed: 11/22/2022] Open
Abstract
Tsetse flies are responsible for the transmission of Trypanossoma sp. to vertebrate animals in Africa causing huge health issues and economic loss. The availability of the genome sequence of Glossina morsitans enabled the discovery of several genes related to medically important phenotypes and novel physiological features. However, a genome-wide scan for coding regions that underwent positive selection is still missing, which is surprising given the evolution of traits associated with the hematophagy in this lineage. In this study, we employed an experimental design that controlled for the rate of false positives and we performed a scan of 3,318 G. morsitans genes. We found 145 genes with significant historical signal of positive selection. These genes were categorized into 18 functional classes after careful manual annotation. Based on their attributed functions, we identified candidate genes related with feeding habits and embryonic development. When our results were contrasted with gene expression data, we confirmed that most genes that underwent adaptive molecular evolution were frequently expressed in organs associated with key physiological evolutionary innovations in the G. morsitans lineage, namely, the salivary gland, the midgut, fat body tissue, and in the spermatophore.
Collapse
Affiliation(s)
- Lucas Freitas
- Universidade Federal do Rio de Janeiro, Departamento de Genética, Rio de Janeiro, RJ, Brazil.,Universidade Federal do Rio de Janeiro, Instituto de Química, Departamento de Bioquímica, Laboratório de Bioinformática, Rio de Janeiro, RJ, Brazil.,Instituto Nacional de Ciência e Tecnologia em Entomologia Molecular, Rio de Janeiro, RJ, Brazil
| | - Rafael D Mesquita
- Universidade Federal do Rio de Janeiro, Instituto de Química, Departamento de Bioquímica, Laboratório de Bioinformática, Rio de Janeiro, RJ, Brazil.,Instituto Nacional de Ciência e Tecnologia em Entomologia Molecular, Rio de Janeiro, RJ, Brazil
| | - Carlos G Schrago
- Universidade Federal do Rio de Janeiro, Departamento de Genética, Rio de Janeiro, RJ, Brazil
| |
Collapse
|
33
|
Tufféry P, de Vries S. The search of sequence variants using a constrained protein evolution simulation approach. Comput Struct Biotechnol J 2020; 18:1790-1799. [PMID: 32695271 PMCID: PMC7355721 DOI: 10.1016/j.csbj.2020.06.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2020] [Revised: 05/15/2020] [Accepted: 06/09/2020] [Indexed: 10/25/2022] Open
Abstract
Protein engineering or candidate therapeutic peptide optimization are processes in which the identification of relevant sequence variants is critical. Starting from one amino-acid sequence, the choice of the substitutions must meet the objective of not disrupting the structure of the protein, not impacting the main functional properties of the starting entity, while also meeting the condition to enhance some expected property such as thermal stability, resistance to degradation, … Here, we introduce a new approach of sequence evolution that focuses on the objective of not disrupting the structure of the initial protein by embedding a point to point control on the preservation of the local structure at each position in the sequence. For 6 mini-proteins, we find that, starting from a single sequence, our simple approach intrinsically contains information about site-specific rate heterogeneity of substitution, and that it is able to reproduce sequence diversity as can be observed in the sequences available in the Uniref repository. We show that our approach is able to provide information about positions not to substitute and about substitutions not to perform at a given position to maintain structure integrity. Overall, our results demonstrate that point to point preservation of the local structure along a sequence is an important determinant of sequence evolution.
Collapse
Affiliation(s)
- Pierre Tufféry
- Université de Paris, BFA, UMR 8251, CNRS, ERL U1133, Inserm, RPBS, F-75013 Paris, France
| | - Sjoerd de Vries
- Université de Paris, BFA, UMR 8251, CNRS, ERL U1133, Inserm, RPBS, F-75013 Paris, France
| |
Collapse
|
34
|
Kemler M, Denchev TT, Denchev CM, Begerow D, Piątek M, Lutz M. Host preference and sorus location correlate with parasite phylogeny in the smut fungal genus Microbotryum (Basidiomycota, Microbotryales). Mycol Prog 2020. [DOI: 10.1007/s11557-020-01571-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
AbstractThe smut fungal genus Microbotryum (Microbotryales, Pucciniomycotina) contains species that parasitize plants from many different lineages of euasterids, with host specificity of individual parasite species in general being exceptionally high. Additionally, it has been shown that the location of spore production in some species is related to spore dispersal. In this phylogenetic study based on ITS and LSU rDNA data of 57 Microbotryum spp., host spectra and sorus location are mapped on the phylogeny of Microbotryum species in order to understand the macroevolutionary patterns of these two traits. We find that monophyletic parasite clades correspond well with monophyletic host clades and also that monophyletic parasite groups in general produce their spores in the same plant organ. Ancestral state reconstruction inferred the most probable ancestral trait for sorus location being leaves and the most probable ancestral host family for the genus Microbotryum as being the Polygonaceae. According to molecular analyses, a newly sequenced specimen of Ustilago ducellieri, a seed parasite on Arenaria leptoclados, previously treated as synonym of Microbotryum duriaeanum, belongs to a lineage distinct from specimens of M. duriaeanum. A new combination, Microbotryum ducellieri, is accordingly proposed. Taxonomic implications of the presented analyses for the genera Bauhinus and Haradaea are briefly discussed.
Collapse
|
35
|
Katoh K, Rozewicki J, Yamada KD. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief Bioinform 2020; 20:1160-1166. [PMID: 28968734 PMCID: PMC6781576 DOI: 10.1093/bib/bbx108] [Citation(s) in RCA: 4436] [Impact Index Per Article: 887.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Revised: 07/27/2017] [Indexed: 11/28/2022] Open
Abstract
This article describes several features in the MAFFT online service for multiple sequence alignment (MSA). As a result of recent advances in sequencing technologies, huge numbers of biological sequences are available and the need for MSAs with large numbers of sequences is increasing. To extract biologically relevant information from such data, sophistication of algorithms is necessary but not sufficient. Intuitive and interactive tools for experimental biologists to semiautomatically handle large data are becoming important. We are working on development of MAFFT toward these two directions. Here, we explain (i) the Web interface for recently developed options for large data and (ii) interactive usage to refine sequence data sets and MSAs.
Collapse
Affiliation(s)
- Kazutaka Katoh
- Corresponding author: Kazutaka Katoh, 3-1 Yamadaoka, Suita, Osaka 565-0871, JAPAN. E-mail:
| | | | | |
Collapse
|
36
|
Saurav K, Borbone N, Burgsdorf I, Teta R, Caso A, Bar-Shalom R, Esposito G, Britstein M, Steindler L, Costantino V. Identification of Quorum Sensing Activators and Inhibitors in The Marine Sponge Sarcotragus spinosulus. Mar Drugs 2020; 18:md18020127. [PMID: 32093216 PMCID: PMC7074164 DOI: 10.3390/md18020127] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 02/11/2020] [Accepted: 02/18/2020] [Indexed: 12/27/2022] Open
Abstract
Marine sponges, a well-documented prolific source of natural products, harbor highly diverse microbial communities. Their extracts were previously shown to contain quorum sensing (QS) signal molecules of the N-acyl homoserine lactone (AHL) type, known to orchestrate bacterial gene regulation. Some bacteria and eukaryotic organisms are known to produce molecules that can interfere with QS signaling, thus affecting microbial genetic regulation and function. In the present study, we established the production of both QS signal molecules as well as QS inhibitory (QSI) molecules in the sponge species Sarcotragus spinosulus. A total of eighteen saturated acyl chain AHLs were identified along with six unsaturated acyl chain AHLs. Bioassay-guided purification led to the isolation of two brominated metabolites with QSI activity. The structures of these compounds were elucidated by comparative spectral analysis of 1HNMR and HR-MS data and were identified as 3-bromo-4-methoxyphenethylamine (1) and 5,6-dibromo-N,N-dimethyltryptamine (2). The QSI activity of compounds 1 and 2 was evaluated using reporter gene assays for long- and short-chain AHL signals (Escherichia coli pSB1075 and E. coli pSB401, respectively). QSI activity was further confirmed by measuring dose-dependent inhibition of proteolytic activity and pyocyanin production in Pseudomonas aeruginosa PAO1. The obtained results show the coexistence of QS and QSI in S. spinosulus, a complex signal network that may mediate the orchestrated function of the microbiome within the sponge holobiont.
Collapse
Affiliation(s)
- Kumar Saurav
- Department of Marine Biology, Leon H. Charney School of Marine Sciences, University of Haifa, Mt. Carmel 31905, Haifa, Israel; (K.S.); (I.B.); (R.B.-S.); (M.B.); (L.S.)
- The Blue Chemistry Lab, Dipartimento di Farmacia, Università degli Studi di Napoli Federico II, Via D. Montesano 49, 80131, Napoli, Italy; (N.B.); (R.T.); (A.C.); (G.E.)
- Laboratory of Algal Biotechnology-Centre Algatech, Institute of Microbiology of the Czech Academy of Sciences, Opatovickýmlýn, Novohradská 237, 379 81 Třeboň, Czech Republic
| | - Nicola Borbone
- The Blue Chemistry Lab, Dipartimento di Farmacia, Università degli Studi di Napoli Federico II, Via D. Montesano 49, 80131, Napoli, Italy; (N.B.); (R.T.); (A.C.); (G.E.)
| | - Ilia Burgsdorf
- Department of Marine Biology, Leon H. Charney School of Marine Sciences, University of Haifa, Mt. Carmel 31905, Haifa, Israel; (K.S.); (I.B.); (R.B.-S.); (M.B.); (L.S.)
| | - Roberta Teta
- The Blue Chemistry Lab, Dipartimento di Farmacia, Università degli Studi di Napoli Federico II, Via D. Montesano 49, 80131, Napoli, Italy; (N.B.); (R.T.); (A.C.); (G.E.)
| | - Alessia Caso
- The Blue Chemistry Lab, Dipartimento di Farmacia, Università degli Studi di Napoli Federico II, Via D. Montesano 49, 80131, Napoli, Italy; (N.B.); (R.T.); (A.C.); (G.E.)
| | - Rinat Bar-Shalom
- Department of Marine Biology, Leon H. Charney School of Marine Sciences, University of Haifa, Mt. Carmel 31905, Haifa, Israel; (K.S.); (I.B.); (R.B.-S.); (M.B.); (L.S.)
| | - Germana Esposito
- The Blue Chemistry Lab, Dipartimento di Farmacia, Università degli Studi di Napoli Federico II, Via D. Montesano 49, 80131, Napoli, Italy; (N.B.); (R.T.); (A.C.); (G.E.)
| | - Maya Britstein
- Department of Marine Biology, Leon H. Charney School of Marine Sciences, University of Haifa, Mt. Carmel 31905, Haifa, Israel; (K.S.); (I.B.); (R.B.-S.); (M.B.); (L.S.)
| | - Laura Steindler
- Department of Marine Biology, Leon H. Charney School of Marine Sciences, University of Haifa, Mt. Carmel 31905, Haifa, Israel; (K.S.); (I.B.); (R.B.-S.); (M.B.); (L.S.)
| | - Valeria Costantino
- The Blue Chemistry Lab, Dipartimento di Farmacia, Università degli Studi di Napoli Federico II, Via D. Montesano 49, 80131, Napoli, Italy; (N.B.); (R.T.); (A.C.); (G.E.)
- Correspondence: ; Tel.: +39-081-678-504
| |
Collapse
|
37
|
Arnold B, Sohail M, Wadsworth C, Corander J, Hanage WP, Sunyaev S, Grad YH. Fine-Scale Haplotype Structure Reveals Strong Signatures of Positive Selection in a Recombining Bacterial Pathogen. Mol Biol Evol 2020; 37:417-428. [PMID: 31589312 PMCID: PMC6993868 DOI: 10.1093/molbev/msz225] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Identifying genetic variation in bacteria that has been shaped by ecological differences remains an important challenge. For recombining bacteria, the sign and strength of linkage provide a unique lens into ongoing selection. We show that derived alleles <300 bp apart in Neisseria gonorrhoeae exhibit more coupling linkage than repulsion linkage, a pattern that cannot be explained by limited recombination or neutrality as these couplings are significantly stronger for nonsynonymous alleles than synonymous alleles. This general pattern is driven by a small fraction of highly diverse genes, many of which exhibit evidence of interspecies horizontal gene transfer and an excess of intermediate frequency alleles. Extensive simulations show that two distinct forms of positive selection can create these patterns of genetic variation: directional selection on horizontally transferred alleles or balancing selection that maintains distinct haplotypes in the presence of recombination. Our results establish a framework for identifying patterns of selection in fine-scale haplotype structure that indicate specific ecological processes in species that recombine with distantly related lineages or possess coexisting adaptive haplotypes.
Collapse
Affiliation(s)
- Brian Arnold
- Division of Informatics, Faculty of Arts and Sciences, Harvard University, Cambridge, MA
- Center for Communicable Disease Dynamics, Harvard T. H. Chan School of Public Health, Boston, MA
| | - Mashaal Sohail
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA
| | - Crista Wadsworth
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA
| | - Jukka Corander
- Department of Biostatistics, University of Oslo, Oslo, Norway
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, Finland
| | - William P Hanage
- Center for Communicable Disease Dynamics, Harvard T. H. Chan School of Public Health, Boston, MA
| | - Shamil Sunyaev
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA
| | - Yonatan H Grad
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA
- Division of Infectious Diseases, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| |
Collapse
|
38
|
Oberhofer M, Hess J, Leutgeb M, Gössnitzer F, Rattei T, Wawrosch C, Zotchev SB. Exploring Actinobacteria Associated With Rhizosphere and Endosphere of the Native Alpine Medicinal Plant Leontopodium nivale Subspecies alpinum. Front Microbiol 2019; 10:2531. [PMID: 31781058 PMCID: PMC6857621 DOI: 10.3389/fmicb.2019.02531] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Accepted: 10/21/2019] [Indexed: 11/24/2022] Open
Abstract
The rhizosphere of plants is enriched in nutrients facilitating growth of microorganisms, some of which are recruited as endophytes. Endophytes, especially Actinobacteria, are known to produce a plethora of bioactive compounds. We hypothesized that Leontopodium nivale subsp. alpinum (Edelweiss), a rare alpine medicinal plant, may serve as yet untapped source for uncommon Actinobacteria associated with this plant. Rhizosphere soil of native Alpine plants was used, after physical and chemical pre-treatments, for isolating Actinobacteria. Isolates were selected based on morphology and identified by 16S rRNA gene-based barcoding. Resulting 77 Actinobacteria isolates represented the genera Actinokineospora, Kitasatospora, Asanoa, Microbacterium, Micromonospora, Micrococcus, Mycobacterium, Nocardia, and Streptomyces. In parallel, Edelweiss plants from the same location were surface-sterilized, separated into leaves, roots, rhizomes, and inflorescence and pooled within tissues before genomic DNA extraction. Metagenomic 16S rRNA gene amplicons confirmed large numbers of actinobacterial operational taxonomic units (OTUs) descending in diversity from roots to rhizomes, leaves and inflorescences. These metagenomic data, when queried with isolate sequences, revealed an overlap between the two datasets, suggesting recruitment of soil bacteria by the plant. Moreover, this study uncovered a profound diversity of uncultured Actinobacteria from Rubrobacteridae, Thermoleophilales, Acidimicrobiales and unclassified Actinobacteria specifically in belowground tissues, which may be exploited by a targeted isolation approach in the future.
Collapse
Affiliation(s)
- Martina Oberhofer
- Pharmaceutical Biotechnology, Department of Pharmacognosy, University of Vienna, Vienna, Austria
| | - Jaqueline Hess
- Division of Systematic and Evolutionary Botany, Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| | - Marlene Leutgeb
- Pharmaceutical Biotechnology, Department of Pharmacognosy, University of Vienna, Vienna, Austria
| | - Florian Gössnitzer
- Pharmaceutical Biotechnology, Department of Pharmacognosy, University of Vienna, Vienna, Austria
| | - Thomas Rattei
- Department of Microbiology and Ecosystem Science, University of Vienna, Vienna, Austria
| | - Christoph Wawrosch
- Pharmaceutical Biotechnology, Department of Pharmacognosy, University of Vienna, Vienna, Austria
| | - Sergey B. Zotchev
- Pharmaceutical Biotechnology, Department of Pharmacognosy, University of Vienna, Vienna, Austria
| |
Collapse
|
39
|
Ashkenazy H, Levy Karin E, Mertens Z, Cartwright RA, Pupko T. SpartaABC: a web server to simulate sequences with indel parameters inferred using an approximate Bayesian computation algorithm. Nucleic Acids Res 2019; 45:W453-W457. [PMID: 28460062 PMCID: PMC5570005 DOI: 10.1093/nar/gkx322] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2017] [Accepted: 04/15/2017] [Indexed: 11/22/2022] Open
Abstract
Many analyses for the detection of biological phenomena rely on a multiple sequence alignment as input. The results of such analyses are often further studied through parametric bootstrap procedures, using sequence simulators. One of the problems with conducting such simulation studies is that users currently have no means to decide which insertion and deletion (indel) parameters to choose, so that the resulting sequences mimic biological data. Here, we present SpartaABC, a web server that aims to solve this issue. SpartaABC implements an approximate-Bayesian-computation rejection algorithm to infer indel parameters from sequence data. It does so by extracting summary statistics from the input. It then performs numerous sequence simulations under randomly sampled indel parameters. By computing a distance between the summary statistics extracted from the input and each simulation, SpartaABC retains only parameters behind simulations close to the real data. As output, SpartaABC provides point estimates and approximate posterior distributions of the indel parameters. In addition, SpartaABC allows simulating sequences with the inferred indel parameters. To this end, the sequence simulators, Dawg 2.0 and INDELible were integrated. Using SpartaABC we demonstrate the differences in indel dynamics among three protein-coding genes across mammalian orthologs. SpartaABC is freely available for use at http://spartaabc.tau.ac.il/webserver.
Collapse
Affiliation(s)
- Haim Ashkenazy
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Eli Levy Karin
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel.,Department of Molecular Biology and Ecology of Plants, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Zach Mertens
- The Biodesign Institute, Arizona State University, Tempe, AZ 85287-5301, USA
| | - Reed A Cartwright
- The Biodesign Institute, Arizona State University, Tempe, AZ 85287-5301, USA.,School of Life Sciences, Arizona State University, Tempe, AZ 85287-5301, USA
| | - Tal Pupko
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| |
Collapse
|
40
|
Vialle RA, Tamuri AU, Goldman N. Alignment Modulates Ancestral Sequence Reconstruction Accuracy. Mol Biol Evol 2019; 35:1783-1797. [PMID: 29618097 PMCID: PMC5995191 DOI: 10.1093/molbev/msy055] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Accurate reconstruction of ancestral states is a critical evolutionary analysis when studying ancient proteins and comparing biochemical properties between parental or extinct species and their extant relatives. It relies on multiple sequence alignment (MSA) which may introduce biases, and it remains unknown how MSA methodological approaches impact ancestral sequence reconstruction (ASR). Here, we investigate how MSA methodology modulates ASR using a simulation study of various evolutionary scenarios. We evaluate the accuracy of ancestral protein sequence reconstruction for simulated data and compare reconstruction outcomes using different alignment methods. Our results reveal biases introduced not only by aligner algorithms and assumptions, but also tree topology and the rate of insertions and deletions. Under many conditions we find no substantial differences between the MSAs. However, increasing the difficulty for the aligners can significantly impact ASR. The MAFFT consistency aligners and PRANK variants exhibit the best performance, whereas FSA displays limited performance. We also discover a bias towards reconstructed sequences longer than the true ancestors, deriving from a preference for inferring insertions, in almost all MSA methodological approaches. In addition, we find measures of MSA quality generally correlate highly with reconstruction accuracy. Thus, we show MSA methodological differences can affect the quality of reconstructions and propose MSA methods should be selected with care to accurately determine ancestral states with confidence.
Collapse
Affiliation(s)
- Ricardo Assunção Vialle
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, United Kingdom.,Department of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.,Department of Genetics and Molecular Biology, Laboratory of Human and Medical Genetics, Federal University of Pará, Belém, Pará, Brazil
| | - Asif U Tamuri
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, United Kingdom.,Research IT Services, University College London, London, United Kingdom
| | - Nick Goldman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, United Kingdom
| |
Collapse
|
41
|
Kirpach J, Colone A, Bürckert JP, Faison WJ, Dubois ARSX, Sinner R, Reye AL, Muller CP. Detection of a Low Level and Heterogeneous B Cell Immune Response in Peripheral Blood of Acute Borreliosis Patients With High Throughput Sequencing. Front Immunol 2019; 10:1105. [PMID: 31156648 PMCID: PMC6532064 DOI: 10.3389/fimmu.2019.01105] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Accepted: 04/30/2019] [Indexed: 01/08/2023] Open
Abstract
The molecular diagnosis of acute Borreliosis is complicated and better strategies to improve the diagnostic processes are warranted. High Throughput Sequencing (HTS) of human B cell repertoires after e.g., Dengue virus infection or influenza vaccination revealed antigen-associated “CDR3 signatures” which may have the potential to support diagnosis in infectious diseases. The human B cell immune response to Borrelia burgdorferi sensu lato—the causative agent of Borreliosis—has mainly been studied at the antibody level, while less attention has been given to the cellular part of the humoral immune response. There are indications that Borrelia actively influence the B cell immune response and that it is therefore not directly comparable to responses induced by other infections. The main goal of this study was to identify B cell features that could be used to support diagnosis of Borreliosis. Therefore, we characterized the B cell immune response in these patients by combining multicolor flow cytometry, single Borrelia-reactive B cell receptor (BCR) sequencing, and B cell repertoire deep sequencing. Our phenotyping experiments showed, that there is no significant difference between B cell subpopulations of acute Borreliosis patients and controls. BCR sequences from individual epitope-reactive B cells had little in common between each other. HTS showed, however, a higher complementarity determining region 3 (CDR3) amino acid (aa) sequence overlap between samples from different timepoints in patients as compared to controls. This indicates, that HTS is sensitive enough to detect ongoing B cell immune responses in these patients. Although each individual's repertoire was dominated by rather unique clones, clustering of bulk BCR repertoire sequences revealed a higher overlap of IgG BCR repertoire sequences between acute patients than controls. Even if we have identified a few Borrelia-associated CDR3aa sequences, they seem to be rather unique for each patient and therefore not suitable as biomarkers.
Collapse
Affiliation(s)
- Josiane Kirpach
- Vaccinology and B Cell Immunology, Infectious Diseases Research Unit, Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| | - Alessia Colone
- Vaccinology and B Cell Immunology, Infectious Diseases Research Unit, Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| | - Jean-Philippe Bürckert
- Vaccinology and B Cell Immunology, Infectious Diseases Research Unit, Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| | - William J Faison
- Vaccinology and B Cell Immunology, Infectious Diseases Research Unit, Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| | - Axel R S X Dubois
- Vaccinology and B Cell Immunology, Infectious Diseases Research Unit, Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| | - Regina Sinner
- Vaccinology and B Cell Immunology, Infectious Diseases Research Unit, Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| | - Anna L Reye
- Vaccinology and B Cell Immunology, Infectious Diseases Research Unit, Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| | - Claude P Muller
- Vaccinology and B Cell Immunology, Infectious Diseases Research Unit, Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| |
Collapse
|
42
|
Lloyd Evans D, Joshi SV, Wang J. Whole chloroplast genome and gene locus phylogenies reveal the taxonomic placement and relationship of Tripidium (Panicoideae: Andropogoneae) to sugarcane. BMC Evol Biol 2019; 19:33. [PMID: 30683070 PMCID: PMC6347779 DOI: 10.1186/s12862-019-1356-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2017] [Accepted: 01/03/2019] [Indexed: 11/13/2022] Open
Abstract
Background For over 50 years, attempts have been made to introgress agronomically useful traits from Erianthus sect. Ripidium (Tripidium) species into sugarcane based on both genera being part of the ‘Saccharum Complex’, an interbreeding group of species believed to be involved in the origins of sugarcane. However, recent low copy number gene studies indicate that Tripidium and Saccharum are more divergent than previously thought. The extent of genus Tripidium has not been fully explored and many species that should be included in Tripidium are still classified as Saccharum. Moreover, Tripidium is currently defined as incertae sedis within the Andropogoneae, though it has been suggested that members of this genus are related to the Germainiinae. Results Eight newly-sequenced chloroplasts from potential Tripidium species were combined in a phylogenetic study with 46 members of the Panicoideae, including seven Saccharum accessions, two Miscanthidium and three Miscanthus species. A robust chloroplast phylogeny was generated and comparison with a gene locus phylogeny clearly places a monophyletic Tripidium clade outside the bounds of the Saccharinae. A key to the currently identified Tripidium species is presented. Conclusion For the first time, we have undertaken a large-scale whole plastid study of eight newly assembled Tripidium accessions and a gene locus study of five Tripidium accessions. Our findings show that Tripidium and Saccharum are 8 million years divergent, last sharing a common ancestor 12 million years ago. We demonstrate that four species should be removed from Saccharum/Erianthus and included in genus Tripidium. In a genome context, we show that Tripidium evolved from a common ancestor with and extended Germainiinae clade formed from Germainia, Eriochrysis, Apocopis, Pogonatherum and Imperata. We re-define the ‘Saccharum complex’ to a group of genera that can interbreed in the wild and extend the Saccharinae to include Sarga along with Sorghastrum, Microstegium vimineum and Polytrias (but excluding Sorghum). Monophyly of genus Tripidium is confirmed and the genus is expanded to include Tripidium arundinaceum, Tripidium procerum, Tripidium kanashiroi and Tripidium rufipilum. As a consequence, these species are excluded from genus Saccharum. Moreover, we demonstrate that genus Tripidium is distinct from the Germainiinae. Electronic supplementary material The online version of this article (10.1186/s12862-019-1356-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Dyfed Lloyd Evans
- South African Sugarcane Research Institute, 170 Flanders Drive, Private Bag X02, Mount Edgecombe, Durban, 4300, South Africa. .,School of Life Sciences, College of Agriculture, Engineering and Science, University of Kwa-Zulu Natal, Private Bag X54001, Durban, 4000, South Africa. .,BeauSci Ltd., Waterbeach, Cambridge, CB25 9TL, UK.
| | - Shailesh V Joshi
- South African Sugarcane Research Institute, 170 Flanders Drive, Private Bag X02, Mount Edgecombe, Durban, 4300, South Africa.,School of Life Sciences, College of Agriculture, Engineering and Science, University of Kwa-Zulu Natal, Private Bag X54001, Durban, 4000, South Africa
| | - Jianping Wang
- Agronomy Department, University of Florida, Gainesville, FL, USA.,Center for Genomics and Biotechnology, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, Fuzhou, China.,Plant Molecular and Biology Program, Genetics Institute, University of Florida, Gainesville, FL, USA
| |
Collapse
|
43
|
Ashkenazy H, Sela I, Levy Karin E, Landan G, Pupko T. Multiple Sequence Alignment Averaging Improves Phylogeny Reconstruction. Syst Biol 2019; 68:117-130. [PMID: 29771363 PMCID: PMC6657586 DOI: 10.1093/sysbio/syy036] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Revised: 05/07/2018] [Accepted: 05/09/2018] [Indexed: 01/11/2023] Open
Abstract
The classic methodology of inferring a phylogenetic tree from sequence data is composed of two steps. First, a multiple sequence alignment (MSA) is computed. Then, a tree is reconstructed assuming the MSA is correct. Yet, inferred MSAs were shown to be inaccurate and alignment errors reduce tree inference accuracy. It was previously proposed that filtering unreliable alignment regions can increase the accuracy of tree inference. However, it was also demonstrated that the benefit of this filtering is often obscured by the resulting loss of phylogenetic signal. In this work we explore an approach, in which instead of relying on a single MSA, we generate a large set of alternative MSAs and concatenate them into a single SuperMSA. By doing so, we account for phylogenetic signals contained in columns that are not present in the single MSA computed by alignment algorithms. Using simulations, we demonstrate that this approach results, on average, in more accurate trees compared to 1) using an unfiltered MSA and 2) using a single MSA with weights assigned to columns according to their reliability. Next, we explore in which regions of the MSA space our approach is expected to be beneficial. Finally, we provide a simple criterion for deciding whether or not the extra effort of computing a SuperMSA and inferring a tree from it is beneficial. Based on these assessments, we expect our methodology to be useful for many cases in which diverged sequences are analyzed. The option to generate such a SuperMSA is available at http://guidance.tau.ac.il.
Collapse
Affiliation(s)
- Haim Ashkenazy
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv 69978, Tel Aviv, Israel
| | - Itamar Sela
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Eli Levy Karin
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv 69978, Tel Aviv, Israel
- Department of Molecular Biology & Ecology of Plants, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Giddy Landan
- Institute of Microbiology, Christian-Albrechts-University of Kiel, 24118 Kiel, Germany
| | - Tal Pupko
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv 69978, Tel Aviv, Israel
| |
Collapse
|
44
|
Savriama Y, Valtonen M, Kammonen JI, Rastas P, Smolander OP, Lyyski A, Häkkinen TJ, Corfe IJ, Gerber S, Salazar-Ciudad I, Paulin L, Holm L, Löytynoja A, Auvinen P, Jernvall J. Bracketing phenogenotypic limits of mammalian hybridization. ROYAL SOCIETY OPEN SCIENCE 2018; 5:180903. [PMID: 30564397 PMCID: PMC6281900 DOI: 10.1098/rsos.180903] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2018] [Accepted: 10/29/2018] [Indexed: 05/09/2023]
Abstract
An increasing number of mammalian species have been shown to have a history of hybridization and introgression based on genetic analyses. Only relatively few fossils, however, preserve genetic material, and morphology must be used to identify the species and determine whether morphologically intermediate fossils could represent hybrids. Because dental and cranial fossils are typically the key body parts studied in mammalian palaeontology, here we bracket the potential for phenotypically extreme hybridizations by examining uniquely preserved cranio-dental material of a captive hybrid between grey and ringed seals. We analysed how distinct these species are genetically and morphologically, how easy it is to identify the hybrids using morphology and whether comparable hybridizations happen in the wild. We show that the genetic distance between these species is more than twice the modern human-Neanderthal distance, but still within that of morphologically similar species pairs known to hybridize. By contrast, morphological and developmental analyses show grey and ringed seals to be highly disparate, and that the hybrid is a predictable intermediate. Genetic analyses of the parent populations reveal introgression in the wild, suggesting that grey-ringed seal hybridization is not limited to captivity. Taken together, we postulate that there is considerable potential for mammalian hybridization between phenotypically disparate taxa.
Collapse
Affiliation(s)
- Yoland Savriama
- Developmental Biology Program, Institute of Biotechnology, University of Helsinki, PO Box 56, 00014 Helsinki, Finland
| | - Mia Valtonen
- Developmental Biology Program, Institute of Biotechnology, University of Helsinki, PO Box 56, 00014 Helsinki, Finland
- Department of Environmental and Biological Sciences, University of Eastern Finland, PO Box 111, 80101 Joensuu, Finland
| | - Juhana I. Kammonen
- Genome Biology Program, Institute of Biotechnology, University of Helsinki, PO Box 56, 00014 Helsinki, Finland
| | - Pasi Rastas
- Genome Biology Program, Institute of Biotechnology, University of Helsinki, PO Box 56, 00014 Helsinki, Finland
| | - Olli-Pekka Smolander
- Genome Biology Program, Institute of Biotechnology, University of Helsinki, PO Box 56, 00014 Helsinki, Finland
| | - Annina Lyyski
- Genome Biology Program, Institute of Biotechnology, University of Helsinki, PO Box 56, 00014 Helsinki, Finland
| | - Teemu J. Häkkinen
- Developmental Biology Program, Institute of Biotechnology, University of Helsinki, PO Box 56, 00014 Helsinki, Finland
| | - Ian J. Corfe
- Developmental Biology Program, Institute of Biotechnology, University of Helsinki, PO Box 56, 00014 Helsinki, Finland
| | - Sylvain Gerber
- Institut Systématique Evolution Biodiversité (ISYEB), Muséum national d'Histoire naturelle, CNRS, Sorbonne Université, EPHE, 45 rue Buffon, CP 50, 75005 Paris, France
| | - Isaac Salazar-Ciudad
- Developmental Biology Program, Institute of Biotechnology, University of Helsinki, PO Box 56, 00014 Helsinki, Finland
- Departament de Genètica i Microbiologia, Universitat Autònoma de Barcelona, 08193 Cerdanyola del Vallès, Spain
| | - Lars Paulin
- Genome Biology Program, Institute of Biotechnology, University of Helsinki, PO Box 56, 00014 Helsinki, Finland
| | - Liisa Holm
- Genome Biology Program, Institute of Biotechnology, University of Helsinki, PO Box 56, 00014 Helsinki, Finland
- Faculty of Biological and Environmental Sciences, University of Helsinki, PO Box 56, 00014 Helsinki, Finland
| | - Ari Löytynoja
- Genome Biology Program, Institute of Biotechnology, University of Helsinki, PO Box 56, 00014 Helsinki, Finland
- Authors for correspondence: Ari Löytynoja e-mail:
| | - Petri Auvinen
- Genome Biology Program, Institute of Biotechnology, University of Helsinki, PO Box 56, 00014 Helsinki, Finland
- Authors for correspondence: Petri Auvinen e-mail:
| | - Jukka Jernvall
- Developmental Biology Program, Institute of Biotechnology, University of Helsinki, PO Box 56, 00014 Helsinki, Finland
- Authors for correspondence: Jukka Jernvall e-mail:
| |
Collapse
|
45
|
Cohanim AB, Amsalem E, Saad R, Shoemaker D, Privman E. Evolution of Olfactory Functions on the Fire Ant Social Chromosome. Genome Biol Evol 2018; 10:2947-2960. [PMID: 30239696 PMCID: PMC6279166 DOI: 10.1093/gbe/evy204] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/14/2018] [Indexed: 12/16/2022] Open
Abstract
Understanding the molecular evolutionary basis of social behavior is a major challenge in evolutionary biology. Social insects evolved a complex language of chemical signals to coordinate thousands of individuals. In the fire ant Solenopsis invicta, chemical signals are involved in the determination of a polymorphic social organization. Single-queen (monogyne) or multiqueen (polygyne) social structure is determined by the "social chromosome," a nonrecombining region containing ∼504 genes with two distinct haplotypes, SB and Sb. Monogyne queens are always SBB, while polygyne queens are always SBb. Workers discriminate monogyne from polygyne queens based on olfactory cues. Here, we took an evolutionary genomics approach to search for candidate genes in the social chromosome that could be responsible for this discrimination. We compared the SB and Sb haplotypes and analyzed the evolutionary rates since their divergence. Notably, we identified a cluster of 23 odorant receptors in the nonrecombining region of the social chromosome that stands out in terms of nonsynonymous changes in both haplotypes. The cluster includes twelve genes formed by recent Solenopsis-specific duplications. We found evidence for positive selection on several tree branches and significant differences between the SB and Sb haplotypes of these genes. The most dramatic difference is the complete deletion of two of these genes in Sb. These results suggest that the evolution of polygyne social organization involved adaptations in olfactory genes and opens the way for functional studies of the molecular mechanisms underlying social behavior.
Collapse
Affiliation(s)
- Amir B Cohanim
- Department of Evolutionary and Environmental Biology, Institute of Evolution, University of Haifa, Israel
| | - Etya Amsalem
- Department of Entomology, Huck Institutes of the Life Sciences, Pennsylvania State University
| | - Rana Saad
- Department of Evolutionary and Environmental Biology, Institute of Evolution, University of Haifa, Israel
| | - DeWayne Shoemaker
- Department of Entomology and Plant Pathology, University of Tennessee
| | - Eyal Privman
- Department of Evolutionary and Environmental Biology, Institute of Evolution, University of Haifa, Israel
| |
Collapse
|
46
|
Saad R, Cohanim AB, Kosloff M, Privman E. Neofunctionalization in Ligand Binding Sites of Ant Olfactory Receptors. Genome Biol Evol 2018; 10:2490-2500. [PMID: 29982411 PMCID: PMC6161762 DOI: 10.1093/gbe/evy131] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/22/2018] [Indexed: 12/12/2022] Open
Abstract
Chemical communication is fundamental for the operation of insect societies. Their diverse vocabulary of chemical signals requires a correspondingly diverse set of chemosensory receptors. Insect olfactory receptors (ORs) are the largest family of chemosensory receptors. The OR family is characterized by frequent expansions of subfamilies, in which duplicated ORs may adapt to detect new signals through positive selection on their amino acid sequence. Ants are an extreme example with ∼400 ORs per genome—the highest number in insects. Presumably, this reflects an increased complexity of chemical communication. Here, we examined gene duplications and positive selection on ant ORs. We reconstructed the hymenopteran OR gene tree, including five ant species, and inferred positive selection along every branch using the branch-site test, a total of 3326 tests. We find more positive selection in branches following species-specific duplications. We identified amino acid sites targeted by positive selection, and mapped them onto a structural model of insect ORs. Seventeen sites were under positive selection in six or more branches, forming two clusters on the extracellular side of the receptor, on either side of a cleft in the structure. This region was previously implicated in ligand activation, suggesting that the concentration of positively selected sites in this region is related to adaptive evolution of ligand binding sites or allosteric transmission of ligand activation. These results provide insights into the specific OR subfamilies and individual residues that facilitated adaptive evolution of olfactory functions, potentially explaining the elaboration of chemical signaling in ant societies.
Collapse
Affiliation(s)
- Rana Saad
- Department of Evolutionary and Environmental Biology, Institute of Evolution, University of Haifa, Israel
| | - Amir B Cohanim
- Department of Evolutionary and Environmental Biology, Institute of Evolution, University of Haifa, Israel
| | - Mickey Kosloff
- Department of Human Biology, University of Haifa, Israel
| | - Eyal Privman
- Department of Evolutionary and Environmental Biology, Institute of Evolution, University of Haifa, Israel
| |
Collapse
|
47
|
Bürckert JP, Faison WJ, Mustin DE, Dubois ARSX, Sinner R, Hunewald O, Wienecke-Baldacchino A, Brieger A, Muller CP. High-throughput sequencing of murine immunoglobulin heavy chain repertoires using single side unique molecular identifiers on an Ion Torrent PGM. Oncotarget 2018; 9:30225-30239. [PMID: 30100985 PMCID: PMC6084394 DOI: 10.18632/oncotarget.25493] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2017] [Accepted: 05/07/2018] [Indexed: 11/25/2022] Open
Abstract
With the advent of high-throughput sequencing (HTS), profiling immunoglobulin (IG) repertoires has become an essential part of immunological research. Advances in sequencing technology enable the IonTorrent Personal Genome Machine (PGM) to cover the full-length of IG mRNA transcripts. Nucleotide insertions and deletions (indels) are the dominant errors of the PGM sequencing platform and can critically influence IG repertoire assessments. Here, we present a PGM-tailored IG repertoire sequencing approach combining error correction through unique molecular identifier (UID) barcoding and indel detection through ImMunoGeneTics (IMGT), the most commonly used sequence alignment database for IG sequences. Using artificially falsified sequences for benchmarking, we found that IMGT's underlying algorithms efficiently detect 98% of the introduced indels. Undetected indels are either located at the end of the sequences or produce masked frameshifts with an insertion and deletion in close proximity. The complementary determining regions 3 (CDR3s) are returned correct for up to 3 insertions or 3 deletions through conservative culling. We further show, that our PGM-tailored unique molecular identifiers result in highly accurate HTS data if combined with the presented processing strategy. In this regard, considering sequences with at least two copies from datasets with UID families of minimum 3 reads result in correct sequences with over 99% confidence. Finally, we show that the protocol can readily be used to generate homogenous datasets for bulk sequencing of murine bone marrow samples. Taken together, this approach will help to establish benchtop-scale sequencing of IG heavy chain transcripts in the field of IG repertoire research.
Collapse
Affiliation(s)
- Jean-Philippe Bürckert
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| | - William J Faison
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| | - Danielle E Mustin
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| | - Axel R S X Dubois
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| | - Regina Sinner
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| | - Oliver Hunewald
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| | | | - Anne Brieger
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| | - Claude P Muller
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-sur-Alzette, Luxembourg
| |
Collapse
|
48
|
Snyman SJ, Komape DM, Khanyi H, van den Berg J, Cilliers D, Lloyd Evans D, Barnard S, Siebert SJ. Assessing the Likelihood of Gene Flow From Sugarcane ( Saccharum Hybrids) to Wild Relatives in South Africa. Front Bioeng Biotechnol 2018; 6:72. [PMID: 29930938 PMCID: PMC5999724 DOI: 10.3389/fbioe.2018.00072] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Accepted: 05/17/2018] [Indexed: 01/17/2023] Open
Abstract
Pre-commercialization studies on environmental biosafety of genetically modified (GM) crops are necessary to evaluate the potential for sexual hybridization with related plant species that occur in the release area. The aim of the study was a preliminary assessment of factors that may contribute to gene flow from sugarcane (Saccharum hybrids) to indigenous relatives in the sugarcane production regions of Mpumalanga and KwaZulu-Natal provinces, South Africa. In the first instance, an assessment of Saccharum wild relatives was conducted based on existing phylogenies and literature surveys. The prevalence, spatial overlap, proximity, distribution potential, and flowering times of wild relatives in sugarcane production regions based on the above, and on herbaria records and field surveys were conducted for Imperata, Sorghum, Cleistachne, and Miscanthidium species. Eleven species were selected for spatial analyses based on their presence within the sugarcane cultivation region: four species in the Saccharinae and seven in the Sorghinae. Secondly, fragments of the nuclear internal transcribed spacer (ITS) regions of the 5.8s ribosomal gene and two chloroplast genes, ribulose-bisphosphate carboxylase (rbcL), and maturase K (matK) were sequenced or assembled from short read data to confirm relatedness between Saccharum hybrids and its wild relatives. Phylogenetic analyses of the ITS cassette showed that the closest wild relative species to commercial sugarcane were Miscanthidium capense, Miscanthidium junceum, and Narenga porphyrocoma. Sorghum was found to be more distantly related to Saccharum than previously described. Based on the phylogeny described in our study, the only species to highlight in terms of evolutionary divergence times from Saccharum are those within the genus Miscanthidium, most especially M. capense, and M. junceum which are only 3 million years divergent from Saccharum. Field assessment of pollen viability of 13 commercial sugarcane cultivars using two stains, iodine potassium iodide (IKI) and triphenyl tetrazolium chloride, showed decreasing pollen viability (from 85 to 0%) from the north to the south eastern regions of the study area. Future work will include other aspects influencing gene flow such as cytological compatibility and introgression between sugarcane and Miscanthidium species.
Collapse
Affiliation(s)
- Sandy J Snyman
- Crop Biology Resource Centre, South African Sugarcane Research Institute, Mount Edgecombe, South Africa.,Department of Biology, School of Life Sciences, University of KwaZulu-Natal, Westville, South Africa
| | - Dennis M Komape
- Unit for Environmental Sciences and Management, North-West University, Potchefstroom, South Africa
| | - Hlobisile Khanyi
- Unit for Environmental Sciences and Management, North-West University, Potchefstroom, South Africa
| | - Johnnie van den Berg
- Unit for Environmental Sciences and Management, North-West University, Potchefstroom, South Africa
| | - Dirk Cilliers
- Unit for Environmental Sciences and Management, North-West University, Potchefstroom, South Africa
| | - Dyfed Lloyd Evans
- Crop Biology Resource Centre, South African Sugarcane Research Institute, Mount Edgecombe, South Africa.,Department of Biology, School of Life Sciences, University of KwaZulu-Natal, Westville, South Africa.,BeauSci Ltd., Waterbeach, Cambridge, United Kingdom
| | - Sandra Barnard
- Unit for Environmental Sciences and Management, North-West University, Potchefstroom, South Africa
| | - Stefan J Siebert
- Unit for Environmental Sciences and Management, North-West University, Potchefstroom, South Africa
| |
Collapse
|
49
|
Nettling M, Treutler H, Cerquides J, Grosse I. Unrealistic phylogenetic trees may improve phylogenetic footprinting. Bioinformatics 2018; 33:1639-1646. [PMID: 28130227 PMCID: PMC5447242 DOI: 10.1093/bioinformatics/btx033] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2016] [Accepted: 01/19/2017] [Indexed: 01/10/2023] Open
Abstract
Motivation The computational investigation of DNA binding motifs from binding sites is one of the classic tasks in bioinformatics and a prerequisite for understanding gene regulation as a whole. Due to the development of sequencing technologies and the increasing number of available genomes, approaches based on phylogenetic footprinting become increasingly attractive. Phylogenetic footprinting requires phylogenetic trees with attached substitution probabilities for quantifying the evolution of binding sites, but these trees and substitution probabilities are typically not known and cannot be estimated easily. Results Here, we investigate the influence of phylogenetic trees with different substitution probabilities on the classification performance of phylogenetic footprinting using synthetic and real data. For synthetic data we find that the classification performance is highest when the substitution probability used for phylogenetic footprinting is similar to that used for data generation. For real data, however, we typically find that the classification performance of phylogenetic footprinting surprisingly increases with increasing substitution probabilities and is often highest for unrealistically high substitution probabilities close to one. This finding suggests that choosing realistic model assumptions might not always yield optimal predictions in general and that choosing unrealistically high substitution probabilities close to one might actually improve the classification performance of phylogenetic footprinting. Availability and Implementation The proposed PF is implemented in JAVA and can be downloaded from https://github.com/mgledi/PhyFoo Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Martin Nettling
- Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle, Germany
| | - Hendrik Treutler
- Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Halle, Germany
| | - Jesus Cerquides
- Institut d'Investigació en Intel ligència Artificial, IIIA-CSIC, Campus UAB, Cerdanyola, Spain
| | - Ivo Grosse
- Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle, Germany.,German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany
| |
Collapse
|
50
|
Lee H, Kingsford C. Kourami: graph-guided assembly for novel human leukocyte antigen allele discovery. Genome Biol 2018; 19:16. [PMID: 29415772 PMCID: PMC5804087 DOI: 10.1186/s13059-018-1388-2] [Citation(s) in RCA: 56] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2017] [Accepted: 01/08/2018] [Indexed: 01/07/2023] Open
Abstract
Accurate typing of human leukocyte antigen (HLA) is important because HLA genes play important roles in immune responses and disease genesis. Previously available computational methods are database-matching approaches and their outputs are inherently limited by the completeness of already known types, making them unsuitable for discovery of novel alleles. We have developed a graph-guided assembly technique for classical HLA genes, which can construct allele sequences given high-coverage whole-genome sequencing data. Our method delivers highly accurate HLA typing, comparable to the current state-of-the-art methods. Using various data, we also demonstrate that our method can type novel alleles.
Collapse
Affiliation(s)
- Heewook Lee
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Carl Kingsford
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.
| |
Collapse
|