1
|
Weibel CA, Wheeler AL, James JE, Willis SM, McShea H, Masel J. The protein domains of vertebrate species in which selection is more effective have greater intrinsic structural disorder. eLife 2024; 12:RP87335. [PMID: 39239703 PMCID: PMC11379457 DOI: 10.7554/elife.87335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2024] Open
Abstract
The nearly neutral theory of molecular evolution posits variation among species in the effectiveness of selection. In an idealized model, the census population size determines both this minimum magnitude of the selection coefficient required for deleterious variants to be reliably purged, and the amount of neutral diversity. Empirically, an 'effective population size' is often estimated from the amount of putatively neutral genetic diversity and is assumed to also capture a species' effectiveness of selection. A potentially more direct measure of the effectiveness of selection is the degree to which selection maintains preferred codons. However, past metrics that compare codon bias across species are confounded by among-species variation in %GC content and/or amino acid composition. Here, we propose a new Codon Adaptation Index of Species (CAIS), based on Kullback-Leibler divergence, that corrects for both confounders. We demonstrate the use of CAIS correlations, as well as the Effective Number of Codons, to show that the protein domains of more highly adapted vertebrate species evolve higher intrinsic structural disorder.
Collapse
Affiliation(s)
- Catherine A Weibel
- Department of Mathematics, University of Arizona, Tucson, United States
- Department of Physics, University of Arizona, Tucson, United States
| | - Andrew L Wheeler
- Genetics Graduate Interdisciplinary Program, University of Arizona, Tucson, United States
| | - Jennifer E James
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, United States
| | - Sara M Willis
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, United States
| | - Hanon McShea
- Department of Earth System Science, Stanford University, Stanford, United States
| | - Joanna Masel
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, United States
| |
Collapse
|
2
|
Qiu Y, Kang YM, Korfmann C, Pouyet F, Eckford A, Palazzo AF. The GC-content at the 5' ends of human protein-coding genes is undergoing mutational decay. Genome Biol 2024; 25:219. [PMID: 39138526 PMCID: PMC11323403 DOI: 10.1186/s13059-024-03364-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Accepted: 07/31/2024] [Indexed: 08/15/2024] Open
Abstract
BACKGROUND In vertebrates, most protein-coding genes have a peak of GC-content near their 5' transcriptional start site (TSS). This feature promotes both the efficient nuclear export and translation of mRNAs. Despite the importance of GC-content for RNA metabolism, its general features, origin, and maintenance remain mysterious. We investigate the evolutionary forces shaping GC-content at the transcriptional start site (TSS) of genes through both comparative genomic analysis of nucleotide substitution rates between different species and by examining human de novo mutations. RESULTS Our data suggests that GC-peaks at TSSs were present in the last common ancestor of amniotes, and likely that of vertebrates. We observe that in apes and rodents, where recombination is directed away from TSSs by PRDM9, GC-content at the 5' end of protein-coding gene is currently undergoing mutational decay. In canids, which lack PRDM9 and perform recombination at TSSs, GC-content at the 5' end of protein-coding is increasing. We show that these patterns extend into the 5' end of the open reading frame, thus impacting synonymous codon position choices. CONCLUSIONS Our results indicate that the dynamics of this GC-peak in amniotes is largely shaped by historic patterns of recombination. Since decay of GC-content towards the mutation rate equilibrium is the default state for non-functional DNA, the observed decrease in GC-content at TSSs in apes and rodents indicates that the GC-peak is not being maintained by selection on most protein-coding genes in those species.
Collapse
Affiliation(s)
- Yi Qiu
- Department of Biochemistry, University of Toronto, Toronto, Ontario, M5G1M1, Canada
| | - Yoon Mo Kang
- Department of Biochemistry, University of Toronto, Toronto, Ontario, M5G1M1, Canada
| | - Christopher Korfmann
- Department of Electrical Engineering and Computer Science, York University, Toronto, Ontario, M3J1P3, Canada
| | - Fanny Pouyet
- Laboratoire Interdisciplinaire des Sciences du Numérique, Université Paris-Saclay, 91190, Gif-sur-Yvette, France
| | - Andrew Eckford
- Department of Electrical Engineering and Computer Science, York University, Toronto, Ontario, M3J1P3, Canada
| | - Alexander F Palazzo
- Department of Biochemistry, University of Toronto, Toronto, Ontario, M5G1M1, Canada.
| |
Collapse
|
3
|
Grant AR, Johnson KP, Stanley EL, Baldwin-Brown J, Kolenčík S, Allen JM. Rapid Targeted Assembly of the Proteome Reveals Evolutionary Variation of GC Content in Avian Lice. Bioinform Biol Insights 2024; 18:11779322241257991. [PMID: 38860163 PMCID: PMC11163934 DOI: 10.1177/11779322241257991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 05/02/2024] [Indexed: 06/12/2024] Open
Abstract
Nucleotide base composition plays an influential role in the molecular mechanisms involved in gene function, phenotype, and amino acid composition. GC content (proportion of guanine and cytosine in DNA sequences) shows a high level of variation within and among species. Many studies measure GC content in a small number of genes, which may not be representative of genome-wide GC variation. One challenge when assembling extensive genomic data sets for these studies is the significant amount of resources (monetary and computational) associated with data processing, and many bioinformatic tools have not been optimized for resource efficiency. Using a high-performance computing (HPC) cluster, we manipulated resources provided to the targeted gene assembly program, automated target restricted assembly method (aTRAM), to determine an optimum way to run the program to maximize resource use. Using our optimum assembly approach, we assembled and measured GC content of all of the protein-coding genes of a diverse group of parasitic feather lice. Of the 499 426 genes assembled across 57 species, feather lice were GC-poor (mean GC = 42.96%) with a significant amount of variation within and between species (GC range = 19.57%-73.33%). We found a significant correlation between GC content and standard deviation per taxon for overall GC and GC3, which could indicate selection for G and C nucleotides in some species. Phylogenetic signal of GC content was detected in both GC and GC3. This research provides a large-scale investigation of GC content in parasitic lice laying the foundation for understanding the basis of variation in base composition across species.
Collapse
Affiliation(s)
- Avery R Grant
- Department of Biology, University of Nevada, Reno, Reno, NV, USA
| | - Kevin P Johnson
- Illinois Natural History Survey, Prairie Research Institute, University of Illinois at Urbana-Champaign, Champaign, IL, USA
| | - Edward L Stanley
- Department of Natural History, Florida Museum of Natural History, University of Florida, Gainesville, FL, USA
| | | | - Stanislav Kolenčík
- Faculty of Mathematics, Natural Sciences, and Information Technologies, University of Primorska, Koper, Slovenia
| | - Julie M Allen
- Department of Biological Sciences, Virginia Tech, Blacksburg, VA, USA
| |
Collapse
|
4
|
Joseph J, Prentout D, Laverré A, Tricou T, Duret L. High prevalence of PRDM9-independent recombination hotspots in placental mammals. Proc Natl Acad Sci U S A 2024; 121:e2401973121. [PMID: 38809707 PMCID: PMC11161765 DOI: 10.1073/pnas.2401973121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Accepted: 04/26/2024] [Indexed: 05/31/2024] Open
Abstract
In many mammals, recombination events are concentrated in hotspots directed by a sequence-specific DNA-binding protein named PRDM9. Intriguingly, PRDM9 has been lost several times in vertebrates, and notably among mammals, it has been pseudogenized in the ancestor of canids. In the absence of PRDM9, recombination hotspots tend to occur in promoter-like features such as CpG islands. It has thus been proposed that one role of PRDM9 could be to direct recombination away from PRDM9-independent hotspots. However, the ability of PRDM9 to direct recombination hotspots has been assessed in only a handful of species, and a clear picture of how much recombination occurs outside of PRDM9-directed hotspots in mammals is still lacking. In this study, we derived an estimator of past recombination activity based on signatures of GC-biased gene conversion in substitution patterns. We quantified recombination activity in PRDM9-independent hotspots in 52 species of boreoeutherian mammals. We observe a wide range of recombination rates at these loci: several species (such as mice, humans, some felids, or cetaceans) show a deficit of recombination, while a majority of mammals display a clear peak of recombination. Our results demonstrate that PRDM9-directed and PRDM9-independent hotspots can coexist in mammals and that their coexistence appears to be the rule rather than the exception. Additionally, we show that the location of PRDM9-independent hotspots is relatively more stable than that of PRDM9-directed hotspots, but that PRDM9-independent hotspots nevertheless evolve slowly in concert with DNA hypomethylation.
Collapse
Affiliation(s)
- Julien Joseph
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR 5558, Villeurbanne69100, France
| | - Djivan Prentout
- Department of Biological Sciences, Columbia University, New York, NY10027
| | - Alexandre Laverré
- Department of Ecology and Evolution, University of Lausanne, LausanneCH-1015, Switzerland
- Swiss Institute of Bioinformatics, LausanneCH-1015, Switzerland
| | - Théo Tricou
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR 5558, Villeurbanne69100, France
| | - Laurent Duret
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR 5558, Villeurbanne69100, France
| |
Collapse
|
5
|
Kotari I, Kosiol C, Borges R. The Patterns of Codon Usage between Chordates and Arthropods are Different but Co-evolving with Mutational Biases. Mol Biol Evol 2024; 41:msae080. [PMID: 38667829 PMCID: PMC11108087 DOI: 10.1093/molbev/msae080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 03/22/2024] [Accepted: 04/15/2024] [Indexed: 05/22/2024] Open
Abstract
Different frequencies amongst codons that encode the same amino acid (i.e. synonymous codons) have been observed in multiple species. Studies focused on uncovering the forces that drive such codon usage showed that a combined effect of mutational biases and translational selection works to produce different frequencies of synonymous codons. However, only few have been able to measure and distinguish between these forces that may leave similar traces on the coding regions. Here, we have developed a codon model that allows the disentangling of mutation, selection on amino acids and synonymous codons, and GC-biased gene conversion (gBGC) which we employed on an extensive dataset of 415 chordates and 191 arthropods. We found that chordates need 15 more synonymous codon categories than arthropods to explain the empirical codon frequencies, which suggests that the extent of codon usage can vary greatly between animal phyla. Moreover, methylation at CpG sites seems to partially explain these patterns of codon usage in chordates but not in arthropods. Despite the differences between the two phyla, our findings demonstrate that in both, GC-rich codons are disfavored when mutations are GC-biased, and the opposite is true when mutations are AT-biased. This indicates that selection on the genomic coding regions might act primarily to stabilize its GC/AT content on a genome-wide level. Our study shows that the degree of synonymous codon usage varies considerably among animals, but is likely governed by a common underlying dynamic.
Collapse
Affiliation(s)
- Ioanna Kotari
- Institut für Populationsgenetik, University of Veterinary Medicine, Veterinärplatz 1, Vienna 1210, Austria
- Vienna Graduate School of Population Genetics, Vienna, Austria
| | - Carolin Kosiol
- Centre for Biological Diversity, School of Biology, University of St Andrews, Fife KY16 9TH, UK
| | - Rui Borges
- Institut für Populationsgenetik, University of Veterinary Medicine, Veterinärplatz 1, Vienna 1210, Austria
| |
Collapse
|
6
|
Sgarlata GM, Rasolondraibe E, Salmona J, Le Pors B, Ralantoharijaona T, Rakotonanahary A, Jan F, Manzi S, Iribar A, Zaonarivelo JR, Volasoa Andriaholinirina N, Rasoloharijaona S, Chikhi L. The genomic diversity of the Eliurus genus in northern Madagascar with a putative new species. Mol Phylogenet Evol 2024; 193:107997. [PMID: 38128795 DOI: 10.1016/j.ympev.2023.107997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 12/06/2023] [Accepted: 12/18/2023] [Indexed: 12/23/2023]
Abstract
Madagascar exhibits extraordinarily high level of species richness and endemism, while being severely threatened by habitat loss and fragmentation (HL&F). In front of these threats to biodiversity, conservation effort can be directed, for instance, in the documentation of species that are still unknown to science, or in investigating how species respond to HL&F. The tufted-tail rats genus (Eliurus spp.) is the most speciose genus of endemic rodents in Madagascar, with 13 described species, which occupy two major habitat types: dry or humid forests. The large species diversity and association to specific habitat types make the Eliurus genus a suitable model for investigating species adaptation to new environments, as well as response to HL&F (dry vs humid). In the present study, we investigated Eliurus spp. genomic diversity across northern Madagascar, a region covered by both dry and humid fragmented forests. From the mitochondrial DNA (mtDNA) and nuclear genomic (RAD-seq) data of 124 Eliurus individuals sampled in poorly studied forests of northern Madagascar, we identified an undescribed Eliurus taxon (Eliurus sp. nova). We tested the hypothesis of a new Eliurus species using several approaches: i) DNA barcoding; ii) phylogenetic inferences; iii) species delimitation tests based on the Multi-Species Coalescent (MSC) model, iv) genealogical divergence index (gdi); v) an ad-hoc test of isolation-by-distance within versus between sister-taxa, vi) comparisons of %GC content patterns and vii) morphological analyses. All analyses support the recognition of the undescribed lineage as a putative distinct species. In addition, we show that Eliurus myoxinus, a species known from the dry forests of western Madagascar, is, surprisingly, found mostly in humid forests in northern Madagascar. In conclusion, we discuss the implications of such findings in the context of Eliurus species evolution and diversification, and use the distribution of northern Eliurus species as a proxy for reconstructing past changes in forest cover and vegetation type in northern Madagascar.
Collapse
Affiliation(s)
| | - Emmanuel Rasolondraibe
- Département de Biologie Animale et Ecologie, Faculté des Sciences, Université de Mahajanga, Mahajanga, Madagascar.
| | - Jordi Salmona
- Instituto Gulbenkian de Ciência, Rua da Quinta Grande, 6, 2780-156 Oeiras, Portugal; Centre de Recherche sur la Biodiversité et l'Environnement (CRBE),Université de Toulouse, CNRS, IRD, Toulouse INP, Université Toulouse 3 -Paul Sabatier (UT3), Toulouse, France.
| | - Barbara Le Pors
- Instituto Gulbenkian de Ciência, Rua da Quinta Grande, 6, 2780-156 Oeiras, Portugal
| | - Tantely Ralantoharijaona
- Département de Biologie Animale et Ecologie, Faculté des Sciences, Université de Mahajanga, Mahajanga, Madagascar
| | - Ando Rakotonanahary
- Département de Biologie Animale et Ecologie, Faculté des Sciences, Université de Mahajanga, Mahajanga, Madagascar.
| | - Fabien Jan
- Instituto Gulbenkian de Ciência, Rua da Quinta Grande, 6, 2780-156 Oeiras, Portugal
| | - Sophie Manzi
- Centre de Recherche sur la Biodiversité et l'Environnement (CRBE),Université de Toulouse, CNRS, IRD, Toulouse INP, Université Toulouse 3 -Paul Sabatier (UT3), Toulouse, France.
| | - Amaia Iribar
- Centre de Recherche sur la Biodiversité et l'Environnement (CRBE),Université de Toulouse, CNRS, IRD, Toulouse INP, Université Toulouse 3 -Paul Sabatier (UT3), Toulouse, France.
| | - John Rigobert Zaonarivelo
- Département des Sciences de la Nature et de l'Environnement, Université d'Antsiranana, 201 Antsiranana, Madagascar.
| | | | - Solofonirina Rasoloharijaona
- Département de Biologie Animale et Ecologie, Faculté des Sciences, Université de Mahajanga, Mahajanga, Madagascar
| | - Lounès Chikhi
- Instituto Gulbenkian de Ciência, Rua da Quinta Grande, 6, 2780-156 Oeiras, Portugal; Centre de Recherche sur la Biodiversité et l'Environnement (CRBE),Université de Toulouse, CNRS, IRD, Toulouse INP, Université Toulouse 3 -Paul Sabatier (UT3), Toulouse, France.
| |
Collapse
|
7
|
Kyriacou RG, Mulhair PO, Holland PWH. GC Content Across Insect Genomes: Phylogenetic Patterns, Causes and Consequences. J Mol Evol 2024; 92:138-152. [PMID: 38491221 PMCID: PMC10978632 DOI: 10.1007/s00239-024-10160-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 02/06/2024] [Indexed: 03/18/2024]
Abstract
The proportions of A:T and G:C nucleotide pairs are often unequal and can vary greatly between animal species and along chromosomes. The causes and consequences of this variation are incompletely understood. The recent release of high-quality genome sequences from the Darwin Tree of Life and other large-scale genome projects provides an opportunity for GC heterogeneity to be compared across a large number of insect species. Here we analyse GC content along chromosomes, and within protein-coding genes and codons, of 150 insect species from four holometabolous orders: Coleoptera, Diptera, Hymenoptera, and Lepidoptera. We find that protein-coding sequences have higher GC content than the genome average, and that Lepidoptera generally have higher GC content than the other three insect orders examined. GC content is higher in small chromosomes in most Lepidoptera species, but this pattern is less consistent in other orders. GC content also increases towards subtelomeric regions within protein-coding genes in Diptera, Coleoptera and Lepidoptera. Two species of Diptera, Bombylius major and B. discolor, have very atypical genomes with ubiquitous increase in AT content, especially at third codon positions. Despite dramatic AT-biased codon usage, we find no evidence that this has driven divergent protein evolution. We argue that the GC landscape of Lepidoptera, Diptera and Coleoptera genomes is influenced by GC-biased gene conversion, strongest in Lepidoptera, with some outlier taxa affected drastically by counteracting processes.
Collapse
Affiliation(s)
- Riccardo G Kyriacou
- Department of Biology, University of Oxford, 11a Mansfield Road, Oxford, OX1 3SZ, UK
| | - Peter O Mulhair
- Department of Biology, University of Oxford, 11a Mansfield Road, Oxford, OX1 3SZ, UK
| | - Peter W H Holland
- Department of Biology, University of Oxford, 11a Mansfield Road, Oxford, OX1 3SZ, UK.
| |
Collapse
|
8
|
Liu Y, Liang N, Xian Q, Zhang W. GC heterogeneity reveals sequence-structures evolution of angiosperm ITS2. BMC PLANT BIOLOGY 2023; 23:608. [PMID: 38036992 PMCID: PMC10691020 DOI: 10.1186/s12870-023-04634-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 11/26/2023] [Indexed: 12/02/2023]
Abstract
BACKGROUND Despite GC variation constitutes a fundamental element of genome and species diversity, the precise mechanisms driving it remain unclear. The abundant sequence data available for the ITS2, a commonly employed phylogenetic marker in plants, offers an exceptional resource for exploring the GC variation across angiosperms. RESULTS A comprehensive selection of 8666 species, comprising 165 genera, 63 families, and 30 orders were used for the analyses. The alignment of ITS2 sequence-structures and partitioning of secondary structures into paired and unpaired regions were performed using 4SALE. Substitution rates and frequencies among GC base-pairs in the paired regions of ITS2 were calculated using RNA-specific models in the PHASE package. The results showed that the distribution of ITS2 GC contents on the angiosperm phylogeny was heterogeneous, but their increase was generally associated with ITS2 sequence homogenization, thereby supporting the occurrence of GC-biased gene conversion (gBGC) during the concerted evolution of ITS2. Additionally, the GC content in the paired regions of the ITS2 secondary structure was significantly higher than that of the unpaired regions, indicating the selection of GC for thermodynamic stability. Furthermore, the RNA substitution models demonstrated that base-pair transformations favored both the elevation and fixation of GC in the paired regions, providing further support for gBGC. CONCLUSIONS Our findings highlight the significance of secondary structure in GC investigation, which demonstrate that both gBGC and structure-based selection are influential factors driving angiosperm ITS2 GC content.
Collapse
Affiliation(s)
- Yubo Liu
- Marine College, Shandong University, Weihai, 264209, China
- Division of Physical Biology, CAS Key Laboratory of Interfacial Physics and Technology, Shanghai Institute of Applied Physics, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, 201800, China
| | - Nan Liang
- Marine College, Shandong University, Weihai, 264209, China
- Allergy Department, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100730, China
| | - Qing Xian
- Marine College, Shandong University, Weihai, 264209, China
| | - Wei Zhang
- Marine College, Shandong University, Weihai, 264209, China.
| |
Collapse
|
9
|
Molteni C, Forni D, Cagliani R, Bravo IG, Sironi M. Evolution and diversity of nucleotide and dinucleotide composition in poxviruses. J Gen Virol 2023; 104. [PMID: 37792576 DOI: 10.1099/jgv.0.001897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/06/2023] Open
Abstract
Poxviruses (family Poxviridae) have long dsDNA genomes and infect a wide range of hosts, including insects, birds, reptiles and mammals. These viruses have substantial incidence, prevalence and disease burden in humans and in other animals. Nucleotide and dinucleotide composition, mostly CpG and TpA, have been largely studied in viral genomes because of their evolutionary and functional implications. We analysed here the nucleotide and dinucleotide composition, as well as codon usage bias, of a set of representative poxvirus genomes, with a very diverse host spectrum. After correcting for overall nucleotide composition, entomopoxviruses displayed low overall GC content, no enrichment in TpA and large variation in CpG enrichment, while chordopoxviruses showed large variation in nucleotide composition, no obvious depletion in CpG and a weak trend for TpA depletion in GC-rich genomes. Overall, intergenome variation in dinucleotide composition in poxviruses is largely accounted for by variation in overall genomic GC levels. Nonetheless, using vaccinia virus as a model, we found that genes expressed at the earliest times in infection are more CpG-depleted than genes expressed at later stages. This observation has parallels in betahepesviruses (also large dsDNA viruses) and suggests an antiviral role for the innate immune system (e.g. via the zinc-finger antiviral protein ZAP) in the early phases of poxvirus infection. We also analysed codon usage bias in poxviruses and we observed that it is mostly determined by genomic GC content, and that stratification after host taxonomy does not contribute to explaining codon usage bias diversity. By analysis of within-species diversity, we show that genomic GC content is the result of mutational biases. Poxvirus genomes that encode a DNA ligase are significantly AT-richer than those that do not, suggesting that DNA repair systems shape mutation biases. Our data shed light on the evolution of poxviruses and inform strategies for their genetic manipulation for therapeutic purposes.
Collapse
Affiliation(s)
- Cristian Molteni
- Scientific Institute IRCCS E. MEDEA, Bioinformatics, Bosisio Parini, Italy
| | - Diego Forni
- Scientific Institute IRCCS E. MEDEA, Bioinformatics, Bosisio Parini, Italy
| | - Rachele Cagliani
- Scientific Institute IRCCS E. MEDEA, Bioinformatics, Bosisio Parini, Italy
| | - Ignacio G Bravo
- Laboratoire MIVEGEC (Univ Montpellier CNRS, IRD), Centre National de la Recherche Scientifique, Montpellier, France
| | - Manuela Sironi
- Scientific Institute IRCCS E. MEDEA, Bioinformatics, Bosisio Parini, Italy
| |
Collapse
|
10
|
Liu A, Wang N, Xie G, Li Y, Yan X, Li X, Zhu Z, Li Z, Yang J, Meng F, Dou M, Chen W, Ma N, Jiang Y, Gao Y, Wang Y. GC-biased gene conversion drives accelerated evolution of ultraconserved elements in mammalian and avian genomes. Genome Res 2023; 33:1673-1689. [PMID: 37884342 PMCID: PMC10691551 DOI: 10.1101/gr.277784.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 08/23/2023] [Indexed: 10/28/2023]
Abstract
Ultraconserved elements (UCEs) are the most conserved regions among the genomes of evolutionarily distant species and are thought to play critical biological functions. However, some UCEs rapidly evolved in specific lineages, and whether they contributed to adaptive evolution is still controversial. Here, using an increased number of sequenced genomes with high taxonomic coverage, we identified 2191 mammalian UCEs and 5938 avian UCEs from 95 mammal and 94 bird genomes, respectively. Our results show that these UCEs are functionally constrained and that their adjacent genes are prone to widespread expression with low expression diversity across tissues. Functional enrichment of mammalian and avian UCEs shows different trends indicating that UCEs may contribute to adaptive evolution of taxa. Focusing on lineage-specific accelerated evolution, we discover that the proportion of fast-evolving UCEs in nine mammalian and 10 avian test lineages range from 0.19% to 13.2%. Notably, up to 62.1% of fast-evolving UCEs in test lineages are much more likely to result from GC-biased gene conversion (gBGC). A single cervid-specific gBGC region embracing the uc.359 allele significantly alters the expression of Nova1 and other neural-related genes in the rat brain. Combined with the altered regulatory activity of ancient gBGC-induced fast-evolving UCEs in eutherians, our results provide evidence that synergy between gBGC and selection shaped lineage-specific substitution patterns, even in the most constrained regulatory elements. In summary, our results show that gBGC played an important role in facilitating lineage-specific accelerated evolution of UCEs, and further support the idea that a combination of multiple evolutionary forces shapes adaptive evolution.
Collapse
Affiliation(s)
- Anguo Liu
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Nini Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Faculty of Mathematics and Natural Sciences, University of Cologne, and Cologne Excellence Cluster for Cellular Stress Responses in Aging-Associated Diseases (CECAD), University Hospital Cologne, Cologne 50931, Germany
| | - Guoxiang Xie
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yang Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Xixi Yan
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Xinmei Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Zhenliang Zhu
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
- College of Veterinary Medicine, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Zhuohui Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Jing Yang
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
- College of Veterinary Medicine, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Fanxin Meng
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Mingle Dou
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Weihuang Chen
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Nange Ma
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yu Jiang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Center for Functional Genomics, Institute of Future Agriculture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yuanpeng Gao
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China;
- College of Veterinary Medicine, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yu Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China;
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| |
Collapse
|
11
|
Näsvall K, Boman J, Talla V, Backström N. Base Composition, Codon Usage, and Patterns of Gene Sequence Evolution in Butterflies. Genome Biol Evol 2023; 15:evad150. [PMID: 37565492 PMCID: PMC10462419 DOI: 10.1093/gbe/evad150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 07/17/2023] [Accepted: 08/08/2023] [Indexed: 08/12/2023] Open
Abstract
Coding sequence evolution is influenced by both natural selection and neutral evolutionary forces. In many species, the effects of mutation bias, codon usage, and GC-biased gene conversion (gBGC) on gene sequence evolution have not been detailed. Quantification of how these forces shape substitution patterns is therefore necessary to understand the strength and direction of natural selection. Here, we used comparative genomics to investigate the association between base composition and codon usage bias on gene sequence evolution in butterflies and moths (Lepidoptera), including an in-depth analysis of underlying patterns and processes in one species, Leptidea sinapis. The data revealed significant G/C to A/T substitution bias at third codon position with some variation in the strength among different butterfly lineages. However, the substitution bias was lower than expected from previously estimated mutation rate ratios, partly due to the influence of gBGC. We found that A/T-ending codons were overrepresented in most species, but there was a positive association between the magnitude of codon usage bias and GC-content in third codon positions. In addition, the tRNA-gene population in L. sinapis showed higher GC-content at third codon positions compared to coding sequences in general and less overrepresentation of A/T-ending codons. There was an inverse relationship between synonymous substitutions and codon usage bias indicating selection on synonymous sites. We conclude that the evolutionary rate in Lepidoptera is affected by a complex interaction between underlying G/C -> A/T mutation bias and partly counteracting fixation biases, predominantly conferred by overall purifying selection, gBGC, and selection on codon usage.
Collapse
Affiliation(s)
- Karin Näsvall
- Evolutionary Biology Program, Department of Ecology and Genetics (IEG), Uppsala University, Uppsala, Sweden
| | - Jesper Boman
- Evolutionary Biology Program, Department of Ecology and Genetics (IEG), Uppsala University, Uppsala, Sweden
| | - Venkat Talla
- Evolutionary Biology Program, Department of Ecology and Genetics (IEG), Uppsala University, Uppsala, Sweden
| | - Niclas Backström
- Evolutionary Biology Program, Department of Ecology and Genetics (IEG), Uppsala University, Uppsala, Sweden
| |
Collapse
|
12
|
Li ZL, Buck M. A proteome-scale analysis of vertebrate protein amino acid occurrence: Thermoadaptation shows a correlation with protein solvation but less so with dynamics. Proteins 2023; 91:3-15. [PMID: 36053994 PMCID: PMC10087973 DOI: 10.1002/prot.26404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Revised: 07/06/2022] [Accepted: 07/25/2022] [Indexed: 12/15/2022]
Abstract
Despite differences in behaviors and living conditions, vertebrate organisms share the great majority of proteins, often with subtle differences in amino acid sequence. Here, we present a simple way to analyze the difference in amino acid occurrence by comparing highly homologous proteins on a subproteome level between several vertebrate model organisms. Specifically, we use this method to identify a pattern of amino acid conservation as well as a shift in amino acid occurrence between homeotherms (warm-blooded species) and poikilotherms (cold-blooded species). Importantly, this general analysis and a specific example further establish a broad correlation, if not likely connection between the thermal adaptation of protein sequences and two of their physical features: on average a change in their protein dynamics and, even more strongly, in their solvation. For poikilotherms, such as frog and fish, the lower body temperature is expected to increase the protein-protein interaction due to a decrease in protein internal dynamics. In order to counteract the tendency for enhanced binding caused by low temperatures, poikilotherms enhance the solvation of their proteins by favoring polar amino acids. This feature appears to dominate over possible changes in dynamics for some proteins. The results suggest that a general trend for amino acid choice is part of the mechanism for thermoadaptation of vertebrate organisms at the molecular level.
Collapse
Affiliation(s)
- Zhen-Lu Li
- School of Life Science, Tianjin University, Tianjin, China.,Department of Physiology and Biophysics, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | - Matthias Buck
- Department of Physiology and Biophysics, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA.,Departments of Pharmacology and of Neurosciences, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| |
Collapse
|
13
|
Card DC, Van Camp AG, Santonastaso T, Jensen-Seaman MI, Anthony NM, Edwards SV. Structure and evolution of the squamate major histocompatibility complex as revealed by two Anolis lizard genomes. Front Genet 2022; 13:979746. [PMID: 36425073 PMCID: PMC9679377 DOI: 10.3389/fgene.2022.979746] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Accepted: 10/20/2022] [Indexed: 11/10/2022] Open
Abstract
The major histocompatibility complex (MHC) is an important genomic region for adaptive immunity and has long been studied in ecological and evolutionary contexts, such as disease resistance and mate and kin selection. The MHC has been investigated extensively in mammals and birds but far less so in squamate reptiles, the third major radiation of amniotes. We localized the core MHC genomic region in two squamate species, the green anole (Anolis carolinensis) and brown anole (A. sagrei), and provide the first detailed characterization of the squamate MHC, including the presence and ordering of known MHC genes in these species and comparative assessments of genomic structure and composition in MHC regions. We find that the Anolis MHC, located on chromosome 2 in both species, contains homologs of many previously-identified mammalian MHC genes in a single core MHC region. The repetitive element composition in anole MHC regions was similar to those observed in mammals but had important distinctions, such as higher proportions of DNA transposons. Moreover, longer introns and intergenic regions result in a much larger squamate MHC region (11.7 Mb and 24.6 Mb in the green and brown anole, respectively). Evolutionary analyses of MHC homologs of anoles and other representative amniotes uncovered generally monophyletic relationships between species-specific homologs and a loss of the peptide-binding domain exon 2 in one of two mhc2β gene homologs of each anole species. Signals of diversifying selection in each anole species was evident across codons of mhc1, many of which appear functionally relevant given known structures of this protein from the green anole, chicken, and human. Altogether, our investigation fills a major gap in understanding of amniote MHC diversity and evolution and provides an important foundation for future squamate-specific or vertebrate-wide investigations of the MHC.
Collapse
Affiliation(s)
- Daren C. Card
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, United States
- Museum of Comparative Zoology, Harvard University, Cambridge, MA, United States
- *Correspondence: Daren C. Card,
| | - Andrew G. Van Camp
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, United States
- Museum of Comparative Zoology, Harvard University, Cambridge, MA, United States
| | - Trenten Santonastaso
- Department of Biological Sciences, University of New Orleans, New Orleans, LA, United States
| | | | - Nicola M. Anthony
- Department of Biological Sciences, University of New Orleans, New Orleans, LA, United States
| | - Scott V. Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, United States
- Museum of Comparative Zoology, Harvard University, Cambridge, MA, United States
| |
Collapse
|
14
|
Mika K, Whittington CM, McAllan BM, Lynch VJ. Gene expression phylogenies and ancestral transcriptome reconstruction resolves major transitions in the origins of pregnancy. eLife 2022; 11:e74297. [PMID: 35770963 PMCID: PMC9275820 DOI: 10.7554/elife.74297] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 06/29/2022] [Indexed: 11/13/2022] Open
Abstract
Structural and physiological changes in the female reproductive system underlie the origins of pregnancy in multiple vertebrate lineages. In mammals, the glandular portion of the lower reproductive tract has transformed into a structure specialized for supporting fetal development. These specializations range from relatively simple maternal nutrient provisioning in egg-laying monotremes to an elaborate suite of traits that support intimate maternal-fetal interactions in Eutherians. Among these traits are the maternal decidua and fetal component of the placenta, but there is considerable uncertainty about how these structures evolved. Previously, we showed that changes in uterine gene expression contributes to several evolutionary innovations during the origins of pregnancy (Mika et al., 2021b). Here, we reconstruct the evolution of entire transcriptomes ('ancestral transcriptome reconstruction') and show that maternal gene expression profiles are correlated with degree of placental invasion. These results indicate that an epitheliochorial-like placenta evolved early in the mammalian stem-lineage and that the ancestor of Eutherians had a hemochorial placenta, and suggest maternal control of placental invasiveness. These data resolve major transitions in the evolution of pregnancy and indicate that ancestral transcriptome reconstruction can be used to study the function of ancestral cell, tissue, and organ systems.
Collapse
Affiliation(s)
- Katelyn Mika
- Department of Human Genetics, University of ChicagoChicagoUnited States
- Department of Organismal Biology and Anatomy, University of ChicagoChicagoUnited States
| | | | | | - Vincent J Lynch
- Department of Biological Sciences, University at Buffalo, State University of New YorkBuffalo,NewyorkUnited States
| |
Collapse
|
15
|
Wilcox JJS, Arca-Ruibal B, Samour J, Mateuta V, Idaghdour Y, Boissinot S. Linked-Read Sequencing of Eight Falcons Reveals a Unique Genomic Architecture in Flux. Genome Biol Evol 2022; 14:evac090. [PMID: 35700227 PMCID: PMC9214253 DOI: 10.1093/gbe/evac090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 05/27/2022] [Accepted: 06/06/2022] [Indexed: 11/12/2022] Open
Abstract
Falcons are diverse birds of cultural and economic importance. They have undergone major lineage-specific chromosomal rearrangements, resulting in greatly-reduced chromosome counts relative to other birds. Here, we use 10X Genomics linked reads to provide new high-contiguity genomes for two gyrfalcons, a saker falcon, a lanner falcon, three subspecies of peregrine falcons, and the common kestrel. Assisted by a transcriptome sequenced from 22 gyrfalcon tissues, we annotate these genomes for a variety of genomic features, estimate historical demography, and then investigate genomic equilibrium in the context of falcon-specific chromosomal rearrangements. We find that falcon genomes are not in AT-GC equilibrium with a bias in substitutions towards higher AT content; this bias is predominantly but not exclusively driven by hypermutability of CpG sites. Small indels and large structural variants were also biased towards insertions rather than deletions. Patterns of disequilibrium were linked to chromosomal rearrangements: falcons have lost GC content in regions that have fused to larger chromosomes from microchromosomes and gained GC content in regions of macrochromosomes that have translocated to microchromosomes. Inserted bases have accumulated on regions ancestrally belonging to microchromosomes, consistent with insertion-biased gene conversion. We also find an excess of interspersed repeats on regions of microchromosomes that have fused to macrochromosomes. Our results reveal that falcon genomes are in a state of flux. They further suggest that many of the key differences between microchromosomes and macrochromosomes are driven by differences in chromosome size, and indicate a clear role for recombination and biased-gene-conversion in determining genomic equilibrium.
Collapse
Affiliation(s)
- Justin J S Wilcox
- Center for Genomics & Systems Biology, New York University Abu Dhabi, Saadiyat Island, Abu Dhabi, United Arab Emirates
| | | | - Jaime Samour
- Wildlife Management and Falcon Medicine and Breeding Consultancy, Abu Dhabi, United Arab Emirates
| | | | - Youssef Idaghdour
- Center for Genomics & Systems Biology, New York University Abu Dhabi, Saadiyat Island, Abu Dhabi, United Arab Emirates
- Biology Program, New York University Abu Dhabi, Saadiyat Island, Abu Dhabi, United Arab Emirates
| | - Stéphane Boissinot
- Center for Genomics & Systems Biology, New York University Abu Dhabi, Saadiyat Island, Abu Dhabi, United Arab Emirates
- Biology Program, New York University Abu Dhabi, Saadiyat Island, Abu Dhabi, United Arab Emirates
| |
Collapse
|
16
|
Matschiner M, Barth JMI, Tørresen OK, Star B, Baalsrud HT, Brieuc MSO, Pampoulie C, Bradbury I, Jakobsen KS, Jentoft S. Supergene origin and maintenance in Atlantic cod. Nat Ecol Evol 2022; 6:469-481. [PMID: 35177802 PMCID: PMC8986531 DOI: 10.1038/s41559-022-01661-x] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 01/10/2022] [Indexed: 12/19/2022]
Abstract
Supergenes are sets of genes that are inherited as a single marker and encode complex phenotypes through their joint action. They are identified in an increasing number of organisms, yet their origins and evolution remain enigmatic. In Atlantic cod, four megabase-scale supergenes have been identified and linked to migratory lifestyle and environmental adaptations. Here we investigate the origin and maintenance of these four supergenes through analysis of whole-genome-sequencing data, including a new long-read-based genome assembly for a non-migratory Atlantic cod individual. We corroborate the finding that chromosomal inversions underlie all four supergenes, and we show that they originated at different times between 0.40 and 1.66 million years ago. We reveal gene flux between supergene haplotypes where migratory and stationary Atlantic cod co-occur and conclude that this gene flux is driven by gene conversion, on the basis of an increase in GC content in exchanged sites. Additionally, we find evidence for double crossover between supergene haplotypes, leading to the exchange of an ~275 kilobase fragment with genes potentially involved in adaptation to low salinity in the Baltic Sea. Our results suggest that supergenes can be maintained over long timescales in the same way as hybridizing species, through the selective purging of introduced genetic variation. Atlantic cod carries four supergenes linked to migratory lifestyle and environmental adaptations. Using whole-genome sequencing, the authors show that the genome inversions that underlie the supergenes originated at different times and show gene flux between supergene haplotypes.
Collapse
Affiliation(s)
- Michael Matschiner
- Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biosciences, University of Oslo, Oslo, Norway. .,Department of Palaeontology and Museum, University of Zurich, Zurich, Switzerland. .,Natural History Museum, University of Oslo, Oslo, Norway.
| | - Julia Maria Isis Barth
- Zoological Institute, Department of Environmental Sciences, University of Basel, Basel, Switzerland
| | - Ole Kristian Tørresen
- Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biosciences, University of Oslo, Oslo, Norway
| | - Bastiaan Star
- Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biosciences, University of Oslo, Oslo, Norway
| | - Helle Tessand Baalsrud
- Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biosciences, University of Oslo, Oslo, Norway
| | - Marine Servane Ono Brieuc
- Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biosciences, University of Oslo, Oslo, Norway
| | | | - Ian Bradbury
- Fisheries and Oceans Canada, St John's, Newfoundland and Labrador, Canada
| | - Kjetill Sigurd Jakobsen
- Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biosciences, University of Oslo, Oslo, Norway
| | - Sissel Jentoft
- Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biosciences, University of Oslo, Oslo, Norway.
| |
Collapse
|
17
|
Latrille T, Lartillot N. An Improved Codon Modeling Approach for Accurate Estimation of the Mutation Bias. Mol Biol Evol 2022; 39:6503505. [PMID: 35021218 PMCID: PMC8831783 DOI: 10.1093/molbev/msac005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Phylogenetic codon models are routinely used to characterize selective regimes in coding sequences. Their parametric design, however, is still a matter of debate, in particular concerning the question of how to account for differing nucleotide frequencies and substitution rates. This problem relates to the fact that nucleotide composition in protein-coding sequences is the result of the interactions between mutation and selection. In particular, because of the structure of the genetic code, the nucleotide composition differs between the three coding positions, with the third position showing a more extreme composition. Yet, phylogenetic codon models do not correctly capture this phenomenon and instead predict that the nucleotide composition should be the same for all three positions. Alternatively, some models allow for different nucleotide rates at the three positions, an approach conflating the effects of mutation and selection on nucleotide composition. In practice, it results in inaccurate estimation of the strength of selection. Conceptually, the problem comes from the fact that phylogenetic codon models do not correctly capture the fixation bias acting against the mutational pressure at the mutation–selection equilibrium. To address this problem and to more accurately identify mutation rates and selection strength, we present an improved codon modeling approach where the fixation rate is not seen as a scalar, but as a tensor. This approach gives an accurate representation of how mutation and selection oppose each other at equilibrium and yields a reliable estimate of the mutational process, while disentangling the mean fixation probabilities prevailing in different mutational directions.
Collapse
Affiliation(s)
- T Latrille
- CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR, Université de Lyon, Université Lyon 1, 5558, Villeurbanne, F-69622, France.,École Normale Supérieure de Lyon, Université de Lyon, Université Lyon 1, Lyon, France
| | - N Lartillot
- CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR, Université de Lyon, Université Lyon 1, 5558, Villeurbanne, F-69622, France
| |
Collapse
|
18
|
Chakraborty S, Basumatary P, Nath D, Paul S, Uddin A. Compositional features and pattern of codon usage for mitochondrial CO genes among reptiles. Mitochondrion 2021; 62:111-121. [PMID: 34793987 DOI: 10.1016/j.mito.2021.11.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 11/02/2021] [Accepted: 11/10/2021] [Indexed: 11/27/2022]
Abstract
The phenomenon of non-random occurrence of synonymous nucleotide triplets (codons) in the coding sequences of genes is the codon usage bias (CUB). In this study, we used bioinformatic tool kit to analyze the compositional pattern and CUB of mitogenes namely COI, COII and COIII across different orders of reptiles. Estimation of overall base composition in the protein-coding sequences of COI, COII and COIII genes of the reptilian orders revealed an uneven usage of nucleotides. The overall count of A nucleotide was found to be the highest while the overall count of G nucleotide was the least. The CO genes across the three reptilian orders were prominently AT biased. Comparison of the GC proportion at each codon position displayed that GC1 percentage ranked the highest in all the three CO genes of the reptilian orders. SCUO values indicated weaker CUB, while considerable variation of SCUO values existed in the three CO genes across the studied reptiles. Relative synonymous codon usage (RSCU) values indicated that mostly the A ending codons were preferred. Based on the parameters namely neutrality plot, mutational responsive index and translational selection, we could conclude that natural selection was the major evolutionary force in COI, COII and COIII genes in the studied reptilian orders. However, correspondence analysis, parity plot and correlation studies indicated the existence of mutation pressure as well on the CO genes.
Collapse
Affiliation(s)
- Supriyo Chakraborty
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India.
| | | | - Durbba Nath
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India
| | - Sunanda Paul
- Department of Biotechnology, Assam University, Silchar 788011, Assam, India
| | - Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Memorial Science College, Algapur, Hailakandi788150, Assam, India.
| |
Collapse
|
19
|
Huttener R, Thorrez L, Veld TI, Granvik M, Van Lommel L, Waelkens E, Derua R, Lemaire K, Goyvaerts L, De Coster S, Buyse J, Schuit F. Sequencing refractory regions in bird genomes are hotspots for accelerated protein evolution. BMC Ecol Evol 2021; 21:176. [PMID: 34537008 PMCID: PMC8449477 DOI: 10.1186/s12862-021-01905-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Accepted: 08/31/2021] [Indexed: 11/29/2022] Open
Abstract
Background Approximately 1000 protein encoding genes common for vertebrates are still unannotated in avian genomes. Are these genes evolutionary lost or are they not yet found for technical reasons? Using genome landscapes as a tool to visualize large-scale regional effects of genome evolution, we reexamined this question. Results On basis of gene annotation in non-avian vertebrate genomes, we established a list of 15,135 common vertebrate genes. Of these, 1026 were not found in any of eight examined bird genomes. Visualizing regional genome effects by our sliding window approach showed that the majority of these "missing" genes can be clustered to 14 regions of the human reference genome. In these clusters, an additional 1517 genes (often gene fragments) were underrepresented in bird genomes. The clusters of “missing” genes coincided with regions of very high GC content, particularly in avian genomes, making them “hidden” because of incomplete sequencing. Moreover, proteins encoded by genes in these sequencing refractory regions showed signs of accelerated protein evolution. As a proof of principle for this idea we experimentally characterized the mRNA and protein products of four "hidden" bird genes that are crucial for energy homeostasis in skeletal muscle: ALDOA, ENO3, PYGM and SLC2A4. Conclusions A least part of the “missing” genes in bird genomes can be attributed to an artifact caused by the difficulty to sequence regions with extreme GC% (“hidden” genes). Biologically, these “hidden” genes are of interest as they encode proteins that evolve more rapidly than the genome wide average. Finally we show that four of these “hidden” genes encode key proteins for energy metabolism in flight muscle. Supplementary Information The online version contains supplementary material available at 10.1186/s12862-021-01905-7.
Collapse
Affiliation(s)
- R Huttener
- Gene Expression Unit, Department of Cellular and Molecular Medicine, KU Leuven, Herestraat 49, O&N1, bus 901, 3000, Leuven, Belgium
| | - L Thorrez
- Gene Expression Unit, Department of Cellular and Molecular Medicine, KU Leuven, Herestraat 49, O&N1, bus 901, 3000, Leuven, Belgium.,Tissue Engineering Laboratory, Department of Development and Regeneration, KU Leuven Campus Kulak, Kortrijk, Belgium
| | - T In't Veld
- Gene Expression Unit, Department of Cellular and Molecular Medicine, KU Leuven, Herestraat 49, O&N1, bus 901, 3000, Leuven, Belgium
| | - M Granvik
- Gene Expression Unit, Department of Cellular and Molecular Medicine, KU Leuven, Herestraat 49, O&N1, bus 901, 3000, Leuven, Belgium
| | - L Van Lommel
- Gene Expression Unit, Department of Cellular and Molecular Medicine, KU Leuven, Herestraat 49, O&N1, bus 901, 3000, Leuven, Belgium
| | - E Waelkens
- Laboratory of Protein Phosphorylation and Proteomics, KU Leuven, Leuven, Belgium
| | - R Derua
- Laboratory of Protein Phosphorylation and Proteomics, KU Leuven, Leuven, Belgium
| | - K Lemaire
- Gene Expression Unit, Department of Cellular and Molecular Medicine, KU Leuven, Herestraat 49, O&N1, bus 901, 3000, Leuven, Belgium
| | - L Goyvaerts
- Gene Expression Unit, Department of Cellular and Molecular Medicine, KU Leuven, Herestraat 49, O&N1, bus 901, 3000, Leuven, Belgium
| | - S De Coster
- Gene Expression Unit, Department of Cellular and Molecular Medicine, KU Leuven, Herestraat 49, O&N1, bus 901, 3000, Leuven, Belgium
| | - J Buyse
- Laboratory of Livestock Physiology, Department of Biosystems, KU Leuven, Leuven, Belgium
| | - F Schuit
- Gene Expression Unit, Department of Cellular and Molecular Medicine, KU Leuven, Herestraat 49, O&N1, bus 901, 3000, Leuven, Belgium.
| |
Collapse
|
20
|
Riba A, Fumagalli MR, Caselle M, Osella M. A Model-Driven Quantitative Analysis of Retrotransposon Distributions in the Human Genome. Genome Biol Evol 2021; 12:2045-2059. [PMID: 32986810 PMCID: PMC7750997 DOI: 10.1093/gbe/evaa201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/19/2020] [Indexed: 12/21/2022] Open
Abstract
Retrotransposons, DNA sequences capable of creating copies of themselves, compose about half of the human genome and played a central role in the evolution of mammals. Their current position in the host genome is the result of the retrotranscription process and of the following host genome evolution. We apply a model from statistical physics to show that the genomic distribution of the two most populated classes of retrotransposons in human deviates from random placement, and that this deviation increases with time. The time dependence suggests a major role of the host genome dynamics in shaping the current retrotransposon distributions. Focusing on a neutral scenario, we show that a simple model based on random placement followed by genome expansion and sequence duplications can reproduce the empirical retrotransposon distributions, even though more complex and possibly selective mechanisms can have contributed. Besides the inherent interest in understanding the origin of current retrotransposon distributions, this work sets a general analytical framework to analyze quantitatively the effects of genome evolutionary dynamics on the distribution of genomic elements.
Collapse
Affiliation(s)
| | - Maria Rita Fumagalli
- Institute of Biophysics - CNR, National Research Council, Genova, Italy.,Department of Environmental Science and Policy, Center for Complexity and Biosystems, University of Milan, Milano, Italy
| | - Michele Caselle
- Department of Physics and INFN, University of Torino, Torino, Italy
| | - Matteo Osella
- Department of Physics and INFN, University of Torino, Torino, Italy
| |
Collapse
|
21
|
Srikulnath K, Ahmad SF, Singchat W, Panthum T. Why Do Some Vertebrates Have Microchromosomes? Cells 2021; 10:2182. [PMID: 34571831 PMCID: PMC8466491 DOI: 10.3390/cells10092182] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 08/17/2021] [Accepted: 08/17/2021] [Indexed: 12/27/2022] Open
Abstract
With more than 70,000 living species, vertebrates have a huge impact on the field of biology and research, including karyotype evolution. One prominent aspect of many vertebrate karyotypes is the enigmatic occurrence of tiny and often cytogenetically indistinguishable microchromosomes, which possess distinctive features compared to macrochromosomes. Why certain vertebrate species carry these microchromosomes in some lineages while others do not, and how they evolve remain open questions. New studies have shown that microchromosomes exhibit certain unique characteristics of genome structure and organization, such as high gene densities, low heterochromatin levels, and high rates of recombination. Our review focuses on recent concepts to expand current knowledge on the dynamic nature of karyotype evolution in vertebrates, raising important questions regarding the evolutionary origins and ramifications of microchromosomes. We introduce the basic karyotypic features to clarify the size, shape, and morphology of macro- and microchromosomes and report their distribution across different lineages. Finally, we characterize the mechanisms of different evolutionary forces underlying the origin and evolution of microchromosomes.
Collapse
Affiliation(s)
- Kornsorn Srikulnath
- Animal Genomics and Bioresource Research Center (AGB Research Center), Faculty of Science, Kasetsart University, 50 Ngamwongwan, Chatuchak, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (T.P.)
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, 50 Ngamwongwan, Chatuchak, Bangkok 10900, Thailand
- The International Undergraduate Program in Bioscience and Technology, Faculty of Science, Kasetsart University, 50 Ngamwongwan, Chatuchak, Bangkok 10900, Thailand
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, 50 Ngamwongwan, Chatuchak, Bangkok 10900, Thailand
- Amphibian Research Center, Hiroshima University, 1-3-1, Kagamiyama, Higashihiroshima 739-8526, Japan
| | - Syed Farhan Ahmad
- Animal Genomics and Bioresource Research Center (AGB Research Center), Faculty of Science, Kasetsart University, 50 Ngamwongwan, Chatuchak, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (T.P.)
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, 50 Ngamwongwan, Chatuchak, Bangkok 10900, Thailand
- The International Undergraduate Program in Bioscience and Technology, Faculty of Science, Kasetsart University, 50 Ngamwongwan, Chatuchak, Bangkok 10900, Thailand
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, 50 Ngamwongwan, Chatuchak, Bangkok 10900, Thailand
| | - Worapong Singchat
- Animal Genomics and Bioresource Research Center (AGB Research Center), Faculty of Science, Kasetsart University, 50 Ngamwongwan, Chatuchak, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (T.P.)
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, 50 Ngamwongwan, Chatuchak, Bangkok 10900, Thailand
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, 50 Ngamwongwan, Chatuchak, Bangkok 10900, Thailand
| | - Thitipong Panthum
- Animal Genomics and Bioresource Research Center (AGB Research Center), Faculty of Science, Kasetsart University, 50 Ngamwongwan, Chatuchak, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (T.P.)
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, 50 Ngamwongwan, Chatuchak, Bangkok 10900, Thailand
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, 50 Ngamwongwan, Chatuchak, Bangkok 10900, Thailand
| |
Collapse
|
22
|
Abstract
Recombination increases the local GC-content in genomic regions through GC-biased gene conversion (gBGC). The recent discovery of a large genomic region with extreme GC-content in the fat sand rat Psammomys obesus provides a model to study the effects of gBGC on chromosome evolution. Here, we compare the GC-content and GC-to-AT substitution patterns across protein-coding genes of four gerbil species and two murine rodents (mouse and rat). We find that the known high-GC region is present in all the gerbils, and is characterized by high substitution rates for all mutational categories (AT-to-GC, GC-to-AT, and GC-conservative) both at synonymous and nonsynonymous sites. A higher AT-to-GC than GC-to-AT rate is consistent with the high GC-content. Additionally, we find more than 300 genes outside the known region with outlying values of AT-to-GC synonymous substitution rates in gerbils. Of these, over 30% are organized into at least 17 large clusters observable at the megabase-scale. The unusual GC-skewed substitution pattern suggests the evolution of genomic regions with very high recombination rates in the gerbil lineage, which can lead to a runaway increase in GC-content. Our results imply that rapid evolution of GC-content is possible in mammals, with gerbil species providing a powerful model to study the mechanisms of gBGC.
Collapse
Affiliation(s)
- Rodrigo Pracana
- Department of Zoology, University of Oxford, Oxford, United Kingdom
| | | | - John F Mulley
- School of Natural Sciences, Bangor University, Bangor, Gwynedd, United Kingdom
| | | |
Collapse
|
23
|
Gao NL, He Z, Zhu Q, Jiang P, Hu S, Chen WH. Selection for Cheaper Amino Acids Drives Nucleotide Usage at the Start of Translation in Eukaryotic Genes. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 19:949-957. [PMID: 33741525 PMCID: PMC9403032 DOI: 10.1016/j.gpb.2021.03.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2018] [Revised: 05/30/2019] [Accepted: 08/18/2019] [Indexed: 12/04/2022]
Abstract
Coding regions have complex interactions among multiple selective forces, which are manifested as biases in nucleotide composition. Previous studies have revealed a decreasing GC gradient from the 5′-end to 3′-end of coding regions in various organisms. We confirmed that this gradient is universal in eukaryotic genes, but the decrease only starts from the ∼ 25th codon. This trend is mostly found in nonsynonymous (ns) sites at which the GC gradient is universal across the eukaryotic genome. Increased GC contents at ns sites result in cheaper amino acids, indicating a universal selection for energy efficiency toward the N-termini of encoded proteins. Within a genome, the decreasing GC gradient is intensified from lowly to highly expressed genes (more and more protein products), further supporting this hypothesis. This reveals a conserved selective constraint for cheaper amino acids at the translation start that drives the increased GC contents at ns sites. Elevated GC contents can facilitate transcription but result in a more stable local secondary structure around the start codon and subsequently impede translation initiation. Conversely, the GC gradients at four-fold and two-fold synonymous sites vary across species. They could decrease or increase, suggesting different constraints acting at the GC contents of different codon sites in different species. This study reveals that the overall GC contents at the translation start are consequences of complex interactions among several major biological processes that shape the nucleotide sequences, especially efficient energy usage.
Collapse
Affiliation(s)
- Na L Gao
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China; Institute for Computer Science and Cluster of Excellence on Plant Sciences, Heinrich Heine University, Duesseldorf 40225, Germany
| | - Zilong He
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, China; State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Interdisciplinary Innovation Institute of Medicine and Engineering, Beihang University, Beijing 100191, China
| | - Qianhui Zhu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, China; State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Puzi Jiang
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Songnian Hu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, China; State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Wei-Hua Chen
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China.
| |
Collapse
|
24
|
Borůvková V, Howell WM, Matoulek D, Symonová R. Quantitative Approach to Fish Cytogenetics in the Context of Vertebrate Genome Evolution. Genes (Basel) 2021; 12:genes12020312. [PMID: 33671814 PMCID: PMC7926999 DOI: 10.3390/genes12020312] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 02/01/2021] [Accepted: 02/17/2021] [Indexed: 01/14/2023] Open
Abstract
Our novel Python-based tool EVANGELIST allows the visualization of GC and repeats percentages along chromosomes in sequenced genomes and has enabled us to perform quantitative large-scale analyses on the chromosome level in fish and other vertebrates. This is a different approach from the prevailing analyses, i.e., analyses of GC% in the coding sequences that make up not more than 2% in human. We identified GC content (GC%) elevations in microchromosomes in ancient fish lineages similar to avian microchromosomes and a large variability in the relationship between the chromosome size and their GC% across fish lineages. This raises the question as to what extent does the chromosome size drive GC% as posited by the currently accepted explanation based on the recombination rate. We ascribe the differences found across fishes to varying GC% of repetitive sequences. Generally, our results suggest that the GC% of repeats and proportion of repeats are independent of the chromosome size. This leaves an open space for another mechanism driving the GC evolution in vertebrates.
Collapse
Affiliation(s)
- Veronika Borůvková
- Faculty of Science, University of Hradec Kralove, 500 03 Hradec Kralove, Czech Republic; (V.B.); (D.M.)
| | - W. Mike Howell
- Department of Biological and Environmental Sciences, Samford University, Birmingham, AL 35226, USA;
| | - Dominik Matoulek
- Faculty of Science, University of Hradec Kralove, 500 03 Hradec Kralove, Czech Republic; (V.B.); (D.M.)
| | - Radka Symonová
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, 85354 Freising, Germany
- Correspondence:
| |
Collapse
|
25
|
Abstract
Drosophila melanogaster, a small dipteran of African origin, represents one of the best-studied model organisms. Early work in this system has uniquely shed light on the basic principles of genetics and resulted in a versatile collection of genetic tools that allow to uncover mechanistic links between genotype and phenotype. Moreover, given its worldwide distribution in diverse habitats and its moderate genome-size, Drosophila has proven very powerful for population genetics inference and was one of the first eukaryotes whose genome was fully sequenced. In this book chapter, we provide a brief historical overview of research in Drosophila and then focus on recent advances during the genomic era. After describing different types and sources of genomic data, we discuss mechanisms of neutral evolution including the demographic history of Drosophila and the effects of recombination and biased gene conversion. Then, we review recent advances in detecting genome-wide signals of selection, such as soft and hard selective sweeps. We further provide a brief introduction to background selection, selection of noncoding DNA and codon usage and focus on the role of structural variants, such as transposable elements and chromosomal inversions, during the adaptive process. Finally, we discuss how genomic data helps to dissect neutral and adaptive evolutionary mechanisms that shape genetic and phenotypic variation in natural populations along environmental gradients. In summary, this book chapter serves as a starting point to Drosophila population genomics and provides an introduction to the system and an overview to data sources, important population genetic concepts and recent advances in the field.
Collapse
|
26
|
Dai Y, Pracana R, Holland PWH. Divergent genes in gerbils: prevalence, relation to GC-biased substitution, and phenotypic relevance. BMC Evol Biol 2020; 20:134. [PMID: 33076817 PMCID: PMC7574485 DOI: 10.1186/s12862-020-01696-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Accepted: 09/29/2020] [Indexed: 11/25/2022] Open
Abstract
Background Two gerbil species, sand rat (Psammomys obesus) and Mongolian jird (Meriones unguiculatus), can become obese and show signs of metabolic dysregulation when maintained on standard laboratory diets. The genetic basis of this phenotype is unknown. Recently, genome sequencing has uncovered very unusual regions of high guanine and cytosine (GC) content scattered across the sand rat genome, most likely generated by extreme and localized biased gene conversion. A key pancreatic transcription factor PDX1 is encoded by a gene in the most extreme GC-rich region, is remarkably divergent and exhibits altered biochemical properties. Here, we ask if gerbils have proteins in addition to PDX1 that are aberrantly divergent in amino acid sequence, whether they have also become divergent due to GC-biased nucleotide changes, and whether these proteins could plausibly be connected to metabolic dysfunction exhibited by gerbils. Results We analyzed ~ 10,000 proteins with 1-to-1 orthologues in human and rodents and identified 50 proteins that accumulated unusually high levels of amino acid change in the sand rat and 41 in Mongolian jird. We show that more than half of the aberrantly divergent proteins are associated with GC biased nucleotide change and many are in previously defined high GC regions. We highlight four aberrantly divergent gerbil proteins, PDX1, INSR, MEDAG and SPP1, that may plausibly be associated with dietary metabolism. Conclusions We show that through the course of gerbil evolution, many aberrantly divergent proteins have accumulated in the gerbil lineage, and GC-biased nucleotide substitution rather than positive selection is the likely cause of extreme divergence in more than half of these. Some proteins carry putatively deleterious changes that could be associated with metabolic and physiological phenotypes observed in some gerbil species. We propose that these animals provide a useful model to study the ‘tug-of-war’ between natural selection and the excessive accumulation of deleterious substitutions mutations through biased gene conversion.
Collapse
Affiliation(s)
- Yichen Dai
- Department of Zoology, University of Oxford, 11a Mansfield Road, Oxford, OX1 3SZ, UK
| | - Rodrigo Pracana
- Department of Zoology, University of Oxford, 11a Mansfield Road, Oxford, OX1 3SZ, UK
| | - Peter W H Holland
- Department of Zoology, University of Oxford, 11a Mansfield Road, Oxford, OX1 3SZ, UK.
| |
Collapse
|
27
|
Huang J, Flouri T, Yang Z. A Simulation Study to Examine the Information Content in Phylogenomic Data Sets under the Multispecies Coalescent Model. Mol Biol Evol 2020; 37:3211-3224. [DOI: 10.1093/molbev/msaa166] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
AbstractWe use computer simulation to examine the information content in multilocus data sets for inference under the multispecies coalescent model. Inference problems considered include estimation of evolutionary parameters (such as species divergence times, population sizes, and cross-species introgression probabilities), species tree estimation, and species delimitation based on Bayesian comparison of delimitation models. We found that the number of loci is the most influential factor for almost all inference problems examined. Although the number of sequences per species does not appear to be important to species tree estimation, it is very influential to species delimitation. Increasing the number of sites and the per-site mutation rate both increase the mutation rate for the whole locus and these have the same effect on estimation of parameters, but the sequence length has a greater effect than the per-site mutation rate for species tree estimation. We discuss the computational costs when the data size increases and provide guidelines concerning the subsampling of genomic data to enable the application of full-likelihood methods of inference.
Collapse
Affiliation(s)
- Jun Huang
- Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
- Department of Mathematics, Beijing Jiaotong University, Beijing, P.R. China
| | - Tomáš Flouri
- Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| |
Collapse
|
28
|
Naim DM, Nor SAM, Mahboob S. Reassessment of species distribution and occurrence of mud crab ( Scylla spp., Portunidae) in Malaysia through morphological and molecular identification. Saudi J Biol Sci 2020; 27:643-652. [PMID: 32210683 PMCID: PMC6997873 DOI: 10.1016/j.sjbs.2019.11.030] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2019] [Revised: 11/14/2019] [Accepted: 11/24/2019] [Indexed: 12/01/2022] Open
Abstract
This study utilized genetic and morphometric approaches to assess the molecular and morphometric differentiation among commercially important species of mud crab. Molecular investigations were derived from 542 bp mitochondrial DNA COI on 249 individuals within genus Scylla from nine states in Malaysia represents four marine regions; South China Sea, Sulu Sea, Straits of Singapore and Straits of Malacca. Four specimens were obtained from Indonesia to give a robust analysis in this study. For species delimitation, Automatic Barcode Gap Discovery (ABGD) method on a web interface was employed. Analysis on phylogenetics was implemented utilizing Neighbour joining (NJ) and Maximum Parsimony (MP) methods. The inter- and intraspecies genetic distances (Ds) was computed using Kimura 2-parameter distance and executed in MEGA version 5.05. All samples were genetically and morphologically identified and clustered into four distinct species. Among the species, S. olivacea was the most abundant (n = 111), on the other hand the occurrence of S. paramamosain in Malaysia was very low (n = 29). No single individual of S. serrata from Malaysia was recorded in this study. Both genetic distance and phylogenetic approaches exhibited a correlative monophyletic association among all specimens analysed. This present study is crucial as it reports the reassessment of all species within genus Scylla in Malaysia, eventually could be employed as a reference source for subsequent research mainly on mariculture and other conservation efforts for the species.
Collapse
Affiliation(s)
- Darlina Md Naim
- School of Biological Sciences, Universiti Sains Malaysia, 11800 Pulau Pinang, Malaysia
| | - Siti Azizah Mohd Nor
- Institute of Marine Biotechnology, Universiti Malaysia Terengganu, 21030 Kuala Terengganu, Malaysia
| | - Shahid Mahboob
- Department of Zoology, College of Science, King Saud University, Riyadh 11451, Saudi Arabia
| |
Collapse
|
29
|
Karin BR, Gamble T, Jackman TR. Optimizing Phylogenomics with Rapidly Evolving Long Exons: Comparison with Anchored Hybrid Enrichment and Ultraconserved Elements. Mol Biol Evol 2020; 37:904-922. [PMID: 31710677 PMCID: PMC7038749 DOI: 10.1093/molbev/msz263] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Marker selection has emerged as an important component of phylogenomic study design due to rising concerns of the effects of gene tree estimation error, model misspecification, and data-type differences. Researchers must balance various trade-offs associated with locus length and evolutionary rate among other factors. The most commonly used reduced representation data sets for phylogenomics are ultraconserved elements (UCEs) and Anchored Hybrid Enrichment (AHE). Here, we introduce Rapidly Evolving Long Exon Capture (RELEC), a new set of loci that targets single exons that are both rapidly evolving (evolutionary rate faster than RAG1) and relatively long in length (>1,500 bp), while at the same time avoiding paralogy issues across amniotes. We compare the RELEC data set to UCEs and AHE in squamate reptiles by aligning and analyzing orthologous sequences from 17 squamate genomes, composed of 10 snakes and 7 lizards. The RELEC data set (179 loci) outperforms AHE and UCEs by maximizing per-locus genetic variation while maintaining presence and orthology across a range of evolutionary scales. RELEC markers show higher phylogenetic informativeness than UCE and AHE loci, and RELEC gene trees show greater similarity to the species tree than AHE or UCE gene trees. Furthermore, with fewer loci, RELEC remains computationally tractable for full Bayesian coalescent species tree analyses. We contrast RELEC to and discuss important aspects of comparable methods, and demonstrate how RELEC may be the most effective set of loci for resolving difficult nodes and rapid radiations. We provide several resources for capturing or extracting RELEC loci from other amniote groups.
Collapse
Affiliation(s)
- Benjamin R Karin
- Department of Biology, Villanova University, Villanova, PA
- Museum of Vertebrate Zoology and Department of Integrative Biology, University of California, Berkeley, CA
| | - Tony Gamble
- Department of Biological Sciences, Marquette University, Milwaukee, WI
- Milwaukee Public Museum, Milwaukee, WI
- Bell Museum of Natural History, University of Minnesota, St. Paul, MN
| | - Todd R Jackman
- Department of Biology, Villanova University, Villanova, PA
| |
Collapse
|
30
|
White ND, Braun MJ. Extracting phylogenetic signal from phylogenomic data: Higher-level relationships of the nightbirds (Strisores). Mol Phylogenet Evol 2019; 141:106611. [DOI: 10.1016/j.ympev.2019.106611] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 09/04/2019] [Accepted: 09/06/2019] [Indexed: 12/22/2022]
|
31
|
Borges R, Szöllősi GJ, Kosiol C. Quantifying GC-Biased Gene Conversion in Great Ape Genomes Using Polymorphism-Aware Models. Genetics 2019; 212:1321-1336. [PMID: 31147380 PMCID: PMC6707462 DOI: 10.1534/genetics.119.302074] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Accepted: 05/20/2019] [Indexed: 11/18/2022] Open
Abstract
As multi-individual population-scale data become available, more complex modeling strategies are needed to quantify genome-wide patterns of nucleotide usage and associated mechanisms of evolution. Recently, the multivariate neutral Moran model was proposed. However, it was shown insufficient to explain the distribution of alleles in great apes. Here, we propose a new model that includes allelic selection. Our theoretical results constitute the basis of a new Bayesian framework to estimate mutation rates and selection coefficients from population data. We apply the new framework to a great ape dataset, where we found patterns of allelic selection that match those of genome-wide GC-biased gene conversion (gBGC). In particular, we show that great apes have patterns of allelic selection that vary in intensity-a feature that we correlated with great apes' distinct demographies. We also demonstrate that the AT/GC toggling effect decreases the probability of a substitution, promoting more polymorphisms in the base composition of great ape genomes. We further assess the impact of GC-bias in molecular analysis, and find that mutation rates and genetic distances are estimated under bias when gBGC is not properly accounted for. Our results contribute to the discussion on the tempo and mode of gBGC evolution, while stressing the need for gBGC-aware models in population genetics and phylogenetics.
Collapse
Affiliation(s)
- Rui Borges
- Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Wien, Austria
| | - Gergely J Szöllősi
- Department of Biological Physics, MTA-ELTE "Lendulet" Evolutionary Genomics Research Group, Eötvös University, Pázmány P. stny. 1A, Budapest 1117, Hungary
| | - Carolin Kosiol
- Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Wien, Austria
- Centre for Biological Diversity, School of Biology, University of St Andrews, Fife KY16 9TH, UK
| |
Collapse
|
32
|
Huttener R, Thorrez L, In't Veld T, Granvik M, Snoeck L, Van Lommel L, Schuit F. GC content of vertebrate exome landscapes reveal areas of accelerated protein evolution. BMC Evol Biol 2019; 19:144. [PMID: 31311498 PMCID: PMC6636035 DOI: 10.1186/s12862-019-1469-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Accepted: 06/26/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Rapid accumulation of vertebrate genome sequences render comparative genomics a powerful approach to study macro-evolutionary events. The assessment of phylogenic relationships between species routinely depends on the analysis of sequence homology at the nucleotide or protein level. RESULTS We analyzed mRNA GC content, codon usage and divergence of orthologous proteins in 55 vertebrate genomes. Data were visualized in genome-wide landscapes using a sliding window approach. Landscapes of GC content reveal both evolutionary conservation of clustered genes, and lineage-specific changes, so that it was possible to construct a phylogenetic tree that closely matched the classic "tree of life". Landscapes of GC content also strongly correlated to landscapes of amino acid usage: positive correlation with glycine, alanine, arginine and proline and negative correlation with phenylalanine, tyrosine, methionine, isoleucine, asparagine and lysine. Peaks of GC content correlated strongly with increased protein divergence. CONCLUSIONS Landscapes of base- and amino acid composition of the coding genome opens a new approach in comparative genomics, allowing identification of discrete regions in which protein evolution accelerated over deep evolutionary time. Insight in the evolution of genome structure may spur novel studies assessing the evolutionary benefit of genes in particular genomic regions.
Collapse
Affiliation(s)
- R Huttener
- Gene Expression Unit, Dept of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium
| | - L Thorrez
- Gene Expression Unit, Dept of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium.,Tissue Engineering Laboratory, Dept of Development and Regeneration, KU Leuven, Kortrijk, Belgium
| | - T In't Veld
- Gene Expression Unit, Dept of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium
| | - M Granvik
- Gene Expression Unit, Dept of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium
| | - L Snoeck
- Tissue Engineering Laboratory, Dept of Development and Regeneration, KU Leuven, Kortrijk, Belgium
| | - L Van Lommel
- Gene Expression Unit, Dept of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium
| | - F Schuit
- Gene Expression Unit, Dept of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium.
| |
Collapse
|
33
|
Bourgeois Y, Ruggiero RP, Manthey JD, Boissinot S. Recent Secondary Contacts, Linked Selection, and Variable Recombination Rates Shape Genomic Diversity in the Model Species Anolis carolinensis. Genome Biol Evol 2019; 11:2009-2022. [PMID: 31134281 PMCID: PMC6681179 DOI: 10.1093/gbe/evz110] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/23/2019] [Indexed: 12/14/2022] Open
Abstract
Gaining a better understanding on how selection and neutral processes affect genomic diversity is essential to gain better insights into the mechanisms driving adaptation and speciation. However, the evolutionary processes affecting variation at a genomic scale have not been investigated in most vertebrate lineages. Here, we present the first population genomics survey using whole genome resequencing in the green anole (Anolis carolinensis). Anoles have been intensively studied to understand mechanisms underlying adaptation and speciation. The green anole in particular is an important model to study genome evolution. We quantified how demography, recombination, and selection have led to the current genetic diversity of the green anole by using whole-genome resequencing of five genetic clusters covering the entire species range. The differentiation of green anole's populations is consistent with a northward expansion from South Florida followed by genetic isolation and subsequent gene flow among adjacent genetic clusters. Dispersal out-of-Florida was accompanied by a drastic population bottleneck followed by a rapid population expansion. This event was accompanied by male-biased dispersal and/or selective sweeps on the X chromosome. We show that the interaction between linked selection and recombination is the main contributor to the genomic landscape of differentiation in the anole genome.
Collapse
Affiliation(s)
| | | | - Joseph D Manthey
- New York University Abu Dhabi, United Arab Emirates
- Department of Biological Sciences, Texas Tech University
| | | |
Collapse
|
34
|
Dai Y, Holland PWH. The Interaction of Natural Selection and GC Skew May Drive the Fast Evolution of a Sand Rat Homeobox Gene. Mol Biol Evol 2019; 36:1473-1480. [PMID: 30968125 PMCID: PMC6573468 DOI: 10.1093/molbev/msz080] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Several processes can lead to strong GC skew in localized genomic regions. In most cases, GC skew should not affect conserved amino acids because natural selection will purge deleterious alleles. However, in the gerbil subfamily of rodents, several conserved genes have undergone radical alteration in association with strong GC skew. An extreme example concerns the highly conserved homeobox gene Pdx1, which is uniquely divergent and GC rich in the sand rat Psammomys obesus and close relatives. Here, we investigate the antagonistic interplay between very rare amino acid changes driven by GC skew and the force of natural selection. Using ectopic protein expression in cell culture, pulse-chase labeling, in vitro mutagenesis, and drug treatment, we compare properties of mouse and sand rat Pdx1 proteins. We find that amino acid change driven by GC skew resulted in altered protein stability, with a significantly longer protein half-life for sand rat Pdx1. Using a reversible inhibitor of the 26S proteasome, MG132, we find that sand rat and mouse Pdx1 are both degraded through the ubiquitin proteasome pathway. However, in vitro mutagenesis reveals this pathway operates through different amino acid residues. We propose that GC skew caused loss of a key ubiquitination site, conserved through vertebrate evolution, and that sand rat Pdx1 evolved or fixed a new ubiquitination site to compensate. Our results give molecular insight into the power of natural selection in the face of maladaptive changes driven by strong GC skew.
Collapse
Affiliation(s)
- Yichen Dai
- Department of Zoology, University of Oxford, Oxford, United Kingdom
| | | |
Collapse
|
35
|
Southworth J, Armitage P, Fallon B, Dawson H, Bryk J, Carr M. Patterns of Ancestral Animal Codon Usage Bias Revealed through Holozoan Protists. Mol Biol Evol 2019; 35:2499-2511. [PMID: 30169693 PMCID: PMC6188563 DOI: 10.1093/molbev/msy157] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
Choanoflagellates and filastereans are the closest known single celled relatives of Metazoa within Holozoa and provide insight into how animals evolved from their unicellular ancestors. Codon usage bias has been extensively studied in metazoans, with both natural selection and mutation pressure playing important roles in different species. The disparate nature of metazoan codon usage patterns prevents the reconstruction of ancestral traits. However, traits conserved across holozoan protists highlight characteristics in the unicellular ancestors of Metazoa. Presented here are the patterns of codon usage in the choanoflagellates Monosiga brevicollis and Salpingoeca rosetta, as well as the filasterean Capsaspora owczarzaki. Codon usage is shown to be remarkably conserved. Highly biased genes preferentially use GC-ending codons, however there is limited evidence this is driven by local mutation pressure. The analyses presented provide strong evidence that natural selection, for both translational accuracy and efficiency, dominates codon usage bias in holozoan protists. In particular, the signature of selection for translational accuracy can be detected even in the most weakly biased genes. Biased codon usage is shown to have coevolved with the tRNA species, with optimal codons showing complementary binding to the highest copy number tRNA genes. Furthermore, tRNA modification is shown to be a common feature for amino acids with higher levels of degeneracy and highly biased genes show a strong preference for using modified tRNAs in translation. The translationally optimal codons defined here will be of benefit to future transgenics work in holozoan protists, as their use should maximise protein yields from edited transgenes.
Collapse
Affiliation(s)
- Jade Southworth
- Department of Biological and Geographical Sciences, University of Huddersfield, Huddersfield, United Kingdom
| | - Paul Armitage
- Department of Biological and Geographical Sciences, University of Huddersfield, Huddersfield, United Kingdom
| | - Brandon Fallon
- Department of Biological and Geographical Sciences, University of Huddersfield, Huddersfield, United Kingdom
| | - Holly Dawson
- Department of Biological and Geographical Sciences, University of Huddersfield, Huddersfield, United Kingdom
| | - Jaroslaw Bryk
- Department of Biological and Geographical Sciences, University of Huddersfield, Huddersfield, United Kingdom
| | - Martin Carr
- Department of Biological and Geographical Sciences, University of Huddersfield, Huddersfield, United Kingdom
| |
Collapse
|
36
|
Bast J, Parker DJ, Dumas Z, Jalvingh KM, Tran Van P, Jaron KS, Figuet E, Brandt A, Galtier N, Schwander T. Consequences of Asexuality in Natural Populations: Insights from Stick Insects. Mol Biol Evol 2019; 35:1668-1677. [PMID: 29659991 PMCID: PMC5995167 DOI: 10.1093/molbev/msy058] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Recombination is a fundamental process with significant impacts on genome evolution. Predicted consequences of the loss of recombination include a reduced effectiveness of selection, changes in the amount of neutral polymorphisms segregating in populations, and an arrest of GC-biased gene conversion. Although these consequences are empirically well documented for nonrecombining genome portions, it remains largely unknown if they extend to the whole genome scale in asexual organisms. We identify the consequences of asexuality using de novo transcriptomes of five independently derived, obligately asexual lineages of stick insects, and their sexual sister-species. We find strong evidence for higher rates of deleterious mutation accumulation, lower levels of segregating polymorphisms and arrested GC-biased gene conversion in asexuals as compared with sexuals. Taken together, our study conclusively shows that predicted consequences of genome evolution under asexuality can indeed be found in natural populations.
Collapse
Affiliation(s)
- Jens Bast
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Darren J Parker
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Zoé Dumas
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Kirsten M Jalvingh
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Patrick Tran Van
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Kamil S Jaron
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Emeric Figuet
- Institute of Evolutionary Sciences, University of Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Alexander Brandt
- Johann-Friedrich-Blumenbach Institute of Zoology and Anthropology, University of Goettingen, Goettingen, Germany
| | - Nicolas Galtier
- Institute of Evolutionary Sciences, University of Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Tanja Schwander
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| |
Collapse
|
37
|
Molecular cytogenetic characterization of repetitive sequences comprising centromeric heterochromatin in three Anseriformes species. PLoS One 2019; 14:e0214028. [PMID: 30913221 PMCID: PMC6435179 DOI: 10.1371/journal.pone.0214028] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Accepted: 03/05/2019] [Indexed: 01/22/2023] Open
Abstract
The highly repetitive DNA sequence of centromeric heterochromatin is an effective molecular cytogenetic marker for investigating genomic compartmentalization between macrochromosomes and microchromosomes in birds. We isolated four repetitive sequence families of centromeric heterochromatin from three Anseriformes species, viz., domestic duck (Anas platyrhynchos, APL), bean goose (Anser fabalis, AFA), and whooper swan (Cygnus cygnus, CCY), and characterized the sequences by molecular cytogenetic approach. The 190-bp APL-HaeIII and 101-bp AFA-HinfI-S sequences were localized in almost all chromosomes of A. platyrhynchos and A. fabalis, respectively. However, the 192-bp AFA-HinfI-L and 290-bp CCY-ApaI sequences were distributed in almost all microchromosomes of A. fabalis and in approximately 10 microchromosomes of C. cygnus, respectively. APL-HaeIII, AFA-HinfI-L, and CCY-ApaI showed partial sequence homology with the chicken nuclear-membrane-associated (CNM) repeat families, which were localized primarily to the centromeric regions of microchromosomes in Galliformes, suggesting that ancestral sequences of the CNM repeat families are observed in the common ancestors of Anseriformes and Galliformes. These results collectively provide the possibility that homogenization of centromeric heterochromatin occurred between microchromosomes in Anseriformes and Galliformes; however, homogenization between macrochromosomes and microchromosomes also occurred in some centromeric repetitive sequences.
Collapse
|
38
|
Kasai F, O'Brien PCM, Ferguson-Smith MA. Squamate Chromosome Size and GC Content Assessed by Flow Karyotyping. Cytogenet Genome Res 2019; 157:46-52. [PMID: 30904910 DOI: 10.1159/000497265] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Chromosome homologies in reptiles have been investigated extensively by gene mapping and chromosome painting. Relative chromosome size can be estimated roughly from conventional karyotypes, but chromosome GC content cannot be evaluated by any of these approaches. However, GC content can be obtained by whole-genome sequencing, although complete data are available only for a limited number of reptilian species. Chromosomes can be characterized by size and GC content in bivariate flow karyotypes, in which the distribution of peaks represents the differences. We have analysed flow karyotypes from 9 representative squamate species and show chromosome profiles for each species based on the relationship between size and GC content. Our results reveal that the GC content of macrochromosomes is invariable in the 9 species. A higher GC content was found in microchromosomes, similar to profiles previously determined in crocodile, turtle, and chicken. The findings suggest that karyotype evolution in reptiles is characterized by unique features of chromosome GC content.
Collapse
|
39
|
Galtier N, Roux C, Rousselle M, Romiguier J, Figuet E, Glémin S, Bierne N, Duret L. Codon Usage Bias in Animals: Disentangling the Effects of Natural Selection, Effective Population Size, and GC-Biased Gene Conversion. Mol Biol Evol 2019; 35:1092-1103. [PMID: 29390090 DOI: 10.1093/molbev/msy015] [Citation(s) in RCA: 83] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Selection on codon usage bias is well documented in a number of microorganisms. Whether codon usage is also generally shaped by natural selection in large organisms, despite their relatively small effective population size (Ne), is unclear. In animals, the population genetics of codon usage bias has only been studied in a handful of model organisms so far, and can be affected by confounding, nonadaptive processes such as GC-biased gene conversion and experimental artefacts. Using population transcriptomics data, we analyzed the relationship between codon usage, gene expression, allele frequency distribution, and recombination rate in 30 nonmodel species of animals, each from a different family, covering a wide range of effective population sizes. We disentangled the effects of translational selection and GC-biased gene conversion on codon usage by separately analyzing GC-conservative and GC-changing mutations. We report evidence for effective translational selection on codon usage in large-Ne species of animals, but not in small-Ne ones, in agreement with the nearly neutral theory of molecular evolution. C- and T-ending codons tend to be preferred over synonymous G- and A-ending ones, for reasons that remain to be determined. In contrast, we uncovered a conspicuous effect of GC-biased gene conversion, which is widespread in animals and the main force determining the fate of AT↔GC mutations. Intriguingly, the strength of its effect was uncorrelated with Ne.
Collapse
Affiliation(s)
- Nicolas Galtier
- UMR5554, Institut des Sciences de l'Evolution, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Camille Roux
- UMR5554, Institut des Sciences de l'Evolution, University Montpellier, CNRS, IRD, EPHE, Montpellier, France.,Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland.,UMR 8198 - Evo-Eco-Paleo, CNRS, Université de Lille-Sciences et Technologies, Villeneuve d'Ascq, France
| | - Marjolaine Rousselle
- UMR5554, Institut des Sciences de l'Evolution, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Jonathan Romiguier
- UMR5554, Institut des Sciences de l'Evolution, University Montpellier, CNRS, IRD, EPHE, Montpellier, France.,Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Emeric Figuet
- UMR5554, Institut des Sciences de l'Evolution, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Sylvain Glémin
- UMR5554, Institut des Sciences de l'Evolution, University Montpellier, CNRS, IRD, EPHE, Montpellier, France.,Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | - Nicolas Bierne
- UMR5554, Institut des Sciences de l'Evolution, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Laurent Duret
- Laboratoire de Biométrie et Biologie Evolutive, UMR 5558, CNRS, Université de Lyon, Université Lyon 1, Villeurbanne, France
| |
Collapse
|
40
|
Bohlin J, Pettersson JHO. Evolution of Genomic Base Composition: From Single Cell Microbes to Multicellular Animals. Comput Struct Biotechnol J 2019; 17:362-370. [PMID: 30949307 PMCID: PMC6429543 DOI: 10.1016/j.csbj.2019.03.001] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Revised: 02/28/2019] [Accepted: 03/01/2019] [Indexed: 01/07/2023] Open
Abstract
Whole genome sequencing (WGS) of thousands of microbial genomes has provided considerable insight into evolutionary mechanisms in the microbial world. While substantially fewer eukaryotic genomes are available for analyses the number is rapidly increasing. This mini-review summarizes broadly evolutionary dynamics of base composition in the different domains of life from the perspective of prokaryotes. Common and different evolutionary mechanisms influencing genomic base composition in eukaryotes and prokaryotes are discussed. The conclusion from the data currently available suggests that while there are similarities there are also striking differences in how genomic base composition has evolved within prokaryotes and eukaryotes. For instance, homologous recombination appears to increase GC content locally in eukaryotes due to a non-selective process termed GC-biased gene conversion (gBGC). For prokaryotes on the other hand, increase in genomic GC content seems to be driven by the environment and selection. We find that similar phenomena observed for some organisms in each respective domain may be caused by very different mechanisms: while gBGC and recombination rates appear to explain the negative correlation between GC3 (GC content based on the third codon nucleotides) and genome size in some eukaryotes uptake of AT rich DNA sequences is the main reason for a similar negative correlation observed in prokaryotes. We provide further examples that indicate that base composition in prokaryotes and eukaryotes have evolved under very different constraints.
Collapse
Affiliation(s)
- Jon Bohlin
- Norwegian Institute of Public Health, Division of Infection Control and Environmental Health, Department of Infectious Disease Epidemiology and Modelling, Lovisenberggata 8, 0456 Oslo, Norway.,Centre for Fertility and Health, Norwegian Institute of Public Health, PO-Box 222 Skøyen, N-0213 Oslo, Norway.,Norwegian University of Life Sciences, Faculty of Veterinary Sciences, Production Animal Clinical Sciences, Ullevålsveien 72, 0454 Oslo, Norway
| | - John H-O Pettersson
- Marie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, School of Life and Environmental Sciences and Sydney Medical School the University of Sydney, New South Wales 2006, Australia.,Zoonosis Science Center, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden.,Public Health Agency of Sweden, Nobels vg 18, SE-171 82 Solna, Sweden
| |
Collapse
|
41
|
Bravo GA, Antonelli A, Bacon CD, Bartoszek K, Blom MPK, Huynh S, Jones G, Knowles LL, Lamichhaney S, Marcussen T, Morlon H, Nakhleh LK, Oxelman B, Pfeil B, Schliep A, Wahlberg N, Werneck FP, Wiedenhoeft J, Willows-Munro S, Edwards SV. Embracing heterogeneity: coalescing the Tree of Life and the future of phylogenomics. PeerJ 2019; 7:e6399. [PMID: 30783571 PMCID: PMC6378093 DOI: 10.7717/peerj.6399] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Accepted: 01/07/2019] [Indexed: 12/23/2022] Open
Abstract
Building the Tree of Life (ToL) is a major challenge of modern biology, requiring advances in cyberinfrastructure, data collection, theory, and more. Here, we argue that phylogenomics stands to benefit by embracing the many heterogeneous genomic signals emerging from the first decade of large-scale phylogenetic analysis spawned by high-throughput sequencing (HTS). Such signals include those most commonly encountered in phylogenomic datasets, such as incomplete lineage sorting, but also those reticulate processes emerging with greater frequency, such as recombination and introgression. Here we focus specifically on how phylogenetic methods can accommodate the heterogeneity incurred by such population genetic processes; we do not discuss phylogenetic methods that ignore such processes, such as concatenation or supermatrix approaches or supertrees. We suggest that methods of data acquisition and the types of markers used in phylogenomics will remain restricted until a posteriori methods of marker choice are made possible with routine whole-genome sequencing of taxa of interest. We discuss limitations and potential extensions of a model supporting innovation in phylogenomics today, the multispecies coalescent model (MSC). Macroevolutionary models that use phylogenies, such as character mapping, often ignore the heterogeneity on which building phylogenies increasingly rely and suggest that assimilating such heterogeneity is an important goal moving forward. Finally, we argue that an integrative cyberinfrastructure linking all steps of the process of building the ToL, from specimen acquisition in the field to publication and tracking of phylogenomic data, as well as a culture that values contributors at each step, are essential for progress.
Collapse
Affiliation(s)
- Gustavo A. Bravo
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
| | - Alexandre Antonelli
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
- Gothenburg Global Biodiversity Centre, Göteborg, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
- Gothenburg Botanical Garden, Göteborg, Sweden
| | - Christine D. Bacon
- Gothenburg Global Biodiversity Centre, Göteborg, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - Krzysztof Bartoszek
- Department of Computer and Information Science, Linköping University, Linköping, Sweden
| | - Mozes P. K. Blom
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
| | - Stella Huynh
- Institut de Biologie, Université de Neuchâtel, Neuchâtel, Switzerland
| | - Graham Jones
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - L. Lacey Knowles
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
| | - Sangeet Lamichhaney
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
| | - Thomas Marcussen
- Centre for Ecological and Evolutionary Synthesis, University of Oslo, Oslo, Norway
| | - Hélène Morlon
- Institut de Biologie, Ecole Normale Supérieure de Paris, Paris, France
| | - Luay K. Nakhleh
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Bengt Oxelman
- Gothenburg Global Biodiversity Centre, Göteborg, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - Bernard Pfeil
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - Alexander Schliep
- Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
| | | | - Fernanda P. Werneck
- Coordenação de Biodiversidade, Programa de Coleções Científicas Biológicas, Instituto Nacional de Pesquisa da Amazônia, Manaus, AM, Brazil
| | - John Wiedenhoeft
- Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
- Department of Computer Science, Rutgers University, Piscataway, NJ, USA
| | - Sandi Willows-Munro
- School of Life Sciences, University of Kwazulu-Natal, Pietermaritzburg, South Africa
| | - Scott V. Edwards
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
- Gothenburg Centre for Advanced Studies in Science and Technology, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
| |
Collapse
|
42
|
van der Hoek MD, Madsen O, Keijer J, van der Leij FR. Evolutionary analysis of the carnitine- and choline acyltransferases suggests distinct evolution of CPT2 versus CPT1 and related variants. Biochim Biophys Acta Mol Cell Biol Lipids 2018; 1863:909-918. [PMID: 29730527 DOI: 10.1016/j.bbalip.2018.05.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2018] [Revised: 04/24/2018] [Accepted: 05/03/2018] [Indexed: 10/17/2022]
Abstract
Carnitine/choline acyltransferases play diverse roles in energy metabolism and neuronal signalling. Our knowledge of their evolutionary relationships, important for functional understanding, is incomplete. Therefore, we aimed to determine the evolutionary relationships of these eukaryotic transferases. We performed extensive phylogenetic and intron position analyses. We found that mammalian intramitochondrial CPT2 is most closely related to cytosolic yeast carnitine transferases (Sc-YAT1 and 2), whereas the other members of the family are related to intraorganellar yeast Sc-CAT2. Therefore, the cytosolically active CPT1 more closely resembles intramitochondrial ancestors than CPT2. The choline acetyltransferase is closely related to carnitine acetyltransferase and shows lower evolutionary rates than long chain acyltransferases. In the CPT1 family several duplications occurred during animal radiation, leading to the isoforms CPT1A, CPT1B and CPT1C. In addition, we found five CPT1-like genes in Caenorhabditis elegans that strongly group to the CPT1 family. The long branch leading to mammalian brain isoform CPT1C suggests that either strong positive or relaxed evolution has taken place on this node. The presented evolutionary delineation of carnitine/choline acyltransferases adds to current knowledge on their functions and provides tangible leads for further experimental research.
Collapse
Affiliation(s)
- Marjanne D van der Hoek
- Applied Research Centre Food and Dairy, Van Hall Larenstein University of Applied Sciences, P.O. box 1528, 8901BV Leeuwarden, The Netherlands; Human and Animal Physiology, Wageningen University, P.O. box 338, 6700AH Wageningen, The Netherlands
| | - Ole Madsen
- Animal Breeding and Genomics Centre, Wageningen University, P.O. box 338, 6700AH Wageningen, The Netherlands
| | - Jaap Keijer
- Human and Animal Physiology, Wageningen University, P.O. box 338, 6700AH Wageningen, The Netherlands
| | - Feike R van der Leij
- Applied Research Centre Food and Dairy, Van Hall Larenstein University of Applied Sciences, P.O. box 1528, 8901BV Leeuwarden, The Netherlands.
| |
Collapse
|
43
|
Mazumdar P, Binti Othman R, Mebus K, Ramakrishnan N, Ann Harikrishna J. Codon usage and codon pair patterns in non-grass monocot genomes. ANNALS OF BOTANY 2017; 120:893-909. [PMID: 29155926 PMCID: PMC5710610 DOI: 10.1093/aob/mcx112] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Accepted: 09/19/2017] [Indexed: 05/19/2023]
Abstract
BACKGROUND AND AIMS Studies on codon usage in monocots have focused on grasses, and observed patterns of this taxon were generalized to all monocot species. Here, non-grass monocot species were analysed to investigate the differences between grass and non-grass monocots. METHODS First, studies of codon usage in monocots were reviewed. The current information was then extended regarding codon usage, as well as codon-pair context bias, using four completely sequenced non-grass monocot genomes (Musa acuminata, Musa balbisiana, Phoenix dactylifera and Spirodela polyrhiza) for which comparable transcriptome datasets are available. Measurements were taken regarding relative synonymous codon usage, effective number of codons, derived optimal codon and GC content and then the relationships investigated to infer the underlying evolutionary forces. KEY RESULTS The research identified optimal codons, rare codons and preferred codon-pair context in the non-grass monocot species studied. In contrast to the bimodal distribution of GC3 (GC content in third codon position) in grasses, non-grass monocots showed a unimodal distribution. Disproportionate use of G and C (and of A and T) in two- and four-codon amino acids detected in the analysis rules out the mutational bias hypothesis as an explanation of genomic variation in GC content. There was found to be a positive relationship between CAI (codon adaptation index; predicts the level of expression of a gene) and GC3. In addition, a strong correlation was observed between coding and genomic GC content and negative correlation of GC3 with gene length, indicating a strong impact of GC-biased gene conversion (gBGC) in shaping codon usage and nucleotide composition in non-grass monocots. CONCLUSION Optimal codons in these non-grass monocots show a preference for G/C in the third codon position. These results support the concept that codon usage and nucleotide composition in non-grass monocots are mainly driven by gBGC.
Collapse
Affiliation(s)
- Purabi Mazumdar
- Centre for Research in Biotechnology for Agriculture, University of Malaya, Kuala Lumpur, Malaysia
| | - RofinaYasmin Binti Othman
- Centre for Research in Biotechnology for Agriculture, University of Malaya, Kuala Lumpur, Malaysia
- Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia
| | - Katharina Mebus
- Centre for Research in Biotechnology for Agriculture, University of Malaya, Kuala Lumpur, Malaysia
| | - N Ramakrishnan
- Electrical and Computer System Engineering, School of Engineering, Monash University Malaysia, Bandar Sunway, Malaysia
| | - Jennifer Ann Harikrishna
- Centre for Research in Biotechnology for Agriculture, University of Malaya, Kuala Lumpur, Malaysia
- Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia
- For correspondence. E-mail:
| |
Collapse
|
44
|
Assaf ZJ, Tilk S, Park J, Siegal ML, Petrov DA. Deep sequencing of natural and experimental populations of Drosophila melanogaster reveals biases in the spectrum of new mutations. Genome Res 2017; 27:1988-2000. [PMID: 29079675 PMCID: PMC5741049 DOI: 10.1101/gr.219956.116] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2016] [Accepted: 10/20/2017] [Indexed: 11/25/2022]
Abstract
Mutations provide the raw material of evolution, and thus our ability to study evolution depends fundamentally on having precise measurements of mutational rates and patterns. We generate a data set for this purpose using (1) de novo mutations from mutation accumulation experiments and (2) extremely rare polymorphisms from natural populations. The first, mutation accumulation (MA) lines are the product of maintaining flies in tiny populations for many generations, therefore rendering natural selection ineffective and allowing new mutations to accrue in the genome. The second, rare genetic variation from natural populations allows the study of mutation because extremely rare polymorphisms are relatively unaffected by the filter of natural selection. We use both methods in Drosophila melanogaster, first generating our own novel data set of sequenced MA lines and performing a meta-analysis of all published MA mutations (∼2000 events) and then identifying a high quality set of ∼70,000 extremely rare (≤0.1%) polymorphisms that are fully validated with resequencing. We use these data sets to precisely measure mutational rates and patterns. Highlights of our results include: a high rate of multinucleotide mutation events at both short (∼5 bp) and long (∼1 kb) genomic distances, showing that mutation drives GC content lower in already GC-poor regions, and using our precise context-dependent mutation rates to predict long-term evolutionary patterns at synonymous sites. We also show that de novo mutations from independent MA experiments display similar patterns of single nucleotide mutation and well match the patterns of mutation found in natural populations.
Collapse
Affiliation(s)
- Zoe June Assaf
- Department of Genetics, Stanford University, Stanford, California 94305, USA.,Department of Biology, Stanford University, Stanford, California 94305, USA
| | - Susanne Tilk
- Department of Biology, Stanford University, Stanford, California 94305, USA
| | - Jane Park
- Department of Biology, Stanford University, Stanford, California 94305, USA
| | - Mark L Siegal
- Department of Biology, New York University, New York, New York 10003, USA
| | - Dmitri A Petrov
- Department of Biology, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
45
|
Inferring the shallow phylogeny of true salamanders (Salamandra) by multiple phylogenomic approaches. Mol Phylogenet Evol 2017; 115:16-26. [DOI: 10.1016/j.ympev.2017.07.009] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2017] [Revised: 05/31/2017] [Accepted: 07/13/2017] [Indexed: 01/31/2023]
|
46
|
Bossert S, Murray EA, Blaimer BB, Danforth BN. The impact of GC bias on phylogenetic accuracy using targeted enrichment phylogenomic data. Mol Phylogenet Evol 2017; 111:149-157. [PMID: 28390323 DOI: 10.1016/j.ympev.2017.03.022] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2017] [Revised: 03/06/2017] [Accepted: 03/24/2017] [Indexed: 01/08/2023]
Abstract
The field of sequence based phylogenetic analyses is currently being transformed by novel hybrid-based targeted enrichment methods, such as the use of ultraconserved elements (UCEs). Rather than analyzing relationships among organisms using a small number of genes, these methods now allow us to evaluate relationships with many hundreds to thousands of individual gene loci. However, the inclusion of thousands of loci does not necessarily overcome the long-standing challenge of incongruence among phylogenetic trees derived from different genes or gene regions. One factor that impacts the level of incongruence in phylogenomic data sets is the level of GC bias. GC rich gene regions are prone to higher recombination rates than AT rich regions, driven by a process referred to as "GC biased gene conversion". As a result, high GC content can be negatively associated with phylogenetic accuracy, but the extent to which this impacts incongruence among UCEs is currently unstudied. We investigated the impact of GC content on phylogeny reconstruction using in silico captured UCE data for the corbiculate bees (Hymenoptera: Apidae). The phylogeny of this group has been the subject of extensive study, and incongruence among gene trees is thought to be a source of phylogenetic error. We conducted coalescent- and concatenation-based analyses of 810 individual gene loci from all 13 currently available bee genomes, including 8 corbiculate taxa. Both coalescent- and concatenation-based methods converged on a single topology for the corbiculate tribes. In contrast to concatenation, the coalescent-based methods revealed significant topological conflict at nodes involving the orchid bees (Euglossini) and honeybees (Apini). Partitioning the loci by GC content reveals decreasing support for the inferred topology with increasing GC bias. Based on the results of this study, we report the first evidence that GC biased gene conversion may contribute to topological incongruence in studies based on ultraconserved elements.
Collapse
Affiliation(s)
- Silas Bossert
- Department of Entomology, Cornell University, Ithaca, New York, USA.
| | | | - Bonnie B Blaimer
- Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - Bryan N Danforth
- Department of Entomology, Cornell University, Ithaca, New York, USA
| |
Collapse
|
47
|
Patterns of cross-contamination in a multispecies population genomic project: detection, quantification, impact, and solutions. BMC Biol 2017; 15:25. [PMID: 28356154 PMCID: PMC5370491 DOI: 10.1186/s12915-017-0366-6] [Citation(s) in RCA: 65] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2016] [Accepted: 03/13/2017] [Indexed: 01/06/2023] Open
Abstract
Background Contamination is a well-known but often neglected problem in molecular biology. Here, we investigated the prevalence of cross-contamination among 446 samples from 116 distinct species of animals, which were processed in the same laboratory and subjected to subcontracted transcriptome sequencing. Results Using cytochrome oxidase 1 as a barcode, we identified a minimum of 782 events of between-species contamination, with approximately 80% of our samples being affected. An analysis of laboratory metadata revealed a strong effect of the sequencing center: nearly all the detected events of between-species contamination involved species that were sent the same day to the same company. We introduce new methods to address the amount of within-species, between-individual contamination, and to correct for this problem when calling genotypes from base read counts. Conclusions We report evidence for pervasive within-species contamination in this data set, and show that classical population genomic statistics, such as synonymous diversity, the ratio of non-synonymous to synonymous diversity, inbreeding coefficient FIT, and Tajima’s D, are sensitive to this problem to various extents. Control analyses suggest that our published results are probably robust to the problem of contamination. Recommendations on how to prevent or avoid contamination in large-scale population genomics/molecular ecology are provided based on this analysis. Electronic supplementary material The online version of this article (doi:10.1186/s12915-017-0366-6) contains supplementary material, which is available to authorized users.
Collapse
|
48
|
Lisachov AP, Trifonov VA, Giovannotti M, Ferguson-Smith MA, Borodin PM. Immunocytological analysis of meiotic recombination in two anole lizards (Squamata, Dactyloidae). COMPARATIVE CYTOGENETICS 2017; 11:129-141. [PMID: 28919954 PMCID: PMC5599703 DOI: 10.3897/compcytogen.v11i1.10916] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2016] [Accepted: 01/16/2017] [Indexed: 05/13/2023]
Abstract
Although the evolutionary importance of meiotic recombination is not disputed, the significance of interspecies differences in the recombination rates and recombination landscapes remains under-appreciated. Recombination rates and distribution of chiasmata have been examined cytologically in many mammalian species, whereas data on other vertebrates are scarce. Immunolocalization of the protein of the synaptonemal complex (SYCP3), centromere proteins and the mismatch-repair protein MLH1 was used, which is associated with the most common type of recombination nodules, to analyze the pattern of meiotic recombination in the male of two species of iguanian lizards, Anolis carolinensis Voigt, 1832 and Deiroptyx coelestinus (Cope, 1862). These species are separated by a relatively long evolutionary history although they retain the ancestral iguanian karyotype. In both species similar and extremely uneven distributions of MLH1 foci along the macrochromosome bivalents were detected: approximately 90% of crossovers were located at the distal 20% of the chromosome arm length. Almost total suppression of recombination in the intermediate and proximal regions of the chromosome arms contradicts the hypothesis that "homogenous recombination" is responsible for the low variation in GC content across the anole genome. It also leads to strong linkage disequilibrium between the genes located in these regions, which may benefit conservation of co-adaptive gene arrays responsible for the ecological adaptations of the anoles.
Collapse
Affiliation(s)
- Artem P. Lisachov
- Institute of Cytology and Genetics, Russian Academy of Sciences, Siberian Branch, Novosibirsk 630090, Russia
| | - Vladimir A. Trifonov
- Institute of Molecular and Cellular Biology, Russian Academy of Sciences, Siberian Branch, Novosibirsk 630090, Russia
- Novosibirsk State University, Novosibirsk 630090, Russia
| | - Massimo Giovannotti
- Dipartimento di Scienze della Vita e dell’Ambiente, Università Politecnica delle Marche, via Brecce Bianche, 60131 Ancona, Italy
| | - Malcolm A. Ferguson-Smith
- Cambridge Resource Centre for Comparative Genomics, Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge CB3 0ES, UK
| | | |
Collapse
|
49
|
Romiguier J, Roux C. Analytical Biases Associated with GC-Content in Molecular Evolution. Front Genet 2017; 8:16. [PMID: 28261263 PMCID: PMC5309256 DOI: 10.3389/fgene.2017.00016] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2016] [Accepted: 02/06/2017] [Indexed: 12/19/2022] Open
Abstract
Molecular evolution is being revolutionized by high-throughput sequencing allowing an increased amount of genome-wide data available for multiple species. While base composition summarized by GC-content is one of the first metrics measured in genomes, its genomic distribution is a frequently neglected feature in downstream analyses based on DNA sequence comparisons. Here, we show how base composition heterogeneity among loci and taxa can bias common molecular evolution analyses such as phylogenetic tree reconstruction, detection of natural selection and estimation of codon usage. We then discuss the biological, technical and methodological causes of these GC-associated biases and suggest approaches to overcome them.
Collapse
Affiliation(s)
- Jonathan Romiguier
- Department of Ecology and Evolution, University of Lausanne Lausanne, Switzerland
| | - Camille Roux
- Department of Ecology and Evolution, University of Lausanne Lausanne, Switzerland
| |
Collapse
|
50
|
Priyam M, Tripathy M, Rai U, Ghorai SM. Tracing the evolutionary lineage of pattern recognition receptor homologues in vertebrates: An insight into reptilian immunity via de novo sequencing of the wall lizard splenic transcriptome. Vet Immunol Immunopathol 2016; 172:26-37. [DOI: 10.1016/j.vetimm.2016.03.002] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2015] [Revised: 03/01/2016] [Accepted: 03/02/2016] [Indexed: 10/22/2022]
|