1
|
López-Cortegano E, Chebib J, Jonas A, Vock A, Künzel S, Keightley PD, Tautz D. The rate and spectrum of new mutations in mice inferred by long-read sequencing. Genome Res 2025; 35:43-54. [PMID: 39622636 PMCID: PMC11789640 DOI: 10.1101/gr.279982.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2024] [Accepted: 11/26/2024] [Indexed: 01/12/2025]
Abstract
All forms of genetic variation originate from new mutations, making it crucial to understand their rates and mechanisms. Here, we use long-read sequencing from Pacific Biosciences (PacBio) to investigate de novo mutations that accumulated in 12 inbred mouse lines derived from three commonly used inbred strains (C3H, C57BL/6, and FVB) maintained for 8 to 15 generations in a mutation accumulation (MA) experiment. We built chromosome-level genome assemblies based on the MA line founders' genomes and then employed a combination of read and assembly-based methods to call the complete spectrum of new mutations. On average, there are about 45 mutations per haploid genome per generation, about half of which (54%) are insertions and deletions shorter than 50 bp (indels). The remainder are single-nucleotide mutations (SNMs; 44%) and large structural mutations (SMs; 2%). We found that the degree of DNA repetitiveness is positively correlated with SNM and indel rates and that a substantial fraction of SMs can be explained by homology-dependent mechanisms associated with repeat sequences. Most (90%) indels can be attributed to microsatellite contractions and expansions, and there is a marked bias toward 4 bp indels. Among the different types of SMs, tandem repeat mutations have the highest mutation rate, followed by insertions of transposable elements (TEs). We uncover a rich landscape of active TEs, notable differences in their spectrum among MA lines and strains, and a high rate of gene retroposition. Our study offers novel insights into mammalian genome evolution and highlights the importance of repetitive elements in shaping genomic diversity.
Collapse
Affiliation(s)
- Eugenio López-Cortegano
- Institute of Ecology and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom;
| | - Jobran Chebib
- Institute of Ecology and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
| | - Anika Jonas
- Department for Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| | - Anastasia Vock
- Department for Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| | - Sven Künzel
- Department for Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| | - Peter D Keightley
- Institute of Ecology and Evolution, University of Edinburgh, Edinburgh EH9 3FL, United Kingdom
| | - Diethard Tautz
- Department for Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| |
Collapse
|
2
|
Zhang W, Guenther A, Gao Y, Ullrich K, Huettel B, Ahmad A, Duan L, Wei K, Tautz D. Full-length RNA transcript sequencing traces brain isoform diversity in house mouse natural populations. Genome Res 2024; 34:2118-2132. [PMID: 39288994 PMCID: PMC11610456 DOI: 10.1101/gr.279166.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Accepted: 09/10/2024] [Indexed: 09/19/2024]
Abstract
The ability to generate multiple RNA transcript isoforms from the same gene is a general phenomenon in eukaryotes. However, the complexity and diversity of alternative isoforms in natural populations remain largely unexplored. Using a newly developed full-length transcript enrichment protocol with 5' CAP selection, we sequenced full-length RNA transcripts of 48 individuals from outbred populations and subspecies of Mus musculus, and from the closely related sister species Mus spretus and Mus spicilegus as outgroups. The data set represents the most extensive full-length high-quality isoform catalog at the population level to date. In total, we reliably identify 117,728 distinct isoforms, of which only 51% were previously annotated. We show that the population-specific distribution pattern of isoforms is phylogenetically informative and reflects the segregating single nucleotide polymorphism (SNP) diversity between the populations. We find that ancient housekeeping genes are a major source of the overall isoform diversity, and that the generation of alternative first exons plays a major role in generating new isoforms. Given that our data allow us to distinguish between population-specific isoforms and isoforms that are conserved across multiple populations, it is possible to refine the annotation of the reference mouse genome to a set of about 40,000 isoforms that should be most relevant for comparative functional analysis across species.
Collapse
Affiliation(s)
- Wenyu Zhang
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an 710129, China;
- Research and Development Institute of Northwestern Polytechnical University in Shenzhen, Shenzhen 518063, China
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Ploen 24306, Germany
| | - Anja Guenther
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Ploen 24306, Germany
- Research Group Behavioral Ecology of Individual Differences, Max Planck Institute for Evolutionary Biology, Ploen 24306, Germany
| | - Yuanxiao Gao
- School of Mathematics and Data Science, Shaanxi University of Science and Technology, Xi'an 710021, China
| | - Kristian Ullrich
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Ploen 24306, Germany
| | - Bruno Huettel
- Max-Planck-Genome-Centre Cologne, MPI for Plant Breeding Research, Cologne 50829, Germany
| | - Aftab Ahmad
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an 710129, China
| | - Lei Duan
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an 710129, China
| | - Kaizong Wei
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an 710129, China
| | - Diethard Tautz
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Ploen 24306, Germany;
| |
Collapse
|
3
|
Castellanos MDP, Wickramasinghe CD, Betrán E. The roles of gene duplications in the dynamics of evolutionary conflicts. Proc Biol Sci 2024; 291:20240555. [PMID: 38865605 DOI: 10.1098/rspb.2024.0555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 04/02/2024] [Indexed: 06/14/2024] Open
Abstract
Evolutionary conflicts occur when there is antagonistic selection between different individuals of the same or different species, life stages or between levels of biological organization. Remarkably, conflicts can occur within species or within genomes. In the dynamics of evolutionary conflicts, gene duplications can play a major role because they can bring very specific changes to the genome: changes in protein dose, the generation of novel paralogues with different functions or expression patterns or the evolution of small antisense RNAs. As we describe here, by having those effects, gene duplication might spark evolutionary conflict or fuel arms race dynamics that takes place during conflicts. Interestingly, gene duplication can also contribute to the resolution of a within-locus evolutionary conflict by partitioning the functions of the gene that is under an evolutionary trade-off. In this review, we focus on intraspecific conflicts, including sexual conflict and illustrate the various roles of gene duplications with a compilation of examples. These examples reveal the level of complexity and the differences in the patterns of gene duplications within genomes under different conflicts. These examples also reveal the gene ontologies involved in conflict and the genomic location of the elements of the conflict. The examples provide a blueprint for the direct study of these conflicts or the exploration of the presence of similar conflicts in other lineages.
Collapse
Affiliation(s)
| | | | - Esther Betrán
- Department of Biology, University of Texas at Arlington , Arlington, TX 76019, USA
| |
Collapse
|
4
|
Yan Y, Tian Y, Wu Z, Zhang K, Yang R. Interchromosomal Colocalization with Parental Genes Is Linked to the Function and Evolution of Mammalian Retrocopies. Mol Biol Evol 2023; 40:msad265. [PMID: 38060983 PMCID: PMC10733166 DOI: 10.1093/molbev/msad265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 10/25/2023] [Accepted: 11/29/2023] [Indexed: 12/22/2023] Open
Abstract
Retrocopies are gene duplicates arising from reverse transcription of mature mRNA transcripts and their insertion back into the genome. While long being regarded as processed pseudogenes, more and more functional retrocopies have been discovered. How the stripped-down retrocopies recover expression capability and become functional paralogs continually intrigues evolutionary biologists. Here, we investigated the function and evolution of retrocopies in the context of 3D genome organization. By mapping retrocopy-parent pairs onto sequencing-based and imaging-based chromatin contact maps in human and mouse cell lines and onto Hi-C interaction maps in 5 other mammals, we found that retrocopies and their parental genes show a higher-than-expected interchromosomal colocalization frequency. The spatial interactions between retrocopies and parental genes occur frequently at loci in active subcompartments and near nuclear speckles. Accordingly, colocalized retrocopies are more actively transcribed and translated and are more evolutionarily conserved than noncolocalized ones. The active transcription of colocalized retrocopies may result from their permissive epigenetic environment and shared regulatory elements with parental genes. Population genetic analysis of retroposed gene copy number variants in human populations revealed that retrocopy insertions are not entirely random in regard to interchromosomal interactions and that colocalized retroposed gene copy number variants are more likely to reach high frequencies, suggesting that both insertion bias and natural selection contribute to the colocalization of retrocopy-parent pairs. Further dissection implies that reduced selection efficacy, rather than positive selection, contributes to the elevated allele frequency of colocalized retroposed gene copy number variants. Overall, our results hint a role of interchromosomal colocalization in the "resurrection" of initially neutral retrocopies.
Collapse
Affiliation(s)
- Yubin Yan
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Yuhan Tian
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Zefeng Wu
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Kunling Zhang
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Ruolin Yang
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| |
Collapse
|
5
|
Ma H, Wang M, Zhang YE, Tan S. The power of "controllers": Transposon-mediated duplicated genes evolve towards neofunctionalization. J Genet Genomics 2023; 50:462-472. [PMID: 37068629 DOI: 10.1016/j.jgg.2023.04.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 04/04/2023] [Accepted: 04/05/2023] [Indexed: 04/19/2023]
Abstract
Since the discovery of the first transposon by Dr. Barbara McClintock, the prevalence and diversity of transposable elements (TEs) have been gradually recognized. As fundamental genetic components, TEs drive organismal evolution not only by contributing functional sequences (e.g., regulatory elements or "controllers" as phrased by Dr. McClintock) but also by shuffling genomic sequences. In the latter respect, TE-mediated gene duplications have contributed to the origination of new genes and attracted extensive interest. In response to the development of this field, we herein attempt to provide an overview of TE-mediated duplication by focusing on common rules emerging across duplications generated by different TE types. Specifically, despite the huge divergence of transposition machinery across TEs, we identify three common features of various TE-mediated duplication mechanisms, including end bypass, template switching, and recurrent transposition. These three features lead to one common functional outcome, namely, TE-mediated duplicates tend to be subjected to exon shuffling and neofunctionalization. Therefore, the intrinsic properties of the mutational mechanism constrain the evolutionary trajectories of these duplicates. We finally discuss the future of this field including an in-depth characterization of both the duplication mechanisms and functions of TE-mediated duplicates.
Collapse
Affiliation(s)
- Huijing Ma
- Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Mengxia Wang
- Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yong E Zhang
- Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China; CAS Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, Yunnan 650223, China; Chinese Institute for Brain Research, Beijing 102206, China.
| | - Shengjun Tan
- Key Laboratory of Zoological Systematics and Evolution & State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China.
| |
Collapse
|
6
|
Pokrovac I, Pezer Ž. Recent advances and current challenges in population genomics of structural variation in animals and plants. Front Genet 2022; 13:1060898. [PMID: 36523759 PMCID: PMC9745067 DOI: 10.3389/fgene.2022.1060898] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 11/15/2022] [Indexed: 05/02/2024] Open
Abstract
The field of population genomics has seen a surge of studies on genomic structural variation over the past two decades. These studies witnessed that structural variation is taxonomically ubiquitous and represent a dominant form of genetic variation within species. Recent advances in technology, especially the development of long-read sequencing platforms, have enabled the discovery of structural variants (SVs) in previously inaccessible genomic regions which unlocked additional structural variation for population studies and revealed that more SVs contribute to evolution than previously perceived. An increasing number of studies suggest that SVs of all types and sizes may have a large effect on phenotype and consequently major impact on rapid adaptation, population divergence, and speciation. However, the functional effect of the vast majority of SVs is unknown and the field generally lacks evidence on the phenotypic consequences of most SVs that are suggested to have adaptive potential. Non-human genomes are heavily under-represented in population-scale studies of SVs. We argue that more research on other species is needed to objectively estimate the contribution of SVs to evolution. We discuss technical challenges associated with SV detection and outline the most recent advances towards more representative reference genomes, which opens a new era in population-scale studies of structural variation.
Collapse
Affiliation(s)
| | - Željka Pezer
- Laboratory for Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
| |
Collapse
|
7
|
Chen J, Zhong J, He X, Li X, Ni P, Safner T, Šprem N, Han J. The de novo assembly of a European wild boar genome revealed unique patterns of chromosomal structural variations and segmental duplications. Anim Genet 2022; 53:281-292. [PMID: 35238061 PMCID: PMC9314987 DOI: 10.1111/age.13181] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 02/12/2022] [Accepted: 02/12/2022] [Indexed: 02/05/2023]
Abstract
The rapid progress of sequencing technology has greatly facilitated the de novo genome assembly of pig breeds. However, the assembly of the wild boar genome is still lacking, hampering our understanding of chromosomal and genomic evolution during domestication from wild boars into domestic pigs. Here, we sequenced and de novo assembled a European wild boar genome (ASM2165605v1) using the long‐range information provided by 10× Linked‐Reads sequencing. We achieved a high‐quality assembly with contig N50 of 26.09 Mb. Additionally, 1.64% of the contigs (222) with lengths from 107.65 kb to 75.36 Mb covered 90.3% of the total genome size of ASM2165605v1 (~2.5 Gb). Mapping analysis revealed that the contigs can fill 24.73% (93/376) of the gaps present in the orthologous regions of the updated pig reference genome (Sscrofa11.1). We further improved the contigs into chromosome level with a reference‐assistant scaffolding method. Using the ‘assembly‐to‐assembly’ approach, we identified intra‐chromosomal large structural variations (SVs, length >1 kb) between ASM2165605v1 and Sscrofa11.1 assemblies. Interestingly, we found that the number of SV events on the X chromosome deviated significantly from the linear models fitting autosomes (R2 > 0.64, p < 0.001). Specifically, deletions and insertions were deficient on the X chromosome by 66.14 and 58.41% respectively, whereas duplications and inversions were excessive on the X chromosome by 71.96 and 107.61% respectively. We further used the large segmental duplications (SDs, >1 kb) events as a proxy to understand the large‐scale inter‐chromosomal evolution, by resolving parental‐derived relationships for SD pairs. We revealed a significant excess of SD movements from the X chromosome to autosomes (p < 0.001), consistent with the expectation of meiotic sex chromosome inactivation. Enrichment analyses indicated that the genes within derived SD copies on autosomes were significantly related to biological processes involving nervous system, lipid biosynthesis and sperm motility (p < 0.01). Together, our analyses of the de novo assembly of ASM2165605v1 provides insight into the SVs between European wild boar and domestic pig, in addition to the ongoing process of meiotic sex chromosome inactivation in driving inter‐chromosomal interaction between the sex chromosome and autosomes.
Collapse
Affiliation(s)
- Jianhai Chen
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
| | - Jie Zhong
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
| | - Xuefei He
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
| | - Xiaoyu Li
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
| | - Pan Ni
- Animal Husbandry and Veterinary Institute of Keqiao District, Shaoxing, Zhejiang, China
| | - Toni Safner
- Faculty of Agriculture, University of Zagreb, Zagreb, Croatia.,Centre of Excellence for Biodiversity and Molecular Plant Breeding, (CoE CroP-BioDiv), Zagreb, Croatia
| | - Nikica Šprem
- Faculty of Agriculture, University of Zagreb, Zagreb, Croatia
| | - Jianlin Han
- International Livestock Research Institute, Nairobi, Kenya.,CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| |
Collapse
|