1
|
Lesack KJ, Wasmuth JD. The impact of FASTQ and alignment read order on structural variant calling from long-read sequencing data. PeerJ 2024; 12:e17101. [PMID: 38500526 PMCID: PMC10946394 DOI: 10.7717/peerj.17101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 02/21/2024] [Indexed: 03/20/2024] Open
Abstract
Background Structural variant (SV) calling from DNA sequencing data has been challenging due to several factors, including the ambiguity of short-read alignments, multiple complex SVs in the same genomic region, and the lack of "truth" datasets for benchmarking. Additionally, caller choice, parameter settings, and alignment method are known to affect SV calling. However, the impact of FASTQ read order on SV calling has not been explored for long-read data. Results Here, we used PacBio DNA sequencing data from 15 Caenorhabditis elegans strains and four Arabidopsis thaliana ecotypes to evaluate the sensitivity of different SV callers on FASTQ read order. Comparisons of variant call format files generated from the original and permutated FASTQ files demonstrated that the order of input data affected the SVs predicted by each caller. In particular, pbsv was highly sensitive to the order of the input data, especially at the highest depths where over 70% of the SV calls generated from pairs of differently ordered FASTQ files were in disagreement. These demonstrate that read order sensitivity is a complex, multifactorial process, as the differences observed both within and between species varied considerably according to the specific combination of aligner, SV caller, and sequencing depth. In addition to the SV callers being sensitive to the input data order, the SAMtools alignment sorting algorithm was identified as a source of variability following read order randomization. Conclusion The results of this study highlight the sensitivity of SV calling on the order of reads encoded in FASTQ files, which has not been recognized in long-read approaches. These findings have implications for the replication of SV studies and the development of consistent SV calling protocols. Our study suggests that researchers should pay attention to the input order sensitivity of read alignment sorting methods when analyzing long-read sequencing data for SV calling, as mitigating a source of variability could facilitate future replication work. These results also raise important questions surrounding the relationship between SV caller read order sensitivity and tool performance. Therefore, tool developers should also consider input order sensitivity as a potential source of variability during the development and benchmarking of new and improved methods for SV calling.
Collapse
Affiliation(s)
- Kyle J. Lesack
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- Host-Parasite Interactions Research Training Network, University of Calgary, Calgary, Alberta, Canada
| | - James D. Wasmuth
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- Host-Parasite Interactions Research Training Network, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
2
|
Ferguson S, Jones A, Murray K, Andrew RL, Schwessinger B, Bothwell H, Borevitz J. Exploring the role of polymorphic interspecies structural variants in reproductive isolation and adaptive divergence in Eucalyptus. Gigascience 2024; 13:giae029. [PMID: 38869149 PMCID: PMC11170218 DOI: 10.1093/gigascience/giae029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2023] [Revised: 03/11/2024] [Accepted: 05/14/2024] [Indexed: 06/14/2024] Open
Abstract
Structural variations (SVs) play a significant role in speciation and adaptation in many species, yet few studies have explored the prevalence and impact of different categories of SVs. We conducted a comparative analysis of long-read assembled reference genomes of closely related Eucalyptus species to identify candidate SVs potentially influencing speciation and adaptation. Interspecies SVs can be either fixed differences or polymorphic in one or both species. To describe SV patterns, we employed short-read whole-genome sequencing on over 600 individuals of Eucalyptus melliodora and Eucalyptus sideroxylon, along with recent high-quality genome assemblies. We aligned reads and genotyped interspecies SVs predicted between species reference genomes. Our results revealed that 49,756 of 58,025 and 39,536 of 47,064 interspecies SVs could be typed with short reads in E. melliodora and E. sideroxylon, respectively. Focusing on inversions and translocations, symmetric SVs that are readily genotyped within both populations, 24 were found to be structural divergences, 2,623 structural polymorphisms, and 928 shared structural polymorphisms. We assessed the functional significance of fixed interspecies SVs by examining differences in estimated recombination rates and genetic differentiation between species, revealing a complex history of natural selection. Shared structural polymorphisms displayed enrichment of potentially adaptive genes. Understanding how different classes of genetic mutations contribute to genetic diversity and reproductive barriers is essential for understanding how organisms enhance fitness, adapt to changing environments, and diversify. Our findings reveal the prevalence of interspecies SVs and elucidate their role in genetic differentiation, adaptive evolution, and species divergence within and between populations.
Collapse
Affiliation(s)
- Scott Ferguson
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, 2600 Australia
| | - Ashley Jones
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, 2600 Australia
| | - Kevin Murray
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, 2600 Australia
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, 72076 Germany
| | - Rose L Andrew
- Botany & N.C.W. Beadle Herbarium, School of Environmental and Rural Science, University of New England, Armidale, NSW 2351, Australia
| | - Benjamin Schwessinger
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, 2600 Australia
| | - Helen Bothwell
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, 2600 Australia
- Warnell School of Forestry & Natural Resources, University of Georgia, Athens 30602 GA, United States
| | - Justin Borevitz
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, 2600 Australia
| |
Collapse
|
3
|
Pokrovac I, Pezer Ž. Recent advances and current challenges in population genomics of structural variation in animals and plants. Front Genet 2022; 13:1060898. [PMID: 36523759 PMCID: PMC9745067 DOI: 10.3389/fgene.2022.1060898] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 11/15/2022] [Indexed: 05/02/2024] Open
Abstract
The field of population genomics has seen a surge of studies on genomic structural variation over the past two decades. These studies witnessed that structural variation is taxonomically ubiquitous and represent a dominant form of genetic variation within species. Recent advances in technology, especially the development of long-read sequencing platforms, have enabled the discovery of structural variants (SVs) in previously inaccessible genomic regions which unlocked additional structural variation for population studies and revealed that more SVs contribute to evolution than previously perceived. An increasing number of studies suggest that SVs of all types and sizes may have a large effect on phenotype and consequently major impact on rapid adaptation, population divergence, and speciation. However, the functional effect of the vast majority of SVs is unknown and the field generally lacks evidence on the phenotypic consequences of most SVs that are suggested to have adaptive potential. Non-human genomes are heavily under-represented in population-scale studies of SVs. We argue that more research on other species is needed to objectively estimate the contribution of SVs to evolution. We discuss technical challenges associated with SV detection and outline the most recent advances towards more representative reference genomes, which opens a new era in population-scale studies of structural variation.
Collapse
Affiliation(s)
| | - Željka Pezer
- Laboratory for Evolutionary Genetics, Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
| |
Collapse
|
4
|
Saitou M, Masuda N, Gokcumen O. Similarity-Based Analysis of Allele Frequency Distribution among Multiple Populations Identifies Adaptive Genomic Structural Variants. Mol Biol Evol 2022; 39:msab313. [PMID: 34718708 PMCID: PMC8896759 DOI: 10.1093/molbev/msab313] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Structural variants have a considerable impact on human genomic diversity. However, their evolutionary history remains mostly unexplored. Here, we developed a new method to identify potentially adaptive structural variants based on a similarity-based analysis that incorporates genotype frequency data from 26 populations simultaneously. Using this method, we analyzed 57,629 structural variants and identified 576 structural variants that show unusual population differentiation. Of these putatively adaptive structural variants, we further showed that 24 variants are multiallelic and overlap with coding sequences, and 20 variants are significantly associated with GWAS traits. Closer inspection of the haplotypic variation associated with these putatively adaptive and functional structural variants reveals deviations from neutral expectations due to: 1) population differentiation of rapidly evolving multiallelic variants, 2) incomplete sweeps, and 3) recent population-specific negative selection. Overall, our study provides new methodological insights, documents hundreds of putatively adaptive variants, and introduces evolutionary models that may better explain the complex evolution of structural variants.
Collapse
Affiliation(s)
- Marie Saitou
- Department of Biological Sciences, University at Buffalo, State University of New York, Buffalo, NY, USA
- Section of Genetic Medicine, Department of Medicine, The University of Chicago, Chicago, IL, USA
| | - Naoki Masuda
- Department of Mathematics, University at Buffalo, State University of New York, Buffalo, NY, USA
- Computational and Data-Enabled Science and Engineering Program, University at Buffalo, State University of New York, Buffalo, NY, USA
| | - Omer Gokcumen
- Department of Biological Sciences, University at Buffalo, State University of New York, Buffalo, NY, USA
| |
Collapse
|
5
|
Gorkovskiy A, Verstrepen KJ. The Role of Structural Variation in Adaptation and Evolution of Yeast and Other Fungi. Genes (Basel) 2021; 12:699. [PMID: 34066718 PMCID: PMC8150848 DOI: 10.3390/genes12050699] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 04/30/2021] [Accepted: 05/04/2021] [Indexed: 01/12/2023] Open
Abstract
Mutations in DNA can be limited to one or a few nucleotides, or encompass larger deletions, insertions, duplications, inversions and translocations that span long stretches of DNA or even full chromosomes. These so-called structural variations (SVs) can alter the gene copy number, modify open reading frames, change regulatory sequences or chromatin structure and thus result in major phenotypic changes. As some of the best-known examples of SV are linked to severe genetic disorders, this type of mutation has traditionally been regarded as negative and of little importance for adaptive evolution. However, the advent of genomic technologies uncovered the ubiquity of SVs even in healthy organisms. Moreover, experimental evolution studies suggest that SV is an important driver of evolution and adaptation to new environments. Here, we provide an overview of the causes and consequences of SV and their role in adaptation, with specific emphasis on fungi since these have proven to be excellent models to study SV.
Collapse
Affiliation(s)
- Anton Gorkovskiy
- Laboratory for Genetics and Genomics, Centre of Microbial and Plant Genetics (CMPG), KU Leuven, Gaston Geenslaan 1, 3001 Leuven, Belgium;
- Laboratory for Systems Biology, VIB—KU Leuven Center for Microbiology, Bio-Incubator, Gaston Geenslaan 1, 3001 Leuven, Belgium
| | - Kevin J. Verstrepen
- Laboratory for Genetics and Genomics, Centre of Microbial and Plant Genetics (CMPG), KU Leuven, Gaston Geenslaan 1, 3001 Leuven, Belgium;
- Laboratory for Systems Biology, VIB—KU Leuven Center for Microbiology, Bio-Incubator, Gaston Geenslaan 1, 3001 Leuven, Belgium
| |
Collapse
|
6
|
Du H, Zheng X, Zhao Q, Hu Z, Wang H, Zhou L, Liu JF. Analysis of Structural Variants Reveal Novel Selective Regions in the Genome of Meishan Pigs by Whole Genome Sequencing. Front Genet 2021; 12:550676. [PMID: 33613628 PMCID: PMC7890942 DOI: 10.3389/fgene.2021.550676] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Accepted: 01/15/2021] [Indexed: 12/17/2022] Open
Abstract
Structural variants (SVs) represent essential forms of genetic variation, and they are associated with various phenotypic traits in a wide range of important livestock species. However, the distribution of SVs in the pig genome has not been fully characterized, and the function of SVs in the economic traits of pig has rarely been studied, especially for most domestic pig breeds. Meishan pig is one of the most famous Chinese domestic pig breeds, with excellent reproductive performance. Here, to explore the genome characters of Meishan pig, we construct an SV map of porcine using whole-genome sequencing data and report 33,698 SVs in 305 individuals of 55 globally distributed pig breeds. We perform selective signature analysis using these SVs, and a number of candidate variants are successfully identified. Especially for the Meishan pig, 64 novel significant selection regions are detected in its genome. A 140-bp deletion in the Indoleamine 2,3-Dioxygenase 2 (IDO2) gene, is shown to be associated with reproduction traits in Meishan pig. In addition, we detect two duplications only existing in Meishan pig. Moreover, the two duplications are separately located in cytochrome P450 family 2 subfamily J member 2 (CYP2J2) gene and phospholipase A2 group IVA (PLA2G4A) gene, which are related to the reproduction trait. Our study provides new insights into the role of selection in SVs' evolution and how SVs contribute to phenotypic variation in pigs.
Collapse
Affiliation(s)
- Heng Du
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Xianrui Zheng
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Qiqi Zhao
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Zhengzheng Hu
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Haifei Wang
- College of Animal Science and Technology, Yangzhou University, Yangzhou, China
| | - Lei Zhou
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Jian-Feng Liu
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, China
| |
Collapse
|
7
|
Swidah R, Auxillos J, Liu W, Jones S, Chan TF, Dai J, Cai Y. SCRaMbLE-in: A Fast and Efficient Method to Diversify and Improve the Yields of Heterologous Pathways in Synthetic Yeast. Methods Mol Biol 2020; 2205:305-327. [PMID: 32809206 DOI: 10.1007/978-1-0716-0908-8_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The synthetic chromosome rearrangement and modification by LoxP-mediated evolution (SCRaMbLE) system is a key component of the synthetic yeast genome (Sc2.0) project, an international effort to construct an entire synthetic genome in yeast. SCRaMbLE involves the introduction of thousands of symmetrical LoxP (LoxPsym) recombination sites downstream of every nonessential gene in all 16 chromosomes, enabling numerous genome rearrangements in the form of deletions, inversions, duplications, and translocations by the Cre-LoxPsym recombination system. We highlight a two-step protocol for SCRaMbLE-in (Liu, Nat Commun 9(1):1936, 2018), a recombinase-based combinatorial method to expedite genetic engineering and exogenous pathway optimization, using a synthetic β-carotene pathway as an example. First, an in vitro phase uses a recombinase toolkit to diversify gene expression by integrating various regulatory elements into the target pathway. This combinatorial pathway library can be transformed directly into yeast for traditional screening. Once an optimized pathway which is flanked by LoxPsym sites is identified, it is transformed into Sc2.0 yeast for the in vivo SCRaMbLE phase, where LoxPsym sites in the synthetic yeast genome and Cre recombinase catalyze massive genome rearrangements. We describe all the conditions necessary to perform SCRaMbLE and post-SCRaMbLE experiments including screening, spot test analysis, and PCRTag analysis to elucidate genotype-phenotype relationships.
Collapse
Affiliation(s)
- Reem Swidah
- Manchester Institute of Biotechnology (MIB), School of Chemistry, The University of Manchester, Manchester, UK
| | - Jamie Auxillos
- Manchester Institute of Biotechnology (MIB), School of Chemistry, The University of Manchester, Manchester, UK
- School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | - Wei Liu
- MRC Laboratory of Molecular Biology, Cambridge, UK
| | - Sally Jones
- Manchester Institute of Biotechnology (MIB), School of Chemistry, The University of Manchester, Manchester, UK
| | - Ting-Fung Chan
- School of Life Sciences and State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Junbiao Dai
- Center for Synthetic Genomics, Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Yizhi Cai
- Manchester Institute of Biotechnology (MIB), School of Chemistry, The University of Manchester, Manchester, UK.
- Center for Synthetic Genomics, Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
| |
Collapse
|
8
|
Polar bear evolution is marked by rapid changes in gene copy number in response to dietary shift. Proc Natl Acad Sci U S A 2019; 116:13446-13451. [PMID: 31209046 DOI: 10.1073/pnas.1901093116] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Polar bear (Ursus maritimus) and brown bear (Ursus arctos) are recently diverged species that inhabit vastly differing habitats. Thus, analysis of the polar bear and brown bear genomes represents a unique opportunity to investigate the evolutionary mechanisms and genetic underpinnings of rapid ecological adaptation in mammals. Copy number (CN) differences in genomic regions between closely related species can underlie adaptive phenotypes and this form of genetic variation has not been explored in the context of polar bear evolution. Here, we analyzed the CN profiles of 17 polar bears, 9 brown bears, and 2 black bears (Ursus americanus). We identified an average of 318 genes per individual that showed evidence of CN variation (CNV). Nearly 200 genes displayed species-specific CN differences between polar bear and brown bear species. Principal component analysis of gene CN provides strong evidence that CNV evolved rapidly in the polar bear lineage and mainly resulted in CN loss. Olfactory receptors composed 47% of CN differentiated genes, with the majority of these genes being at lower CN in the polar bear. Additionally, we found significantly fewer copies of several genes involved in fatty acid metabolism as well as AMY1B, the salivary amylase-encoding gene in the polar bear. These results suggest that natural selection shaped patterns of CNV in response to the transition from an omnivorous to primarily carnivorous diet during polar bear evolution. Our analyses of CNV shed light on the genomic underpinnings of ecological adaptation during polar bear evolution.
Collapse
|
9
|
Ma L, Li Y, Chen X, Ding M, Wu Y, Yuan YJ. SCRaMbLE generates evolved yeasts with increased alkali tolerance. Microb Cell Fact 2019; 18:52. [PMID: 30857530 PMCID: PMC6410612 DOI: 10.1186/s12934-019-1102-4] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2018] [Accepted: 03/04/2019] [Indexed: 11/27/2022] Open
Abstract
Background Strains with increased alkali tolerance have a broad application in industrial, especially for bioremediation, biodegradation, biocontrol and production of bio-based chemicals. A novel synthetic chromosome recombination and modification by LoxP-mediated evolution (SCRaMbLE) system has been introduced in the synthetic yeast genome (Sc 2.0), which enables generation of a yeast library with massive structural variations and potentially drives phenotypic evolution. The structural variations including deletion, inversion and duplication have been detected within synthetic yeast chromosomes. Results Haploid yeast strains harboring either one (synV) or two (synV and synX) synthetic chromosomes were subjected to SCRaMbLE. Seven of evolved strains with increased alkali tolerance at pH 8.0 were generated through multiple independent SCRaMbLE experiments. Various of structural variations were detected in evolved yeast strains by PCRTag analysis and whole genome sequencing including two complex structural variations. One possessed an inversion of 20,743 base pairs within which YEL060C (PRB1) was deleted simultaneously, while another contained a duplication region of 9091 base pairs in length with a deletion aside. Moreover, a common deletion region with length of 11,448 base pairs was mapped in four of the alkali-tolerant strains. We further validated that the deletion of YER161C (SPT2) within the deleted region could increase alkali tolerance in Saccharomyces cerevisiae. Conclusions SCRaMbLE system provides a simple and efficient way to generate evolved yeast strains with enhanced alkali tolerance. Deletion of YER161C (SPT2) mapped by SCRaMbLE can improve alkali tolerance in S. cerevisiae. This study enriches our understanding of alkali tolerance in yeast and provides a standard workflow for the application of SCRaMbLE system to generate various phenotypes that may be interesting for industry and extend understanding of phenotype-genotype relationship. Electronic supplementary material The online version of this article (10.1186/s12934-019-1102-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Lu Ma
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, 300072, China.,Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, China
| | - Yunxiang Li
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, 300072, China.,Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, China
| | - Xinyu Chen
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, 300072, China.,Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, China
| | - Mingzhu Ding
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, 300072, China.,Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, China
| | - Yi Wu
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, 300072, China. .,Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, China.
| | - Ying-Jin Yuan
- Frontier Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, 300072, China.,Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, China
| |
Collapse
|
10
|
Pezer Ž, Chung AG, Karn RC, Laukaitis CM. Analysis of Copy Number Variation in the Abp Gene Regions of Two House Mouse Subspecies Suggests Divergence during the Gene Family Expansions. Genome Biol Evol 2018; 9:3858091. [PMID: 28575204 PMCID: PMC5513543 DOI: 10.1093/gbe/evx099] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/26/2017] [Indexed: 12/26/2022] Open
Abstract
The Androgen-binding protein (Abp) gene region of the mouse genome contains 64 genes, some encoding pheromones that influence assortative mating between mice from different subspecies. Using CNVnator and quantitative PCR, we explored copy number variation in this gene family in natural populations of Mus musculus domesticus (Mmd) and Mus musculus musculus (Mmm), two subspecies of house mice that form a narrow hybrid zone in Central Europe. We found that copy number variation in the center of the Abp gene region is very common in wild Mmd, primarily representing the presence/absence of the final duplications described for the mouse genome. Clustering of Mmd individuals based on this variation did not reflect their geographical origin, suggesting no population divergence in the Abp gene cluster. However, copy number variation patterns differ substantially between Mmd and other mouse taxa. Large blocks of Abp genes are absent in Mmm, Mus musculus castaneus and an outgroup, Mus spretus, although with differences in variation and breakpoint locations. Our analysis calls into question the reliance on a reference genome for interpreting the detailed organization of genes in taxa more distant from the Mmd reference genome. The polymorphic nature of the gene family expansion in all four taxa suggests that the number of Abp genes, especially in the central gene region, is not critical to the survival and reproduction of the mouse. However, Abp haplotypes of variable length may serve as a source of raw genetic material for new signals influencing reproductive communication and thus speciation of mice.
Collapse
Affiliation(s)
- Željka Pezer
- Max Planck Institute for Evolutionary Biology, Plön, Germany.,Ruđer Bošković Institute, Zagreb, Croatia
| | - Amanda G Chung
- Department of Medicine, College of Medicine, University of Arizona
| | - Robert C Karn
- Department of Medicine, College of Medicine, University of Arizona
| | | |
Collapse
|
11
|
Fan S, Hansen MEB, Lo Y, Tishkoff SA. Going global by adapting local: A review of recent human adaptation. Science 2017; 354:54-59. [PMID: 27846491 DOI: 10.1126/science.aaf5098] [Citation(s) in RCA: 200] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
The spread of modern humans across the globe has led to genetic adaptations to diverse local environments. Recent developments in genomic technologies, statistical analyses, and expanded sampled populations have led to improved identification and fine-mapping of genetic variants associated with adaptations to regional living conditions and dietary practices. Ongoing efforts in sequencing genomes of indigenous populations, accompanied by the growing availability of "-omics" and ancient DNA data, promises a new era in our understanding of recent human evolution and the origins of variable traits and disease risks.
Collapse
Affiliation(s)
- Shaohua Fan
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Matthew E B Hansen
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Yancy Lo
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA.,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Sarah A Tishkoff
- Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA. .,Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
12
|
Olsson M, Kierczak M, Karlsson Å, Jabłońska J, Leegwater P, Koltookian M, Abadie J, De Citres CD, Thomas A, Hedhammar Å, Tintle L, Lindblad-Toh K, Meadows JRS. Absolute quantification reveals the stable transmission of a high copy number variant linked to autoinflammatory disease. BMC Genomics 2016; 17:299. [PMID: 27107962 PMCID: PMC4841964 DOI: 10.1186/s12864-016-2619-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2015] [Accepted: 04/13/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Dissecting the role copy number variants (CNVs) play in disease pathogenesis is directly reliant on accurate methods for quantification. The Shar-Pei dog breed is predisposed to a complex autoinflammatory disease with numerous clinical manifestations. One such sign, recurrent fever, was previously shown to be significantly associated with a novel, but unstable CNV (CNV_16.1). Droplet digital PCR (ddPCR) offers a new mechanism for CNV detection via absolute quantification with the promise of added precision and reliability. The aim of this study was to evaluate ddPCR in relation to quantitative PCR (qPCR) and to assess the suitability of the favoured method as a genetic test for Shar-Pei Autoinflammatory Disease (SPAID). RESULTS One hundred and ninety-six individuals were assayed using both PCR methods at two CNV positions (CNV_14.3 and CNV_16.1). The digital method revealed a striking result. The CNVs did not follow a continuum of alleles as previously reported, rather the alleles were stable and pedigree analysis showed they adhered to Mendelian segregation. Subsequent analysis of ddPCR case/control data confirmed that both CNVs remained significantly associated with the subphenotype of fever, but also to the encompassing SPAID complex (p < 0.001). In addition, harbouring CNV_16.1 allele five (CNV_16.1|5) resulted in a four-fold increase in the odds for SPAID (p < 0.001). The inclusion of a genetic marker for CNV_16.1 in a genome-wide association test revealed that this variant explained 9.7 % of genetic variance and 25.8 % of the additive genetic heritability of this autoinflammatory disease. CONCLUSIONS This data shows the utility of the ddPCR method to resolve cryptic copy number inheritance patterns and so open avenues of genetic testing. In its current form, the ddPCR test presented here could be used in canine breeding to reduce the number of homozygote CNV_16.1|5 individuals and thereby to reduce the prevalence of disease in this breed.
Collapse
Affiliation(s)
- M Olsson
- Department of Medicine, Rheumatology Unit, Karolinska Institute, Stockholm, Sweden
| | - M Kierczak
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Å Karlsson
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - J Jabłońska
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - P Leegwater
- Department of Clinical Sciences of Companion Animals, Utrecht University, Utrecht, Netherlands
| | - M Koltookian
- Broad Institute of MIT and Harvard, Boston, MA, USA
| | - J Abadie
- LUNAM University, Oniris, AMaROC Unit, Nantes, F-44307, France
| | | | - A Thomas
- ANTAGENE Animal Genetics Laboratory, La Tour de Salvagny, Lyon, 69, France
| | - Å Hedhammar
- Department of Clinical Sciences, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - L Tintle
- Wurtsboro Veterinary Clinic, Wurtsboro, New York, USA
| | - K Lindblad-Toh
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.,Broad Institute of MIT and Harvard, Boston, MA, USA
| | - J R S Meadows
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
| |
Collapse
|
13
|
Phan L, Hsu J, Tri LQM, Willi M, Mansour T, Kai Y, Garner J, Lopez J, Busby B. dbVar structural variant cluster set for data analysis and variant comparison. F1000Res 2016; 5:673. [PMID: 28357035 PMCID: PMC5345777 DOI: 10.12688/f1000research.8290.2] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/01/2016] [Indexed: 11/20/2022] Open
Abstract
dbVar houses over 3 million submitted structural variants (SSV) from 120 human studies including copy number variations (CNV), insertions, deletions, inversions, translocations, and complex chromosomal rearrangements. Users can submit multiple SSVs to dbVAR that are presumably identical, but were ascertained by different platforms and samples, to calculate whether the variant is rare or common in the population and allow for cross validation. However, because SSV genomic location reporting can vary – including fuzzy locations where the start and/or end points are not precisely known – analysis, comparison, annotation, and reporting of SSVs across studies can be difficult. This project was initiated by the Structural Variant Comparison Group for the purpose of generating a non-redundant set of genomic regions defined by counts of concordance for all human SSVs placed on RefSeq assembly GRCh38 (RefSeq accession GCF_000001405.26). We intend that the availability of these regions, called structural variant clusters (SVCs), will facilitate the analysis, annotation, and exchange of SV data and allow for simplified display in genomic sequence viewers for improved variant interpretation. Sets of SVCs were generated by variant type for each of the 120 studies as well as for a combined set across all studies. Starting from 3.64 million SSVs, 2.5 million and 3.4 million non-redundant SVCs with count >=1 were generated by variant type for each study and across all studies, respectively. In addition, we have developed utilities for annotating, searching, and filtering SVC data in GVF format for computing summary statistics, exporting data for genomic viewers, and annotating the SVC using external data sources.
Collapse
Affiliation(s)
- Lon Phan
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Jeffrey Hsu
- Cleveland Clinic Lerner Research Institute, Cleveland, OH, USA
| | - Le Quang Minh Tri
- Department of Biotechnology, Ho Chi Minh City International University, Ho Chi Minh, Vietnam
| | - Michaela Willi
- Laboratory of Genetics and Physiology, National Institute of Diabetes, Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MA, USA; Division of Bioinformatics, Biocenter, Medical University Innsbruck, Innsbruck, Austria
| | - Tamer Mansour
- Lab for Data Intensive Biology, Department of Population Health and Reproduction, University of California, Davis, CA, USA; Department of Clinical Pathology, University of Mansoura, Mansoura, Egypt
| | - Yan Kai
- Cancer Epigenetics Laboratory, Department of Anatomy and Regenerative Biology, The George Washington University, Washington, DC, USA; Department of Physics, The George Washington University, Washington, DC, USA
| | - John Garner
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - John Lopez
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Ben Busby
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
14
|
Affiliation(s)
- Mario Cáceres
- Institució Catalana de Recerca i Estudis Avançats (ICREA) and Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain
| |
Collapse
|