201
|
Mendivil Ramos O, Ferrier DEK. Mechanisms of Gene Duplication and Translocation and Progress towards Understanding Their Relative Contributions to Animal Genome Evolution. INTERNATIONAL JOURNAL OF EVOLUTIONARY BIOLOGY 2012; 2012:846421. [PMID: 22919542 PMCID: PMC3420103 DOI: 10.1155/2012/846421] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/26/2012] [Revised: 05/30/2012] [Accepted: 06/27/2012] [Indexed: 01/10/2023]
Abstract
Duplication of genetic material is clearly a major route to genetic change, with consequences for both evolution and disease. A variety of forms and mechanisms of duplication are recognised, operating across the scales of a few base pairs upto entire genomes. With the ever-increasing amounts of gene and genome sequence data that are becoming available, our understanding of the extent of duplication is greatly improving, both in terms of the scales of duplication events as well as their rates of occurrence. An accurate understanding of these processes is vital if we are to properly understand important events in evolution as well as mechanisms operating at the level of genome organisation. Here we will focus on duplication in animal genomes and how the duplicated sequences are distributed, with the aim of maintaining a focus on principles of evolution and organisation that are most directly applicable to the shaping of our own genome.
Collapse
Affiliation(s)
| | - David E. K. Ferrier
- The Scottish Oceans Institute, School of Biology, University of St Andrews, East Sands, Fife KY16 8LB, UK
| |
Collapse
|
202
|
Du R, Lu C, Jiang Z, Li S, Ma R, An H, Xu M, An Y, Xia Y, Jin L, Wang X, Zhang F. Efficient typing of copy number variations in a segmental duplication-mediated rearrangement hotspot using multiplex competitive amplification. J Hum Genet 2012; 57:545-551. [PMID: 22673690 DOI: 10.1038/jhg.2012.66] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Local genomic architecture, such as segmental duplications (SDs), can induce copy number variations (CNVs) hotspots in the human genome, many of which manifest as genomic disorders. Significant technological advances have been achieved for genome-wide CNV investigations, but these costly methods are not suitable for genotyping certain disease-associated CNVs or other loci of interest in populations. Recently, two independent studies showed that the murine meiosis expressed gene 1 (Meig1) was critical to spermatogenesis. We found that the human orthologue MEIG1 is flanked by an SD pair, between which non-allelic homologous recombination (NAHR) can cause recurrent CNVs. To study this potential CNV hotspot and its role in spermatogenesis, we developed a new CNV genotyping method, AccuCopy, based on multiplex competitive amplification to investigate 320 patients with spermatogenic impairment and 93 healthy controls. Three MEIG1 duplications (two in patients and one in controls) were identified, whereas no deletion was found. As NAHR results in more recurrent deletions than duplications at a locus, the over representation of recurrent MEIG1 duplications suggests a potential purifying selection operating on this hotspot, possibly via fecundity. We also showed that AccuCopy is an efficient and reliable method for multiplex CNV genotyping.
Collapse
Affiliation(s)
- Renqian Du
- MOE Key Laboratory of Contemporary Anthropology and State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
203
|
Carrigan MA, Uryasev O, Davis RP, Zhai L, Hurley TD, Benner SA. The natural history of class I primate alcohol dehydrogenases includes gene duplication, gene loss, and gene conversion. PLoS One 2012; 7:e41175. [PMID: 22859968 PMCID: PMC3409193 DOI: 10.1371/journal.pone.0041175] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2012] [Accepted: 06/18/2012] [Indexed: 01/29/2023] Open
Abstract
BACKGROUND Gene duplication is a source of molecular innovation throughout evolution. However, even with massive amounts of genome sequence data, correlating gene duplication with speciation and other events in natural history can be difficult. This is especially true in its most interesting cases, where rapid and multiple duplications are likely to reflect adaptation to rapidly changing environments and life styles. This may be so for Class I of alcohol dehydrogenases (ADH1s), where multiple duplications occurred in primate lineages in Old and New World monkeys (OWMs and NWMs) and hominoids. METHODOLOGY/PRINCIPAL FINDINGS To build a preferred model for the natural history of ADH1s, we determined the sequences of nine new ADH1 genes, finding for the first time multiple paralogs in various prosimians (lemurs, strepsirhines). Database mining then identified novel ADH1 paralogs in both macaque (an OWM) and marmoset (a NWM). These were used with the previously identified human paralogs to resolve controversies relating to dates of duplication and gene conversion in the ADH1 family. Central to these controversies are differences in the topologies of trees generated from exonic (coding) sequences and intronic sequences. CONCLUSIONS/SIGNIFICANCE We provide evidence that gene conversions are the primary source of difference, using molecular clock dating of duplications and analyses of microinsertions and deletions (micro-indels). The tree topology inferred from intron sequences appear to more correctly represent the natural history of ADH1s, with the ADH1 paralogs in platyrrhines (NWMs) and catarrhines (OWMs and hominoids) having arisen by duplications shortly predating the divergence of OWMs and NWMs. We also conclude that paralogs in lemurs arose independently. Finally, we identify errors in database interpretation as the source of controversies concerning gene conversion. These analyses provide a model for the natural history of ADH1s that posits four ADH1 paralogs in the ancestor of Catarrhine and Platyrrhine primates, followed by the loss of an ADH1 paralog in the human lineage.
Collapse
Affiliation(s)
- Matthew A Carrigan
- Foundation for Applied Molecular Evolution, Gainesville, Florida, United States of America.
| | | | | | | | | | | |
Collapse
|
204
|
Mácha J, Teichmanová R, Sater AK, Wells DE, Tlapáková T, Zimmerman LB, Krylov V. Deep ancestry of mammalian X chromosome revealed by comparison with the basal tetrapod Xenopus tropicalis. BMC Genomics 2012; 13:315. [PMID: 22800176 PMCID: PMC3472169 DOI: 10.1186/1471-2164-13-315] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2012] [Accepted: 06/25/2012] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND The X and Y sex chromosomes are conspicuous features of placental mammal genomes. Mammalian sex chromosomes arose from an ordinary pair of autosomes after the proto-Y acquired a male-determining gene and degenerated due to suppression of X-Y recombination. Analysis of earlier steps in X chromosome evolution has been hampered by the long interval between the origins of teleost and amniote lineages as well as scarcity of X chromosome orthologs in incomplete avian genome assemblies. RESULTS This study clarifies the genesis and remodelling of the Eutherian X chromosome by using a combination of sequence analysis, meiotic map information, and cytogenetic localization to compare amniote genome organization with that of the amphibian Xenopus tropicalis. Nearly all orthologs of human X genes localize to X. tropicalis chromosomes 2 and 8, consistent with an ancestral X-conserved region and a single X-added region precursor. This finding contradicts a previous hypothesis of three evolutionary strata in this region. Homologies between human, opossum, chicken and frog chromosomes suggest a single X-added region predecessor in therian mammals, corresponding to opossum chromosomes 4 and 7. A more ancient X-added ancestral region, currently extant as a major part of chicken chromosome 1, is likely to have been present in the progenitor of synapsids and sauropsids. Analysis of X chromosome gene content emphasizes conservation of single protein coding genes and the role of tandem arrays in formation of novel genes. CONCLUSIONS Chromosomal regions orthologous to Therian X chromosomes have been located in the genome of the frog X. tropicalis. These X chromosome ancestral components experienced a series of fusion and breakage events to give rise to avian autosomes and mammalian sex chromosomes. The early branching tetrapod X. tropicalis' simple diploid genome and robust synteny to amniotes greatly enhances studies of vertebrate chromosome evolution.
Collapse
Affiliation(s)
- Jaroslav Mácha
- Department of Cell Biology, Faculty of Science, Charles University in Prague, Vinicna 7, Prague 2, Czech Republic
| | - Radka Teichmanová
- Department of Cell Biology, Faculty of Science, Charles University in Prague, Vinicna 7, Prague 2, Czech Republic
| | - Amy K Sater
- Department of Biology and Biochemistry, University of Houston, Houston, TX, 77204-5001, USA
| | - Dan E Wells
- Department of Biology and Biochemistry, University of Houston, Houston, TX, 77204-5001, USA
| | - Tereza Tlapáková
- Department of Cell Biology, Faculty of Science, Charles University in Prague, Vinicna 7, Prague 2, Czech Republic
| | - Lyle B Zimmerman
- Division of Developmental Biology, MRC-National Institute for Medical Research, Mill Hill, London, NW7 1AA, UK
| | - Vladimír Krylov
- Department of Cell Biology, Faculty of Science, Charles University in Prague, Vinicna 7, Prague 2, Czech Republic
| |
Collapse
|
205
|
Mannaert A, Downing T, Imamura H, Dujardin JC. Adaptive mechanisms in pathogens: universal aneuploidy in Leishmania. Trends Parasitol 2012; 28:370-6. [PMID: 22789456 DOI: 10.1016/j.pt.2012.06.003] [Citation(s) in RCA: 100] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2012] [Revised: 06/14/2012] [Accepted: 06/14/2012] [Indexed: 02/07/2023]
Abstract
Genomic stability and maintenance of the correct chromosome number are assumed to be essential for normal development in eukaryotes. Aneuploidy is usually associated with severe abnormalities and decrease of cell fitness, but some organisms appear to rely on aneuploidy for rapid adaptation to changing environments. This phenomenon is mostly described in pathogenic fungi and cancer cells. However, recent genome studies highlight the importance of Leishmania as a new model for studies on aneuploidy. Several reports revealed extensive variation in chromosome copy number, indicating that aneuploidy is a constitutive feature of this protozoan parasite genus. Aneuploidy appears to be beneficial in organisms that are primarily asexual, unicellular, and that undergo sporadic epidemic expansions, including common pathogens as well as cancer.
Collapse
Affiliation(s)
- An Mannaert
- Unit of Molecular Parasitology, Department of Biomedical Sciences, Institute of Tropical Medicine, Antwerp, Belgium
| | | | | | | |
Collapse
|
206
|
Abstract
RAD51 is important for restarting stalled replication forks and for repairing DNA double-strand breaks (DSBs) through a pathway called homology-directed repair (HDR). However, analysis of the consequences of specific RAD51 mutants has been difficult since they are toxic. Here we report on the dominant effects of two human RAD51 mutants defective for ATP binding (K133A) or ATP hydrolysis (K133R) expressed in mouse embryonic stem (ES) cells that also expressed normal mouse RAD51 from the other chromosome. These cells were defective for restarting stalled replication forks and repairing breaks. They were also hypersensitive to camptothecin, a genotoxin that generates breaks specifically at the replication fork. In addition, these cells exhibited a wide range of structural chromosomal changes that included multiple breakpoints within the same chromosome. Thus, ATP binding and hydrolysis are essential for chromosomal maintenance. Fusion of RAD51 to a fluorescent tag (enhanced green fluorescent protein [eGFP]) allowed visualization of these proteins at sites of replication and repair. We found very low levels of mutant protein present at these sites compared to normal protein, suggesting that low levels of mutant protein were sufficient for disruption of RAD51 activity and generation of chromosomal rearrangements.
Collapse
|
207
|
Marotta M, Piontkivska H, Tanaka H. Molecular trajectories leading to the alternative fates of duplicate genes. PLoS One 2012; 7:e38958. [PMID: 22720000 PMCID: PMC3375281 DOI: 10.1371/journal.pone.0038958] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2012] [Accepted: 05/14/2012] [Indexed: 11/21/2022] Open
Abstract
Gene duplication generates extra gene copies in which mutations can accumulate without risking the function of pre-existing genes. Such mutations modify duplicates and contribute to evolutionary novelties. However, the vast majority of duplicates appear to be short-lived and experience duplicate silencing within a few million years. Little is known about the molecular mechanisms leading to these alternative fates. Here we delineate differing molecular trajectories of a relatively recent duplication event between humans and chimpanzees by investigating molecular properties of a single duplicate: DNA sequences, gene expression and promoter activities. The inverted duplication of the Glutathione S-transferase Theta 2 (GSTT2) gene had occurred at least 7 million years ago in the common ancestor of African great apes and is preserved in chimpanzees (Pan troglodytes), whereas a deletion polymorphism is prevalent in humans. The alternative fates are associated with expression divergence between these species, and reduced expression in humans is regulated by silencing mutations that have been propagated between duplicates by gene conversion. In contrast, selective constraint preserved duplicate divergence in chimpanzees. The difference in evolutionary processes left a unique DNA footprint in which dying duplicates are significantly more similar to each other (99.4%) than preserved ones. Such molecular trajectories could provide insights for the mechanisms underlying duplicate life and death in extant genomes.
Collapse
Affiliation(s)
- Michael Marotta
- Department of Molecular Genetics, Cleveland Clinic Foundation, Cleveland, Ohio, United States of America
| | - Helen Piontkivska
- Department of Biological Sciences, Kent State University, Kent, Ohio, United States of America
| | - Hisashi Tanaka
- Department of Molecular Genetics, Cleveland Clinic Foundation, Cleveland, Ohio, United States of America
| |
Collapse
|
208
|
Charrier C, Joshi K, Coutinho-Budd J, Kim JE, Lambert N, de Marchena J, Jin WL, Vanderhaeghen P, Ghosh A, Sassa T, Polleux F. Inhibition of SRGAP2 function by its human-specific paralogs induces neoteny during spine maturation. Cell 2012; 149:923-35. [PMID: 22559944 DOI: 10.1016/j.cell.2012.03.034] [Citation(s) in RCA: 305] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2011] [Revised: 02/28/2012] [Accepted: 03/01/2012] [Indexed: 12/28/2022]
Abstract
Structural genomic variations represent a major driving force of evolution, and a burst of large segmental gene duplications occurred in the human lineage during its separation from nonhuman primates. SRGAP2, a gene recently implicated in neocortical development, has undergone two human-specific duplications. Here, we find that both duplications (SRGAP2B and SRGAP2C) are partial and encode a truncated F-BAR domain. SRGAP2C is expressed in the developing and adult human brain and dimerizes with ancestral SRGAP2 to inhibit its function. In the mouse neocortex, SRGAP2 promotes spine maturation and limits spine density. Expression of SRGAP2C phenocopies SRGAP2 deficiency. It underlies sustained radial migration and leads to the emergence of human-specific features, including neoteny during spine maturation and increased density of longer spines. These results suggest that inhibition of SRGAP2 function by its human-specific paralogs has contributed to the evolution of the human neocortex and plays an important role during human brain development.
Collapse
Affiliation(s)
- Cécile Charrier
- Department of Cell Biology, Dorris Neuroscience Center, The Scripps Research Institute, La Jolla, CA 92037, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
209
|
Simmons AD, Carvalho CMB, Lupski JR. What have studies of genomic disorders taught us about our genome? Methods Mol Biol 2012; 838:1-27. [PMID: 22228005 DOI: 10.1007/978-1-61779-507-7_1] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
The elucidation of genomic disorders began with molecular technologies that enabled detection of genomic changes which were (a) smaller than those resolved by traditional cytogenetics (less than 5 Mb) and (b) larger than what could be determined by conventional gel electrophoresis. Methods such as pulsed field gel electrophoresis (PFGE) and fluorescent in situ hybridization (FISH) could resolve such changes but were limited to locus-specific studies. The study of genomic disorders has rapidly advanced with the development of array-based techniques. These enabled examination of the entire human genome at a higher level of resolution, thus allowing elucidation of the basis of many new disorders, mechanisms that result in genomic changes that can result in copy number variation (CNV), and most importantly, a deeper understanding of the characteristics, features, and plasticity of our genome. In this chapter, we focus on the structural and architectural features of the genome, which can potentially result in genomic instability, delineate how mechanisms, such as NAHR, NHEJ, and FoSTeS/MMBIR lead to disease-causing rearrangements, and briefly describe the relationship between the leading methods presently used in studying genomic disorders. We end with a discussion on our new understanding about our genome including: the contribution of new mutation CNV to disease, the abundance of mosaicism, the extent of subtelomeric rearrangements, the frequency of de novo rearrangements associated with sporadic birth defects, the occurrence of balanced and unbalanced translocations, the increasing discovery of insertional translocations, the exploration of complex rearrangements and exonic CNVs. In the postgenomic era, our understanding of the genome has advanced very rapidly as the level of technical resolution has become higher. This leads to a greater understanding of the effects of rearrangements present both in healthy subjects and individuals with clinically relevant phenotypes.
Collapse
|
210
|
Bailey J. Lessons from chimpanzee-based research on human disease: the implications of genetic differences. Altern Lab Anim 2012; 39:527-40. [PMID: 22243397 DOI: 10.1177/026119291103900608] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Assertions that the use of chimpanzees to investigate human diseases is valid scientifically are frequently based on a reported 98-99% genetic similarity between the species. Critical analyses of the relevance of chimpanzee studies to human biology, however, indicate that this genetic similarity does not result in sufficient physiological similarity for the chimpanzee to constitute a good model for research, and furthermore, that chimpanzee data do not translate well to progress in clinical practice for humans. Leading examples include the minimal citations of chimpanzee research that is relevant to human medicine, the highly different pathology of HIV/AIDS and hepatitis C virus infection in the two species, the lack of correlation in the efficacy of vaccines and treatments between chimpanzees and humans, and the fact that chimpanzees are not useful for research on human cancer. The major molecular differences underlying these inter-species phenotypic disparities have been revealed by comparative genomics and molecular biology - there are key differences in all aspects of gene expression and protein function, from chromosome and chromatin structure to post-translational modification. The collective effects of these differences are striking, extensive and widespread, and they show that the superficial similarity between human and chimpanzee genetic sequences is of little consequence for biomedical research. The extrapolation of biomedical data from the chimpanzee to the human is therefore highly unreliable, and the use of the chimpanzee must be considered of little value, particularly given the breadth and potential of alternative methods of enquiry that are currently available to science.
Collapse
|
211
|
Bekpen C, Tastekin I, Siswara P, Akdis CA, Eichler EE. Primate segmental duplication creates novel promoters for the LRRC37 gene family within the 17q21.31 inversion polymorphism region. Genome Res 2012; 22:1050-8. [PMID: 22419166 PMCID: PMC3371713 DOI: 10.1101/gr.134098.111] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The LRRC37 gene family maps to a complex region of the human genome and has been subjected to multiple rounds of segmental duplication. We investigate the expression and regulation of this gene family in multiple tissues and organisms and show a testis-specific expression of this gene family in mouse but a more ubiquitous pattern of expression among primates. Evolutionary and phylogenetic analyses support a model in which new alternative promoters have been acquired during primate evolution. We identify two promoters, Cl8 and particularly Cl3, both of which are highly active in the cerebellum and fetal brain in human and have been duplicated from a promoter region of two unrelated genes, BPTF and DND1, respectively. Two of these more broadly expressed gene family members, LRRC37A1 and A4, define the boundary of a common human inversion polymorphism mapping to chromosome 17q21.31 (the MAPT locus)—a region associated with risk for frontal temporal dementia, Parkinsonism, and intellectual disability. We propose that the regulation of the LRRC37 family occurred in a stepwise manner, acquiring foreign promoters from BPTF and DND1 via segmental duplication. This unusual evolutionary trajectory altered the regulation of the LRRC37 family, leading to increased expression in the fetal brain and cerebellum.
Collapse
|
212
|
Abbasi AA, Hanif H. Phylogenetic history of paralogous gene quartets on human chromosomes 1, 2, 8 and 20 provides no evidence in favor of the vertebrate octoploidy hypothesis. Mol Phylogenet Evol 2012; 63:922-7. [PMID: 22425707 DOI: 10.1016/j.ympev.2012.02.028] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2011] [Revised: 02/08/2012] [Accepted: 02/27/2012] [Indexed: 01/24/2023]
Abstract
Fourfold paralogy regions in the human genome have been considered historical remnants of whole-genome duplication events predicted to have occurred early in vertebrate evolution. Taking advantage of the well-annotated and high-quality human genomic sequence map as well as the ever-increasing accessibility of large-scale genomic sequence data from a diverse range of animal species, we investigated the prediction that the ancestral vertebrate genome was shaped by two rapid rounds of whole-genome duplication within a period of 10 million years. Both the map self-comparison approach and a phylogenetic analysis revealed that gene families identified as tetralogous on human chromosomes 1/2/8/20 arose by small-scale duplication events that occurred at widely different time points in animal evolution. Furthermore, the data discount the likelihood that tree topologies of the form ((A,B)(C,D)) are best explained by the octoploidy hypothesis. We instead propose that such symmetrical tree patterns are also consistent with local duplications and rearrangement events.
Collapse
Affiliation(s)
- Amir Ali Abbasi
- National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad 45320, Pakistan.
| | | |
Collapse
|
213
|
Weise A, Mrasek K, Klein E, Mulatinho M, Llerena JC, Hardekopf D, Pekova S, Bhatt S, Kosyakova N, Liehr T. Microdeletion and microduplication syndromes. J Histochem Cytochem 2012; 60:346-58. [PMID: 22396478 DOI: 10.1369/0022155412440001] [Citation(s) in RCA: 94] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The widespread use of whole genome analysis based on array comparative genomic hybridization in diagnostics and research has led to a continuously growing number of microdeletion and microduplication syndromes (MMSs) connected to certain phenotypes. These MMSs also include increasing instances in which the critical region can be reciprocally deleted or duplicated. This review catalogues the currently known MMSs and the corresponding critical regions including phenotypic consequences. Besides the pathogenic pathways leading to such rearrangements, the different detection methods and their limitations are discussed. Finally, the databases available for distinguishing between reported benign or pathogenic copy number alterations are highlighted. Overall, a review of MMSs that previously were also denoted "genomic disorders" or "contiguous gene syndromes" is given.
Collapse
Affiliation(s)
- Anja Weise
- Jena University Hospital, Friedrich Schiller University, Institute of Human Genetics, Jena, Germany.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
214
|
Molina O, Anton E, Vidal F, Blanco J. High rates of de novo 15q11q13 inversions in human spermatozoa. Mol Cytogenet 2012; 5:11. [PMID: 22309495 PMCID: PMC3293048 DOI: 10.1186/1755-8166-5-11] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2011] [Accepted: 02/06/2012] [Indexed: 11/22/2022] Open
Abstract
Low-Copy Repeats predispose the 15q11-q13 region to non-allelic homologous recombination. We have already demonstrated that a significant percentage of Prader-Willi syndrome (PWS) fathers have an increased susceptibility to generate 15q11q13 deletions in spermatozoa, suggesting the participation of intrachromatid exchanges. This work has been focused on assessing the incidence of de novo 15q11q13 inversions in spermatozoa of control donors and PWS fathers in order to determine the basal rates of inversions and to confirm the intrachromatid mechanism as the main cause of 15q11q13 anomalies. Semen samples from 10 control donors and 16 PWS fathers were processed and analyzed by triple-color FISH. Three differentially labeled BAC-clones were used: one proximal and two distal of the 15q11-q13 region. Signal associations allowed the discrimination between normal and inverted haplotypes, which were confirmed by laser-scanning confocal microscopy. Two types of inversions were detected which correspond to the segments involved in Class I and II PWS deletions. No significant differences were observed in the mean frequencies of inversions between controls and PWS fathers (3.59% ± 0.46 and 9.51% ± 0.87 vs 3.06% ± 0.33 and 10.07% ± 0.74). Individual comparisons showed significant increases of inversions in four PWS fathers (P < 0.05) previously reported as patients with increases of 15q11q13 deletions. Results suggest that the incidence of heterozygous inversion carriers in the general population could reach significant values. This situation could have important implications, as they have been described as predisposing haplotypes for genomic disorders. As a whole, results confirm the high instability of the 15q11-q13 region, which is prone to different types of de novo reorganizations by intrachromatid NAHR.
Collapse
Affiliation(s)
- Oscar Molina
- Unitat de Biologia Cel·lular (Facultat de Biociències), Universitat Autònoma de Barcelona, 08193-Bellaterra (Cerdanyola del Vallès), SPAIN.
| | | | | | | |
Collapse
|
215
|
Derrien T, Estellé J, Marco Sola S, Knowles DG, Raineri E, Guigó R, Ribeca P. Fast computation and applications of genome mappability. PLoS One 2012; 7:e30377. [PMID: 22276185 PMCID: PMC3261895 DOI: 10.1371/journal.pone.0030377] [Citation(s) in RCA: 327] [Impact Index Per Article: 25.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2011] [Accepted: 12/19/2011] [Indexed: 01/17/2023] Open
Abstract
We present a fast mapping-based algorithm to compute the mappability of each region of a reference genome up to a specified number of mismatches. Knowing the mappability of a genome is crucial for the interpretation of massively parallel sequencing experiments. We investigate the properties of the mappability of eukaryotic DNA/RNA both as a whole and at the level of the gene family, providing for various organisms tracks which allow the mappability information to be visually explored. In addition, we show that mappability varies greatly between species and gene classes. Finally, we suggest several practical applications where mappability can be used to refine the analysis of high-throughput sequencing data (SNP calling, gene expression quantification and paired-end experiments). This work highlights mappability as an important concept which deserves to be taken into full account, in particular when massively parallel sequencing technologies are employed. The GEM mappability program belongs to the GEM (GEnome Multitool) suite of programs, which can be freely downloaded for any use from its website (http://gemlibrary.sourceforge.net).
Collapse
Affiliation(s)
- Thomas Derrien
- Institut de Génétique et Développement (IGDR), Université Rennes 1, Rennes, France
- * E-mail: (TD); (PR)
| | - Jordi Estellé
- Centro Nacional de Análisis Genómico (CNAG), Barcelona, Spain
| | | | - David G. Knowles
- Centre for Genomic Regulation (CRG), Universitat Pompeu Fabra, Barcelona, Spain
| | | | - Roderic Guigó
- Centre for Genomic Regulation (CRG), Universitat Pompeu Fabra, Barcelona, Spain
| | - Paolo Ribeca
- Centro Nacional de Análisis Genómico (CNAG), Barcelona, Spain
- * E-mail: (TD); (PR)
| |
Collapse
|
216
|
Uddin M, Sturge M, Peddle L, O'Rielly DD, Rahman P. Genome-wide signatures of 'rearrangement hotspots' within segmental duplications in humans. PLoS One 2011; 6:e28853. [PMID: 22194928 PMCID: PMC3237539 DOI: 10.1371/journal.pone.0028853] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2011] [Accepted: 11/16/2011] [Indexed: 11/19/2022] Open
Abstract
The primary objective of this study was to create a genome-wide high resolution map (i.e., >100 bp) of ‘rearrangement hotspots’ which can facilitate the identification of regions capable of mediating de novo deletions or duplications in humans. A hierarchical method was employed to fragment segmental duplications (SDs) into multiple smaller SD units. Combining an end space free pairwise alignment algorithm with a ‘seed and extend’ approach, we have exhaustively searched 409 million alignments to detect complex structural rearrangements within the reference-guided assembly of the NA18507 human genome (18× coverage), including the previously identified novel 4.8 Mb sequence from de novo assembly within this genome. We have identified 1,963 rearrangement hotspots within SDs which encompass 166 genes and display an enrichment of duplicated gene nucleotide variants (DNVs). These regions are correlated with increased non-allelic homologous recombination (NAHR) event frequency which presumably represents the origin of copy number variations (CNVs) and pathogenic duplications/deletions. Analysis revealed that 20% of the detected hotspots are clustered within the proximal and distal SD breakpoints flanked by the pathogenic deletions/duplications that have been mapped for 24 NAHR-mediated genomic disorders. FISH Validation of selected complex regions revealed 94% concordance with in silico localization of the highly homologous derivatives. Other results from this study indicate that intra-chromosomal recombination is enhanced in genic compared with agenic duplicated regions, and that gene desert regions comprising SDs may represent reservoirs for creation of novel genes. The generation of genome-wide signatures of ‘rearrangement hotspots’, which likely serve as templates for NAHR, may provide a powerful approach towards understanding the underlying mutational mechanism(s) for development of constitutional and acquired diseases.
Collapse
Affiliation(s)
- Mohammed Uddin
- Faculty of Medicine, Discipline of Medicine and Genetics, Memorial University, St. John's, Newfoundland, Canada
| | - Mitch Sturge
- Faculty of Medicine, Discipline of Medicine and Genetics, Memorial University, St. John's, Newfoundland, Canada
| | - Lynette Peddle
- Faculty of Medicine, Discipline of Medicine and Genetics, Memorial University, St. John's, Newfoundland, Canada
| | - Darren D. O'Rielly
- Faculty of Medicine, Discipline of Medicine and Genetics, Memorial University, St. John's, Newfoundland, Canada
| | - Proton Rahman
- Faculty of Medicine, Discipline of Medicine and Genetics, Memorial University, St. John's, Newfoundland, Canada
- * E-mail:
| |
Collapse
|
217
|
Baker RH, Kuehl JV, Wilkinson GS. The Enhancer of split complex arose prior to the diversification of schizophoran flies and is strongly conserved between Drosophila and stalk-eyed flies (Diopsidae). BMC Evol Biol 2011; 11:354. [PMID: 22151427 PMCID: PMC3261227 DOI: 10.1186/1471-2148-11-354] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2011] [Accepted: 12/08/2011] [Indexed: 02/03/2023] Open
Abstract
Background In Drosophila, the Enhancer of split complex (E(spl)-C) comprises 11 bHLH and Bearded genes that function during Notch signaling to repress proneural identity in the developing peripheral nervous system. Comparison with other insects indicates that the basal state for Diptera is a single bHLH and Bearded homolog and that the expansion of the gene complex occurred in the lineage leading to Drosophila. However, comparative genomic data from other fly species that would elucidate the origin and sequence of gene duplication for the complex is lacking. Therefore, in order to examine the evolutionary history of the complex within Diptera, we reconstructed, using several fosmid clones, the entire E(spl)-complex in the stalk-eyed fly, Teleopsis dalmanni and collected additional homologs of E(spl)-C genes from searches of dipteran EST databases and the Glossina morsitans genome assembly. Results Comparison of the Teleopsis E(spl)-C gene organization with Drosophila indicates complete conservation in gene number and orientation between the species except that T. dalmanni contains a duplicated copy of E(spl)m5 that is not present in Drosophila. Phylogenetic analysis of E(spl)-complex bHLH and Bearded genes for several dipteran species clearly demonstrates that all members of the complex were present prior to the diversification of schizophoran flies. Comparison of upstream regulatory elements and 3' UTR domains between the species also reveals strong conservation for many of the genes and identifies several novel characteristics of E(spl)-C regulatory evolution including the discovery of a previously unidentified, highly conserved SPS+A domain between E(spl)mγ and E(spl)mβ. Conclusion Identifying the phylogenetic origin of E(spl)-C genes and their associated regulatory DNA is essential to understanding the functional significance of this well-studied gene complex. Results from this study provide numerous insights into the evolutionary history of the complex and will help refine the focus of studies examining the adaptive consequences of this gene expansion.
Collapse
Affiliation(s)
- Richard H Baker
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY 10024, USA.
| | | | | |
Collapse
|
218
|
Alvarez CE, Akey JM. Copy number variation in the domestic dog. Mamm Genome 2011; 23:144-63. [PMID: 22138850 DOI: 10.1007/s00335-011-9369-8] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2011] [Accepted: 10/09/2011] [Indexed: 12/13/2022]
Abstract
Differences in the content and organization of DNA, collectively referred to as structural variation, have emerged as a major source of genetic and phenotypic diversity within and between species. In addition, structural variation provides an important substrate for evolutionary innovations. Here, we review recent progress in characterizing patterns of canine structural variation within and between breeds, and in correlating copy number variants (CNVs) with phenotypes. Because of the extensive phenotypic diversity that exists within and between breeds and the tantalizing examples of canine CNVs that influence traits such as skin wrinkling in Shar-Pei, dorsal hair ridge in Rhodesian and Thai Ridgebacks, and short limbs in many breeds such as Dachshunds and Corgis, we argue that domesticated dogs are uniquely poised to contribute novel insights into CNV biology. As new technologies continue to be developed and refined, the field of canine genomics is on the precipice of a deeper understanding of how structural variation and CNVs contribute to canine genetic diversity, phenotypic variation, and disease susceptibility.
Collapse
Affiliation(s)
- Carlos E Alvarez
- The Center for Human and Molecular Genetics, The Research Institute at Nationwide Children's Hospital, 700 Children's Drive, W491, Columbus, OH 43205, USA.
| | | |
Collapse
|
219
|
Koroteev MV, Miller J. Scale-free duplication dynamics: a model for ultraduplication. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2011; 84:061919. [PMID: 22304128 DOI: 10.1103/physreve.84.061919] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2010] [Revised: 07/04/2011] [Indexed: 05/31/2023]
Abstract
Empirical studies of the genome-wide length distribution of duplicated sequences have revealed an algebraic tail common to nearly all clades. The decay of the tail is often well approximated by a single exponent that takes values within a limited range. We propose and study here scale-free duplication dynamics, a class of model for genome sequence evolution that generates the observed shapes of this distribution. A transition between self-similar and non-self-similar regimes is exhibited. Our model accounts plausibly for the observed form of the algebraic tail, which is not produced by standard models for generating long-range sequence correlations.
Collapse
Affiliation(s)
- M V Koroteev
- Physics and Biology Unit, Okinawa Institute of Science and Technology Suzaki 12-22, Uruma, Okinawa 904-2234, Japan
| | | |
Collapse
|
220
|
Abstract
This review summarizes aspects of the extensive literature on the patterns and processes underpinning chromosomal evolution in vertebrates and especially placental mammals. It highlights the growing synergy between molecular cytogenetics and comparative genomics, particularly with respect to fully or partially sequenced genomes, and provides novel insights into changes in chromosome number and structure across deep division of the vertebrate tree of life. The examination of basal numbers in the deeper branches of the vertebrate tree suggest a haploid (n) chromosome number of 10-13 in an ancestral vertebrate, with modest increases in tetrapods and amniotes most probably by chromosomal fissioning. Information drawn largely from cross-species chromosome painting in the data-dense Placentalia permits the confident reconstruction of an ancestral karyotype comprising n=23 chromosomes that is similarly retained in Boreoeutheria. Using in silico genome-wide scans that include the newly released frog genome we show that of the nine ancient syntenies detected in conserved karyotypes of extant placentals (thought likely to reflect the structure of ancestral chromosomes), the human syntenic segmental associations 3p/21, 4pq/8p, 7a/16p, 14/15, 12qt/22q and 12pq/22qt predate the divergence of tetrapods. These findings underscore the enhanced quality of ancestral reconstructions based on the integrative molecular cytogenetic and comparative genomic approaches that collectively highlight a pattern of conserved syntenic associations that extends back ∼360 million years ago.
Collapse
|
221
|
Farré M, Bosch M, López-Giráldez F, Ponsà M, Ruiz-Herrera A. Assessing the role of tandem repeats in shaping the genomic architecture of great apes. PLoS One 2011; 6:e27239. [PMID: 22076140 PMCID: PMC3208591 DOI: 10.1371/journal.pone.0027239] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2011] [Accepted: 10/12/2011] [Indexed: 11/18/2022] Open
Abstract
Background Ancestral reconstructions of mammalian genomes have revealed that evolutionary breakpoint regions are clustered in regions that are more prone to break and reorganize. What is still unclear to evolutionary biologists is whether these regions are physically unstable due solely to sequence composition and/or genome organization, or do they represent genomic areas where the selection against breakpoints is minimal. Methodology and Principal Findings Here we present a comprehensive study of the distribution of tandem repeats in great apes. We analyzed the distribution of tandem repeats in relation to the localization of evolutionary breakpoint regions in the human, chimpanzee, orangutan and macaque genomes. We observed an accumulation of tandem repeats in the genomic regions implicated in chromosomal reorganizations. In the case of the human genome our analyses revealed that evolutionary breakpoint regions contained more base pairs implicated in tandem repeats compared to synteny blocks, being the AAAT motif the most frequently involved in evolutionary regions. We found that those AAAT repeats located in evolutionary regions were preferentially associated with Alu elements. Significance Our observations provide evidence for the role of tandem repeats in shaping mammalian genome architecture. We hypothesize that an accumulation of specific tandem repeats in evolutionary regions can promote genome instability by altering the state of the chromatin conformation or by promoting the insertion of transposable elements.
Collapse
Affiliation(s)
- Marta Farré
- Departament de Biologia Cel·lular, Fisiologia i Immunologia, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
| | | | - Francesc López-Giráldez
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
| | - Montserrat Ponsà
- Departament de Biologia Cel·lular, Fisiologia i Immunologia, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
| | - Aurora Ruiz-Herrera
- Departament de Biologia Cel·lular, Fisiologia i Immunologia, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
- Institut de Biotecnologia i Biomedicina (IBB), Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
- * E-mail:
| |
Collapse
|
222
|
|
223
|
Zhang YE, Landback P, Vibranovski MD, Long M. Accelerated recruitment of new brain development genes into the human genome. PLoS Biol 2011; 9:e1001179. [PMID: 22028629 PMCID: PMC3196496 DOI: 10.1371/journal.pbio.1001179] [Citation(s) in RCA: 119] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2011] [Accepted: 09/08/2011] [Indexed: 11/24/2022] Open
Abstract
How the human brain evolved has attracted tremendous interests for decades. Motivated by case studies of primate-specific genes implicated in brain function, we examined whether or not the young genes, those emerging genome-wide in the lineages specific to the primates or rodents, showed distinct spatial and temporal patterns of transcription compared to old genes, which had existed before primate and rodent split. We found consistent patterns across different sources of expression data: there is a significantly larger proportion of young genes expressed in the fetal or infant brain of humans than in mouse, and more young genes in humans have expression biased toward early developing brains than old genes. Most of these young genes are expressed in the evolutionarily newest part of human brain, the neocortex. Remarkably, we also identified a number of human-specific genes which are expressed in the prefrontal cortex, which is implicated in complex cognitive behaviors. The young genes upregulated in the early developing human brain play diverse functional roles, with a significant enrichment of transcription factors. Genes originating from different mechanisms show a similar expression bias in the developing brain. Moreover, we found that the young genes upregulated in early brain development showed rapid protein evolution compared to old genes also expressed in the fetal brain. Strikingly, genes expressed in the neocortex arose soon after its morphological origin. These four lines of evidence suggest that positive selection for brain function may have contributed to the origination of young genes expressed in the developing brain. These data demonstrate a striking recruitment of new genes into the early development of the human brain.
Collapse
Affiliation(s)
- Yong E. Zhang
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
| | - Patrick Landback
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
| | - Maria D. Vibranovski
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
224
|
Cooper DN, Bacolla A, Férec C, Vasquez KM, Kehrer-Sawatzki H, Chen JM. On the sequence-directed nature of human gene mutation: the role of genomic architecture and the local DNA sequence environment in mediating gene mutations underlying human inherited disease. Hum Mutat 2011; 32:1075-99. [PMID: 21853507 PMCID: PMC3177966 DOI: 10.1002/humu.21557] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2011] [Accepted: 06/17/2011] [Indexed: 12/21/2022]
Abstract
Different types of human gene mutation may vary in size, from structural variants (SVs) to single base-pair substitutions, but what they all have in common is that their nature, size and location are often determined either by specific characteristics of the local DNA sequence environment or by higher order features of the genomic architecture. The human genome is now recognized to contain "pervasive architectural flaws" in that certain DNA sequences are inherently mutation prone by virtue of their base composition, sequence repetitivity and/or epigenetic modification. Here, we explore how the nature, location and frequency of different types of mutation causing inherited disease are shaped in large part, and often in remarkably predictable ways, by the local DNA sequence environment. The mutability of a given gene or genomic region may also be influenced indirectly by a variety of noncanonical (non-B) secondary structures whose formation is facilitated by the underlying DNA sequence. Since these non-B DNA structures can interfere with subsequent DNA replication and repair and may serve to increase mutation frequencies in generalized fashion (i.e., both in the context of subtle mutations and SVs), they have the potential to serve as a unifying concept in studies of mutational mechanisms underlying human inherited disease.
Collapse
Affiliation(s)
- David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United Kingdom.
| | | | | | | | | | | |
Collapse
|
225
|
Shammas MA. Repetitive sequences, genomic instability and Barrett's esophageal adenocarcinoma. Mob Genet Elements 2011; 1:208-212. [PMID: 22479688 DOI: 10.4161/mge.1.3.17456] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2011] [Accepted: 07/22/2011] [Indexed: 11/19/2022] Open
Abstract
Barrett's esophageal adenocarcinoma (BAC) is a cancer associated with heartburn. If gastroesophageal reflux is not treated, the exposure to acid over the years, leads to a premalignant condition known as Barrett's esophagus (BE) which then progresses through low grade and high grade dysplasias to Barrett's adenocarcinoma. Genomic instability, which seems to arise early at BE stage, leads to accrual of mutational changes which underlie the the succession of histological and physiological changes associated with this disease. Genomic instability is therefore an important target for prevention and treatment of cancer and it is important to elucidate the mechanisms associated with this problem. We have shown that elevated/deregulated homologous recombination mediates genomic instability in cancer. Recently we also demonstrated that the mutational rates of individual chromosomes in BAC cells correlate with their ALU frequency. The aims of this article are to briefly discuss different types of repetitive sequences and highlight their importance in physiology of normal and cancer cells, especially BAC.
Collapse
Affiliation(s)
- Masood A Shammas
- Department of Medical Oncology; Harvard (Dana Farber) Cancer Institute; Boston, MA USA; VA Boston Healthcare System; West Roxbury, MA USA
| |
Collapse
|
226
|
Ezawa K, Ikeo K, Gojobori T, Saitou N. Evolutionary patterns of recently emerged animal duplogs. Genome Biol Evol 2011; 3:1119-35. [PMID: 21859807 PMCID: PMC3194840 DOI: 10.1093/gbe/evr074] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Duplogs, or intraspecies paralogs, constitute the important portion of eukaryote genomes and serve as a major source of functional innovation. We conducted detailed analyses of recently emerged animal duplogs. Genome data of three vertebrate species (Homo sapiens, Mus musculus, and Danio rerio), Caenorhabditis elegans, and two Drosophila species (Drosophila melanogaster and D. pseudoobscura) were used. Duplication events were divided into six age-groups according to the synonymous distance (dS) up to 0.6. Duplogs were classified into four equal-sized classes on physical distances and into three classes on relative orientations. We observed the following shared characteristics among intrachromosomal multiexon duplogs: 1) inverted duplogs account for 20-50%, and about a half of the physically most distant 25%; 2) except for C. elegans, the composition of physical distances, that of relative orientations, and the proportion of inverted duplogs in each physical distance category are more or less uniform; 3) except for C. elegans, the characteristics of the youngest (dS < 0.01) duplogs are similar to the overall characteristics of the entire set. These results suggest that intrachromosomal duplogs with fairly long physical distances were generated at once, rather than resulting from tandem duplications and subsequent genomic rearrangements. This is different from the three well-known modes of gene duplication: tandem duplication, retrotransposition, and genome duplication. We termed this new mode as "drift" duplication. The drift duplication has been producing duplicate copies at paces comparable with tandem duplications since the common ancestor of vertebrates, and it may have already operated in the common ancestor of bilateral animals.
Collapse
Affiliation(s)
- Kiyoshi Ezawa
- Division of Population Genetics, National Institute of Genetics, Mishima, Japan
| | | | | | | |
Collapse
|
227
|
Algebraic distribution of segmental duplication lengths in whole-genome sequence self-alignments. PLoS One 2011; 6:e18464. [PMID: 21779315 PMCID: PMC3136455 DOI: 10.1371/journal.pone.0018464] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2010] [Accepted: 03/08/2011] [Indexed: 01/25/2023] Open
Abstract
Distributions of duplicated sequences from genome self-alignment are characterized, including forward and backward alignments in bacteria and eukaryotes. A Markovian process without auto-correlation should generate an exponential distribution expected from local effects of point mutation and selection on localised function; however, the observed distributions show substantial deviation from exponential form – they are roughly algebraic instead – suggesting a novel kind of long-distance correlation that must be non-local in origin.
Collapse
|
228
|
Chung D, Kuan PF, Li B, Sanalkumar R, Liang K, Bresnick EH, Dewey C, Keleş S. Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data. PLoS Comput Biol 2011; 7:e1002111. [PMID: 21779159 PMCID: PMC3136429 DOI: 10.1371/journal.pcbi.1002111] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2011] [Accepted: 05/18/2011] [Indexed: 11/19/2022] Open
Abstract
Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip) as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The state of the art for analyzing ChIP-seq data relies on using only reads that map uniquely to a relevant reference genome (uni-reads). This can lead to the omission of up to 30% of alignable reads. We describe a general approach for utilizing reads that map to multiple locations on the reference genome (multi-reads). Our approach is based on allocating multi-reads as fractional counts using a weighted alignment scheme. Using human STAT1 and mouse GATA1 ChIP-seq datasets, we illustrate that incorporation of multi-reads significantly increases sequencing depths, leads to detection of novel peaks that are not otherwise identifiable with uni-reads, and improves detection of peaks in mappable regions. We investigate various genome-wide characteristics of peaks detected only by utilization of multi-reads via computational experiments. Overall, peaks from multi-read analysis have similar characteristics to peaks that are identified by uni-reads except that the majority of them reside in segmental duplications. We further validate a number of GATA1 multi-read only peaks by independent quantitative real-time ChIP analysis and identify novel target genes of GATA1. These computational and experimental results establish that multi-reads can be of critical importance for studying transcription factor binding in highly repetitive regions of genomes with ChIP-seq experiments.
Collapse
Affiliation(s)
- Dongjun Chung
- Department of Statistics, University of Wisconsin, Madison, Wisconsin, United States of America
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Pei Fen Kuan
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Bo Li
- Department of Computer Sciences, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Rajendran Sanalkumar
- Wisconsin Institutes for Medical Research, UW Carbone Cancer Center, Department of Cell and Regenerative Biology, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, United States of America
| | - Kun Liang
- Department of Statistics, University of Wisconsin, Madison, Wisconsin, United States of America
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Emery H. Bresnick
- Wisconsin Institutes for Medical Research, UW Carbone Cancer Center, Department of Cell and Regenerative Biology, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, United States of America
| | - Colin Dewey
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, Wisconsin, United States of America
- Department of Computer Sciences, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Sündüz Keleş
- Department of Statistics, University of Wisconsin, Madison, Wisconsin, United States of America
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, Wisconsin, United States of America
| |
Collapse
|
229
|
Transcriptional variations mediated by an alternative promoter of the FPR3 gene. Mamm Genome 2011; 22:621-33. [PMID: 21717223 DOI: 10.1007/s00335-011-9341-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2010] [Accepted: 05/20/2011] [Indexed: 10/18/2022]
Abstract
Formyl peptide receptor 3 (FPR3) is a potential player in innate immunity and appears with FPR2 as a FPR cluster during primate evolution. Comparative genome analyses indicate that a segmental duplication (SD) event upstream of the FPR3 gene after the divergence of New and Old World monkeys led to the emergence of an alternative promoter. In this study we combined computational and experimental approaches to identify a FPR3 gene that is controlled by an alternative promoter derived during a SD event. Its transcriptional activity was detected by quantitative reverse transcription polymerase chain reaction. Human alternative transcripts (FPR3-1 and FPR3-2) showed tissue-specific patterns with strong expressions in lung or uterus, while the FPR3-1 transcript of rhesus macaque is broadly expressed in various tissues. Overall, transcriptional variations of FPR3 occur by an alternative promoter during primate evolution.
Collapse
|
230
|
Abstract
The past two decades have witnessed tremendous advances in noninvasive and postmortem neuroscientific techniques, advances that have made it possible, for the first time, to compare in detail the organization of the human brain to that of other primates. Studies comparing humans to chimpanzees and other great apes reveal that human brain evolution was not merely a matter of enlargement, but involved changes at all levels of organization that have been examined. These include the cellular and laminar organization of cortical areas; the higher order organization of the cortex, as reflected in the expansion of association cortex (in absolute terms, as well as relative to primary areas); the distribution of long-distance cortical connections; and hemispheric asymmetry. Additionally, genetic differences between humans and other primates have proven to be more extensive than previously thought, raising the possibility that human brain evolution involved significant modifications of neurophysiology and cerebral energy metabolism.
Collapse
Affiliation(s)
- Todd M Preuss
- Division of Neuropathology and Neurodegenerative Diseases and Center for Translational Social Neuroscience, Yerkes National Primate Research Center, Emory University, Atlanta, Georgia 30329, USA.
| |
Collapse
|
231
|
Ventura M, Catacchio CR, Alkan C, Marques-Bonet T, Sajjadian S, Graves TA, Hormozdiari F, Navarro A, Malig M, Baker C, Lee C, Turner EH, Chen L, Kidd JM, Archidiacono N, Shendure J, Wilson RK, Eichler EE. Gorilla genome structural variation reveals evolutionary parallelisms with chimpanzee. Genome Res 2011; 21:1640-9. [PMID: 21685127 DOI: 10.1101/gr.124461.111] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Structural variation has played an important role in the evolutionary restructuring of human and great ape genomes. Recent analyses have suggested that the genomes of chimpanzee and human have been particularly enriched for this form of genetic variation. Here, we set out to assess the extent of structural variation in the gorilla lineage by generating 10-fold genomic sequence coverage from a western lowland gorilla and integrating these data into a physical and cytogenetic framework of structural variation. We discovered and validated over 7665 structural changes within the gorilla lineage, including sequence resolution of inversions, deletions, duplications, and mobile element insertions. A comparison with human and other ape genomes shows that the gorilla genome has been subjected to the highest rate of segmental duplication. We show that both the gorilla and chimpanzee genomes have experienced independent yet convergent patterns of structural mutation that have not occurred in humans, including the formation of subtelomeric heterochromatic caps, the hyperexpansion of segmental duplications, and bursts of retroviral integrations. Our analysis suggests that the chimpanzee and gorilla genomes are structurally more derived than either orangutan or human genomes.
Collapse
Affiliation(s)
- Mario Ventura
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
232
|
Irwin DM, Biegel JM, Stewart CB. Evolution of the mammalian lysozyme gene family. BMC Evol Biol 2011; 11:166. [PMID: 21676251 PMCID: PMC3141428 DOI: 10.1186/1471-2148-11-166] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2010] [Accepted: 06/15/2011] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Lysozyme c (chicken-type lysozyme) has an important role in host defense, and has been extensively studied as a model in molecular biology, enzymology, protein chemistry, and crystallography. Traditionally, lysozyme c has been considered to be part of a small family that includes genes for two other proteins, lactalbumin, which is found only in mammals, and calcium-binding lysozyme, which is found in only a few species of birds and mammals. More recently, additional testes-expressed members of this family have been identified in human and mouse, suggesting that the mammalian lysozyme gene family is larger than previously known. RESULTS Here we characterize the extent and diversity of the lysozyme gene family in the genomes of phylogenetically diverse mammals, and show that this family contains at least eight different genes that likely duplicated prior to the diversification of extant mammals. These duplicated genes have largely been maintained, both in intron-exon structure and in genomic context, throughout mammalian evolution. CONCLUSIONS The mammalian lysozyme gene family is much larger than previously appreciated and consists of at least eight distinct genes scattered around the genome. Since the lysozyme c and lactalbumin proteins have acquired very different functions during evolution, it is likely that many of the other members of the lysozyme-like family will also have diverse and unexpected biological properties.
Collapse
Affiliation(s)
- David M Irwin
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Canada
- Banting and Best Diabetes Centre, University of Toronto, Toronto, Canada
| | - Jason M Biegel
- Department of Biological Sciences, University at Albany, State University of New York, Albany, New York 12222, USA
| | - Caro-Beth Stewart
- Department of Biological Sciences, University at Albany, State University of New York, Albany, New York 12222, USA
| |
Collapse
|
233
|
Ramalingam A, Zhou XG, Fiedler SD, Brawner SJ, Joyce JM, Liu HY, Yu S. 16p13.11 duplication is a risk factor for a wide spectrum of neuropsychiatric disorders. J Hum Genet 2011; 56:541-4. [PMID: 21614007 DOI: 10.1038/jhg.2011.42] [Citation(s) in RCA: 87] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
The chromosome 16p13.11 heterozygous deletion is associated with a diverse array of neuropsychiatric disorders including intellectual disabilities, autism, schizophrenia, epilepsy and attention-deficit hyperactivity disorder. However the clinical significance of its reciprocal duplication is not clearly defined yet. We evaluated 1645 consecutive pediatric patients with various developmental disorders by high-resolution microarray-based comparative genomic hybridization and identified four deletions and eight duplications within the 16p13.11 region, representing ∼0.73% (12/1645) of the patients analyzed. Recurrent clinical features in these patients include mental retardation/intellectual disability, autism, seizure, dysmorphic feature or multiple congenital anomalies. Our data expand the spectrum of the clinical findings in patients with these genomic abnormalities and provide further support for the pathogenic involvement of this duplication in patients who carry them.
Collapse
Affiliation(s)
- Arivudainambi Ramalingam
- Department of Pathology, Children's Mercy Hospitals and Clinics and University of Missouri-Kansas City School of Medicine, Kansas City, MO 64108, USA
| | | | | | | | | | | | | |
Collapse
|
234
|
Villa N, Bentivegna A, Ertel A, Redaelli S, Colombo C, Nacinovich R, Broggi F, Lissoni S, Bungaro S, Addya S, Fortina P, Dalprà L. A de novo supernumerary genomic discontinuous ring chromosome 21 in a child with mild intellectual disability. Am J Med Genet A 2011; 155A:1425-31. [PMID: 21574245 DOI: 10.1002/ajmg.a.34010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2010] [Accepted: 02/17/2011] [Indexed: 11/10/2022]
Abstract
Small supernumerary marker chromosomes (sSMCs) are structurally abnormal extra chromosomes that cannot be unambiguously identified or characterized by conventional banding techniques alone, and they are generally equal in size or smaller than chromosome 20 of the same metaphase spread. Small supernumerary ring chromosomes (sSRCs), a smaller class of marker chromosomes, comprise about 10% of the cases. For various reasons these marker chromosomes have been the most difficult to characterize; although specific syndromes have not yet been defined, 60% of cases are associated with an abnormal phenotype. The chromosomal material involved, the degree and tissutal distribution of mosaicism, and the possible presence of uniparental disomy, are the important factors determining whether or not the ring chromosome will give rise to symptoms. Using conventional and molecular cytogenetics approaches we identified a de novo chromosome 21 sSRC in a child with speech delay and mild intellectual disability. By using aCGH analysis and SNP arrays, we report the presence of two discontinuous regions of chromosome 21 and the paternal origin of the sSRC. A thorough neuropsychiatric evaluation is also provided. Only few other cases of complex discontinuous ring chromosomes have been described in detail.
Collapse
Affiliation(s)
- Nicoletta Villa
- Medical Genetics Laboratory, S. Gerardo Hospital, Monza, Italy
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
235
|
Polymorphic family of injected pseudokinases is paramount in Toxoplasma virulence. Proc Natl Acad Sci U S A 2011; 108:9625-30. [PMID: 21436047 DOI: 10.1073/pnas.1015980108] [Citation(s) in RCA: 196] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Toxoplasma gondii, an obligate intracellular parasite of the phylum Apicomplexa, has the unusual ability to infect virtually any warm-blooded animal. It is an extraordinarily successful parasite, infecting an estimated 30% of humans worldwide. The outcome of Toxoplasma infection is highly dependent on allelic differences in the large number of effectors that the parasite secretes into the host cell. Here, we show that the largest determinant of the virulence difference between two of the most common strains of Toxoplasma is the ROP5 locus. This is an unusual segment of the Toxoplasma genome consisting of a family of 4-10 tandem, highly divergent genes encoding pseudokinases that are injected directly into host cells. Given their hypothesized catalytic inactivity, it is striking that deletion of the ROP5 cluster in a highly virulent strain caused a complete loss of virulence, showing that ROP5 proteins are, in fact, indispensable for Toxoplasma to cause disease in mice. We find that copy number at this locus varies among the three major Toxoplasma lineages and that extensive polymorphism is clustered into hotspots within the ROP5 pseudokinase domain. We propose that the ROP5 locus represents an unusual evolutionary strategy for sampling of sequence space in which the gene encoding an important enzyme has been (i) catalytically inactivated, (ii) expanded in number, and (iii) subject to strong positive selection. Such a strategy likely contributes to Toxoplasma's successful adaptation to a wide host range and has resulted in dramatic differences in virulence.
Collapse
|
236
|
Kurahashi H, Inagaki H, Ohye T, Kogo H, Tsutsumi M, Kato T, Tong M, Emanuel BS. The constitutional t(11;22): implications for a novel mechanism responsible for gross chromosomal rearrangements. Clin Genet 2011; 78:299-309. [PMID: 20507342 DOI: 10.1111/j.1399-0004.2010.01445.x] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The constitutional t(11;22)(q23;q11) is the most common recurrent non-Robertsonian translocation in humans. The breakpoint sequences of both chromosomes are characterized by several hundred base pairs of palindromic AT-rich repeats (PATRRs). Similar PATRRs have also been identified at the breakpoints of other nonrecurrent translocations, suggesting that PATRR-mediated chromosomal translocation represents one of the universal pathways for gross chromosomal rearrangement in the human genome. We propose that PATRRs have the potential to form cruciform structures through intrastrand-base pairing in single-stranded DNA, creating a source of genomic instability and leading to translocations. Indeed, de novo examples of the t(11;22) are detected at a high frequency in sperm from normal healthy males. This review synthesizes recent data illustrating a novel paradigm for an apparent spermatogenesis-specific translocation mechanism. This observation has important implications pertaining to the predominantly paternal origin of de novo gross chromosomal rearrangements in humans.
Collapse
Affiliation(s)
- H Kurahashi
- Division of Molecular Genetics, Institute for Comprehensive Medical Science, Fujita Health University, Toyoake, Aichi, Japan.
| | | | | | | | | | | | | | | |
Collapse
|
237
|
Levasseur A, Pontarotti P. The role of duplications in the evolution of genomes highlights the need for evolutionary-based approaches in comparative genomics. Biol Direct 2011; 6:11. [PMID: 21333002 PMCID: PMC3052240 DOI: 10.1186/1745-6150-6-11] [Citation(s) in RCA: 71] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2010] [Accepted: 02/18/2011] [Indexed: 12/02/2022] Open
Abstract
Understanding the evolutionary plasticity of the genome requires a global, comparative approach in which genetic events are considered both in a phylogenetic framework and with regard to population genetics and environmental variables. In the mechanisms that generate adaptive and non-adaptive changes in genomes, segmental duplications (duplication of individual genes or genomic regions) and polyploidization (whole genome duplications) are well-known driving forces. The probability of fixation and maintenance of duplicates depends on many variables, including population sizes and selection regimes experienced by the corresponding genes: a combination of stochastic and adaptive mechanisms has shaped all genomes. A survey of experimental work shows that the distinction made between fixation and maintenance of duplicates still needs to be conceptualized and mathematically modeled. Here we review the mechanisms that increase or decrease the probability of fixation or maintenance of duplicated genes, and examine the outcome of these events on the adaptation of the organisms. Reviewers This article was reviewed by Dr. Etienne Joly, Dr. Lutz Walter and Dr. W. Ford Doolittle.
Collapse
Affiliation(s)
- Anthony Levasseur
- INRA, UMR1163 de Biotechnologie des Champignons Filamenteux, IFR86-BAIM, Universités de Provence et de la Méditerranée, ESIL, 163 avenue de Luminy, CP 925, 13288 Marseille Cedex 09, France.
| | | |
Collapse
|
238
|
Pramanik S, Cui X, Wang HY, Chimge NO, Hu G, Shen L, Gao R, Li H. Segmental duplication as one of the driving forces underlying the diversity of the human immunoglobulin heavy chain variable gene region. BMC Genomics 2011; 12:78. [PMID: 21272357 PMCID: PMC3042411 DOI: 10.1186/1471-2164-12-78] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2010] [Accepted: 01/27/2011] [Indexed: 11/10/2022] Open
Abstract
Background Segmental duplication and deletion were implicated for a region containing the human immunoglobulin heavy chain variable (IGHV) gene segments, 1.9III/hv3005 (possible allelic variants of IGHV3-30) and hv3019b9 (a possible allelic variant of IGHV3-33). However, very little is known about the ranges of the duplication and the polymorphic region. This is mainly because of the difficulty associated with distinguishing between allelic and paralogous sequences in the IGHV region containing extensive repetitive sequences. Inability to separate the two parental haploid genomes in the subjects is another serious barrier. To address these issues, unique DNA sequence tags evenly distributed within and flanking the duplicated region implicated by the previous studies were selected. The selected tags in single sperm from six unrelated healthy donors were amplified by multiplex PCR followed by microarray detection. In this way, individual haplotypes of different parental origins in the sperm donors could be analyzed separately and precisely. The identified polymorphic region was further analyzed at the nucleotide sequence level using sequences from the three human genomic sequence assemblies in the database. Results A large polymorphic region was identified using the selected sequence tags. Four of the 12 haplotypes were shown to contain consecutively undetectable tags spanning in a variable range. Detailed analysis of sequences from the genomic sequence assemblies revealed two large duplicate sequence blocks of 24,696 bp and 24,387 bp, respectively, and an incomplete copy of 961 bp in this region. It contains up to 13 IGHV gene segments depending on haplotypes. A polymorphic region was found to be located within the duplicated blocks. The variants of this polymorphism unusually diverged at the nucleotide sequence level and in IGHV gene segment number, composition and organization, indicating a limited selection pressure in general. However, the divergence level within the gene segments is significantly different from that in the intergenic regions indicating that these regions may have been subject to different selection pressures and that the IGHV gene segments in this region are functionally important. Conclusions Non-reciprocal genetic rearrangements associated with large duplicate sequence blocks could substantially contribute to the IGHV region diversity. Since the resulting polymorphisms may affect the number, composition and organization of the gene segments in this region, it may have significant impact on the function of the IGHV gene segment repertoire, antibody diversity, and therefore, the immune system. Because one of the gene segments, 3-30 (1.9III), is associated with autoimmune diseases, it could be of diagnostic significance to learn about the variants in the haplotypes by using the multiplex haplotype analysis system used in the present study with DNA sequence tags specific for the variants of all gene segments in this region.
Collapse
Affiliation(s)
- Sreemanta Pramanik
- Department of Molecular Genetics, Microbiology, and Immunology, University of Medicine and Dentistry of New Jersey-Robert Wood Johnson Medical School, Piscataway, NJ 08854, USA
| | | | | | | | | | | | | | | |
Collapse
|
239
|
Guo X, Freyer L, Morrow B, Zheng D. Characterization of the past and current duplication activities in the human 22q11.2 region. BMC Genomics 2011; 12:71. [PMID: 21269513 PMCID: PMC3040729 DOI: 10.1186/1471-2164-12-71] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2010] [Accepted: 01/26/2011] [Indexed: 12/02/2022] Open
Abstract
Background Segmental duplications (SDs) on 22q11.2 (LCR22), serve as substrates for meiotic non-allelic homologous recombination (NAHR) events resulting in several clinically significant genomic disorders. Results To understand the duplication activity leading to the complicated SD structure of this region, we have applied the A-Bruijn graph algorithm to decompose the 22q11.2 SDs to 523 fundamental duplication sequences, termed subunits. Cross-species syntenic analysis of primate genomes demonstrates that many of these LCR22 subunits emerged very recently, especially those implicated in human genomic disorders. Some subunits have expanded more actively than others, and young Alu SINEs, are associated much more frequently with duplicated sequences that have undergone active expansion, confirming their role in mediating recombination events. Many copy number variations (CNVs) exist on 22q11.2, some flanked by SDs. Interestingly, two chromosome breakpoints for 13 CNVs (mean length 65 kb) are located in paralogous subunits, providing direct evidence that SD subunits could contribute to CNV formation. Sequence analysis of PACs or BACs identified extra CNVs, specifically, 10 insertions and 18 deletions within 22q11.2; four were more than 10 kb in size and most contained young AluYs at their breakpoints. Conclusions Our study indicates that AluYs are implicated in the past and current duplication events, and moreover suggests that DNA rearrangements in 22q11.2 genomic disorders perhaps do not occur randomly but involve both actively expanded duplication subunits and Alu elements.
Collapse
Affiliation(s)
- Xingyi Guo
- Department of Neurology, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | | | | | | |
Collapse
|
240
|
van Binsbergen E. Origins and Breakpoint Analyses of Copy Number Variations: Up Close and Personal. Cytogenet Genome Res 2011; 135:271-6. [DOI: 10.1159/000330267] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
|
241
|
Vogler C, Gschwind L, Röthlisberger B, Huber A, Filges I, Miny P, Auschra B, Stetak A, Demougin P, Vukojevic V, Kolassa IT, Elbert T, de Quervain DJF, Papassotiropoulos A. Microarray-based maps of copy-number variant regions in European and sub-Saharan populations. PLoS One 2010; 5:e15246. [PMID: 21179565 PMCID: PMC3002949 DOI: 10.1371/journal.pone.0015246] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2010] [Accepted: 11/16/2010] [Indexed: 02/03/2023] Open
Abstract
The genetic basis of phenotypic variation can be partially explained by the presence of copy-number variations (CNVs). Currently available methods for CNV assessment include high-density single-nucleotide polymorphism (SNP) microarrays that have become an indispensable tool in genome-wide association studies (GWAS). However, insufficient concordance rates between different CNV assessment methods call for cautious interpretation of results from CNV-based genetic association studies. Here we provide a cross-population, microarray-based map of copy-number variant regions (CNVRs) to enable reliable interpretation of CNV association findings. We used the Affymetrix Genome-Wide Human SNP Array 6.0 to scan the genomes of 1167 individuals from two ethnically distinct populations (Europe, N = 717; Rwanda, N = 450). Three different CNV-finding algorithms were tested and compared for sensitivity, specificity, and feasibility. Two algorithms were subsequently used to construct CNVR maps, which were also validated by processing subsamples with additional microarray platforms (Illumina 1M-Duo BeadChip, Nimblegen 385K aCGH array) and by comparing our data with publicly available information. Both algorithms detected a total of 42669 CNVs, 74% of which clustered in 385 CNVRs of a cross-population map. These CNVRs overlap with 862 annotated genes and account for approximately 3.3% of the haploid human genome. We created comprehensive cross-populational CNVR-maps. They represent an extendable framework that can leverage the detection of common CNVs and additionally assist in interpreting CNV-based association studies.
Collapse
Affiliation(s)
- Christian Vogler
- Department of Psychology, University of Basel, and Department of Biomedicine, University Children's Hospital, Basel, Switzerland.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
242
|
Abstract
Gene duplications represent an important class of evolutionary events that is likely to have contributed to the unique human phenotype in the short evolutionary time since the human-chimpanzee divergence. With the availability of both human and chimpanzee genome drafts in high coverage re-sequencing assemblies and the high annotation quality of most human genes, it should now be possible to identify all human lineage-specific gene duplication events (human inparalogues) and a few pioneering studies have attempted to do that. However, the different levels of coverage in the human and chimpanzee's genomes assemblies, and the differing levels of gene annotation, have led to problematic assumptions and oversimplifications in the algorithms and the datasets used to detect human lineage-specific gene duplications. In this study, we have developed a set of bioinformatic tools to overcome a number of the conceptual problems that are prevalent in previous studies and have collected a reliable and representative set of human inparalogues.
Collapse
Affiliation(s)
- Yuval Itan
- Research Department of Genetics, Evolution and Environment, University College London, UK.
| | | | | |
Collapse
|
243
|
Paar V, Glunčić M, Basar I, Rosandić M, Paar P, Cvitković M. Large Tandem, Higher Order Repeats and Regularly Dispersed Repeat Units Contribute Substantially to Divergence Between Human and Chimpanzee Y Chromosomes. J Mol Evol 2010; 72:34-55. [DOI: 10.1007/s00239-010-9401-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2010] [Accepted: 10/25/2010] [Indexed: 10/18/2022]
|
244
|
Holford ME, Khurana E, Cheung KH, Gerstein M. Using semantic web rules to reason on an ontology of pseudogenes. Bioinformatics 2010; 26:i71-8. [PMID: 20529940 PMCID: PMC2881358 DOI: 10.1093/bioinformatics/btq173] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Recent years have seen the development of a wide range of biomedical ontologies. Notable among these is Sequence Ontology (SO) which offers a rich hierarchy of terms and relationships that can be used to annotate genomic data. Well-designed formal ontologies allow data to be reasoned upon in a consistent and logically sound way and can lead to the discovery of new relationships. The Semantic Web Rules Language (SWRL) augments the capabilities of a reasoner by allowing the creation of conditional rules. To date, however, formal reasoning, especially the use of SWRL rules, has not been widely used in biomedicine. RESULTS We have built a knowledge base of human pseudogenes, extending the existing SO framework to incorporate additional attributes. In particular, we have defined the relationships between pseudogenes and segmental duplications. We then created a series of logical rules using SWRL to answer research questions and to annotate our pseudogenes appropriately. Finally, we were left with a knowledge base which could be queried to discover information about human pseudogene evolution. AVAILABILITY The fully populated knowledge base described in this document is available for download from http://ontology.pseudogene.org. A SPARQL endpoint from which to query the dataset is also available at this location.
Collapse
Affiliation(s)
- Matthew E Holford
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA.
| | | | | | | |
Collapse
|
245
|
Parrott AM, Tsai M, Batchu P, Ryan K, Ozer HL, Tian B, Mathews MB. The evolution and expression of the snaR family of small non-coding RNAs. Nucleic Acids Res 2010; 39:1485-500. [PMID: 20935053 PMCID: PMC3045588 DOI: 10.1093/nar/gkq856] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We recently identified the snaR family of small non-coding RNAs that associate in vivo with the nuclear factor 90 (NF90/ILF3) protein. The major human species, snaR-A, is an RNA polymerase III transcript with restricted tissue distribution and orthologs in chimpanzee but not rhesus macaque or mouse. We report their expression in human tissues and their evolution in primates. snaR genes are exclusively in African Great Apes and some are unique to humans. Two novel families of snaR-related genetic elements were found in primates: CAS (catarrhine ancestor of snaR), limited to Old World Monkeys and apes; and ASR (Alu/snaR-related), present in all monkeys and apes. ASR and CAS appear to have spread by retrotransposition, whereas most snaR genes have spread by segmental duplication. snaR-A and snaR-G2 are differentially expressed in discrete regions of the human brain and other tissues, notably including testis. snaR-A is up-regulated in transformed and immortalized human cells, and is stably bound to ribosomes in HeLa cells. We infer that snaR evolved from the left monomer of the primate-specific Alu SINE family via ASR and CAS in conjunction with major primate speciation events, and suggest that snaRs participate in tissue- and species-specific regulation of cell growth and translation.
Collapse
Affiliation(s)
- Andrew M Parrott
- Department of Biochemistry and Molecular Biology, New Jersey Medical School, UMDNJ, Newark, New Jersey, USA
| | | | | | | | | | | | | |
Collapse
|
246
|
Fu W, Zhang F, Wang Y, Gu X, Jin L. Identification of copy number variation hotspots in human populations. Am J Hum Genet 2010; 87:494-504. [PMID: 20920665 DOI: 10.1016/j.ajhg.2010.09.006] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2010] [Revised: 08/09/2010] [Accepted: 09/15/2010] [Indexed: 01/22/2023] Open
Abstract
Copy number variants (CNVs) in the human genome contribute to both Mendelian and complex traits as well as to genomic plasticity in evolution. The investigation of mutational rates of CNVs is critical to understanding genomic instability and the etiology of the copy number variation (CNV)-related traits. However, the evaluation of the CNV mutation rate at the genome level poses an insurmountable practical challenge that requires large samples and accurate typing. In this study, we show that an approximate estimation of the CNV mutation rate could be achieved by using the phylogeny information of flanking SNPs. This allows a genome-wide comparison of mutation rates between CNVs with the use of vast, readily available data of SNP genotyping. A total of 4187 CNV regions (CNVRs) previously identified in HapMap populations were investigated in this study. We showed that the mutation rates for the majority of these CNVRs are at the order of 10⁻⁵ per generation, consistent with experimental observations at individual loci. Notably, the mutation rates of 104 (2.5%) CNVRs were estimated at the order of 10⁻³ per generation; therefore, they were identified as potential hotspots. Additional analyses revealed that genome architecture at CNV loci has a potential role in inciting mutational hotspots in the human genome. Interestingly, 49 (47%) CNV hotspots include human genes, some of which are known to be functional CNV loci (e.g., CNVs of C4 and β-defensin causing autoimmune diseases and CNVs of HYDIN with implication in control of cerebral cortex size), implicating the important role of CNV in human health and evolution, especially in common and complex diseases.
Collapse
|
247
|
Kahn CL, Hristov BH, Raphael BJ. Parsimony and likelihood reconstruction of human segmental duplications. ACTA ACUST UNITED AC 2010; 26:i446-52. [PMID: 20823306 PMCID: PMC2935423 DOI: 10.1093/bioinformatics/btq368] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
MOTIVATION Segmental duplications > 1 kb in length with >or= 90% sequence identity between copies comprise nearly 5% of the human genome. They are frequently found in large, contiguous regions known as duplication blocks that can contain mosaic patterns of thousands of segmental duplications. Reconstructing the evolutionary history of these complex genomic regions is a non-trivial, but important task. RESULTS We introduce parsimony and likelihood techniques to analyze the evolutionary relationships between duplication blocks. Both techniques rely on a generic model of duplication in which long, contiguous substrings are copied and reinserted over large physical distances, allowing for a duplication block to be constructed by aggregating substrings of other blocks. For the likelihood method, we give an efficient dynamic programming algorithm to compute the weighted ensemble of all duplication scenarios that account for the construction of a duplication block. Using this ensemble, we derive the probabilities of various duplication scenarios. We formalize the task of reconstructing the evolutionary history of segmental duplications as an optimization problem on the space of directed acyclic graphs. We use a simulated annealing heuristic to solve the problem for a set of segmental duplications in the human genome in both parsimony and likelihood settings. AVAILABILITY Supplementary information is available at http://www.cs.brown.edu/people/braphael/supplements/.
Collapse
Affiliation(s)
- Crystal L Kahn
- Department of Computer Science, Brown University, Providence, RI 02912, USA.
| | | | | |
Collapse
|
248
|
Dalloul RA, Long JA, Zimin AV, Aslam L, Beal K, Ann Blomberg L, Bouffard P, Burt DW, Crasta O, Crooijmans RPMA, Cooper K, Coulombe RA, De S, Delany ME, Dodgson JB, Dong JJ, Evans C, Frederickson KM, Flicek P, Florea L, Folkerts O, Groenen MAM, Harkins TT, Herrero J, Hoffmann S, Megens HJ, Jiang A, de Jong P, Kaiser P, Kim H, Kim KW, Kim S, Langenberger D, Lee MK, Lee T, Mane S, Marcais G, Marz M, McElroy AP, Modise T, Nefedov M, Notredame C, Paton IR, Payne WS, Pertea G, Prickett D, Puiu D, Qioa D, Raineri E, Ruffier M, Salzberg SL, Schatz MC, Scheuring C, Schmidt CJ, Schroeder S, Searle SMJ, Smith EJ, Smith J, Sonstegard TS, Stadler PF, Tafer H, Tu Z(J, Van Tassell CP, Vilella AJ, Williams KP, Yorke JA, Zhang L, Zhang HB, Zhang X, Zhang Y, Reed KM. Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo): genome assembly and analysis. PLoS Biol 2010; 8:e1000475. [PMID: 20838655 PMCID: PMC2935454 DOI: 10.1371/journal.pbio.1000475] [Citation(s) in RCA: 320] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2009] [Accepted: 07/27/2010] [Indexed: 12/11/2022] Open
Abstract
A synergistic combination of two next-generation sequencing platforms with a detailed comparative BAC physical contig map provided a cost-effective assembly of the genome sequence of the domestic turkey (Meleagris gallopavo). Heterozygosity of the sequenced source genome allowed discovery of more than 600,000 high quality single nucleotide variants. Despite this heterozygosity, the current genome assembly (∼1.1 Gb) includes 917 Mb of sequence assigned to specific turkey chromosomes. Annotation identified nearly 16,000 genes, with 15,093 recognized as protein coding and 611 as non-coding RNA genes. Comparative analysis of the turkey, chicken, and zebra finch genomes, and comparing avian to mammalian species, supports the characteristic stability of avian genomes and identifies genes unique to the avian lineage. Clear differences are seen in number and variety of genes of the avian immune system where expansions and novel genes are less frequent than examples of gene loss. The turkey genome sequence provides resources to further understand the evolution of vertebrate genomes and genetic variation underlying economically important quantitative traits in poultry. This integrated approach may be a model for providing both gene and chromosome level assemblies of other species with agricultural, ecological, and evolutionary interest.
Collapse
Affiliation(s)
- Rami A. Dalloul
- Avian Immunobiology Laboratory, Department of Animal and Poultry Sciences, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Julie A. Long
- Animal Biosciences and Biotechnology Laboratory, USDA Agricultural Research Service, Beltsville, Maryland, United States of America
| | - Aleksey V. Zimin
- Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America
| | - Luqman Aslam
- Animal Breeding and Genomics Centre, Wageningen University, Wageningen, the Netherlands
| | - Kathryn Beal
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Le Ann Blomberg
- Animal Biosciences and Biotechnology Laboratory, USDA Agricultural Research Service, Beltsville, Maryland, United States of America
| | - Pascal Bouffard
- Roche Applied Science, Indianapolis, Indiana, United States of America
| | - David W. Burt
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Roslin, Midlothian, United Kingdom
| | - Oswald Crasta
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia, United States of America
- Chromatin Inc., Champaign, Illinois, United States of America
| | | | - Kristal Cooper
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Roger A. Coulombe
- Department of Veterinary Sciences, Utah State University, Logan, Utah, United States of America
| | - Supriyo De
- Gene Expression and Genomics Unit, National Institute on Aging, National Institutes of Health, Baltimore, Maryland, United States of America
| | - Mary E. Delany
- Department of Animal Science, University of California, Davis, California, United States of America
| | - Jerry B. Dodgson
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, Michigan, United States of America
| | - Jennifer J. Dong
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas, United States of America
| | - Clive Evans
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia, United States of America
| | | | - Paul Flicek
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Liliana Florea
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, United States of America
| | - Otto Folkerts
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia, United States of America
- Chromatin Inc., Champaign, Illinois, United States of America
| | - Martien A. M. Groenen
- Animal Breeding and Genomics Centre, Wageningen University, Wageningen, the Netherlands
| | - Tim T. Harkins
- Roche Applied Science, Indianapolis, Indiana, United States of America
| | - Javier Herrero
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Steve Hoffmann
- Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany
- LIFE Project, University of Leipzig, Leipzig, Germany
| | - Hendrik-Jan Megens
- Animal Breeding and Genomics Centre, Wageningen University, Wageningen, the Netherlands
| | - Andrew Jiang
- Department of Animal Science, University of California, Davis, California, United States of America
| | - Pieter de Jong
- Children's Hospital and Research Center at Oakland, Oakland, California, United States of America
| | - Pete Kaiser
- Institute for Animal Health, Compton, Berkshire, United Kingdom
| | - Heebal Kim
- Laboratory of Bioinformatics and Population Genetics, Department of Agricultural Biotechnology, Seoul National University, Seoul, Korea
| | - Kyu-Won Kim
- Laboratory of Bioinformatics and Population Genetics, Department of Agricultural Biotechnology, Seoul National University, Seoul, Korea
| | - Sungwon Kim
- Avian Immunobiology Laboratory, Department of Animal and Poultry Sciences, Virginia Tech, Blacksburg, Virginia, United States of America
| | - David Langenberger
- Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany
| | - Mi-Kyung Lee
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas, United States of America
| | - Taeheon Lee
- Laboratory of Bioinformatics and Population Genetics, Department of Agricultural Biotechnology, Seoul National University, Seoul, Korea
| | - Shrinivasrao Mane
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Guillaume Marcais
- Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America
| | - Manja Marz
- Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany
- Philipps-Universität Marburg, Pharmazeutische Chemie, Marburg, Germany
| | - Audrey P. McElroy
- Avian Immunobiology Laboratory, Department of Animal and Poultry Sciences, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Thero Modise
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Mikhail Nefedov
- Children's Hospital and Research Center at Oakland, Oakland, California, United States of America
| | - Cédric Notredame
- Comparative Bioinformatics, Centre for Genomic Regulation (CRG), Universitat Pompeus Fabre, Barcelona, Spain
| | - Ian R. Paton
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Roslin, Midlothian, United Kingdom
| | - William S. Payne
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, Michigan, United States of America
| | - Geo Pertea
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, United States of America
| | - Dennis Prickett
- Institute for Animal Health, Compton, Berkshire, United Kingdom
| | - Daniela Puiu
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, United States of America
| | - Dan Qioa
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Emanuele Raineri
- Comparative Bioinformatics, Centre for Genomic Regulation (CRG), Universitat Pompeus Fabre, Barcelona, Spain
| | - Magali Ruffier
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Steven L. Salzberg
- Center for Bioinformatics and Computational Biology, Department of Computer Science, University of Maryland, College Park, Maryland, United States of America
| | - Michael C. Schatz
- Center for Bioinformatics and Computational Biology, Department of Computer Science, University of Maryland, College Park, Maryland, United States of America
| | - Chantel Scheuring
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas, United States of America
| | - Carl J. Schmidt
- Department of Animal and Food Sciences, University of Delaware, Newark, Delaware, United States of America
| | - Steven Schroeder
- Bovine Functional Genomics Laboratory, USDA Agricultural Research Service, Beltsville Agricultural Research Center, Beltsville, Maryland, United States of America
| | - Stephen M. J. Searle
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Edward J. Smith
- Avian Immunobiology Laboratory, Department of Animal and Poultry Sciences, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Jacqueline Smith
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Roslin, Midlothian, United Kingdom
| | - Tad S. Sonstegard
- Bovine Functional Genomics Laboratory, USDA Agricultural Research Service, Beltsville Agricultural Research Center, Beltsville, Maryland, United States of America
| | - Peter F. Stadler
- Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
- Fraunhofer Institut für Zelltherapie und Immunologie, Leipzig, Germany
- Department of Theoretical Chemistry University of Vienna, Vienna, Austria
- Santa Fe Institute, Santa Fe, New Mexico, United States of America
| | - Hakim Tafer
- Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany
- Department of Theoretical Chemistry University of Vienna, Vienna, Austria
| | - Zhijian (Jake) Tu
- Department of Biochemistry, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Curtis P. Van Tassell
- Bovine Functional Genomics Laboratory, USDA Agricultural Research Service, Beltsville Agricultural Research Center, Beltsville, Maryland, United States of America
- Animal Improvement Programs Laboratory, USDA Agricultural Research Service, Beltsville Agricultural Research Center, Beltsville, Maryland, United States of America
| | - Albert J. Vilella
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Kelly P. Williams
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia, United States of America
| | - James A. Yorke
- Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America
| | - Liqing Zhang
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Hong-Bin Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas, United States of America
| | - Xiaojun Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas, United States of America
| | - Yang Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas, United States of America
| | - Kent M. Reed
- Department of Veterinary and Biomedical Sciences, College of Veterinary Medicine, University of Minnesota, St. Paul, Minnesota, United States of America
| |
Collapse
|
249
|
|
250
|
Colobran R, Pedrosa E, Carretero-Iglesia L, Juan M. Copy number variation in chemokine superfamily: the complex scene of CCL3L-CCL4L genes in health and disease. Clin Exp Immunol 2010; 162:41-52. [PMID: 20659124 DOI: 10.1111/j.1365-2249.2010.04224.x] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Genome copy number changes (copy number variations: CNVs) include inherited, de novo and somatically acquired deviations from a diploid state within a particular chromosomal segment. CNVs are frequent in higher eukaryotes and associated with a substantial portion of inherited and acquired risk for various human diseases. CNVs are distributed widely in the genomes of apparently healthy individuals and thus constitute significant amounts of population-based genomic variation. Human CNV loci are enriched for immune genes and one of the most striking examples of CNV in humans involves a genomic region containing the chemokine genes CCL3L and CCL4L. The CCL3L-CCL4L copy number variable region (CNVR) shows extensive architectural complexity, with smaller CNVs within the larger ones and with interindividual variation in breakpoints. Furthermore, the individual genes embedded in this CNVR account for an additional level of genetic and mRNA complexity: CCL4L1 and CCL4L2 have identical exonic sequences but produce a different pattern of mRNAs. CCL3L2 was considered previously as a CCL3L1 pseudogene, but is actually transcribed. Since 2005, CCL3L-CCL4L CNV has been associated extensively with various human immunodeficiency virus-related outcomes, but some recent studies called these associations into question. This controversy may be due in part to the differences in alternative methods for quantifying gene copy number and differentiating the individual genes. This review summarizes and discusses the current knowledge about CCL3L-CCL4L CNV and points out that elucidating their complete phenotypic impact requires dissecting the combinatorial genomic complexity posed by various proportions of distinct CCL3L and CCL4L genes among individuals.
Collapse
Affiliation(s)
- R Colobran
- Laboratory of Immunobiology for Research and Application to Diagnosis (LIRAD), Tissue and Blood Bank (BST), Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol (IGTP) Servei d'Immunologia, Centre de Diagnòstic Biomèdic (CDB), Hospital Clínic, IDIBAPS (Institut d'Investigacions Biomèdiques August Pi i Sunyer), Barcelona, Spain
| | | | | | | |
Collapse
|