1
|
Dishuck PC, Munson KM, Lewis AP, Dougherty ML, Underwood JG, Harvey WT, Hsieh P, Pastinen T, Eichler EE. Structural variation, selection, and diversification of the NPIP gene family from the human pangenome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.04.636496. [PMID: 39975192 PMCID: PMC11838601 DOI: 10.1101/2025.02.04.636496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
The NPIP (nuclear pore interacting protein) gene family has expanded to high copy number in humans and African apes where it has been subject to an excess of amino acid replacement consistent with positive selection (1). Due to the limitations of short-read sequencing, NPIP human genetic diversity has been poorly understood. Using highly accurate assemblies generated from long-read sequencing as part of the human pangenome, we completely characterize 169 human haplotypes (4,665 NPIP paralogs and alleles). Of the 28 NPIP paralogs, just three (NPIPB2, B11, and B14) are fixed at a single copy, and only a single locus, B2, shows no structural variation. Four NPIP paralogs map to large segmental duplication blocks that mediate polymorphic inversions (355 kbp-1.6 Mbp) corresponding to microdeletions associated with developmental delay and autism. Haplotype-based tests of positive selection and selective sweeps identify two paralogs, B9 and B15, within the top percentile for both tests. Using full-length cDNA data from 101 tissue/cell types, we construct paralog-specific gene models and show that 56% (31/55 most abundant isoforms) have not been previously described in RefSeq. We define six distinct translation start sites and other protein structural features that distinguish paralogs, including a variable number tandem repeat that encodes a beta helix of variable size that emerged ~3.1 million years ago in human evolution. Among the 28 NPIP paralogs, we identify distinct tissue and developmental patterns of expression with only a few maintaining the ancestral testis-enriched expression. A subset of paralogs (NPIPA1, A5, A6-9, B3-5, and B12/B13) show increased brain expression. Our results suggest ongoing positive selection in the human population and rapid diversification of NPIP gene models.
Collapse
Affiliation(s)
- Philip C. Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Katherine M. Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Alexandra P. Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Max L. Dougherty
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Present address: Tisch Cancer Institute, Division of Hematology and Medical Oncology, The Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Jason G. Underwood
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Pacific Biosciences (PacBio) of California, Incorporated, Menlo Park, CA, USA
| | - William T. Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - PingHsun Hsieh
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Department of Genetics, Cell Biology, and Development, Institute for Health Informatics, University of Minnesota, Minneapolis, MN, USA
| | - Tomi Pastinen
- Genomic Medicine Center, Department of Pediatrics, Children’s Mercy Kansas City, Kansas City, KS, USA
- UMKC School of Medicine, University of Missouri, Kansas City, Kansas City, KS, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| |
Collapse
|
2
|
Guitart X, Porubsky D, Yoo D, Dougherty ML, Dishuck PC, Munson KM, Lewis AP, Hoekzema K, Knuth J, Chang S, Pastinen T, Eichler EE. Independent expansion, selection, and hypervariability of the TBC1D3 gene family in humans. Genome Res 2024; 34:1798-1810. [PMID: 39107043 DOI: 10.1101/gr.279299.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Accepted: 07/29/2024] [Indexed: 08/09/2024]
Abstract
TBC1D3 is a primate-specific gene family that has expanded in the human lineage and has been implicated in neuronal progenitor proliferation and expansion of the frontal cortex. The gene family and its expression have been challenging to investigate because it is embedded in high-identity and highly variable segmental duplications. We sequenced and assembled the gene family using long-read sequencing data from 34 humans and 11 nonhuman primate species. Our analysis shows that this particular gene family has independently duplicated in at least five primate lineages, and the duplicated loci are enriched at sites of large-scale chromosomal rearrangements on Chromosome 17. We find that all human copy-number variation maps to two distinct clusters located at Chromosome 17q12 and that humans are highly structurally variable at this locus, differing by as many as 20 copies and ∼1 Mbp in length depending on haplotypes. We also show evidence of positive selection, as well as a significant change in the predicted human TBC1D3 protein sequence. Last, we find that, despite multiple duplications, human TBC1D3 expression is limited to a subset of copies and, most notably, from a single paralog group: TBC1D3-CDKL These observations may help explain why a gene potentially important in cortical development can be so variable in the human population.
Collapse
Affiliation(s)
- Xavi Guitart
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - DongAhn Yoo
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Max L Dougherty
- Tisch Cancer Institute, Division of Hematology and Medical Oncology, The Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Philip C Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Jordan Knuth
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Stephen Chang
- Department of Biochemistry
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University, Stanford, California 94305, USA
| | - Tomi Pastinen
- Department of Pediatrics, Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, Missouri 64108, USA
- Department of Pediatrics, School of Medicine, University of Missouri Kansas City, Kansas City, Missouri 64108, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA;
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
3
|
Höps W, Rausch T, Jendrusch M, Korbel JO, Sedlazeck FJ. Impact and characterization of serial structural variations across humans and great apes. Nat Commun 2024; 15:8007. [PMID: 39266513 PMCID: PMC11393467 DOI: 10.1038/s41467-024-52027-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Accepted: 08/23/2024] [Indexed: 09/14/2024] Open
Abstract
Modern sequencing technology enables the systematic detection of complex structural variation (SV) across genomes. However, extensive DNA rearrangements arising through a series of mutations, a phenomenon we refer to as serial SV (sSV), remain underexplored, posing a challenge for SV discovery. Here, we present NAHRwhals ( https://github.com/WHops/NAHRwhals ), a method to infer repeat-mediated series of SVs in long-read genomic assemblies. Applying NAHRwhals to haplotype-resolved human genomes from 28 individuals reveals 37 sSV loci of various length and complexity. These sSVs explain otherwise cryptic variation in medically relevant regions such as the TPSAB1 gene, 8p23.1, 22q11 and Sotos syndrome regions. Comparisons with great ape assemblies indicate that most human sSVs formed recently, after the human-ape split, and involved non-repeat-mediated processes in addition to non-allelic homologous recombination. NAHRwhals reliably discovers and characterizes sSVs at scale and independent of species, uncovering their genomic abundance and suggesting broader implications for disease.
Collapse
Affiliation(s)
- Wolfram Höps
- European Molecular Biology Laboratory, Genome Biology Unit, Meyerhofstr. 1, 69117, Heidelberg, Germany
| | - Tobias Rausch
- European Molecular Biology Laboratory, Genome Biology Unit, Meyerhofstr. 1, 69117, Heidelberg, Germany
- Molecular Medicine Partnership Unit, European Molecular Biology Laboratory, University of Heidelberg, Heidelberg, Germany
| | - Michael Jendrusch
- European Molecular Biology Laboratory, Genome Biology Unit, Meyerhofstr. 1, 69117, Heidelberg, Germany
| | - Jan O Korbel
- European Molecular Biology Laboratory, Genome Biology Unit, Meyerhofstr. 1, 69117, Heidelberg, Germany.
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
- Department of Computer Science, Rice University, Houston, TX, USA
| |
Collapse
|
4
|
Guitart X, Porubsky D, Yoo D, Dougherty ML, Dishuck PC, Munson KM, Lewis AP, Hoekzema K, Knuth J, Chang S, Pastinen T, Eichler EE. Independent expansion, selection and hypervariability of the TBC1D3 gene family in humans. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.12.584650. [PMID: 38654825 PMCID: PMC11037872 DOI: 10.1101/2024.03.12.584650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
TBC1D3 is a primate-specific gene family that has expanded in the human lineage and has been implicated in neuronal progenitor proliferation and expansion of the frontal cortex. The gene family and its expression have been challenging to investigate because it is embedded in high-identity and highly variable segmental duplications. We sequenced and assembled the gene family using long-read sequencing data from 34 humans and 11 nonhuman primate species. Our analysis shows that this particular gene family has independently duplicated in at least five primate lineages, and the duplicated loci are enriched at sites of large-scale chromosomal rearrangements on chromosome 17. We find that most humans vary along two TBC1D3 clusters where human haplotypes are highly variable in copy number, differing by as many as 20 copies, and structure (structural heterozygosity 90%). We also show evidence of positive selection, as well as a significant change in the predicted human TBC1D3 protein sequence. Lastly, we find that, despite multiple duplications, human TBC1D3 expression is limited to a subset of copies and, most notably, from a single paralog group: TBC1D3-CDKL. These observations may help explain why a gene potentially important in cortical development can be so variable in the human population.
Collapse
Affiliation(s)
- Xavi Guitart
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - DongAhn Yoo
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Max L. Dougherty
- Tisch Cancer Institute, Division of Hematology and Medical Oncology, The Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Philip C. Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Katherine M. Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Alexandra P. Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Jordan Knuth
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Stephen Chang
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University, Stanford, CA, USA
| | - Tomi Pastinen
- Department of Pediatrics, Genomic Medicine Center, Children’s Mercy Kansas City, Kansas City, MO, USA
- Department of Pediatrics, School of Medicine, University of Missouri Kansas City, Kansas City, MO, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical institute, University of Washington, Seattle, WA, USA
| |
Collapse
|
5
|
Soto DC, Uribe-Salazar JM, Shew CJ, Sekar A, McGinty S, Dennis MY. Genomic structural variation: A complex but important driver of human evolution. AMERICAN JOURNAL OF BIOLOGICAL ANTHROPOLOGY 2023; 181 Suppl 76:118-144. [PMID: 36794631 PMCID: PMC10329998 DOI: 10.1002/ajpa.24713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/02/2022] [Revised: 01/21/2023] [Accepted: 02/05/2023] [Indexed: 02/17/2023]
Abstract
Structural variants (SVs)-including duplications, deletions, and inversions of DNA-can have significant genomic and functional impacts but are technically difficult to identify and assay compared with single-nucleotide variants. With the aid of new genomic technologies, it has become clear that SVs account for significant differences across and within species. This phenomenon is particularly well-documented for humans and other primates due to the wealth of sequence data available. In great apes, SVs affect a larger number of nucleotides than single-nucleotide variants, with many identified SVs exhibiting population and species specificity. In this review, we highlight the importance of SVs in human evolution by (1) how they have shaped great ape genomes resulting in sensitized regions associated with traits and diseases, (2) their impact on gene functions and regulation, which subsequently has played a role in natural selection, and (3) the role of gene duplications in human brain evolution. We further discuss how to incorporate SVs in research, including the strengths and limitations of various genomic approaches. Finally, we propose future considerations in integrating existing data and biospecimens with the ever-expanding SV compendium propelled by biotechnology advancements.
Collapse
Affiliation(s)
- Daniela C. Soto
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| | - José M. Uribe-Salazar
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| | - Colin J. Shew
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| | - Aarthi Sekar
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| | - Sean McGinty
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| | - Megan Y. Dennis
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA, USA
- Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, USA
| |
Collapse
|
6
|
Damert A. SVA retrotransposons and a low copy repeat in humans and great apes: a mobile connection. Mol Biol Evol 2022; 39:6586216. [PMID: 35574660 PMCID: PMC9132208 DOI: 10.1093/molbev/msac103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Segmental duplications (SDs) constitute a considerable fraction of primate genomes. They contribute to genetic variation and provide raw material for evolution. Groups of SDs are characterized by the presence of shared core duplicons. One of these core duplicons, low copy repeat (lcr)16a, has been shown to be particularly active in the propagation of interspersed SDs in primates. The underlying mechanisms are, however, only partially understood. Alu short interspersed elements (SINEs) are frequently found at breakpoints and have been implicated in the expansion of SDs. Detailed analysis of lcr16a-containing SDs shows that the hominid-specific SVA (SINE-R-VNTR-Alu) retrotransposon is an integral component of the core duplicon in Asian and African great apes. In orang-utan, it provides breakpoints and contributes to both interchromosomal and intrachromosomal lcr16a mobility by inter-element recombination. Furthermore, the data suggest that in hominines (human, chimpanzee, gorilla) SVA recombination-mediated integration of a circular intermediate is the founding event of a lineage-specific lcr16a expansion. One of the hominine lcr16a copies displays large flanking direct repeats, a structural feature shared by other SDs in the human genome. Taken together, the results obtained extend the range of SVAs’ contribution to genome evolution from RNA-mediated transduction to DNA-based recombination. In addition, they provide further support for a role of circular intermediates in SD mobilization.
Collapse
Affiliation(s)
- Annette Damert
- Infection Biology Unit and Primate Genetics Laboratory, German Primate Center, Leibniz Institute for Primate Research, Göttingen, Germany
| |
Collapse
|
7
|
Giannuzzi G, Logsdon GA, Chatron N, Miller DE, Reversat J, Munson KM, Hoekzema K, Bonnet-Dupeyron MN, Rollat-Farnier PA, Baker CA, Sanlaville D, Eichler EE, Schluth-Bolard C, Reymond A. Alpha Satellite Insertion Close to an Ancestral Centromeric Region. Mol Biol Evol 2021; 38:5576-5587. [PMID: 34464971 PMCID: PMC8662618 DOI: 10.1093/molbev/msab244] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Human centromeres are mainly composed of alpha satellite DNA hierarchically organized as higher-order repeats (HORs). Alpha satellite dynamics is shown by sequence homogenization in centromeric arrays and by its transfer to other centromeric locations, for example, during the maturation of new centromeres. We identified during prenatal aneuploidy diagnosis by fluorescent in situ hybridization a de novo insertion of alpha satellite DNA from the centromere of chromosome 18 (D18Z1) into cytoband 15q26. Although bound by CENP-B, this locus did not acquire centromeric functionality as demonstrated by the lack of constriction and the absence of CENP-A binding. The insertion was associated with a 2.8-kbp deletion and likely occurred in the paternal germline. The site was enriched in long terminal repeats and located ∼10 Mbp from the location where a centromere was ancestrally seeded and became inactive in the common ancestor of humans and apes 20-25 million years ago. Long-read mapping to the T2T-CHM13 human genome assembly revealed that the insertion derives from a specific region of chromosome 18 centromeric 12-mer HOR array in which the monomer size follows a regular pattern. The rearrangement did not directly disrupt any gene or predicted regulatory element and did not alter the methylation status of the surrounding region, consistent with the absence of phenotypic consequences in the carrier. This case demonstrates a likely rare but new class of structural variation that we name "alpha satellite insertion." It also expands our knowledge on alphoid DNA dynamics and conveys the possibility that alphoid arrays can relocate near vestigial centromeric sites.
Collapse
Affiliation(s)
- Giuliana Giannuzzi
- Department of Biosciences, University of Milan, Milan, Italy
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
- Institute of Biomedical Technologies, National Research Council, Milan, Italy
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA
| | - Nicolas Chatron
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
- Service de Génétique, Hospices Civils de Lyon, Lyon, France
- Institut NeuroMyoGène, University of Lyon, Lyon, France
| | - Danny E Miller
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children’s Hospital, Seattle, WA
| | - Julie Reversat
- Service de Génétique, Hospices Civils de Lyon, Lyon, France
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA
| | | | - Pierre-Antoine Rollat-Farnier
- Service de Génétique, Hospices Civils de Lyon, Lyon, France
- Cellule Bioinformatique, Hospices Civils de Lyon, Lyon, France
| | - Carl A Baker
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA
| | - Damien Sanlaville
- Service de Génétique, Hospices Civils de Lyon, Lyon, France
- Institut NeuroMyoGène, University of Lyon, Lyon, France
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA
| | - Caroline Schluth-Bolard
- Service de Génétique, Hospices Civils de Lyon, Lyon, France
- Institut NeuroMyoGène, University of Lyon, Lyon, France
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| |
Collapse
|
8
|
Vervoort L, Dierckxsens N, Pereboom Z, Capozzi O, Rocchi M, Shaikh TH, Vermeesch JR. 22q11.2 Low Copy Repeats Expanded in the Human Lineage. Front Genet 2021; 12:706641. [PMID: 34335701 PMCID: PMC8320366 DOI: 10.3389/fgene.2021.706641] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Accepted: 06/23/2021] [Indexed: 11/13/2022] Open
Abstract
Segmental duplications or low copy repeats (LCRs) constitute duplicated regions interspersed in the human genome, currently neglected in standard analyses due to their extreme complexity. Recent functional studies have indicated the potential of genes within LCRs in synaptogenesis, neuronal migration, and neocortical expansion in the human lineage. One of the regions with the highest proportion of duplicated sequence is the 22q11.2 locus, carrying eight LCRs (LCR22-A until LCR22-H), and rearrangements between them cause the 22q11.2 deletion syndrome. The LCR22-A block was recently reported to be hypervariable in the human population. It remains unknown whether this variability also exists in non-human primates, since research is strongly hampered by the presence of sequence gaps in the human and non-human primate reference genomes. To chart the LCR22 haplotypes and the associated inter- and intra-species variability, we de novo assembled the region in non-human primates by a combination of optical mapping techniques. A minimal and likely ancient haplotype is present in the chimpanzee, bonobo, and rhesus monkey without intra-species variation. In addition, the optical maps identified assembly errors and closed gaps in the orthologous chromosome 22 reference sequences. These findings indicate the LCR22 expansion to be unique to the human population, which might indicate involvement of the region in human evolution and adaptation. Those maps will enable LCR22-specific functional studies and investigate potential associations with the phenotypic variability in the 22q11.2 deletion syndrome.
Collapse
Affiliation(s)
| | | | - Zjef Pereboom
- Centre for Research and Conservation, Royal Zoological Society of Antwerp, Antwerp, Belgium
- Evolutionary Ecology Group, Department of Biology, Antwerp University, Antwerp, Belgium
| | | | | | - Tamim H. Shaikh
- Section of Genetics and Metabolism, Department of Pediatrics, University of Colorado School of Medicine, Aurora, CO, United States
| | | |
Collapse
|
9
|
Abstract
Syntenies are genomic segments of consecutive genes identified by a certain conservation in gene content and order. The notion of conservation may vary from one definition to another, the more constrained requiring identical gene contents and gene orders, while more relaxed definitions just require a certain similarity in gene content, and not necessarily in the same order. Regardless of the way they are identified, the goal is to characterize homologous genomic regions, i.e., regions deriving from a common ancestral region, reflecting a certain gene co-evolution that can enlighten important functional properties. In addition of being able to identify them, it is also necessary to infer the evolutionary history that has led from the ancestral segment to the extant ones. In this field, most algorithmic studies address the problem of inferring rearrangement scenarios explaining the disruption in gene order between segments with the same gene content, some of them extending the evolutionary model to gene insertion and deletion. However, syntenies also evolve through other events modifying their content in genes, such as duplications, losses or horizontal gene transfers, i.e., the movement of genes from one species to another. Although the reconciliation approach between a gene tree and a species tree addresses the problem of inferring such events for single-gene families, little effort has been dedicated to the generalization to segmental events and to syntenies. This paper reviews some of the main algorithmic methods for inferring ancestral syntenies and focus on those integrating both gene orders and gene trees.
Collapse
|
10
|
Chen L, Abel HJ, Das I, Larson DE, Ganel L, Kanchi KL, Regier AA, Young EP, Kang CJ, Scott AJ, Chiang C, Wang X, Lu S, Christ R, Service SK, Chiang CWK, Havulinna AS, Kuusisto J, Boehnke M, Laakso M, Palotie A, Ripatti S, Freimer NB, Locke AE, Stitziel NO, Hall IM. Association of structural variation with cardiometabolic traits in Finns. Am J Hum Genet 2021; 108:583-596. [PMID: 33798444 PMCID: PMC8059371 DOI: 10.1016/j.ajhg.2021.03.008] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Accepted: 03/03/2021] [Indexed: 02/08/2023] Open
Abstract
The contribution of genome structural variation (SV) to quantitative traits associated with cardiometabolic diseases remains largely unknown. Here, we present the results of a study examining genetic association between SVs and cardiometabolic traits in the Finnish population. We used sensitive methods to identify and genotype 129,166 high-confidence SVs from deep whole-genome sequencing (WGS) data of 4,848 individuals. We tested the 64,572 common and low-frequency SVs for association with 116 quantitative traits and tested candidate associations using exome sequencing and array genotype data from an additional 15,205 individuals. We discovered 31 genome-wide significant associations at 15 loci, including 2 loci at which SVs have strong phenotypic effects: (1) a deletion of the ALB promoter that is greatly enriched in the Finnish population and causes decreased serum albumin level in carriers (p = 1.47 × 10-54) and is also associated with increased levels of total cholesterol (p = 1.22 × 10-28) and 14 additional cholesterol-related traits, and (2) a multi-allelic copy number variant (CNV) at PDPR that is strongly associated with pyruvate (p = 4.81 × 10-21) and alanine (p = 6.14 × 10-12) levels and resides within a structurally complex genomic region that has accumulated many rearrangements over evolutionary time. We also confirmed six previously reported associations, including five led by stronger signals in single nucleotide variants (SNVs) and one linking recurrent HP gene deletion and cholesterol levels (p = 6.24 × 10-10), which was also found to be strongly associated with increased glycoprotein level (p = 3.53 × 10-35). Our study confirms that integrating SVs in trait-mapping studies will expand our knowledge of genetic factors underlying disease risk.
Collapse
Affiliation(s)
- Lei Chen
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA; Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA; Department of Genetics, Yale University School of Medicine, New Haven, CT 06510, USA
| | - Haley J Abel
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA; Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Indraniel Das
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - David E Larson
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA; Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Liron Ganel
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA; Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Krishna L Kanchi
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Allison A Regier
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA; Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Erica P Young
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA; Cardiovascular Division, Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Chul Joo Kang
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Alexandra J Scott
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA; Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Colby Chiang
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA; Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Xinxin Wang
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA; Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA; Department of Genetics, Yale University School of Medicine, New Haven, CT 06510, USA
| | - Shuangjia Lu
- Department of Genetics, Yale University School of Medicine, New Haven, CT 06510, USA
| | - Ryan Christ
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Susan K Service
- Center for Neurobehavioral Genetics, Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Charleston W K Chiang
- Center for Genetic Epidemiology, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA; Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Aki S Havulinna
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki 00014, Finland; Finnish Institute for Health and Welfare (THL), Helsinki 00271, Finland
| | - Johanna Kuusisto
- Institute of Clinical Medicine, Internal Medicine, University of Eastern Finland, Kuopio 70210, Finland; Department of Medicine, Kuopio University Hospital, Kuopio 70210, Finland
| | - Michael Boehnke
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Markku Laakso
- Institute of Clinical Medicine, Internal Medicine, University of Eastern Finland, Kuopio 70210, Finland; Department of Medicine, Kuopio University Hospital, Kuopio 70210, Finland
| | - Aarno Palotie
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki 00014, Finland; Analytical and Translational Genetics Unit (ATGU), Psychiatric & Neurodevelopmental Genetics Unit, Departments of Psychiatry and Neurology, Massachusetts General Hospital, Boston, MA 02114, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Samuli Ripatti
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki 00014, Finland; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Public Health, Faculty of Medicine, University of Helsinki, Helsinki 00014, Finland
| | - Nelson B Freimer
- Center for Neurobehavioral Genetics, Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Adam E Locke
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA; Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Nathan O Stitziel
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA; Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA; Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA.
| | - Ira M Hall
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA; Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA; Department of Genetics, Yale University School of Medicine, New Haven, CT 06510, USA.
| |
Collapse
|
11
|
Dwyer DS. Genomic Chaos Begets Psychiatric Disorder. Complex Psychiatry 2020; 6:20-29. [PMID: 34883501 PMCID: PMC7673594 DOI: 10.1159/000507988] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Accepted: 04/06/2020] [Indexed: 12/21/2022] Open
Abstract
The processes that created the primordial genome are inextricably linked to current day vulnerability to developing a psychiatric disorder as summarized in this review article. Chaos and dynamic forces including duplication, transposition, and recombination generated the protogenome. To survive early stages of genome evolution, self-organization emerged to curb chaos. Eventually, the human genome evolved through a delicate balance of chaos/instability and organization/stability. However, recombination coldspots, silencing of transposable elements, and other measures to limit chaos also led to retention of variants that increase risk for disease. Moreover, ongoing dynamics in the genome creates various new mutations that determine liability for psychiatric disorders. Homologous recombination, long-range gene regulation, and gene interactions were all guided by spooky action-at-a-distance, which increased variability in the system. A probabilistic system of life was required to deal with a changing environment. This ensured the generation of outliers in the population, which enhanced the probability that some members would survive unfavorable environmental impacts. Some of the outliers produced through this process in man are ill suited to cope with the complex demands of modern life. Genomic chaos and mental distress from the psychological challenges of modern living will inevitably converge to produce psychiatric disorders in man.
Collapse
Affiliation(s)
- Donard S. Dwyer
- Departments of Psychiatry and Behavioral Medicine and Pharmacology, Toxicology and Neuroscience, LSU Health Shreveport, Shreveport, Louisiana, USA
| |
Collapse
|
12
|
Cantsilieris S, Sunkin SM, Johnson ME, Anaclerio F, Huddleston J, Baker C, Dougherty ML, Underwood JG, Sulovari A, Hsieh P, Mao Y, Catacchio CR, Malig M, Welch AE, Sorensen M, Munson KM, Jiang W, Girirajan S, Ventura M, Lamb BT, Conlon RA, Eichler EE. An evolutionary driver of interspersed segmental duplications in primates. Genome Biol 2020; 21:202. [PMID: 32778141 PMCID: PMC7419210 DOI: 10.1186/s13059-020-02074-4] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Accepted: 06/08/2020] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND The complex interspersed pattern of segmental duplications in humans is responsible for rearrangements associated with neurodevelopmental disease, including the emergence of novel genes important in human brain evolution. We investigate the evolution of LCR16a, a putative driver of this phenomenon that encodes one of the most rapidly evolving human-ape gene families, nuclear pore interacting protein (NPIP). RESULTS Comparative analysis shows that LCR16a has independently expanded in five primate lineages over the last 35 million years of primate evolution. The expansions are associated with independent lineage-specific segmental duplications flanking LCR16a leading to the emergence of large interspersed duplication blocks at non-orthologous chromosomal locations in each primate lineage. The intron-exon structure of the NPIP gene family has changed dramatically throughout primate evolution with different branches showing characteristic gene models yet maintaining an open reading frame. In the African ape lineage, we detect signatures of positive selection that occurred after a transition to more ubiquitous expression among great ape tissues when compared to Old World and New World monkeys. Mouse transgenic experiments from baboon and human genomic loci confirm these expression differences and suggest that the broader ape expression pattern arose due to mutational changes that emerged in cis. CONCLUSIONS LCR16a promotes serial interspersed duplications and creates hotspots of genomic instability that appear to be an ancient property of primate genomes. Dramatic changes to NPIP gene structure and altered tissue expression preceded major bouts of positive selection in the African ape lineage, suggestive of a gene undergoing strong adaptive evolution.
Collapse
Affiliation(s)
- Stuart Cantsilieris
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA
- Present Address: Centre for Eye Research Australia, Department of Surgery (Ophthalmology), University of Melbourne, Royal Victorian Eye and Ear Hospital, East Melbourne, VIC, 3002, Australia
| | | | - Matthew E Johnson
- Center for Spatial and Functional Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Fabio Anaclerio
- Department of Biology-Genetics, University of Bari, Bari, Italy
| | - John Huddleston
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
- Molecular and Cellular Biology Program, University of Washington, Seattle, WA, 98195, USA
| | - Carl Baker
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA
| | - Max L Dougherty
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA
| | - Jason G Underwood
- Pacific Biosciences (PacBio) of California, Incorporated, Menlo Park, CA, 94025, USA
| | - Arvis Sulovari
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA
| | - PingHsun Hsieh
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA
| | - Yafei Mao
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA
| | | | - Maika Malig
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA
- Present Address: Department of Molecular and Cellular Biology, University of California, Davis, CA, 95616, USA
- Present Address: Integrative Genetics and Genomics Graduate Group, University of California, Davis, CA, 95616, USA
| | - AnneMarie E Welch
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA
- Present Address: Brain and Mitochondrial Research, Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC, Australia
| | - Melanie Sorensen
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA
| | - Weihong Jiang
- Case Transgenic and Targeting Facility, Department of Genetics and Genome Sciences, School of Medicine, Case Western Reserve University, Cleveland, OH, 44106, USA
| | - Santhosh Girirajan
- Department of Biochemistry and Molecular Biology, Department of Anthropology, Pennsylvania State University, University Park, PA, 16802, USA
| | - Mario Ventura
- Department of Biology-Genetics, University of Bari, Bari, Italy
| | - Bruce T Lamb
- Stark Neurosciences Research Institute, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Ronald A Conlon
- Case Transgenic and Targeting Facility, Department of Genetics and Genome Sciences, School of Medicine, Case Western Reserve University, Cleveland, OH, 44106, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA.
- Howard Hughes Medical Institute, University of Washington School of Medicine, 3720 15th Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA.
| |
Collapse
|
13
|
Refining the Phenotype of Recurrent Rearrangements of Chromosome 16. Int J Mol Sci 2019; 20:ijms20051095. [PMID: 30836598 PMCID: PMC6429492 DOI: 10.3390/ijms20051095] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Revised: 02/25/2019] [Accepted: 02/27/2019] [Indexed: 01/08/2023] Open
Abstract
Chromosome 16 is one of the most gene-rich chromosomes of our genome, and 10% of its sequence consists of segmental duplications, which give instability and predisposition to rearrangement by the recurrent mechanism of non-allelic homologous recombination. Microarray technologies have allowed for the analysis of copy number variations (CNVs) that can contribute to the risk of developing complex diseases. By array comparative genomic hybridization (CGH) screening of 1476 patients, we detected 27 cases with CNVs on chromosome 16. We identified four smallest regions of overlapping (SROs): one at 16p13.11 was found in seven patients; one at 16p12.2 was found in four patients; two close SROs at 16p11.2 were found in twelve patients; finally, six patients were found with atypical rearrangements. Although phenotypic variability was observed, we identified a male bias for Childhood Apraxia of Speech associated to 16p11.2 microdeletions. We also reported an elevated frequency of second-site genomic alterations, supporting the model of the second hit to explain the clinical variability associated with CNV syndromes. Our goal was to contribute to the building of a chromosome 16 disease-map based on disease susceptibility regions. The role of the CNVs of chromosome 16 was increasingly made clear in the determination of developmental delay. We also found that in some cases a second-site CNV could explain the phenotypic heterogeneity by a simple additive effect or a pejorative synergistic effect.
Collapse
|
14
|
Yang Y, Gu Q, Zhang Y, Sasaki T, Crivello J, O'Neill RJ, Gilbert DM, Ma J. Continuous-Trait Probabilistic Model for Comparing Multi-species Functional Genomic Data. Cell Syst 2018; 7:208-218.e11. [PMID: 29936186 PMCID: PMC6107375 DOI: 10.1016/j.cels.2018.05.022] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Revised: 05/17/2018] [Accepted: 05/29/2018] [Indexed: 01/22/2023]
Abstract
A large amount of multi-species functional genomic data from high-throughput assays are becoming available to help understand the molecular mechanisms for phenotypic diversity across species. However, continuous-trait probabilistic models, which are key to such comparative analysis, remain under-explored. Here we develop a new model, called phylogenetic hidden Markov Gaussian processes (Phylo-HMGP), to simultaneously infer heterogeneous evolutionary states of functional genomic features in a genome-wide manner. Both simulation studies and real data application demonstrate the effectiveness of Phylo-HMGP. Importantly, we applied Phylo-HMGP to analyze a new cross-species DNA replication timing (RT) dataset from the same cell type in five primate species (human, chimpanzee, orangutan, gibbon, and green monkey). We demonstrate that our Phylo-HMGP model enables discovery of genomic regions with distinct evolutionary patterns of RT. Our method provides a generic framework for comparative analysis of multi-species continuous functional genomic signals to help reveal regions with conserved or lineage-specific regulatory roles.
Collapse
Affiliation(s)
- Yang Yang
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Quanquan Gu
- Department of Computer Science, University of Virginia, Charlottesville, VA 22904, USA
| | - Yang Zhang
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Takayo Sasaki
- Department of Biological Science, Florida State University, Tallahassee, FL 32306, USA
| | - Julianna Crivello
- Institute for Systems Genomics, Department of Molecular & Cell Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Rachel J O'Neill
- Institute for Systems Genomics, Department of Molecular & Cell Biology, University of Connecticut, Storrs, CT 06269, USA
| | - David M Gilbert
- Department of Biological Science, Florida State University, Tallahassee, FL 32306, USA
| | - Jian Ma
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
| |
Collapse
|
15
|
Cross-Regulation between Transposable Elements and Host DNA Replication. Viruses 2017; 9:v9030057. [PMID: 28335567 PMCID: PMC5371812 DOI: 10.3390/v9030057] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2017] [Revised: 03/13/2017] [Accepted: 03/15/2017] [Indexed: 12/27/2022] Open
Abstract
Transposable elements subvert host cellular functions to ensure their survival. Their interaction with the host DNA replication machinery indicates that selective pressures lead them to develop ancestral and convergent evolutionary adaptations aimed at conserved features of this fundamental process. These interactions can shape the co-evolution of the transposons and their hosts.
Collapse
|
16
|
Dennis MY, Eichler EE. Human adaptation and evolution by segmental duplication. Curr Opin Genet Dev 2016; 41:44-52. [PMID: 27584858 PMCID: PMC5161654 DOI: 10.1016/j.gde.2016.08.001] [Citation(s) in RCA: 128] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2016] [Revised: 07/02/2016] [Accepted: 08/02/2016] [Indexed: 12/29/2022]
Abstract
Duplications are the primary force by which new gene functions arise and provide a substrate for large-scale structural variation. Analysis of thousands of genomes shows that humans and great apes have more genetic differences in content and structure over recent segmental duplications than any other euchromatic region. Novel human-specific duplicated genes, ARHGAP11B and SRGAP2C, have recently been described with a potential role in neocortical expansion and increased neuronal spine density. Large segmental duplications and the structural variants they promote are also frequently stratified between human populations with a subset being subjected to positive selection. The impact of recent duplications on human evolution and adaptation is only beginning to be realized as new technologies enhance their discovery and accurate genotyping.
Collapse
Affiliation(s)
- Megan Y Dennis
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA 95616, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
17
|
Mohajeri K, Cantsilieris S, Huddleston J, Nelson BJ, Coe BP, Campbell CD, Baker C, Harshman L, Munson KM, Kronenberg ZN, Kremitzki M, Raja A, Catacchio CR, Graves TA, Wilson RK, Ventura M, Eichler EE. Interchromosomal core duplicons drive both evolutionary instability and disease susceptibility of the Chromosome 8p23.1 region. Genome Res 2016; 26:1453-1467. [PMID: 27803192 PMCID: PMC5088589 DOI: 10.1101/gr.211284.116] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2016] [Accepted: 09/12/2016] [Indexed: 12/13/2022]
Abstract
Recurrent rearrangements of Chromosome 8p23.1 are associated with congenital heart defects and developmental delay. The complexity of this region has led to inconsistencies in the current reference assembly, confounding studies of genetic variation. Using comparative sequence-based approaches, we generated a high-quality 6.3-Mbp alternate reference assembly of an inverted Chromosome 8p23.1 haplotype. Comparison with nonhuman primates reveals a 746-kbp duplicative transposition and two separate inversion events that arose in the last million years of human evolution. The breakpoints associated with these rearrangements map to an ape-specific interchromosomal core duplicon that clusters at sites of evolutionary inversion (P = 7.8 × 10−5). Refinement of microdeletion breakpoints identifies a subgroup of patients that map to the same interchromosomal core involved in the evolutionary formation of the duplication blocks. Our results define a higher-order genomic instability element that has shaped the structure of specific chromosomes during primate evolution contributing to rearrangements associated with inversion and disease.
Collapse
Affiliation(s)
- Kiana Mohajeri
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Stuart Cantsilieris
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - John Huddleston
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| | - Bradley J Nelson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Bradley P Coe
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Catarina D Campbell
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Carl Baker
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Lana Harshman
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Zev N Kronenberg
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Milinn Kremitzki
- The McDonnell Genome Institute at Washington University, Washington University School of Medicine, St. Louis, Missouri 63108, USA
| | - Archana Raja
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| | | | - Tina A Graves
- The McDonnell Genome Institute at Washington University, Washington University School of Medicine, St. Louis, Missouri 63108, USA
| | - Richard K Wilson
- The McDonnell Genome Institute at Washington University, Washington University School of Medicine, St. Louis, Missouri 63108, USA
| | - Mario Ventura
- Dipartimento di Biologia, Università degli Studi di Bari Aldo Moro, Bari 70125, Italy
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
18
|
Dumont BL, Eichler EE. Signals of historical interlocus gene conversion in human segmental duplications. PLoS One 2013; 8:e75949. [PMID: 24124524 PMCID: PMC3790853 DOI: 10.1371/journal.pone.0075949] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Accepted: 08/17/2013] [Indexed: 12/04/2022] Open
Abstract
Standard methods of DNA sequence analysis assume that sequences evolve independently, yet this assumption may not be appropriate for segmental duplications that exchange variants via interlocus gene conversion (IGC). Here, we use high quality multiple sequence alignments from well-annotated segmental duplications to systematically identify IGC signals in the human reference genome. Our analysis combines two complementary methods: (i) a paralog quartet method that uses DNA sequence simulations to identify a statistical excess of sites consistent with inter-paralog exchange, and (ii) the alignment-based method implemented in the GENECONV program. One-quarter (25.4%) of the paralog families in our analysis harbor clear IGC signals by the quartet approach. Using GENECONV, we identify 1477 gene conversion tracks that cumulatively span 1.54 Mb of the genome. Our analyses confirm the previously reported high rates of IGC in subtelomeric regions and Y-chromosome palindromes, and identify multiple novel IGC hotspots, including the pregnancy specific glycoproteins and the neuroblastoma breakpoint gene families. Although the duplication history of a paralog family is described by a single tree, we show that IGC has introduced incredible site-to-site variation in the evolutionary relationships among paralogs in the human genome. Our findings indicate that IGC has left significant footprints in patterns of sequence diversity across segmental duplications in the human genome, out-pacing the contributions of single base mutation by orders of magnitude. Collectively, the IGC signals we report comprise a catalog that will provide a critical reference for interpreting observed patterns of DNA sequence variation across duplicated genomic regions, including targets of recent adaptive evolution in humans.
Collapse
Affiliation(s)
- Beth L. Dumont
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- * E-mail:
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- Howard Hughes Medical Institute, Seattle, Washington, United States of America
| |
Collapse
|
19
|
Fawcett JA, Innan H. The role of gene conversion in preserving rearrangement hotspots in the human genome. Trends Genet 2013; 29:561-8. [PMID: 23953668 DOI: 10.1016/j.tig.2013.07.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2013] [Revised: 06/20/2013] [Accepted: 07/08/2013] [Indexed: 11/27/2022]
Abstract
Hotspots of non-allelic homologous recombination (NAHR) have a crucial role in creating genetic diversity and are also associated with dozens of genomic disorders. Recent studies suggest that many human NAHR hotspots have been preserved throughout the evolution of primates. NAHR hotspots are likely to remain active as long as the segmental duplications (SDs) promoting NAHR retain sufficient similarity. Here, we propose an evolutionary model of SDs that incorporates the effect of gene conversion and compare it with a null model that assumes SDs evolve independently without gene conversion. The gene conversion model predicts a much longer lifespan of NAHR hotspots compared with the null model. We show that the literature on copy number variants (CNVs) and genomic disorders, and also the results of additional analysis of CNVs, are all more consistent with the gene conversion model.
Collapse
Affiliation(s)
- Jeffrey A Fawcett
- Graduate University for Advanced Studies, Hayama, Kanagawa 240-0193, Japan
| | | |
Collapse
|
20
|
Male-specific region of the bovine Y chromosome is gene rich with a high transcriptomic activity in testis development. Proc Natl Acad Sci U S A 2013; 110:12373-8. [PMID: 23842086 DOI: 10.1073/pnas.1221104110] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
The male-specific region of the mammalian Y chromosome (MSY) contains clusters of genes essential for male reproduction. The highly repetitive and degenerative nature of the Y chromosome impedes genomic and transcriptomic characterization. Although the Y chromosome sequence is available for the human, chimpanzee, and macaque, little is known about the annotation and transcriptome of nonprimate MSY. Here, we investigated the transcriptome of the MSY in cattle by direct testis cDNA selection and RNA-seq approaches. The bovine MSY differs radically from the primate Y chromosomes with respect to its structure, gene content, and density. Among the 28 protein-coding genes/families identified on the bovine MSY (12 single- and 16 multicopy genes), 16 are bovid specific. The 1,274 genes identified in this study made the bovine MSY gene density the highest in the genome; in comparison, primate MSYs have only 31-78 genes. Our results, along with the highly transcriptional activities observed from these Y-chromosome genes and 375 additional noncoding RNAs, challenge the widely accepted hypothesis that the MSY is gene poor and transcriptionally inert. The bovine MSY genes are predominantly expressed and are differentially regulated during the testicular development. Synonymous substitution rate analyses of the multicopy MSY genes indicated that two major periods of expansion occurred during the Miocene and Pliocene, contributing to the adaptive radiation of bovids. The massive amplification and vigorous transcription suggest that the MSY serves as a genomic niche regulating male reproduction during bovid expansion.
Collapse
|
21
|
Sassa T. The Role of Human-Specific Gene Duplications During Brain Development and Evolution. J Neurogenet 2013; 27:86-96. [DOI: 10.3109/01677063.2013.789512] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
22
|
Currall BB, Chiang C, Talkowski ME, Morton CC. Mechanisms for Structural Variation in the Human Genome. CURRENT GENETIC MEDICINE REPORTS 2013; 1:81-90. [PMID: 23730541 DOI: 10.1007/s40142-013-0012-8] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
It has been known for several decades that genetic variation involving changes to chromosomal structure (i.e., structural variants) can contribute to disease; however this relationship has been brought into acute focus in recent years largely based on innovative new genomics approaches and technology. Structural variants (SVs) arise from improperly repaired DNA double-strand breaks (DSB). DSBs are a frequent occurrence in all cells and two major pathways are involved in their repair: homologous recombination and non-homologous end joining. Errors during these repair mechanisms can result in SVs that involve losses, gains and rearrangements ranging from a few nucleotides to entire chromosomal arms. Factors such as rearrangements, hotspots and induced DSBs are implicated in the formation of SVs. While de novo SVs are often associated with disease, some SVs are conserved within human subpopulations and may have had a meaningful influence on primate evolution. As the ability to sequence the whole human genome rapidly evolves, the diversity of SVs is illuminated, including very complex rearrangements involving multiple DSBs in a process recently designated as "chromothripsis". Elucidating mechanisms involved in the etiology of SVs informs disease pathogenesis as well as the dynamic function associated with the biology and evolution of human genomes.
Collapse
Affiliation(s)
- Benjamin B Currall
- Departments of Obstetrics, Gynecology and Reproductive Biology, Brigham and Women's Hospital and Harvard Medical School, New Research Building, Room 160D, 77 Avenue Louis Pasteur, Boston, MA 02115, USA. Harvard Medical School, Boston, MA, USA
| | | | | | | |
Collapse
|
23
|
Marotta M, Chen X, Inoshita A, Stephens R, Budd GT, Crowe JP, Lyons J, Kondratova A, Tubbs R, Tanaka H. A common copy-number breakpoint of ERBB2 amplification in breast cancer colocalizes with a complex block of segmental duplications. Breast Cancer Res 2012. [PMID: 23181561 PMCID: PMC4053137 DOI: 10.1186/bcr3362] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Introduction Segmental duplications (low-copy repeats) are the recently duplicated genomic segments in the human genome that display nearly identical (> 90%) sequences and account for about 5% of euchromatic regions. In germline, duplicated segments mediate nonallelic homologous recombination and thus cause both non-disease-causing copy-number variants and genomic disorders. To what extent duplicated segments play a role in somatic DNA rearrangements in cancer remains elusive. Duplicated segments often cluster and form genomic blocks enriched with both direct and inverted repeats (complex genomic regions). Such complex regions could be fragile and play a mechanistic role in the amplification of the ERBB2 gene in breast tumors, because repeated sequences are known to initiate gene amplification in model systems. Methods We conducted polymerase chain reaction (PCR)-based assays for primary breast tumors and analyzed publically available array-comparative genomic hybridization data to map a common copy-number breakpoint in ERBB2-amplified primary breast tumors. We further used molecular, bioinformatics, and population-genetics approaches to define duplication contents, structural variants, and haplotypes within the common breakpoint. Results We found a large (> 300-kb) block of duplicated segments that was colocalized with a common-copy number breakpoint for ERBB2 amplification. The breakpoint that potentially initiated ERBB2 amplification localized in a region 1.5 megabases (Mb) on the telomeric side of ERBB2. The region is very complex, with extensive duplications of KRTAP genes, structural variants, and, as a result, a paucity of single-nucleotide polymorphism (SNP) markers. Duplicated segments are varied in size and degree of sequence homology, indicating that duplications have occurred recurrently during genome evolution. Conclusions Amplification of the ERBB2 gene in breast tumors is potentially initiated by a complex region that has unusual genomic features and thus requires rigorous, labor-intensive investigation. The haplotypes we provide could be useful to identify the potential association between the complex region and ERBB2 amplification.
Collapse
|
24
|
Mbanefo EC, Chuanxin Y, Kikuchi M, Shuaibu MN, Boamah D, Kirinoki M, Hayashi N, Chigusa Y, Osada Y, Hamano S, Hirayama K. Origin of a novel protein-coding gene family with similar signal sequence in Schistosoma japonicum. BMC Genomics 2012; 13:260. [PMID: 22716200 PMCID: PMC3434034 DOI: 10.1186/1471-2164-13-260] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2012] [Accepted: 06/11/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Evolution of novel protein-coding genes is the bedrock of adaptive evolution. Recently, we identified six protein-coding genes with similar signal sequence from Schistosoma japonicum egg stage mRNA using signal sequence trap (SST). To find the mechanism underlying the origination of these genes with similar core promoter regions and signal sequence, we adopted an integrated approach utilizing whole genome, transcriptome and proteome database BLAST queries, other bioinformatics tools, and molecular analyses. RESULTS Our data, in combination with database analyses showed evidences of expression of these genes both at the mRNA and protein levels exclusively in all developmental stages of S. japonicum. The signal sequence motif was identified in 27 distinct S. japonicum UniGene entries with multiple mRNA transcripts, and in 34 genome contigs distributed within 18 scaffolds with evidence of genome-wide dispersion. No homolog of these genes or similar domain was found in deposited data from any other organism. We observed preponderance of flanking repetitive elements (REs), albeit partial copies, especially of the RTE-like and Perere class at either side of the duplication source locus. The role of REs as major mediators of DNA-level recombination leading to dispersive duplication is discussed with evidence from our analyses. We also identified a stepwise pathway towards functional selection in evolving genes by alternative splicing. Equally, the possible transcription models of some protein-coding representatives of the duplicons are presented with evidence of expression in vitro. CONCLUSION Our findings contribute to the accumulating evidence of the role of REs in the generation of evolutionary novelties in organisms' genomes.
Collapse
Affiliation(s)
- Evaristus Chibunna Mbanefo
- Department of Immunogenetics, Institute of Tropical Medicine (NEKKEN), and Global COE Program, Nagasaki University, 1-12-4 Sakamoto, 852-8523, Nagasaki, Japan
- Department of Parasitology and Entomology, Faculty of Bioscience, Nnamdi Azikiwe University, P.M.B. 5025, Awka, Nigeria
| | - Yu Chuanxin
- Laboratory on Technology for Parasitic Disease Prevention and Control, Jiangsu Institute of Parasitic Diseases, 117 Yangxiang, Meiyuan, Wuxi, 214064, People's Republic of China
| | - Mihoko Kikuchi
- Department of Immunogenetics, Institute of Tropical Medicine (NEKKEN), and Global COE Program, Nagasaki University, 1-12-4 Sakamoto, 852-8523, Nagasaki, Japan
| | - Mohammed Nasir Shuaibu
- Department of Immunogenetics, Institute of Tropical Medicine (NEKKEN), and Global COE Program, Nagasaki University, 1-12-4 Sakamoto, 852-8523, Nagasaki, Japan
| | - Daniel Boamah
- Department of Immunogenetics, Institute of Tropical Medicine (NEKKEN), and Global COE Program, Nagasaki University, 1-12-4 Sakamoto, 852-8523, Nagasaki, Japan
| | - Masashi Kirinoki
- Laboratory of Tropical Medicine and Parasitology, Dokkyo Medical University, Tochigi, Japan
| | - Naoko Hayashi
- Laboratory of Tropical Medicine and Parasitology, Dokkyo Medical University, Tochigi, Japan
| | - Yuichi Chigusa
- Laboratory of Tropical Medicine and Parasitology, Dokkyo Medical University, Tochigi, Japan
| | - Yoshio Osada
- Department of Immunology and Parasitology, The University of Occupational and Environmental Health, Kitakyushu, Japan
| | - Shinjiro Hamano
- Department of Parasitology, Institute of Tropical Medicine (NEKKEN), and Global COE Program, Nagasaki University, 1-12-4 Sakamoto, 852-8523, Nagasaki, Japan
| | - Kenji Hirayama
- Department of Immunogenetics, Institute of Tropical Medicine (NEKKEN), and Global COE Program, Nagasaki University, 1-12-4 Sakamoto, 852-8523, Nagasaki, Japan
| |
Collapse
|
25
|
Hollox EJ. The challenges of studying complex and dynamic regions of the human genome. Methods Mol Biol 2012; 838:187-207. [PMID: 22228013 DOI: 10.1007/978-1-61779-507-7_9] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Recent work has emphasised that the human genome is not simple and static, but complex and dynamic. This review focuses on the regions that are particularly hard to dissect and analyse, yet hold clues to how the genome changes during evolution and disease. I begin by summarising recent key advances in the understanding of the variable structure of our genome, and then I discuss a medley of methods that may allow us to analyse this structure in fine detail. In the final part, I describe potential future developments in this field, and make an argument that, just as we routinely genotype single-nucleotide polymorphisms now and will routinely re-sequence genomes in the near future, we should be aiming to physically re-map the individual human genome for each individual we study.
Collapse
Affiliation(s)
- Edward J Hollox
- Department of Genetics, University of Leicester, Adrian Building, University Road, Leicester, UK.
| |
Collapse
|
26
|
Salm MPA, Horswell SD, Hutchison CE, Speedy HE, Yang X, Liang L, Schadt EE, Cookson WO, Wierzbicki AS, Naoumova RP, Shoulders CC. The origin, global distribution, and functional impact of the human 8p23 inversion polymorphism. Genome Res 2012; 22:1144-53. [PMID: 22399572 PMCID: PMC3371712 DOI: 10.1101/gr.126037.111] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Genomic inversions are an increasingly recognized source of genetic variation. However, a lack of reliable high-throughput genotyping assays for these structures has precluded a full understanding of an inversion's phylogenetic, phenotypic, and population genetic properties. We characterize these properties for one of the largest polymorphic inversions in man (the ∼4.5-Mb 8p23.1 inversion), a structure that encompasses numerous signals of natural selection and disease association. We developed and validated a flexible bioinformatics tool that utilizes SNP data to enable accurate, high-throughput genotyping of the 8p23.1 inversion. This tool was applied retrospectively to diverse genome-wide data sets, revealing significant population stratification that largely follows a clinal “serial founder effect” distribution model. Phylogenetic analyses establish the inversion's ancestral origin within the Homo lineage, indicating that 8p23.1 inversion has occurred independently in the Pan lineage. The human inversion breakpoint was localized to an inverted pair of human endogenous retrovirus elements within the large, flanking low-copy repeats; experimental validation of this breakpoint confirmed these elements as the likely intermediary substrates that sponsored inversion formation. In five data sets, mRNA levels of disease-associated genes were robustly associated with inversion genotype. Moreover, a haplotype associated with systemic lupus erythematosus was restricted to the derived inversion state. We conclude that the 8p23.1 inversion is an evolutionarily dynamic structure that can now be accommodated into the understanding of human genetic and phenotypic diversity.
Collapse
Affiliation(s)
- Maximilian P A Salm
- Centre for Endocrinology, Barts & the London School of Medicine & Dentistry, Queen Mary University of London, London, United Kingdom.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Kuang SQ, Guo DC, Prakash SK, McDonald MLN, Johnson RJ, Wang M, Regalado ES, Russell L, Cao JM, Kwartler C, Fraivillig K, Coselli JS, Safi HJ, Estrera AL, Leal SM, LeMaire SA, Belmont JW, Milewicz DM. Recurrent chromosome 16p13.1 duplications are a risk factor for aortic dissections. PLoS Genet 2011; 7:e1002118. [PMID: 21698135 PMCID: PMC3116911 DOI: 10.1371/journal.pgen.1002118] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2010] [Accepted: 04/19/2011] [Indexed: 11/17/2022] Open
Abstract
Chromosomal deletions or reciprocal duplications of the 16p13.1 region have been implicated in a variety of neuropsychiatric disorders such as autism, schizophrenia, epilepsies, and attention-deficit hyperactivity disorder (ADHD). In this study, we investigated the association of recurrent genomic copy number variants (CNVs) with thoracic aortic aneurysms and dissections (TAAD). By using SNP arrays to screen and comparative genomic hybridization microarrays to validate, we identified 16p13.1 duplications in 8 out of 765 patients of European descent with adult-onset TAAD compared with 4 of 4,569 controls matched for ethnicity (P = 5.0×10−5, OR = 12.2). The findings were replicated in an independent cohort of 467 patients of European descent with TAAD (P = 0.005, OR = 14.7). Patients with 16p13.1 duplications were more likely to harbor a second rare CNV (P = 0.012) and to present with aortic dissections (P = 0.010) than patients without duplications. Duplications of 16p13.1 were identified in 2 of 130 patients with familial TAAD, but the duplications did not segregate with TAAD in the families. MYH11, a gene known to predispose to TAAD, lies in the duplicated region of 16p13.1, and increased MYH11 expression was found in aortic tissues from TAAD patients with 16p13.1 duplications compared with control aortas. These data suggest chromosome 16p13.1 duplications confer a risk for TAAD in addition to the established risk for neuropsychiatric disorders. It also indicates that recurrent CNVs may predispose to disorders involving more than one organ system, an observation critical to the understanding of the role of recurrent CNVs in human disease and a finding that may be common to other recurrent CNVs involving multiple genes. Thoracic aortic aneurysms and acute aortic dissections (TAAD) have ranked as high as the fifteenth leading cause of death in the United States. TAAD can be inherited in families in an autosomal dominant manner, and mutations in ACTA2 and MYH11, genes encoding two major components of the smooth muscle contractile unit, are responsible for approximately 15% of familial TAAD. However, the majority of patients with TAAD do not have an identified syndrome or family history of aortic disease, and genetic factors predisposing to these sporadic cases have not been identified. To determine whether recurrent genomic copy number variants (CNVs) contribute to TAAD pathogenesis, we screened 765 patients with adult-onset TAAD for CNVs and identified recurrent 16p13.1 duplications in 1% of TAAD cases compared with 0.09% of controls. The 16p13.1 duplication involves 9 genes, including MYH11. This recurrent duplication of 16p13.1 has also been determined to be associated with neuropsychiatric conditions, specifically schizophrenia and attention-deficit hyperactivity disorder. Our study suggests that recurrent duplications of 16p13.1 confer a risk for both neuropsychiatric diseases and TAAD, a finding that may be common to other recurrent CNVs involving multiple genes.
Collapse
Affiliation(s)
- Shao-Qing Kuang
- Department of Internal Medicine, University of Texas Health Science Center at Houston, Houston, Texas, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Lai AG, Denton-Giles M, Mueller-Roeber B, Schippers JHM, Dijkwel PP. Positional information resolves structural variations and uncovers an evolutionarily divergent genetic locus in accessions of Arabidopsis thaliana. Genome Biol Evol 2011; 3:627-40. [PMID: 21622917 PMCID: PMC3157834 DOI: 10.1093/gbe/evr038] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Genome sequencing of closely related individuals has yielded valuable insights that link genome evolution to phenotypic variations. However, advancement in sequencing technology has also led to an escalation in the number of poor quality–drafted genomes assembled based on reference genomes that can have highly divergent or haplotypic regions. The self-fertilizing nature of Arabidopsis thaliana poses an advantage to sequencing projects because its genome is mostly homozygous. To determine the accuracy of an Arabidopsis drafted genome in less conserved regions, we performed a resequencing experiment on a ∼371-kb genomic interval in the Landsberg erecta (Ler-0) accession. We identified novel structural variations (SVs) between Ler-0 and the reference accession Col-0 using a long-range polymerase chain reaction approach to generate an Illumina data set that has positional information, that is, a data set with reads that map to a known location. Positional information is important for accurate genome assembly and the resolution of SVs particularly in highly duplicated or repetitive regions. Sixty-one regions with misassembly signatures were identified from the Ler-0 draft, suggesting the presence of novel SVs that are not represented in the draft sequence. Sixty of those were resolved by iterative mapping using our data set. Fifteen large indels (>100 bp) identified from this study were found to be located either within protein-coding regions or upstream regulatory regions, suggesting the formation of novel alleles or altered regulation of existing genes in Ler-0. We propose future genome-sequencing experiments to follow a clone-based approach that incorporates positional information to ultimately reveal haplotype-specific differences between accessions.
Collapse
Affiliation(s)
- Alvina G Lai
- Institute of Molecular BioSciences, Massey University, Private Bag 11-222, Palmerston North 4442, New Zealand
| | | | | | | | | |
Collapse
|
29
|
Guo X, Freyer L, Morrow B, Zheng D. Characterization of the past and current duplication activities in the human 22q11.2 region. BMC Genomics 2011; 12:71. [PMID: 21269513 PMCID: PMC3040729 DOI: 10.1186/1471-2164-12-71] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2010] [Accepted: 01/26/2011] [Indexed: 12/02/2022] Open
Abstract
Background Segmental duplications (SDs) on 22q11.2 (LCR22), serve as substrates for meiotic non-allelic homologous recombination (NAHR) events resulting in several clinically significant genomic disorders. Results To understand the duplication activity leading to the complicated SD structure of this region, we have applied the A-Bruijn graph algorithm to decompose the 22q11.2 SDs to 523 fundamental duplication sequences, termed subunits. Cross-species syntenic analysis of primate genomes demonstrates that many of these LCR22 subunits emerged very recently, especially those implicated in human genomic disorders. Some subunits have expanded more actively than others, and young Alu SINEs, are associated much more frequently with duplicated sequences that have undergone active expansion, confirming their role in mediating recombination events. Many copy number variations (CNVs) exist on 22q11.2, some flanked by SDs. Interestingly, two chromosome breakpoints for 13 CNVs (mean length 65 kb) are located in paralogous subunits, providing direct evidence that SD subunits could contribute to CNV formation. Sequence analysis of PACs or BACs identified extra CNVs, specifically, 10 insertions and 18 deletions within 22q11.2; four were more than 10 kb in size and most contained young AluYs at their breakpoints. Conclusions Our study indicates that AluYs are implicated in the past and current duplication events, and moreover suggests that DNA rearrangements in 22q11.2 genomic disorders perhaps do not occur randomly but involve both actively expanded duplication subunits and Alu elements.
Collapse
Affiliation(s)
- Xingyi Guo
- Department of Neurology, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | | | | | | |
Collapse
|
30
|
Bengesser K, Cooper DN, Steinmann K, Kluwe L, Chuzhanova NA, Wimmer K, Tatagiba M, Tinschert S, Mautner VF, Kehrer-Sawatzki H. A novel third type of recurrent NF1 microdeletion mediated by nonallelic homologous recombination between LRRC37B-containing low-copy repeats in 17q11.2. Hum Mutat 2010; 31:742-51. [PMID: 20506354 DOI: 10.1002/humu.21254] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Large microdeletions encompassing the neurofibromatosis type-1 (NF1) gene and its flanking regions at 17q11.2 belong to the group of genomic disorders caused by aberrant recombination between segmental duplications. The most common NF1 microdeletions (type-1) span 1.4-Mb and have breakpoints located within NF1-REPs A and C, low-copy repeats (LCRs) containing LRRC37-core duplicons. We have identified a novel type of recurrent NF1 deletion mediated by nonallelic homologous recombination (NAHR) between the highly homologous NF1-REPs B and C. The breakpoints of these approximately 1.0-Mb ("type-3") NF1 deletions were characterized at the DNA sequence level in three unrelated patients. Recombination regions, spanning 275, 180, and 109-bp, respectively, were identified within the LRRC37B-P paralogues of NF1-REPs B and C, and were found to contain sequences capable of non-B DNA formation. Both LCRs contain LRRC37-core duplicons, abundant and highly dynamic sequences in the human genome. NAHR between LRRC37-containing LCRs at 17q21.31 is known to have mediated the 970-kb polymorphic inversions of the MAPT-locus that occurred independently in different primate species, but also underlies the syndromes associated with recurrent 17q21.31 microdeletions and reciprocal microduplications. The novel NF1 microdeletions reported here provide further evidence for the unusually high recombinogenic potential of LRRC37-containing LCRs in the human genome.
Collapse
|
31
|
Girirajan S, Rosenfeld JA, Cooper GM, Antonacci F, Siswara P, Itsara A, Vives L, Walsh T, McCarthy SE, Baker C, Mefford HC, Kidd JM, Browning SR, Browning BL, Dickel DE, Levy DL, Ballif BC, Platky K, Farber DM, Gowans GC, Wetherbee JJ, Asamoah A, Weaver DD, Mark PR, Dickerson J, Garg BP, Ellingwood SA, Smith R, Banks VC, Smith W, McDonald MT, Hoo JJ, French BN, Hudson C, Johnson JP, Ozmore JR, Moeschler JB, Surti U, Escobar LF, El-Kechen D, Gorski JL, Kussman J, Salbert B, Lacassie Y, Biser A, McDonald-McGinn DM, Zackai EH, Deardorff MA, Shaikh TH, Haan E, Friend KL, Fichera M, Romano C, Gécz J, deLisi LE, Sebat J, King MC, Shaffer LG, Eichler EE. A recurrent 16p12.1 microdeletion supports a two-hit model for severe developmental delay. Nat Genet 2010; 42:203-9. [PMID: 20154674 PMCID: PMC2847896 DOI: 10.1038/ng.534] [Citation(s) in RCA: 470] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2009] [Accepted: 01/15/2010] [Indexed: 02/06/2023]
Abstract
We report the identification of a recurrent, 520-kb 16p12.1 microdeletion associated with childhood developmental delay. The microdeletion was detected in 20 of 11,873 cases compared with 2 of 8,540 controls (P = 0.0009, OR = 7.2) and replicated in a second series of 22 of 9,254 cases compared with 6 of 6,299 controls (P = 0.028, OR = 2.5). Most deletions were inherited, with carrier parents likely to manifest neuropsychiatric phenotypes compared to non-carrier parents (P = 0.037, OR = 6). Probands were more likely to carry an additional large copy-number variant when compared to matched controls (10 of 42 cases, P = 5.7 x 10(-5), OR = 6.6). The clinical features of individuals with two mutations were distinct from and/or more severe than those of individuals carrying only the co-occurring mutation. Our data support a two-hit model in which the 16p12.1 microdeletion both predisposes to neuropsychiatric phenotypes as a single event and exacerbates neurodevelopmental phenotypes in association with other large deletions or duplications. Analysis of other microdeletions with variable expressivity indicates that this two-hit model might be more generally applicable to neuropsychiatric disease.
Collapse
Affiliation(s)
- Santhosh Girirajan
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Gregory M. Cooper
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Francesca Antonacci
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Priscillia Siswara
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Andy Itsara
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Laura Vives
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Tom Walsh
- Department of Medicine, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Carl Baker
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Heather C. Mefford
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Jeffrey M. Kidd
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Sharon R. Browning
- Department of Statistics, Faculty of Science, The University of Auckland, Auckland, New Zealand
| | - Brian L. Browning
- Department of Statistics, Faculty of Science, The University of Auckland, Auckland, New Zealand
| | - Diane E. Dickel
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Deborah L. Levy
- Psychology Research Laboratory, McLean Hospital, Belmont, MA, USA
- Department of Psychiatry, Harvard Medical School, Boston, MA, USA
| | | | - Kathryn Platky
- Division of Child Neurology, Department of Neurology, University of Louisville, School of Medicine, Louisville, KY, USA
| | - Darren M. Farber
- Division of Child Neurology, Department of Neurology, University of Louisville, School of Medicine, Louisville, KY, USA
| | - Gordon C. Gowans
- Weisskopf Child Evaluation Center, Department of Pediatrics, University of Louisville, Louisville, KY, USA
| | - Jessica J. Wetherbee
- Weisskopf Child Evaluation Center, Department of Pediatrics, University of Louisville, Louisville, KY, USA
| | - Alexander Asamoah
- Weisskopf Child Evaluation Center, Department of Pediatrics, University of Louisville, Louisville, KY, USA
| | - David D. Weaver
- Department of Medical and Molecular Genetics, School of Medicine, Indiana University, Indianapolis, IN, USA
| | - Paul R. Mark
- Department of Medical and Molecular Genetics, School of Medicine, Indiana University, Indianapolis, IN, USA
| | - Jennifer Dickerson
- Department of Neurology, Division of Pediatric Neurology, School of Medicine, Indiana University, Indianapolis, IN, USA
| | - Bhuwan P. Garg
- Department of Neurology, Division of Pediatric Neurology, School of Medicine, Indiana University, Indianapolis, IN, USA
| | - Sara A. Ellingwood
- Division of Genetics, Maine Medical Partners Pediatric Specialty Care, Maine Medical Center, Portland, ME, USA
| | - Rosemarie Smith
- Division of Genetics, Maine Medical Partners Pediatric Specialty Care, Maine Medical Center, Portland, ME, USA
| | - Valerie C. Banks
- Division of Genetics, Maine Medical Partners Pediatric Specialty Care, Maine Medical Center, Portland, ME, USA
| | - Wendy Smith
- Division of Genetics, Maine Medical Partners Pediatric Specialty Care, Maine Medical Center, Portland, ME, USA
| | - Marie T. McDonald
- Division of Medical Genetics, Duke University Medical Center, Durham, NC, USA
| | - Joe J. Hoo
- Department of Pediatrics, University of Toledo Medical College and NW Ohio Regional Genetics Center, Toledo, OH, USA
| | - Beatrice N. French
- Department of Pediatrics, University of Toledo Medical College and NW Ohio Regional Genetics Center, Toledo, OH, USA
| | - Cindy Hudson
- Medical Genetics, Shodair Children's Hospital, Helena, MT, USA
| | - John P. Johnson
- Medical Genetics, Shodair Children's Hospital, Helena, MT, USA
| | - Jillian R. Ozmore
- Division of Clinical Genetics, Dartmouth-Hitchcock Medical Center, Lebanon, NH, USA
| | - John B. Moeschler
- Division of Clinical Genetics, Dartmouth-Hitchcock Medical Center, Lebanon, NH, USA
| | | | - Luis F. Escobar
- Medical Genetics and Neurodevelopmental Center, St. Vincent Children's Hospital, Indianapolis, IN, USA
| | - Dima El-Kechen
- Medical Genetics and Neurodevelopmental Center, St. Vincent Children's Hospital, Indianapolis, IN, USA
| | - Jerome L. Gorski
- Division of Medical Genetics, University of Missouri, Columbia, MO, USA
| | - Jennifer Kussman
- Division of Medical Genetics, University of Missouri, Columbia, MO, USA
| | | | - Yves Lacassie
- Division of Genetics, Department of Pediatrics, Louisiana State University Health Sciences Center and Children's Hospital, New Orleans, LA, USA
| | - Alisha Biser
- Department of Pediatrics and Genetics, University of Pennsylvania, and the Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Donna M. McDonald-McGinn
- Department of Pediatrics and Genetics, University of Pennsylvania, and the Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Elaine H. Zackai
- Department of Pediatrics and Genetics, University of Pennsylvania, and the Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Matthew A. Deardorff
- Department of Pediatrics and Genetics, University of Pennsylvania, and the Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Tamim H. Shaikh
- Department of Pediatrics and Genetics, University of Pennsylvania, and the Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Eric Haan
- South Australian Clinical Genetics Service, SA Pathology at Women's and Children's Hospital, Adelaide, Australia
- Department of Paediatrics, The University of Adelaide, Adelaide, Australia
| | - Kathryn L. Friend
- Genetics and Molecular Pathology, and SA Pathology at Women's and Children's Hospital, Adelaide, Australia
| | - Marco Fichera
- Oasi Institute for Research and Care in Mental Retardation and Brain Aging, Troina, Italy
| | - Corrado Romano
- Oasi Institute for Research and Care in Mental Retardation and Brain Aging, Troina, Italy
| | - Jozef Gécz
- Department of Paediatrics, The University of Adelaide, Adelaide, Australia
- Genetics and Molecular Pathology, and SA Pathology at Women's and Children's Hospital, Adelaide, Australia
| | - Lynn E. deLisi
- Department of Psychiatry, Harvard Medical School, Boston, MA, USA
- VA Boston Healthcare System, Brockton, MA, USA
| | - Jonathan Sebat
- Departments of Psychiatry and Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Mary-Claire King
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Department of Medicine, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| |
Collapse
|
32
|
Kahn CL, Mozes S, Raphael BJ. Efficient algorithms for analyzing segmental duplications with deletions and inversions in genomes. Algorithms Mol Biol 2010; 5:11. [PMID: 20047668 PMCID: PMC2820476 DOI: 10.1186/1748-7188-5-11] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2009] [Accepted: 01/04/2010] [Indexed: 02/06/2023] Open
Abstract
Background Segmental duplications, or low-copy repeats, are common in mammalian genomes. In the human genome, most segmental duplications are mosaics comprised of multiple duplicated fragments. This complex genomic organization complicates analysis of the evolutionary history of these sequences. One model proposed to explain this mosaic patterns is a model of repeated aggregation and subsequent duplication of genomic sequences. Results We describe a polynomial-time exact algorithm to compute duplication distance, a genomic distance defined as the most parsimonious way to build a target string by repeatedly copying substrings of a fixed source string. This distance models the process of repeated aggregation and duplication. We also describe extensions of this distance to include certain types of substring deletions and inversions. Finally, we provide a description of a sequence of duplication events as a context-free grammar (CFG). Conclusion These new genomic distances will permit more biologically realistic analyses of segmental duplications in genomes.
Collapse
|
33
|
Tandem repeats modify the structure of human genes hosted in segmental duplications. Genome Biol 2009; 10:R137. [PMID: 19954527 PMCID: PMC2812944 DOI: 10.1186/gb-2009-10-12-r137] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2009] [Revised: 10/08/2009] [Accepted: 12/02/2009] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Recently duplicated genes are often subject to genomic rearrangements that can lead to the development of novel gene structures. Here we specifically investigated the effect of variations in internal tandem repeats (ITRs) on the gene structure of human paralogs located in segmental duplications. RESULTS We found that around 7% of the primate-specific genes located within duplicated regions of the genome contain variable tandem repeats. These genes are members of large groups of recently duplicated paralogs that are often polymorphic in the human population. Half of the identified ITRs occur within coding exons and may be either kept or spliced out from the mature transcript. When ITRs reside within exons, they encode variable amino acid repeats. When located at exon-intron boundaries, ITRs can generate alternative splicing patterns through the formation of novel introns. CONCLUSIONS Our study shows that variation in the number of ITRs impacts on recently duplicated genes by modifying their coding sequence, splicing pattern, and tissue expression. The resulting effect is the production of a variety of primate-specific proteins, which mostly differ in number and sequence of amino acid repeats.
Collapse
|
34
|
Marques-Bonet T, Girirajan S, Eichler EE. The origins and impact of primate segmental duplications. Trends Genet 2009; 25:443-54. [PMID: 19796838 PMCID: PMC2847396 DOI: 10.1016/j.tig.2009.08.002] [Citation(s) in RCA: 119] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2009] [Revised: 08/07/2009] [Accepted: 08/10/2009] [Indexed: 12/25/2022]
Abstract
Duplicated sequences are substrates for the emergence of new genes and are an important source of genetic instability associated with rare and common diseases. Analyses of primate genomes have shown an increase in the proportion of interspersed segmental duplications (SDs) within the genomes of humans and great apes. This contrasts with other mammalian genomes that seem to have their recently duplicated sequences organized in a tandem configuration. In this review, we focus on the mechanistic origin and impact of this difference with respect to evolution, genetic diversity and primate phenotype. Although many genomes will be sequenced in the future, resolution of this aspect of genomic architecture still requires high quality sequences and detailed analyses.
Collapse
Affiliation(s)
- Tomas Marques-Bonet
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | | |
Collapse
|
35
|
Marques-Bonet T, Eichler EE. The evolution of human segmental duplications and the core duplicon hypothesis. COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY 2009; 74:355-62. [PMID: 19717539 PMCID: PMC4114149 DOI: 10.1101/sqb.2009.74.011] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Duplicated sequences are important sources of genetic instability and in the evolution of new gene function within species. Hominids have a preponderance of intrachromosomal duplications organized in an interspersed fashion, as opposed to tandem duplications, which are common in other mammalian genomes such as mouse, dog, and cow. Multiple lines of evidence, including sequence divergence, comparative primate genomes, and fluorescence in situ hybridization (FISH) analyses, point to an excess of segmental duplications in the common ancestor of humans and African great apes. We find that much of the interspersed human duplication architecture within chromosomes is focused around common sequence elements referred to as "core duplicons." These cores correspond to the expansion of gene families, some of which show signatures of positive selection and lack orthologs present in other mammalian species. This genomic architecture predisposes apes and humans not only to extensive genetic diversity, but also to large-scale structural diversity mediated by nonallelic homologous recombination. In humans, many de novo large-scale genomic changes mediated by these duplications are associated with neuropsychiatric and neurodevelopmental disease. We propose that the disadvantage of a high rate of new mutations is offset by the selective advantage of newly minted genes within the cores.
Collapse
Affiliation(s)
- T Marques-Bonet
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | | |
Collapse
|
36
|
|
37
|
Abstract
We summarize the progress in whole-genome sequencing and analyses of primate genomes. These emerging genome datasets have broadened our understanding of primate genome evolution revealing unexpected and complex patterns of evolutionary change. This includes the characterization of genome structural variation, episodic changes in the repeat landscape, differences in gene expression, new models regarding speciation, and the ephemeral nature of the recombination landscape. The functional characterization of genomic differences important in primate speciation and adaptation remains a significant challenge. Limited access to biological materials, the lack of detailed phenotypic data and the endangered status of many critical primate species have significantly attenuated research into the genetic basis of primate evolution. Next-generation sequencing technologies promise to greatly expand the number of available primate genome sequences; however, such draft genome sequences will likely miss critical genetic differences within complex genomic regions unless dedicated efforts are put forward to understand the full spectrum of genetic variation.
Collapse
Affiliation(s)
- Tomas Marques-Bonet
- Department of Genome Sciences, University of Washington and the Howard Hughes Medical Institute, Seattle, Washington 98105, USA.
| | | | | |
Collapse
|
38
|
A role for DNA polymerase mu in the emerging DJH rearrangements of the postgastrulation mouse embryo. Mol Cell Biol 2008; 29:1266-75. [PMID: 19103746 DOI: 10.1128/mcb.01518-08] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
The molecular complexes involved in the nonhomologous end-joining process that resolves recombination-activating gene (RAG)-induced double-strand breaks and results in V(D)J gene rearrangements vary during mammalian ontogeny. In the mouse, the first immunoglobulin gene rearrangements emerge during midgestation periods, but their repertoires have not been analyzed in detail. We decided to study the postgastrulation DJ(H) joints and compare them with those present in later life. The embryo DJ(H) joints differed from those observed in perinatal life by the presence of short stretches of nontemplated (N) nucleotides. Whereas most adult N nucleotides are introduced by terminal deoxynucleotidyl transferase (TdT), the embryo N nucleotides were due to the activity of the homologous DNA polymerase mu (Polmu), which was widely expressed in the early ontogeny, as shown by analysis of Polmu(-/-) embryos. Based on its DNA-dependent polymerization ability, which TdT lacks, Polmu also filled in small sequence gaps at the coding ends and contributed to the ligation of highly processed ends, frequently found in the embryo, by pairing to internal microhomology sites. These findings show that Polmu participates in the repair of early-embryo, RAG-induced double-strand breaks and subsequently may contribute to preserve the genomic stability and cellular homeostasis of lymphohematopoietic precursors during development.
Collapse
|
39
|
Schmieder S, Darré-Toulemonde F, Arguel MJ, Delerue-Audegond A, Christen R, Nahon JL. Primate-specific spliced PMCHL RNAs are non-protein coding in human and macaque tissues. BMC Evol Biol 2008; 8:330. [PMID: 19068116 PMCID: PMC2621205 DOI: 10.1186/1471-2148-8-330] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2008] [Accepted: 12/09/2008] [Indexed: 11/24/2022] Open
Abstract
Background Brain-expressed genes that were created in primate lineage represent obvious candidates to investigate molecular mechanisms that contributed to neural reorganization and emergence of new behavioural functions in Homo sapiens. PMCHL1 arose from retroposition of a pro-melanin-concentrating hormone (PMCH) antisense mRNA on the ancestral human chromosome 5p14 when platyrrhines and catarrhines diverged. Mutations before divergence of hylobatidae led to creation of new exons and finally PMCHL1 duplicated in an ancestor of hominids to generate PMCHL2 at the human chromosome 5q13. A complex pattern of spliced and unspliced PMCHL RNAs were found in human brain and testis. Results Several novel spliced PMCHL transcripts have been characterized in human testis and fetal brain, identifying an additional exon and novel splice sites. Sequencing of PMCHL genes in several non-human primates allowed to carry out phylogenetic analyses revealing that the initial retroposition event took place within an intron of the brain cadherin (CDH12) gene, soon after platyrrhine/catarrhine divergence, i.e. 30–35 Mya, and was concomitant with the insertion of an AluSg element. Sequence analysis of the spliced PMCHL transcripts identified only short ORFs of less than 300 bp, with low (VMCH-p8 and protein variants) or no evolutionary conservation. Western blot analyses of human and macaque tissues expressing PMCHL RNA failed to reveal any protein corresponding to VMCH-p8 and protein variants encoded by spliced transcripts. Conclusion Our present results improve our knowledge of the gene structure and the evolutionary history of the primate-specific chimeric PMCHL genes. These genes produce multiple spliced transcripts, bearing short, non-conserved and apparently non-translated ORFs that may function as mRNA-like non-coding RNAs.
Collapse
Affiliation(s)
- Sandra Schmieder
- Université de Nice-Sophia Antipolis, CNRS, Institut de Pharmacologie Moléculaire et Cellulaire, Valbonne, France.
| | | | | | | | | | | |
Collapse
|
40
|
Perry GH, Yang F, Marques-Bonet T, Murphy C, Fitzgerald T, Lee AS, Hyland C, Stone AC, Hurles ME, Tyler-Smith C, Eichler EE, Carter NP, Lee C, Redon R. Copy number variation and evolution in humans and chimpanzees. Genes Dev 2008; 18:1698-710. [PMID: 18775914 PMCID: PMC2577862 DOI: 10.1101/gr.082016.108] [Citation(s) in RCA: 189] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2008] [Accepted: 08/26/2008] [Indexed: 11/24/2022]
Abstract
Copy number variants (CNVs) underlie many aspects of human phenotypic diversity and provide the raw material for gene duplication and gene family expansion. However, our understanding of their evolutionary significance remains limited. We performed comparative genomic hybridization on a single human microarray platform to identify CNVs among the genomes of 30 humans and 30 chimpanzees as well as fixed copy number differences between species. We found that human and chimpanzee CNVs occur in orthologous genomic regions far more often than expected by chance and are strongly associated with the presence of highly homologous intrachromosomal segmental duplications. By adapting population genetic analyses for use with copy number data, we identified functional categories of genes that have likely evolved under purifying or positive selection for copy number changes. In particular, duplications and deletions of genes with inflammatory response and cell proliferation functions may have been fixed by positive selection and involved in the adaptive phenotypic differentiation of humans and chimpanzees.
Collapse
Affiliation(s)
- George H. Perry
- School of Human Evolution & Social Change, Arizona State University, Tempe, Arizona 85287, USA
- Department of Pathology, Brigham & Women’s Hospital, Boston, Massachusetts 02115, USA
| | - Fengtang Yang
- Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Tomas Marques-Bonet
- Department of Genome Sciences, University of Washington School of Medicine and the Howard Hughes Medical Institute, Seattle, Washington 98195, USA
| | - Carly Murphy
- Department of Pathology, Brigham & Women’s Hospital, Boston, Massachusetts 02115, USA
| | - Tomas Fitzgerald
- Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Arthur S. Lee
- Department of Pathology, Brigham & Women’s Hospital, Boston, Massachusetts 02115, USA
| | - Courtney Hyland
- Department of Pathology, Brigham & Women’s Hospital, Boston, Massachusetts 02115, USA
| | - Anne C. Stone
- School of Human Evolution & Social Change, Arizona State University, Tempe, Arizona 85287, USA
| | - Matthew E. Hurles
- Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Chris Tyler-Smith
- Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine and the Howard Hughes Medical Institute, Seattle, Washington 98195, USA
| | - Nigel P. Carter
- Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Charles Lee
- Department of Pathology, Brigham & Women’s Hospital, Boston, Massachusetts 02115, USA
- Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Richard Redon
- Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| |
Collapse
|
41
|
Varki A, Geschwind DH, Eichler EE. Explaining human uniqueness: genome interactions with environment, behaviour and culture. Nat Rev Genet 2008; 9:749-63. [PMID: 18802414 PMCID: PMC2756412 DOI: 10.1038/nrg2428] [Citation(s) in RCA: 109] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
What makes us human? Specialists in each discipline respond through the lens of their own expertise. In fact, 'anthropogeny' (explaining the origin of humans) requires a transdisciplinary approach that eschews such barriers. Here we take a genomic and genetic perspective towards molecular variation, explore systems analysis of gene expression and discuss an organ-systems approach. Rejecting any 'genes versus environment' dichotomy, we then consider genome interactions with environment, behaviour and culture, finally speculating that aspects of human uniqueness arose because of a primate evolutionary trend towards increasing and irreversible dependence on learned behaviours and culture - perhaps relaxing allowable thresholds for large-scale genomic diversity.
Collapse
Affiliation(s)
- Ajit Varki
- Center for Academic Research and Training in Anthropogeny, University of California, San Diego, La Jolla, California 92093, USA.
| | | | | |
Collapse
|
42
|
Symmons O, Váradi A, Arányi T. How segmental duplications shape our genome: recent evolution of ABCC6 and PKD1 Mendelian disease genes. Mol Biol Evol 2008; 25:2601-13. [PMID: 18791038 DOI: 10.1093/molbev/msn202] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
The completion of the Human Genome Project has brought the understanding that our genome contains an unexpectedly large proportion of segmental duplications. This poses the challenge of elucidating the consequences of recent duplications on physiology. We have conducted an in-depth study of a subset of segmental duplications on chromosome 16. We focused on PKD1 and ABCC6 duplications because mutations affecting these genes are responsible for the Mendelian disorders autosomal dominant polycystic kidney disease and pseudoxanthoma elasticum, respectively. We establish that duplications of PKD1 and ABCC6 are associated to low-copy repeat 16a and show that such duplications have occurred several times independently in different primate species. We demonstrate that partial duplication of PKD1 and ABCC6 has numerous consequences: the pseudogenes give rise to new transcripts and mediate gene conversion, which not only results in disease-causing mutations but also serves as a reservoir for sequence variation. The duplicated segments are also involved in submicroscopic and microscopic genomic rearrangements, contributing to structural variation in human and chromosomal break points in the gibbon. In conclusion, our data shed light on the recent and ongoing evolution of chromosome 16 mediated by segmental duplication and deepen our understanding of the history of two Mendelian disorder genes.
Collapse
Affiliation(s)
- Orsolya Symmons
- Institute of Enzymology, Hungarian Academy of Sciences, Budapest, Hungary
| | | | | |
Collapse
|
43
|
Zody MC, Jiang Z, Fung HC, Antonacci F, Hillier LW, Cardone MF, Graves TA, Kidd JM, Cheng Z, Abouelleil A, Chen L, Wallis J, Glasscock J, Wilson RK, Reily AD, Duckworth J, Ventura M, Hardy J, Warren WC, Eichler EE. Evolutionary toggling of the MAPT 17q21.31 inversion region. Nat Genet 2008; 40:1076-83. [PMID: 19165922 PMCID: PMC2684794 DOI: 10.1038/ng.193] [Citation(s) in RCA: 158] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Using comparative sequencing approaches, we investigated the evolutionary history of the European-enriched 17q21.31 MAPT inversion polymorphism. We present a detailed, BAC-based sequence assembly of the inverted human H2 haplotype and compare it to the sequence structure and genetic variation of the corresponding 1.5-Mb region for the noninverted H1 human haplotype and that of chimpanzee and orangutan. We found that inversion of the MAPT region is similarly polymorphic in other great ape species, and we present evidence that the inversions occurred independently in chimpanzees and humans. In humans, the inversion breakpoints correspond to core duplications with the LRRC37 gene family. Our analysis favors the H2 configuration and sequence haplotype as the likely great ape and human ancestral state, with inversion recurrences during primate evolution. We show that the H2 architecture has evolved more extensive sequence homology, perhaps explaining its tendency to undergo microdeletion associated with mental retardation in European populations.
Collapse
Affiliation(s)
- Michael C. Zody
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA, 02142, USA
- Department of Medical Biochemistry and Microbiology, Uppsala University, Box 597, Uppsala, SE-751 24, Sweden
| | - Zhaoshi Jiang
- Department of Genome Sciences, Howard Hughes Medical Institute, University of Washington, Seattle, WA, 98195, USA
| | - Hon-Chung Fung
- Department of Molecular Neuroscience and Reta Lila Weston Laboratories, Institute of Neurology, University College London, London, WC1N 3BG, UK
- Department of Neurology, Chang Gung Memorial Hospital and College of Medicine, Chang Gung University, 199 Tung Hwa North Road, Taipei, 10591, Taiwan
| | - Francesca Antonacci
- Department of Genome Sciences, Howard Hughes Medical Institute, University of Washington, Seattle, WA, 98195, USA
| | - LaDeana W. Hillier
- Genome Sequencing Center, Washington University School of Medicine, Campus Box 8501, 4444 Forest Park Avenue, St. Louis, MO, 63108, USA
| | - Maria Francesca Cardone
- Department of Genetics and Microbiology, University of Bari, Via Amendola 165/A, Bari, 70126, Italy
| | - Tina A. Graves
- Genome Sequencing Center, Washington University School of Medicine, Campus Box 8501, 4444 Forest Park Avenue, St. Louis, MO, 63108, USA
| | - Jeffrey M. Kidd
- Department of Genome Sciences, Howard Hughes Medical Institute, University of Washington, Seattle, WA, 98195, USA
| | - Ze Cheng
- Department of Genome Sciences, Howard Hughes Medical Institute, University of Washington, Seattle, WA, 98195, USA
| | - Amr Abouelleil
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA, 02142, USA
| | - Lin Chen
- Department of Genome Sciences, Howard Hughes Medical Institute, University of Washington, Seattle, WA, 98195, USA
| | - John Wallis
- Genome Sequencing Center, Washington University School of Medicine, Campus Box 8501, 4444 Forest Park Avenue, St. Louis, MO, 63108, USA
| | - Jarret Glasscock
- Genome Sequencing Center, Washington University School of Medicine, Campus Box 8501, 4444 Forest Park Avenue, St. Louis, MO, 63108, USA
| | - Richard K. Wilson
- Genome Sequencing Center, Washington University School of Medicine, Campus Box 8501, 4444 Forest Park Avenue, St. Louis, MO, 63108, USA
| | - Amy Denise Reily
- Genome Sequencing Center, Washington University School of Medicine, Campus Box 8501, 4444 Forest Park Avenue, St. Louis, MO, 63108, USA
| | - Jaime Duckworth
- Laboratory of Neurogenetics, NIA, NIH, Bethesda, MD 20892, USA
| | - Mario Ventura
- Department of Genetics and Microbiology, University of Bari, Via Amendola 165/A, Bari, 70126, Italy
| | - John Hardy
- Department of Molecular Neuroscience and Reta Lila Weston Laboratories, Institute of Neurology, University College London, London, WC1N 3BG, UK
| | - Wesley C. Warren
- Genome Sequencing Center, Washington University School of Medicine, Campus Box 8501, 4444 Forest Park Avenue, St. Louis, MO, 63108, USA
| | - Evan E. Eichler
- Department of Genome Sciences, Howard Hughes Medical Institute, University of Washington, Seattle, WA, 98195, USA
| |
Collapse
|
44
|
Marques-Bonet T, Cheng Z, She X, Eichler EE, Navarro A. The genomic distribution of intraspecific and interspecific sequence divergence of human segmental duplications relative to human/chimpanzee chromosomal rearrangements. BMC Genomics 2008; 9:384. [PMID: 18699995 PMCID: PMC2542386 DOI: 10.1186/1471-2164-9-384] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2007] [Accepted: 08/12/2008] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND It has been suggested that chromosomal rearrangements harbor the molecular footprint of the biological phenomena which they induce, in the form, for instance, of changes in the sequence divergence rates of linked genes. So far, all the studies of these potential associations have focused on the relationship between structural changes and the rates of evolution of single-copy DNA and have tried to exclude segmental duplications (SDs). This is paradoxical, since SDs are one of the primary forces driving the evolution of structure and function in our genomes and have been linked not only with novel genes acquiring new functions, but also with overall higher DNA sequence divergence and major chromosomal rearrangements. RESULTS Here we take the opposite view and focus on SDs. We analyze several of the features of SDs, including the rates of intraspecific divergence between paralogous copies of human SDs and of interspecific divergence between human SDs and chimpanzee DNA. We study how divergence measures relate to chromosomal rearrangements, while considering other factors that affect evolutionary rates in single copy DNA. CONCLUSION We find that interspecific SD divergence behaves similarly to divergence of single-copy DNA. In contrast, old and recent paralogous copies of SDs do present different patterns of intraspecific divergence. Also, we show that some relatively recent SDs accumulate in regions that carry inversions in sister lineages.
Collapse
Affiliation(s)
- Tomàs Marques-Bonet
- Unitat de Biologia Evolutiva Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Ze Cheng
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Xinwei She
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Arcadi Navarro
- Unitat de Biologia Evolutiva Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
- Institucio Catalana de Recerca i Estudis Avancats (ICREA) and Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
- Population Genomics Node (GNV8), National Institute for Bioinformatics (INB) Universitat Pompeu Fabra, Spain
| |
Collapse
|
45
|
Hannes FD, Sharp AJ, Mefford HC, de Ravel T, Ruivenkamp CA, Breuning MH, Fryns JP, Devriendt K, Van Buggenhout G, Vogels A, Stewart H, Hennekam RC, Cooper GM, Regan R, Knight SJL, Eichler EE, Vermeesch JR. Recurrent reciprocal deletions and duplications of 16p13.11: the deletion is a risk factor for MR/MCA while the duplication may be a rare benign variant. J Med Genet 2008; 46:223-32. [PMID: 18550696 PMCID: PMC2658752 DOI: 10.1136/jmg.2007.055202] [Citation(s) in RCA: 212] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
BACKGROUND Genomic disorders are often caused by non-allelic homologous recombination between segmental duplications. Chromosome 16 is especially rich in a chromosome-specific low copy repeat, termed LCR16. METHODS AND RESULTS A bacterial artificial chromosome (BAC) array comparative genome hybridisation (CGH) screen of 1027 patients with mental retardation and/or multiple congenital anomalies (MR/MCA) was performed. The BAC array CGH screen identified five patients with deletions and five with apparently reciprocal duplications of 16p13 covering 1.65 Mb, including 15 RefSeq genes. In addition, three atypical rearrangements overlapping or flanking this region were found. Fine mapping by high-resolution oligonucleotide arrays suggests that these deletions and duplications result from non-allelic homologous recombination (NAHR) between distinct LCR16 subunits with >99% sequence identity. Deletions and duplications were either de novo or inherited from unaffected parents. To determine whether these imbalances are associated with the MR/MCA phenotype or whether they might be benign variants, a population of 2014 normal controls was screened. The absence of deletions in the control population showed that 16p13.11 deletions are significantly associated with MR/MCA (p = 0.0048). Despite phenotypic variability, common features were identified: three patients with deletions presented with MR, microcephaly and epilepsy (two of these had also short stature), and two other deletion carriers ascertained prenatally presented with cleft lip and midline defects. In contrast to its previous association with autism, the duplication seems to be a common variant in the population (5/1682, 0.29%). CONCLUSION These findings indicate that deletions inherited from clinically normal parents are likely to be causal for the patients' phenotype whereas the role of duplications (de novo or inherited) in the phenotype remains uncertain. This difference in knowledge regarding the clinical relevance of the deletion and the duplication causes a paradigm shift in (cyto)genetic counselling.
Collapse
Affiliation(s)
- F D Hannes
- Center for Human Genetics, Herestraat 49, 3000 Leuven, Belgium
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
46
|
Jiang Z, Hubley R, Smit A, Eichler EE. DupMasker: a tool for annotating primate segmental duplications. Genome Res 2008; 18:1362-8. [PMID: 18502942 DOI: 10.1101/gr.078477.108] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Segmental duplications (SDs) play an important role in genome rearrangement, evolution, and the copy-number variation (CNV) of primate genomes. Such sequences are difficult to detect, a priori, because they share no defining sequence features that distinguish them from unique portions of the genome. Current sequence annotation of segmental duplications requires computationally intensive, genome-wide self-comparisons that cannot be easily implemented on new data sets. Based on the successful implementation of RepeatMasker, we developed a new genome annotation tool, DupMasker. The program uses a library of nonredundant consensus sequences of human segmental duplications, wherein a majority of the ancestral origins have been determined based on comparisons to mammalian outgroup genomes. Using DupMasker, new human and nonhuman primate (NHP) sequences may be readily queried to provide details on the origin and degree of sequence identity of each duplicon. This program can be applied to delineate the order and orientation of duplicons within complex duplication blocks and used to characterize structural variation differences between sequenced human haplotypes. We predict this tool will be valuable in the annotation of large-insert sequence clones, allowing putative unique and duplicated regions of the genomes to be annotated prior to whole genome assembly comparisons.
Collapse
Affiliation(s)
- Zhaoshi Jiang
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | | | | | | |
Collapse
|
47
|
Kehrer-Sawatzki H, Cooper DN. Molecular mechanisms of chromosomal rearrangement during primate evolution. Chromosome Res 2008; 16:41-56. [PMID: 18293104 DOI: 10.1007/s10577-007-1207-1] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Breakpoint analysis of the large chromosomal rearrangements which have occurred during primate evolution promises to yield new insights into the underlying mechanisms of mutagenesis. Comparison of these evolutionary breakpoints with those that are disease-associated in humans, and which occur during either meiotic or mitotic cell division, should help to identify basic mechanistic similarities as well as differences. It has recently become clear that segmental duplications (SDs) have had a very significant impact on genome plasticity during primate evolution. In comparisons of the human and chimpanzee genomes, SDs have been found in flanking regions of 70-80% of inversions and approximately 40% of deletions/duplications. A strong spatial association between primate-specific breakpoints and SDs has also become evident from comparisons of human with other mammalian genomes. The lineage-specific hyperexpansion of certain SDs observed in the genomes of human, chimpanzee, gorilla and gibbon is indicative of the intrinsic instability of some SDs in primates. However, since many primate-specific breakpoints map to regions lacking SDs, but containing interspersed high-copy repetitive sequence elements such as SINEs, LINEs, LTRs, alpha-satellites and (AT)( n ) repeats, we may infer that a range of different molecular mechanisms have probably been involved in promoting chromosomal breakage during the evolution of primate genomes.
Collapse
|
48
|
Ji X, Zhao S. DA and Xiao-two giant and composite LTR-retrotransposon-like elements identified in the human genome. Genomics 2008; 91:249-58. [PMID: 18083327 PMCID: PMC3092482 DOI: 10.1016/j.ygeno.2007.10.014] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2007] [Revised: 10/22/2007] [Accepted: 10/29/2007] [Indexed: 11/19/2022]
Abstract
We discovered two new complex elements while studying large genomic rearrangements and segmental duplications in the human genome. Both resemble bacterial composite DNA transposon Tn9, consisting of a core flanked by mobile elements, except that the flanking element is not a DNA transposon but instead is long terminal repeat retrotransposon-like with human endogenous retrovirus and satellite sequences. Based on the core size, we named them Xiao ( approximately 30 kb) and DA ( approximately 280 kb), meaning small and big, respectively, in Chinese. Xiao originated from a 19p region encoding olfactory receptor 7E members after the human/ape divergence from Old World monkeys, while DA likely evolved from a Xiao by inserting approximately 200 kb of chimeric sequence from 16p and 21q into the Xiao core, resulting in a target site duplication of 3.4 kb. DA/Xiao was identified in 30 loci on 12 chromosomes, and only DAs mediated intrachromosomal rearrangements, based on our reconstructed human-mouse-rat ancestral genome and the rhesus macaque genome.
Collapse
Affiliation(s)
- Xinglai Ji
- Department of Biochemistry and Molecular Biology, Institute of Bioinformatics, University of Georgia, Athens, GA 30602-7229, USA
| | - Shaying Zhao
- Department of Biochemistry and Molecular Biology, Institute of Bioinformatics, University of Georgia, Athens, GA 30602-7229, USA
| |
Collapse
|
49
|
|
50
|
Yang S, Arguello JR, Li X, Ding Y, Zhou Q, Chen Y, Zhang Y, Zhao R, Brunet F, Peng L, Long M, Wang W. Repetitive element-mediated recombination as a mechanism for new gene origination in Drosophila. PLoS Genet 2007; 4:e3. [PMID: 18208328 PMCID: PMC2211543 DOI: 10.1371/journal.pgen.0040003] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2007] [Accepted: 11/27/2007] [Indexed: 01/05/2023] Open
Abstract
Previous studies of repetitive elements (REs) have implicated a mechanistic role in generating new chimerical genes. Such examples are consistent with the classic model for exon shuffling, which relies on non-homologous recombination. However, recent data for chromosomal aberrations in model organisms suggest that ectopic homology-dependent recombination may also be important. Lack of a dataset comprising experimentally verified young duplicates has hampered an effective examination of these models as well as an investigation of sequence features that mediate the rearrangements. Here we use ∼7,000 cDNA probes (∼112,000 primary images) to screen eight species within the Drosophila melanogaster subgroup and identify 17 duplicates that were generated through ectopic recombination within the last 12 mys. Most of these are functional and have evolved divergent expression patterns and novel chimeric structures. Examination of their flanking sequences revealed an excess of repetitive sequences, with the majority belonging to the transposable element DNAREP1 family, associated with the new genes. Our dataset strongly suggests an important role for REs in the generation of chimeric genes within these species. In numerous organisms, many new genes have been found to originate through dispersed gene duplication and exon/domain shuffling. What recombination mechanisms were involved in the duplication and the shuffling processes? Lack of the intermediate products of recombination that share adequate sequence identity between homologous sequences, or the parental sequences from which the new genes were derived, often makes answering these questions difficult. We identified a number of young genes that originated in recently diverged branches in the evolutionary tree of the eight Drosophila melanogaster subgroup species, by using fluorescence in situ hybridization with polytene chromosomes. We analyzed the genomic regions surrounding 17 new dispersed duplicate genes and observed that most of these genes are flanked by repetitive elements (REs), including a large and diverged transposable element family, DNAREP1. Several copies of these REs are kept in both new and parental gene regions, and their degeneration is correlated with the increasing ages of the identified new genes. These data suggest that REs mediate the recombination responsible for the new gene origination.
Collapse
Affiliation(s)
- Shuang Yang
- Chinese Academy of Sciences (CAS)—Max Planck Junior Research Group, Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
- Graduate School of Chinese Academy Sciences, Beijing, China
| | - J. Roman Arguello
- Committee on Evolutionary Biology, The University of Chicago, Chicago, Illinois, United States of America
| | - Xin Li
- Chinese Academy of Sciences (CAS)—Max Planck Junior Research Group, Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
- Graduate School of Chinese Academy Sciences, Beijing, China
| | - Yun Ding
- Chinese Academy of Sciences (CAS)—Max Planck Junior Research Group, Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
- Graduate School of Chinese Academy Sciences, Beijing, China
| | - Qi Zhou
- Chinese Academy of Sciences (CAS)—Max Planck Junior Research Group, Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
- Graduate School of Chinese Academy Sciences, Beijing, China
| | - Ying Chen
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
| | - Yue Zhang
- Chinese Academy of Sciences (CAS)—Max Planck Junior Research Group, Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Ruoping Zhao
- Chinese Academy of Sciences (CAS)—Max Planck Junior Research Group, Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Frédéric Brunet
- Committee on Evolutionary Biology, The University of Chicago, Chicago, Illinois, United States of America
| | - Lixin Peng
- Chinese Academy of Sciences (CAS)—Max Planck Junior Research Group, Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Manyuan Long
- Committee on Evolutionary Biology, The University of Chicago, Chicago, Illinois, United States of America
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
- * To whom correspondence should be addressed. E-mail: (ML); (WW)
| | - Wen Wang
- Chinese Academy of Sciences (CAS)—Max Planck Junior Research Group, Key Laboratory of Cellular and Molecular Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
- * To whom correspondence should be addressed. E-mail: (ML); (WW)
| |
Collapse
|