1
|
Buzdin AA. [Functional analysis of retroviral endogenous inserts in the human genome evolution]. RUSSIAN JOURNAL OF BIOORGANIC CHEMISTRY 2010; 36:38-46. [PMID: 20386577 DOI: 10.1134/s1068162010010048] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Retroelements, mobile elements produced in DNA by reverse transcription, comprise about 40% of the human genome. A small part of these elements appeared in the genome quite recently after the divergence of humans and chimpanzees had occurred. Evolutionarily young retroelements are represented by the members of four groups, SVA, Alu, L1, and the endogenous HERV-K (HML-2) virus. These retroelements could play a functional role in the course of the molecular evolution of human DNA. We comprehensively studied the contribution of human-specific endogenous viruses (hsERV) to the structural modifications and regulation of the human genome. We found that hsERV presented in 134 copies occupied about 330 000 bp of human DNA. They added to genomic sequences the copies of 50 functional retroviral genes as well as 134 potential promoters and enhancers, 50% of which are located in the regions adjacent to known genes, and 22% in gene introns. At least 67% of these elements are human-specific promoters in vivo. hsERV viruses regulate the activity of known protein-encoding genes by means of RNA interference, function as enhancers, and provide new polyadenylation signals for mRNA.
Collapse
|
2
|
Armengol G, Knuutila S, Lozano JJ, Madrigal I, Caballín MR. Identification of human specific gene duplications relative to other primates by array CGH and quantitative PCR. Genomics 2010; 95:203-9. [PMID: 20153417 DOI: 10.1016/j.ygeno.2010.02.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2009] [Revised: 01/13/2010] [Accepted: 02/03/2010] [Indexed: 01/30/2023]
Abstract
In order to identify human lineage specific (HLS) copy number differences (CNDs) compared to other primates, we performed pair wise comparisons (human vs. chimpanzee, gorilla and orangutan) by using cDNA array comparative genomic hybridization (CGH). A set of 23 genes with HLS duplications were identified, as well as other lineage differences in gene copy number specific of chimpanzee, gorilla and orangutan. Each species has gained more copies of specific genes rather than losing gene copies. Eleven of the 23 genes have only been observed to have undergone HLS duplication in Fortna et al. (2004) and in the present study. Then, seven of these 11 genes were analyzed by quantitative PCR in chimpanzee, gorilla and orangutan, as well as in other six primate species (Hylobates lar, Cercopithecus aethiops, Papio hamadryas, Macaca mulatta, Lagothrix lagothricha, and Saimiri sciureus). Six genes confirmed array CGH data, and four of them appeared to have bona fide HLS duplications (ABCB10, E2F6, CDH12, and TDG genes). We propose that these gene duplications have a potential to contribute to specific human phenotypes.
Collapse
Affiliation(s)
- Gemma Armengol
- Department of Animal Biology, Plant Biology and Ecology, Faculty of Biosciences, Universitat Autònoma de Barcelona (UAB), Barcelona, Spain.
| | | | | | | | | |
Collapse
|
3
|
Functional analysis of the evolutionarily conserved cis-regulatory elements on the sox17 gene in zebrafish. Dev Biol 2009; 326:456-70. [DOI: 10.1016/j.ydbio.2008.11.010] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2008] [Revised: 10/31/2008] [Accepted: 11/11/2008] [Indexed: 11/19/2022]
|
4
|
Hallast P, Saarela J, Palotie A, Laan M. High divergence in primate-specific duplicated regions: human and chimpanzee chorionic gonadotropin beta genes. BMC Evol Biol 2008; 8:195. [PMID: 18606016 PMCID: PMC2478647 DOI: 10.1186/1471-2148-8-195] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2007] [Accepted: 07/07/2008] [Indexed: 11/17/2022] Open
Abstract
Background Low nucleotide divergence between human and chimpanzee does not sufficiently explain the species-specific morphological, physiological and behavioral traits. As gene duplication is a major prerequisite for the emergence of new genes and novel biological processes, comparative studies of human and chimpanzee duplicated genes may assist in understanding the mechanisms behind primate evolution. We addressed the divergence between human and chimpanzee duplicated genomic regions by using Luteinizing Hormone Beta (LHB)/Chorionic Gonadotropin Beta (CGB) gene cluster as a model. The placental CGB genes that are essential for implantation have evolved from an ancestral pituitary LHB gene by duplications in the primate lineage. Results We shotgun sequenced and compared the human (45,165 bp) and chimpanzee (39,876 bp) LHB/CGB regions and hereby present evidence for structural variation resulting in discordant number of CGB genes (6 in human, 5 in chimpanzee). The scenario of species-specific parallel duplications was supported (i) as the most parsimonious solution requiring the least rearrangement events to explain the interspecies structural differences; (ii) by the phylogenetic trees constructed with fragments of intergenic regions; (iii) by the sequence similarity calculations. Across the orthologous regions of LHB/CGB cluster, substitutions and indels contributed approximately equally to the interspecies divergence and the distribution of nucleotide identity was correlated with the regional repeat content. Intraspecies gene conversion may have shaped the LHB/CGB gene cluster. The substitution divergence (1.8–2.59%) exceeded two-three fold the estimates for single-copy loci and the fraction of transversional mutations was increased compared to the unique sequences (43% versus ~30%). Despite the high sequence identity among LHB/CGB genes, there are signs of functional differentiation among the gene copies. Estimates for dn/ds rate ratio suggested a purifying selection on LHB and CGB8, and a positive evolution of CGB1. Conclusion If generalized, our data suggests that in addition to species-specific deletions and duplications, parallel duplication events may have contributed to genetic differences separating humans from their closest relatives. Compared to unique genomic segments, duplicated regions are characterized by high divergence promoted by intraspecies gene conversion and species-specific chromosomal rearrangements, including the alterations in gene copy number.
Collapse
Affiliation(s)
- Pille Hallast
- Department of Biotechnology, Institute of Molecular and Cell Biology, University of Tartu, Riia 23, 51010 Tartu, Estonia.
| | | | | | | |
Collapse
|
5
|
Kehrer-Sawatzki H, Cooper DN. Understanding the recent evolution of the human genome: insights from human-chimpanzee genome comparisons. Hum Mutat 2007; 28:99-130. [PMID: 17024666 DOI: 10.1002/humu.20420] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The sequencing of the chimpanzee genome and the comparison with its human counterpart have begun to reveal the spectrum of genetic changes that has accompanied human evolution. In addition to gross karyotypic rearrangements such as the fusion that formed human chromosome 2 and the human-specific pericentric inversions of chromosomes 1 and 18, there is considerable submicroscopic structural variation involving deletions, duplications, and inversions. Lineage-specific segmental duplications, detected by array comparative genomic hybridization and direct sequence comparison, have made a very significant contribution to this structural divergence, which is at least three-fold greater than that due to nucleotide substitutions. Since structural genomic changes may have given rise to irreversible functional differences between the diverging species, their detailed analysis could help to identify the biological processes that have accompanied speciation. To this end, interspecies comparisons have revealed numerous human-specific gains and losses of genes as well as changes in gene expression. The very considerable structural diversity (polymorphism) evident within both lineages has, however, hampered the analysis of the structural divergence between the human and chimpanzee genomes. The concomitant evaluation of genetic divergence and diversity at the nucleotide level has nevertheless served to identify many genes that have evolved under positive selection and may thus have been involved in the development of human lineage-specific traits. Genes that display signs of weak negative selection have also been identified and could represent candidate loci for complex genomic disorders. Here, we review recent progress in comparing the human and chimpanzee genomes and discuss how the differences detected have improved our understanding of the evolution of the human genome.
Collapse
|
6
|
Hahn Y, Bera TK, Pastan IH, Lee B. Duplication and extensive remodeling shaped POTE family genes encoding proteins containing ankyrin repeat and coiled coil domains. Gene 2005; 366:238-45. [PMID: 16364570 DOI: 10.1016/j.gene.2005.07.045] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2005] [Revised: 07/12/2005] [Accepted: 07/20/2005] [Indexed: 11/25/2022]
Abstract
The POTE family genes encode a highly homologous group of primate-specific proteins that contain ankyrin repeats and coiled coil domains. At least 13 paralogous POTE family genes are found on 8 human chromosomes (2, 8, 13, 14, 15, 18, 21 and 22), which can be sorted into 3 groups based on sequence similarity. We identified by a database search a group of additional human ankyrin repeat domain proteins, of which ANKRD26 and ANKRD30A are the best characterized; these are more distant homologs of POTE family proteins. A comprehensive comparison of the genomic organization indicates that ANKRD26 has the genomic structure of the possible ancestor of ANKRD30A and all POTE family genes. Extensive remodeling involving segmental loss and internal duplication appears to have reshaped the ANKRD30A and POTE family genes after the primal duplication of the ancestor gene. We also identified a mouse homolog of human ANKRD26, but failed to find a mouse homolog that bears the structural characteristics of any of the POTE family of proteins. The mouse Ankrd26 may serve as a useful model for the study of the function of human ANKRD26, ANKRD30A and POTE family proteins.
Collapse
Affiliation(s)
- Yoonsoo Hahn
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Building 37, MSC 4264, 37 Convent Drive Room 5120A, Bethesda, MD 20892-4264, USA
| | | | | | | |
Collapse
|
7
|
Abstract
Fibroblast growth factors (FGF) are associated with multiple developmental and metabolic processes in triploblasts, and perhaps also in diploblasts. The evolution of the FGF superfamily has accompanied the major morphological and functional innovations of metazoan species. The study of FGFs throughout species shows that the FGF superfamily can be subdivided in eight families in present-day organisms and has evolved through phases of gene duplications and gene losses. At least two major expansions of the superfamily can be recognized: a first expansion increased the number of FGFs from one or few archeo-FGFs to eight proto-FGFs, prototypic of the eight families. A second expansion, which took place during euchordate evolution, is associated with genome duplications. It increased the number of members in the families. Subsequent losses reduced that number to the present-day figures.
Collapse
Affiliation(s)
- Cornel Popovici
- Laboratory of Molecular Oncology, Marseille Cancer Institute, UMR599, 27 Bd. Leï Roure, 13009 Marseille, France
| | | | | | | |
Collapse
|
8
|
Horvath JE, Gulden CL, Vallente RU, Eichler MY, Ventura M, McPherson JD, Graves TA, Wilson RK, Schwartz S, Rocchi M, Eichler EE. Punctuated duplication seeding events during the evolution of human chromosome 2p11. Genome Res 2005; 15:914-27. [PMID: 15965031 PMCID: PMC1172035 DOI: 10.1101/gr.3916405] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Primate genomic sequence comparisons are becoming increasingly useful for elucidating the evolutionary history and organization of our own genome. Such studies are particularly informative within human pericentromeric regions--areas of particularly rapid change in genomic structure. Here, we present a systematic analysis of the evolutionary history of one approximately 700-kb region of 2p11, including the first autosomal transition from pericentromeric sequence to higher-order alpha-satellite DNA. We show that this region is composed of segmental duplications corresponding to 14 ancestral segments ranging in size from 4 kb to approximately 115 kb. These duplicons show 94%-98.5% sequence identity to their ancestral loci. Comparative FISH and phylogenetic analysis indicate that these duplicons are differentially distributed in human, chimpanzee, and gorilla genomes, whereas baboon has a single putative ancestral locus for all but one of the duplications. Our analysis supports a model where duplicative transposition events occurred during a narrow window of evolution after the separation of the human/ape lineage from the Old World monkeys (10-20 million years ago). Although dramatic secondary dispersal events occurred during the radiation of the human, chimpanzee, and gorilla lineages, duplicative transposition seeding events of new material to this particular pericentromeric region abruptly ceased after this time period. The multiplicity of initial duplicative transpositions prior to the separation of humans and great-apes suggests a punctuated model for the formation of highly duplicated pericentromeric regions within the human genome. The data further indicate that factors other than sequence are important determinants for such bursts of duplicative transposition from the euchromatin to pericentromeric regions.
Collapse
Affiliation(s)
- Julie E Horvath
- Department of Genetics and Center for Human Genetics, Case Western Reserve University School of Medicine and University Hospitals of Cleveland, Cleveland, Ohio 44106, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Locke DP, Jiang Z, Pertz LM, Misceo D, Archidiacono N, Eichler EE. Molecular evolution of the human chromosome 15 pericentromeric region. Cytogenet Genome Res 2004; 108:73-82. [PMID: 15545718 DOI: 10.1159/000080804] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2003] [Accepted: 12/09/2003] [Indexed: 11/19/2022] Open
Abstract
We present a detailed molecular evolutionary analysis of 1.2 Mb from the pericentromeric region of human 15q11. Sequence analysis indicates the region has been subject to extensive interchromosomal and intrachromosomal duplications during primate evolution. Comparative FISH analyses among non-human primates show remarkable quantitative and qualitative differences in the organization and duplication history of this region - including lineage-specific deletions and duplication expansions. Phylogenetic and comparative analyses reveal that the region is composed of at least 24 distinct segmental duplications or duplicons that have populated the pericentromeric regions of the human genome over the last 40 million years of human evolution. The value of combining both cytogenetic and experimental data in understanding the complex forces which have shaped these regions is discussed.
Collapse
Affiliation(s)
- D P Locke
- Department of Genetics, Center for Computational Genomics, Case Western Reserve University School of Medicine and University Hospitals of Cleveland, Cleveland, OH, USA
| | | | | | | | | | | |
Collapse
|
10
|
Finch PW, Rubin JS. Keratinocyte growth factor/fibroblast growth factor 7, a homeostatic factor with therapeutic potential for epithelial protection and repair. Adv Cancer Res 2004; 91:69-136. [PMID: 15327889 DOI: 10.1016/s0065-230x(04)91003-2] [Citation(s) in RCA: 170] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Keratinocyte growth factor (KGF) is a paracrine-acting, epithelial mitogen produced by cells of mesenchymal origin. It is a member of the fibroblast growth factor (FGF) family, and acts exclusively through a subset of FGF receptor isoforms (FGFR2b) expressed predominantly by epithelial cells. The upregulation of KGF after epithelial injury suggested it had an important role in tissue repair. This hypothesis was reinforced by evidence that intestinal damage was worse and healing impaired in KGF null mice. Preclinical data from several animal models demonstrated that recombinant human KGF could enhance the regenerative capacity of epithelial tissues and protect them from a variety of toxic exposures. These beneficial effects are attributed to multiple mechanisms that collectively act to strengthen the integrity of the epithelial barrier, and include the stimulation of cell proliferation, migration, differentiation, survival, DNA repair, and induction of enzymes involved in the detoxification of reactive oxygen species. KGF is currently being evaluated in clinical trials to test its ability to ameliorate severe oral mucositis (OM) that results from cancer chemoradiotherapy. In a phase 3 trial involving patients who were treated with myeloablative chemoradiotherapy before autologous peripheral blood progenitor cell transplantation for hematologic malignancies, KGF significantly reduced both the incidence and duration of severe OM. Similar investigations are underway in patients being treated for solid tumors. On the basis of its success in ameliorating chemoradiotherapy-induced OM in humans and tissue damage in a variety of animal models, additional clinical applications of KGF are worthy of investigation.
Collapse
Affiliation(s)
- Paul W Finch
- Laboratory of Cellular and Molecular Biology, National Cancer Institute, Bethesda, Maryland 20892, USA
| | | |
Collapse
|
11
|
Bailey JA, Eichler EE. Genome-wide detection and analysis of recent segmental duplications within mammalian organisms. COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY 2004; 68:115-24. [PMID: 15338609 DOI: 10.1101/sqb.2003.68.115] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Affiliation(s)
- J A Bailey
- Department of Genetics, Center for Computational Genomics, Case Western Reserve University School of Medicine and University Hospitals of Cleveland, Cleveland, Ohio 44106, USA
| | | |
Collapse
|
12
|
Liu X, Li X, Li M, Acimovic YJ, Li Z, Scherer SW, Estivill X, Tsui LC. Characterization of the segmental duplication LCR7-20 in the human genome. Genomics 2004; 83:262-9. [PMID: 14706455 DOI: 10.1016/j.ygeno.2003.08.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Our previous study described the amplification of a genomic sequence containing exon 9 of CFTR in the human genome. Here we report that this CFTR sequence is part of a large duplicated sequence unit, provisionally named LCR7-20. Through successive screening of two human chromosome 7-specific cosmid libraries to construct a cosmid contig, we assembled two sequenced BAC clones into a single contig containing a prototypic LCR7-20 unit. Subsequent searches of existing human genome sequences identified additional six copies of LCR7-20-like sequences with more than 90% sequence homology. Additional genomic clones containing LCR7-20-like sequences were then isolated from total genomic BAC and PAC libraries. Restriction fragment analysis and limited sequencing data indicated that there could be around 30 copies of LCR7-20-like sequences in the human genome and that the average region of homology could extend over 120 kb. As indicated by fluorescence in situ hybridization analysis, LCR7-20-like sequences are dispersed on different chromosomes, mainly in the centromeric and pericentromeric regions, and some may exist in tandem copies. Our study also indicates that many genomic regions containing LCR7-20's either have been misassembled or are missing in current versions of the human genome sequence.
Collapse
Affiliation(s)
- Xiangdong Liu
- The Centre for Applied Genomics, Research Institute, Hospital for Sick Children, Toronto, Ontario M5G 1X8, Canada
| | | | | | | | | | | | | | | |
Collapse
|
13
|
Fortna A, Kim Y, MacLaren E, Marshall K, Hahn G, Meltesen L, Brenton M, Hink R, Burgers S, Hernandez-Boussard T, Karimpour-Fard A, Glueck D, McGavran L, Berry R, Pollack J, Sikela JM. Lineage-specific gene duplication and loss in human and great ape evolution. PLoS Biol 2004; 2:E207. [PMID: 15252450 PMCID: PMC449870 DOI: 10.1371/journal.pbio.0020207] [Citation(s) in RCA: 233] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2004] [Accepted: 05/06/2004] [Indexed: 12/22/2022] Open
Abstract
Given that gene duplication is a major driving force of evolutionary change and the key mechanism underlying the emergence of new genes and biological processes, this study sought to use a novel genome-wide approach to identify genes that have undergone lineage-specific duplications or contractions among several hominoid lineages. Interspecies cDNA array-based comparative genomic hybridization was used to individually compare copy number variation for 39,711 cDNAs, representing 29,619 human genes, across five hominoid species, including human. We identified 1,005 genes, either as isolated genes or in clusters positionally biased toward rearrangement-prone genomic regions, that produced relative hybridization signals unique to one or more of the hominoid lineages. Measured as a function of the evolutionary age of each lineage, genes showing copy number expansions were most pronounced in human (134) and include a number of genes thought to be involved in the structure and function of the brain. This work represents, to our knowledge, the first genome-wide gene-based survey of gene duplication across hominoid species. The genes identified here likely represent a significant majority of the major gene copy number changes that have occurred over the past 15 million years of human and great ape evolution and are likely to underlie some of the key phenotypic characteristics that distinguish these species. This genome-wide analysis reports the major lineage-specific gene copy number changes that have occurred over the past 15 million years of human and great ape evolution
Collapse
Affiliation(s)
- Andrew Fortna
- 1Department of Pharmacology and Human Medical Genetics Program, University of Colorado Health Sciences CenterDenver, Colorado, United States of America
| | - Young Kim
- 2Department of Pathology, Stanford UniversityStanford, California, United States of America
| | - Erik MacLaren
- 1Department of Pharmacology and Human Medical Genetics Program, University of Colorado Health Sciences CenterDenver, Colorado, United States of America
| | - Kriste Marshall
- 1Department of Pharmacology and Human Medical Genetics Program, University of Colorado Health Sciences CenterDenver, Colorado, United States of America
| | - Gretchen Hahn
- 3Colorado Genetics Laboratory, University of Colorado Health Sciences CenterDenver, Colorado, United States of America
| | - Lynne Meltesen
- 3Colorado Genetics Laboratory, University of Colorado Health Sciences CenterDenver, Colorado, United States of America
| | - Matthew Brenton
- 1Department of Pharmacology and Human Medical Genetics Program, University of Colorado Health Sciences CenterDenver, Colorado, United States of America
| | - Raquel Hink
- 1Department of Pharmacology and Human Medical Genetics Program, University of Colorado Health Sciences CenterDenver, Colorado, United States of America
| | - Sonya Burgers
- 1Department of Pharmacology and Human Medical Genetics Program, University of Colorado Health Sciences CenterDenver, Colorado, United States of America
| | | | - Anis Karimpour-Fard
- 5Department of Preventive Medicine and Biometrics, University of Colorado Health Sciences CenterDenver, ColoradoUnited States of America
| | - Deborah Glueck
- 5Department of Preventive Medicine and Biometrics, University of Colorado Health Sciences CenterDenver, ColoradoUnited States of America
| | - Loris McGavran
- 3Colorado Genetics Laboratory, University of Colorado Health Sciences CenterDenver, Colorado, United States of America
| | - Rebecca Berry
- 3Colorado Genetics Laboratory, University of Colorado Health Sciences CenterDenver, Colorado, United States of America
| | - Jonathan Pollack
- 2Department of Pathology, Stanford UniversityStanford, California, United States of America
| | - James M Sikela
- 1Department of Pharmacology and Human Medical Genetics Program, University of Colorado Health Sciences CenterDenver, Colorado, United States of America
| |
Collapse
|
14
|
Brun ME, Ruault M, Ventura M, Roizès G, De Sario A. Juxtacentromeric region of human chromosome 21: a boundary between centromeric heterochromatin and euchromatic chromosome arms. Gene 2003; 312:41-50. [PMID: 12909339 DOI: 10.1016/s0378-1119(03)00530-4] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
We have analysed the genomic structure and transcriptional activity of a 2.3-Mb genomic sequence in the juxtacentromeric region of human chromosome 21. Our work shows that this region comprises two different chromosome domains. The 1.5-Mb proximal domain: (i) is a patchwork of chromosome duplications; (ii) shares sequence similarity with several chromosomes; (iii) contains several gene fragments (truncated genes having an intron/exon structure) intermingled with retrotransposed pseudogenes; and (iv) harbours two genes (TPTE and BAGE2) that belong to gene families and have a cancer and/or testis expression profile. The TPTE gene family was generated before the branching of Old World monkeys from the great ape lineage, by intra- and interchromosome duplications of the ancestral TPTE gene mapping to phylogenetic chromosome XIII. By contrast, the 0.8-Mb distal domain: (i) is devoid of chromosome duplications; (ii) has a chromosome 21-specific sequence; (iii) contains no gene fragments and only one retrotransposed pseudogene; and (iv) harbours six genes including housekeeping genes. G-rich sequences commonly associated with duplication termini cluster at the boundary between the two chromosome domains. These structural and transcriptional features lead us to suggest that the proximal domain has heterochromatic properties, whereas the distal domain has euchromatic properties.
Collapse
MESH Headings
- ATP-Binding Cassette Transporters/genetics
- Adaptor Proteins, Vesicular Transport/genetics
- Alternative Splicing
- Animals
- Antigens, Neoplasm/genetics
- Base Composition
- Blotting, Northern
- Cell Line
- Centromere/genetics
- Chromosome Mapping
- Chromosomes, Human, Pair 21/genetics
- DNA, Complementary/chemistry
- DNA, Complementary/genetics
- Databases, Nucleic Acid
- Euchromatin/genetics
- Female
- Gene Duplication
- Gene Expression
- Heterochromatin/genetics
- Humans
- In Situ Hybridization, Fluorescence
- Male
- Membrane Proteins/genetics
- Molecular Sequence Data
- PTEN Phosphohydrolase
- Phosphoric Monoester Hydrolases
- Protein Tyrosine Phosphatases/genetics
- Pseudogenes/genetics
- RNA, Messenger/genetics
- RNA, Messenger/metabolism
- RNA-Binding Proteins/genetics
- Repetitive Sequences, Nucleic Acid
- Retroelements/genetics
- Sequence Analysis, DNA
Collapse
Affiliation(s)
- Marie-Elisabeth Brun
- Institut de Génétique Humaine, CNRS UPR 1142, 141, rue de la Cardonille, 34396 Montpellier, France
| | | | | | | | | |
Collapse
|
15
|
Bera TK, Zimonjic DB, Popescu NC, Sathyanarayana BK, Kumar V, Lee B, Pastan I. POTE, a highly homologous gene family located on numerous chromosomes and expressed in prostate, ovary, testis, placenta, and prostate cancer. Proc Natl Acad Sci U S A 2002; 99:16975-80. [PMID: 12475935 PMCID: PMC139254 DOI: 10.1073/pnas.262655399] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/29/2002] [Indexed: 01/13/2023] Open
Abstract
We have identified a gene located on chromosomes 21 that is expressed in normal and neoplastic prostate, and in normal testis, ovary, and placenta. We name this gene POTE (expressed in prostate, ovary, testis, and placenta). The POTE gene has 11 exons and 10 introns and spans approximately equal 32 kb of chromosome 21q11.2 region. The 1.83-kb mRNA of POTE encodes a protein of 66 kDa. Ten paralogs of the gene have been found dispersed among eight different chromosomes (2, 8, 13, 14, 15, 18, 21, and 22) with preservation of ORFs and splice junctions. The synonymous:nonsynonymous ratio indicates that the genes were duplicated rather recently but are diverging at a rate faster than the average for other paralogous genes. In prostate and in testis, at least five different paralogs are expressed. In situ hybridization shows that POTE is expressed in basal and terminal cells of normal prostate epithelium. It is also expressed in some prostate cancers and in the LnCAP prostate cancer cell line. The POTE protein contains seven ankyrin repeats between amino acids 140 and 380. Expression of POTE in prostate cancer and its undetectable expression in normal essential tissues make POTE a candidate for the immunotherapy of prostate cancer. The existence of a large number of closely related but rapidly diverging members, their location on multiple chromosomes and their limited expression pattern suggest an important role for the POTE gene family in reproductive processes.
Collapse
Affiliation(s)
- Tapan K Bera
- Laboratories of Molecular Biology and Experimental Carcinogenesis, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892-4264, USA
| | | | | | | | | | | | | |
Collapse
|
16
|
Fantes JA, Mewborn SK, Lese CM, Hedrick J, Brown RL, Dyomin V, Chaganti RSK, Christian SL, Ledbetter DH. Organisation of the pericentromeric region of chromosome 15: at least four partial gene copies are amplified in patients with a proximal duplication of 15q. J Med Genet 2002; 39:170-7. [PMID: 11897815 PMCID: PMC1735052 DOI: 10.1136/jmg.39.3.170] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Clinical cytogenetic laboratories frequently identify an apparent duplication of proximal 15q that does not involve probes within the PWS/AS critical region and is not associated with any consistent phenotype. Previous mapping data placed several pseudogenes, NF1, IgH D/V, and GABRA5 in the pericentromeric region of proximal 15q. Recent studies have shown that these pseudogene sequences have increased copy numbers in subjects with apparent duplications of proximal 15q. To determine the extent of variation in a control population, we analysed NF1 and IgH D pseudogene copy number in interphase nuclei from 20 cytogenetically normal subjects by FISH. Both loci are polymorphic in controls, ranging from 1-4 signals for NF1 and 1-3 signals for IgH D. Eight subjects with apparent duplications, examined by the same method, showed significantly increased NF1 copy number (5-10 signals). IgH D copy number was also increased in 6/8 of these patients (4-9 signals). We identified a fourth pseudogene, BCL8A, which maps to the pericentromeric region and is coamplified along with the NF1 sequences. Interphase FISH ordering experiments show that IgH D lies closest to the centromere, while BCL8A is the most distal locus in this pseudogene array; the total size of the amplicon is estimated at approximately 1 Mb. The duplicated chromosome was inherited from either sex parent, indicating no parent of origin effect, and no consistent phenotype was present. FISH analysis with one or more of these probes is therefore useful in discriminating polymorphic amplification of proximal pseudogene sequences from clinically significant duplications of 15q.
Collapse
Affiliation(s)
- J A Fantes
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Malaivijitnond S, Takenaka O, Anukulthanakorn K, Cherdshewasart W. The nucleotide sequences of the parathyroid gene in primates (suborder Anthropoidea). Gen Comp Endocrinol 2002; 125:67-78. [PMID: 11825036 DOI: 10.1006/gcen.2001.7735] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Nucleotide sequences of the parathyroid (PTH) gene of 12 species of primates belonging to suborder Anthropoidea were examined. The PTH gene contains one intron that separates two exons that code the sequence of prepro and PTH, respectively. The intron of the PTH gene in Cebus apella, Callithrix jacchus, and Saguinus oedipus was 102 bp long, whereas a 103-bp intron was observed in the remaining species. Phylogenetic analysis using the nucleotide sequences of PTH revealed that these 12 species of primates of suborder Anthropoidea could be divided into two groups of the infraorder Platyrrhini (C. apella, C. jacchus, and S. oedipus) and the infraorder Catarrhini (Macaca fascicularis, Macaca fuscata, Cercopithecus aethiops, Papio hamadryas, Presbytes obscura, Hylobates lar, Pongo pygmaeus, Pan troglodytes, and Pan paniscus). The latter infraorder could be further subdivided into two subgroups belonging to the superfamily Cercopithecoidea (M. fascicularis, M. fuscata, C. aethiops, P. hamadryas, and P. obscura) and the superfamily Hominoidea (H. lar, P. pygmaeus, P. troglodytes, and P. paniscus). The deduced amino acid sequences of PTH gene between 12 species of nonhuman primates and human revealed no amino acid substitution in mature PTH among orangutans, chimpanzees, and humans. The results indicated that the PTH gene is very conserved among primates, especially between great apes and humans. The apes are the most suitable animals to be used for studying the bone metabolism and applying the knowledge to clinical use in humans.
Collapse
|
18
|
Crosier M, Viggiano L, Guy J, Misceo D, Stones R, Wei W, Hearn T, Ventura M, Archidiacono N, Rocchi M, Jackson MS. Human paralogs of KIAA0187 were created through independent pericentromeric-directed and chromosome-specific duplication mechanisms. Genome Res 2002; 12:67-80. [PMID: 11779832 PMCID: PMC155266 DOI: 10.1101/gr.213702] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
KIAA0187 is a gene of unknown function that maps to 10q11 and has been subject to recent duplication events. Here we analyze 18 human paralogs of this gene and show that paralogs of exons 14-23 were formed through satellite-associated pericentromeric-directed duplication, whereas paralogs of exons 1-9 were created via chromosome-specific satellite-independent duplications. In silico, Northern, and RT-PCR analyses indicate that nine paralogs are transcribed, including four in which KIAA0187 exons are spliced onto novel sequences. Despite this, no new genes appear to have been created by these events. The chromosome 10 paralogs map to 10q11, 10q22, 10q23.1, and 10q23.3, forming part of a complex family of chromosome-specific repeats that includes GLUD1, Cathepsin L, and KIAA1099 pseudogenes. Phylogenetic analyses and comparative FISH indicates that the 10q23.1 and 10q23.3 repeats were created in 10q11 and relocated by a paracentric inversion 13 to 27 Myr ago. Furthermore, the most recent duplications, involving the KIAA1099 pseudogenes, have largely been confined to 10q11. These results indicate a simple model for the evolution of this repeat family, involving multiple rounds of centromere-proximal duplication and dispersal through intrachromosomal rearrangement. However, more complex events must be invoked to account for high sequence identity between some paralogs.
Collapse
Affiliation(s)
- Moira Crosier
- The Institute of Human Genetics, The International Centre for Life, Central Parkway, University of Newcastle Upon Tyne, NE1 3BZ, United Kingdom
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Bailey JA, Yavor AM, Viggiano L, Misceo D, Horvath JE, Archidiacono N, Schwartz S, Rocchi M, Eichler EE. Human-specific duplication and mosaic transcripts: the recent paralogous structure of chromosome 22. Am J Hum Genet 2002; 70:83-100. [PMID: 11731936 PMCID: PMC419985 DOI: 10.1086/338458] [Citation(s) in RCA: 133] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2001] [Accepted: 10/31/2001] [Indexed: 11/03/2022] Open
Abstract
In recent decades, comparative chromosomal banding, chromosome painting, and gene-order studies have shown strong conservation of gross chromosome structure and gene order in mammals. However, findings from the human genome sequence suggest an unprecedented degree of recent (<35 million years ago) segmental duplication. This dynamism of segmental duplications has important implications in disease and evolution. Here we present a chromosome-wide view of the structure and evolution of the most highly homologous duplications (> or = 1 kb and > or = 90%) on chromosome 22. Overall, 10.8% (3.7/33.8 Mb) of chromosome 22 is duplicated, with an average sequence identity of 95.4%. To organize the duplications into tractable units, intron-exon structure and well-defined duplication boundaries were used to define 78 duplicated modules (minimally shared evolutionary segments) with 157 copies on chromosome 22. Analysis of these modules provides evidence for the creation or modification of 11 novel transcripts. Comparative FISH analyses of human, chimpanzee, gorilla, orangutan, and macaque reveal qualitative and quantitative differences in the distribution of these duplications--consistent with their recent origin. Several duplications appear to be human specific, including a approximately 400-kb duplication (99.4%-99.8% sequence identity) that transposed from chromosome 14 to the most proximal pericentromeric region of chromosome 22. Experimental and in silico data further support a pericentromeric gradient of duplications where the most recent duplications transpose adjacent to the centromere. Taken together, these data suggest that segmental duplications have been an ongoing process of primate genome evolution, contributing to recent gene innovation and the dynamic transformation of genome architecture within and among closely related species.
Collapse
MESH Headings
- Animals
- Centromere/genetics
- Chromosomes, Human, Pair 14/genetics
- Chromosomes, Human, Pair 22/genetics
- Evolution, Molecular
- Exons/genetics
- Gene Dosage
- Gene Duplication
- Genes, Duplicate/genetics
- Humans
- In Situ Hybridization, Fluorescence
- Introns/genetics
- Mosaicism/genetics
- Primates/genetics
- RNA, Messenger/analysis
- RNA, Messenger/genetics
- Species Specificity
- Time Factors
- Transcription, Genetic/genetics
- Translocation, Genetic/genetics
Collapse
Affiliation(s)
- Jeffrey A Bailey
- Department of Genetics and Center for Human Genetics, Case Western Reserve University School of Medicine and University Hospitals of Cleveland, OH, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Abstract
An estimated 5% of the human genome consists of interspersed duplications that have arisen over the past 35 million years of evolution. Two categories of such recently duplicated segments can be distinguished: segmental duplications between nonhomologous chromosomes (transchromosomal duplications) and duplications mainly restricted to a particular chromosome (chromosome-specific duplications). Many of these duplications exhibit an extraordinarily high degree of sequence identity at the nucleotide level (>95%) and span large genomic distances (1-100 kb). Preliminary analyses indicate that these same regions are targets for rapid evolutionary turnover among the genomes of closely related primates. The dynamic nature of these regions because of recurrent chromosomal rearrangement, and their ability to create fusion genes from juxtaposed cassettes suggest that duplicative transposition was an important force in the evolution of our genome.
Collapse
Affiliation(s)
- E E Eichler
- Dept of Genetics and Center for Human Genetics, Case Western Reserve School of Medicine and University Hospitals of Cleveland, Cleveland, OH 44106, USA.
| |
Collapse
|
21
|
Emanuel BS, Shaikh TH. Segmental duplications: an 'expanding' role in genomic instability and disease. Nat Rev Genet 2001; 2:791-800. [PMID: 11584295 DOI: 10.1038/35093500] [Citation(s) in RCA: 195] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The knowledge that specific genetic diseases are caused by recurrent chromosomal aberrations has indicated that genomic instability might be directly related to the structure of the regions involved. The sequencing of the human genome has directed significant attention towards understanding the molecular basis of such recombination 'hot spots'. Segmental duplications have emerged as a significant factor in the aetiology of disorders that are caused by abnormal gene dosage. These observations bring us closer to understanding the mechanisms and consequences of genomic rearrangement.
Collapse
Affiliation(s)
- B S Emanuel
- Division of Human Genetics and Molecular Biology, 1002 Abramson Research Center, The Children's Hospital of Philadelphia, 3516 Civic Center Blvd, Philadelphia, Pennsylvania 19104, USA.
| | | |
Collapse
|
22
|
Affiliation(s)
- Jeffrey Rogers
- Department of Genetics, Southwest Foundation for Biomedical Research, San Antonio, Texas, USA
| | | |
Collapse
|
23
|
Bailey JA, Yavor AM, Massa HF, Trask BJ, Eichler EE. Segmental duplications: organization and impact within the current human genome project assembly. Genome Res 2001; 11:1005-17. [PMID: 11381028 PMCID: PMC311093 DOI: 10.1101/gr.gr-1871r] [Citation(s) in RCA: 513] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Segmental duplications play fundamental roles in both genomic disease and gene evolution. To understand their organization within the human genome, we have developed the computational tools and methods necessary to detect identity between long stretches of genomic sequence despite the presence of high copy repeats and large insertion-deletions. Here we present our analysis of the most recent genome assembly (January 2001) in which we focus on the global organization of these segments and the role they play in the whole-genome assembly process. Initially, we considered only large recent duplication events that fell well-below levels of draft sequencing error (alignments 90%-98% similar and > or =1 kb in length). Duplications (90%-98%; > or =1 kb) comprise 3.6% of all human sequence. These duplications show clustering and up to 10-fold enrichment within pericentromeric and subtelomeric regions. In terms of assembly, duplicated sequences were found to be over-represented in unordered and unassigned contigs indicating that duplicated sequences are difficult to assign to their proper position. To assess coverage of these regions within the genome, we selected BACs containing interchromosomal duplications and characterized their duplication pattern by FISH. Only 47% (106/224) of chromosomes positive by FISH had a corresponding chromosomal position by comparison. We present data that indicate that this is attributable to misassembly, misassignment, and/or decreased sequencing coverage within duplicated regions. Surprisingly, if we consider putative duplications >98% identity, we identify 10.6% (286 Mb) of the current assembly as paralogous. The majority of these alignments, we believe, represent unmerged overlaps within unique regions. Taken together the above data indicate that segmental duplications represent a significant impediment to accurate human genome assembly, requiring the development of specialized techniques to finish these exceptional regions of the genome. The identification and characterization of these highly duplicated regions represents an important step in the complete sequencing of a human reference genome.
Collapse
Affiliation(s)
- J A Bailey
- Department of Genetics and Center for Human Genetics, Case Western Reserve School of Medicine and University Hospitals of Cleveland, Cleveland, Ohio 44106, USA
| | | | | | | | | |
Collapse
|
24
|
Bailey JA, Yavor AM, Massa HF, Trask BJ, Eichler EE. Segmental Duplications: Organization and Impact Within the Current Human Genome Project Assembly. Genome Res 2001. [DOI: 10.1101/gr.187101] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Segmental duplications play fundamental roles in both genomic disease and gene evolution. To understand their organization within the human genome, we have developed the computational tools and methods necessary to detect identity between long stretches of genomic sequence despite the presence of high copy repeats and large insertion-deletions. Here we present our analysis of the most recent genome assembly (January 2001) in which we focus on the global organization of these segments and the role they play in the whole-genome assembly process. Initially, we considered only large recent duplication events that fell well-below levels of draft sequencing error (alignments 90%–98% similar and ≥1 kb in length). Duplications (90%–98%; ≥1 kb) comprise 3.6% of all human sequence. These duplications show clustering and up to 10-fold enrichment within pericentromeric and subtelomeric regions. In terms of assembly, duplicated sequences were found to be over-represented in unordered and unassigned contigs indicating that duplicated sequences are difficult to assign to their proper position. To assess coverage of these regions within the genome, we selected BACs containing interchromosomal duplications and characterized their duplication pattern by FISH. Only 47% (106/224) of chromosomes positive by FISH had a corresponding chromosomal position by BLAST comparison. We present data that indicate that this is attributable to misassembly, misassignment, and/or decreased sequencing coverage within duplicated regions. Surprisingly, if we consider putative duplications >98% identity, we identify 10.6% (286 Mb) of the current assembly as paralogous. The majority of these alignments, we believe, represent unmerged overlaps within unique regions. Taken together the above data indicate that segmental duplications represent a significant impediment to accurate human genome assembly, requiring the development of specialized techniques to finish these exceptional regions of the genome. The identification and characterization of these highly duplicated regions represents an important step in the complete sequencing of a human reference genome.
Collapse
|
25
|
Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, Stange-Thomann Y, Stojanovic N, Subramanian A, Wyman D, Rogers J, Sulston J, Ainscough R, Beck S, Bentley D, Burton J, Clee C, Carter N, Coulson A, Deadman R, Deloukas P, Dunham A, Dunham I, Durbin R, French L, Grafham D, Gregory S, Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C, McMurray A, Matthews L, Mercer S, Milne S, Mullikin JC, Mungall A, Plumb R, Ross M, Shownkeen R, Sims S, Waterston RH, Wilson RK, Hillier LW, McPherson JD, Marra MA, Mardis ER, Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL, Wendl MC, Delehaunty KD, Miner TL, Delehaunty A, Kramer JB, Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P, Richardson P, Wenning S, Slezak T, Doggett N, Cheng JF, Olsen A, Lucas S, Elkin C, Uberbacher E, Frazier M, Gibbs RA, Muzny DM, Scherer SE, Bouck JB, Sodergren EJ, Worley KC, Rives CM, Gorrell JH, Metzker ML, Naylor SL, Kucherlapati RS, Nelson DL, Weinstock GM, Sakaki Y, Fujiyama A, Hattori M, Yada T, Toyoda A, Itoh T, Kawagoe C, Watanabe H, Totoki Y, Taylor T, Weissenbach J, Heilig R, Saurin W, Artiguenave F, Brottier P, Bruls T, Pelletier E, Robert C, Wincker P, Smith DR, Doucette-Stamm L, Rubenfield M, Weinstock K, Lee HM, Dubois J, Rosenthal A, Platzer M, Nyakatura G, Taudien S, Rump A, Yang H, Yu J, Wang J, Huang G, Gu J, Hood L, Rowen L, Madan A, Qin S, Davis RW, Federspiel NA, Abola AP, Proctor MJ, Myers RM, Schmutz J, Dickson M, Grimwood J, Cox DR, Olson MV, Kaul R, Raymond C, Shimizu N, Kawasaki K, Minoshima S, Evans GA, Athanasiou M, Schultz R, Roe BA, Chen F, Pan H, Ramser J, Lehrach H, Reinhardt R, McCombie WR, de la Bastide M, Dedhia N, Blöcker H, Hornischer K, Nordsiek G, Agarwala R, Aravind L, Bailey JA, Bateman A, Batzoglou S, Birney E, Bork P, Brown DG, Burge CB, Cerutti L, Chen HC, Church D, Clamp M, Copley RR, Doerks T, Eddy SR, Eichler EE, Furey TS, Galagan J, Gilbert JG, Harmon C, Hayashizaki Y, Haussler D, Hermjakob H, Hokamp K, Jang W, Johnson LS, Jones TA, Kasif S, Kaspryzk A, Kennedy S, Kent WJ, Kitts P, Koonin EV, Korf I, Kulp D, Lancet D, Lowe TM, McLysaght A, Mikkelsen T, Moran JV, Mulder N, Pollara VJ, Ponting CP, Schuler G, Schultz J, Slater G, Smit AF, Stupka E, Szustakowki J, Thierry-Mieg D, Thierry-Mieg J, Wagner L, Wallis J, Wheeler R, Williams A, Wolf YI, Wolfe KH, Yang SP, Yeh RF, Collins F, Guyer MS, Peterson J, Felsenfeld A, Wetterstrand KA, Patrinos A, Morgan MJ, de Jong P, Catanese JJ, Osoegawa K, Shizuya H, Choi S, Chen YJ, Szustakowki J. Initial sequencing and analysis of the human genome. Nature 2001; 409:860-921. [PMID: 11237011 DOI: 10.1038/35057062] [Citation(s) in RCA: 14532] [Impact Index Per Article: 631.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.
Collapse
Affiliation(s)
- E S Lander
- Whitehead Institute for Biomedical Research, Center for Genome Research, Cambridge, MA 02142, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Abstract
The remarkable similarity among the genomes of humans and the African great apes could warrant their classification together as a single genus. However, whereas there are many similarities in the biology, life history, and behavior of humans and great apes, there are also many striking differences that need to be explained. The complete sequencing of the human genome creates an opportunity to ask which genes are involved in those differences. A logical approach would be to use the chimpanzee genome for comparison and the other great ape genomes for confirmation. Until such a great ape genome project can become reality, the next best approach must be educated guesses of where the genetic differences may lie and a careful analysis of differences that we do know about. Our group recently discovered a human-specific inactivating mutation in the CMP-sialic acid hydroxylase gene, which results in the loss of expression of a common mammalian cell-surface sugar throughout all cells in the human body. We are currently investigating the implications of this difference for a variety of issues relevant to humans, ranging from pathogen susceptibility to brain development. Evaluating the uniqueness of this finding has also led us to explore the existing literature on the broader issue of genetic differences between humans and great apes. The aim of this brief review is to consider a listing of currently known genetic differences between humans and great apes and to suggest avenues for future research. The differences reported between human and great ape genomes include cytogenetic differences, differences in the type and number of repetitive genomic DNA and transposable elements, abundance and distribution of endogenous retroviruses, the presence and extent of allelic polymorphisms, specific gene inactivation events, gene sequence differences, gene duplications, single nucleotide polymorphisms, gene expression differences, and messenger RNA splicing variations. Evaluation of the reported findings in all these categories indicates that the CMP-sialic hydroxylase mutation is the only one that has so far been shown to result in a global biochemical and structural difference between humans and great apes. Several of the other known genetic dissimilarities deserve more exploration at the functional level. Among the areas of focus for the future should be genes affecting development, mental maturation, reproductive biology, and other aspects of life history. The approaches taken should include both going from the genome up to the adaptive potential of the organisms and going from novel adaptive regimes down to the relevant repercussions in the genome. Also, as much as we desire a simple genetic explanation for the human phenomenon, it is much more probable that our evolution occurred in multiple genetic steps, many of which must have left detectable footprints in our genomes. Ultimately, we need to know the exact number of genetic steps, the order in which they occurred, and the temporal, spatial, environmental, and cultural contexts that determined their impact on human evolution.
Collapse
Affiliation(s)
- P Gagneux
- Department of Medicine and Glycobiology Research and Training Center, University of California at San Diego, La Jolla, California 92093-0687, USA
| | | |
Collapse
|
27
|
Abstract
SUMMARY Fibroblast growth factors (FGFs) make up a large family of polypeptide growth factors that are found in organisms ranging from nematodes to humans. In vertebrates, the 22 members of the FGF family range in molecular mass from 17 to 34 kDa and share 13-71% amino acid identity. Between vertebrate species, FGFs are highly conserved in both gene structure and amino-acid sequence. FGFs have a high affinity for heparan sulfate proteoglycans and require heparan sulfate to activate one of four cell-surface FGF receptors. During embryonic development, FGFs have diverse roles in regulating cell proliferation, migration and differentiation. In the adult organism, FGFs are homeostatic factors and function in tissue repair and response to injury. When inappropriately expressed, some FGFs can contribute to the pathogenesis of cancer. A subset of the FGF family, expressed in adult tissue, is important for neuronal signal transduction in the central and peripheral nervous systems.
Collapse
Affiliation(s)
- D M Ornitz
- Department of Molecular Biology and Pharmacology, Washington University Medical School, 660 S. Euclid Avenue, St. Louis, MO 63110, USA.
| | | |
Collapse
|
28
|
Horvath JE, Schwartz S, Eichler EE. The mosaic structure of human pericentromeric DNA: a strategy for characterizing complex regions of the human genome. Genome Res 2000; 10:839-52. [PMID: 10854415 PMCID: PMC310890 DOI: 10.1101/gr.10.6.839] [Citation(s) in RCA: 89] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The pericentromeric regions of human chromosomes pose particular problems for both mapping and sequencing. These difficulties are due, in large part, to the presence of duplicated genomic segments that are distributed among multiple human chromosomes. To ensure contiguity of genomic sequence in these regions, we designed a sequence-based strategy to characterize different pericentromeric regions using a single (162 kb) 2p11 seed sequence as a point of reference. Molecular and cytogenetic techniques were first used to construct a paralogy map that delineated the interchromosomal distribution of duplicated segments throughout the human genome. Monochromosomal hybrid DNAs were PCR amplified by primer pairs designed to the 2p11 reference sequence. The PCR products were directly sequenced and used to develop a catalog of sequence tags for each duplicon for each chromosome. A total of 685 paralogous sequence variants were generated by sequencing 34.7 kb of paralogous pericentromeric sequence. Using PCR products as hybridization probes, we were able to identify 702 human BAC clones, of which a subset, 107 clones, were analyzed at the sequence level. We used diagnostic paralogous sequence variants to assign 65 of these BACs to at least 9 chromosomal pericentromeric regions: 1q12, 2p11, 9p11/q12, 10p11, 14q11, 15q11, 16p11, 17p11, and 22q11. Comparisons with existing sequence and physical maps for the human genome suggest that many of these BACs map to regions of the genome with sequence gaps. Our analysis indicates that large portions of pericentromeric DNA are virtually devoid of unique sequences. Instead, they consist of a mosaic of different genomic segments that have had different propensities for duplication. These biologic properties may be exploited for the rapid characterization of, not only pericentromeric DNA, but also other complex paralogous regions of the human genome.
Collapse
Affiliation(s)
- J E Horvath
- Department of Genetics and Center for Human Genetics, Case Western Reserve School of Medicine and University Hospitals of Cleveland, Cleveland, Ohio 44106 USA
| | | | | |
Collapse
|
29
|
Ji Y, Eichler EE, Schwartz S, Nicholls RD. Structure of chromosomal duplicons and their role in mediating human genomic disorders. Genome Res 2000; 10:597-610. [PMID: 10810082 DOI: 10.1101/gr.10.5.597] [Citation(s) in RCA: 166] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Chromosome-specific low-copy repeats, or duplicons, occur in multiple regions of the human genome. Homologous recombination between different duplicon copies leads to chromosomal rearrangements, such as deletions, duplications, inversions, and inverted duplications, depending on the orientation of the recombining duplicons. When such rearrangements cause dosage imbalance of a developmentally important gene(s), genetic diseases now termed genomic disorders result, at a frequency of 0.7-1/1000 births. Duplicons can have simple or very complex structures, with variation in copy number from 2 to >10 repeats, and each varying in size from a few kilobases in length to hundreds of kilobases. Analysis of the different duplicons involved in human genomic disorders identifies features that may predispose to recombination, including large size and high sequence identity between the recombining copies, putative recombination promoting features, and the presence of multiple genes/pseudogenes that may include genes expressed in germ cells. Most of the chromosome rearrangements involve duplicons near pericentromeric regions, which may relate to the propensity of such regions to accumulate duplicons. Detailed analyses of the structure, polymorphic variation, and mechanisms of recombination in genomic disorders, as well as the evolutionary origin of various duplicons will further our understanding of the structure, function, and fluidity of the human genome.
Collapse
Affiliation(s)
- Y Ji
- Department of Genetics, Case Western Reserve University School of Medicine, and Center for Human Genetics, University Hospitals of Cleveland, Cleveland, Ohio 44106 USA
| | | | | | | |
Collapse
|
30
|
The Pathological Consequences and Evolutionary Implications of Recent Human Genomic Duplications. COMPARATIVE GENOMICS 2000. [DOI: 10.1007/978-94-011-4309-7_5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
31
|
Eichler EE, Archidiacono N, Rocchi M. CAGGG repeats and the pericentromeric duplication of the hominoid genome. Genome Res 1999; 9:1048-58. [PMID: 10568745 DOI: 10.1101/gr.9.11.1048] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Gene duplication is one of the primary forces of evolutionary change. We present data from three different pericentromeric regions of human chromosomes, which indicate that such regions of the genome have been sites of recent genomic duplication. This form of duplication has involved the evolutionary movement of segments of genomic material, including both intronic and exonic sequence, from diverse regions of the genome toward the pericentromeric regions. Sequence analyses of the target sites of duplication have identified a novel class of interspersed GC-rich repeats located precisely at the boundaries of duplication. Estimates of the evolutionary age of these duplications indicate that they have occurred between 10 and 25 mya. In contrast, comparative analyses confirm that the GC-rich pericentromeric repeats have existed within the pericentromeric regions of primate chromosomes before the divergence of the cercopithecoid and hominoid lineages ( approximately 30 mya). These data provide molecular evidence for considerable interchromosomal duplication of genic segments during the evolution of the hominoid genome and strongly implicate GC-rich repeat elements as playing a direct role in the pericentromeric localization of these events
Collapse
MESH Headings
- Animals
- Base Sequence
- Centromere
- Chromosomes, Human, Pair 1
- Chromosomes, Human, Pair 10
- Chromosomes, Human, Pair 14
- Chromosomes, Human, Pair 15
- Chromosomes, Human, Pair 16
- Chromosomes, Human, Pair 2
- Chromosomes, Human, Pair 22
- Chromosomes, Human, Pair 9
- Evolution, Molecular
- Gene Duplication
- Hominidae/genetics
- Humans
- In Situ Hybridization, Fluorescence
- Microsatellite Repeats/genetics
- Molecular Sequence Data
- Sequence Alignment
- X Chromosome
Collapse
Affiliation(s)
- E E Eichler
- Department of Genetics and Center for Human Genetics, Case Western Reserve School of Medicine and University Hospitals of Cleveland, Cleveland, Ohio 44106, USA.
| | | | | |
Collapse
|
32
|
Ruault M, Trichet V, Gimenez S, Boyle S, Gardiner K, Rolland M, Roizès G, De Sario A. Juxta-centromeric region of human chromosome 21 is enriched for pseudogenes and gene fragments. Gene 1999; 239:55-64. [PMID: 10571034 DOI: 10.1016/s0378-1119(99)00381-9] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A physical map including four pseudogenes and 10 gene fragments and spanning 500 kb in the juxta-centromeric region of the long arm of human chromosome 21 is presented. cDNA fragments isolated from a selected cDNA library were characterized and mapped to the 831B6 YAC and to two BAC contigs that cover 250 kb of the region. An 85 kb genomic sequence located in the proximal region of the map was analyzed for putative exons. Four pseudogenes were found, including psiIGSF3, psiEIF3, psiGCT-rel whose functional copies map to chromosome 1p13, chromosome 2 and chromosome 22q11, respectively. The TTLL1 pseudogene corresponds to a new gene whose functional copy maps to chromosome 22q13. Ten gene fragments represent novel sequences that have related sequences on different human chromosomes and show 97-100% nucleotide identity to chromosome 21. These may correspond to pseudogenes on chromosome 21 and to functional genes in other chromosomes. The 85 kb genomic sequence was analyzed also for GC content, CpG islands, and repetitive sequence distribution. A GC-poor L isochore spanning 40 kb from satellite 1 was observed in the most centromeric region, next to a GC-rich H isochore that is a candidate region for the presence of functional genes. The pericentric duplication of a 7.8 kb region that is derived from the 22q13 chromosome band is described. We showed that the juxta-centromeric region of human chromosome 21 is enriched for retrotransposed pseudogenes and gene fragments transferred by interchromosome duplications, but we do not rule out the possibility that the region harbors functional genes also.
Collapse
Affiliation(s)
- M Ruault
- Séquences Répétées et Centromères Humains, CNRS UPR 1142, Institut de Biologie, Montpellier, France
| | | | | | | | | | | | | | | |
Collapse
|
33
|
Loftus BJ, Kim UJ, Sneddon VP, Kalush F, Brandon R, Fuhrmann J, Mason T, Crosby ML, Barnstead M, Cronin L, Deslattes Mays A, Cao Y, Xu RX, Kang HL, Mitchell S, Eichler EE, Harris PC, Venter JC, Adams MD. Genome duplications and other features in 12 Mb of DNA sequence from human chromosome 16p and 16q. Genomics 1999; 60:295-308. [PMID: 10493829 DOI: 10.1006/geno.1999.5927] [Citation(s) in RCA: 105] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Several publicly funded large-scale sequencing efforts have been initiated with the goal of completing the first reference human genome sequence by the year 2005. Here we present the results of analysis of 11.8 Mb of genomic sequence from chromosome 16. The apparent gene density varies throughout the region, but the number of genes predicted (84) suggests that this is a gene-poor region. This result may also suggest that the total number of human genes is likely to be at the lower end of published estimates. One of the most interesting aspects of this region of the genome is the presence of highly homologous, recently duplicated tracts of sequence distributed throughout the p-arm. Such duplications have implications for mapping and gene analysis as well as the predisposition to recurrent chromosomal structural rearrangements associated with genetic disease.
Collapse
Affiliation(s)
- B J Loftus
- The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, Maryland 20850, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Eichler EE, Hoffman SM, Adamson AA, Gordon LA, McCready P, Lamerdin JE, Mohrenweiser HW. Complex beta-satellite repeat structures and the expansion of the zinc finger gene cluster in 19p12. Genome Res 1998; 8:791-808. [PMID: 9724325 DOI: 10.1101/gr.8.8.791] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
We investigated the organization, architecture, and evolution of the largest cluster ( approximately 4 Mb) of Krüppel-associated box zinc finger (KRAB-ZNF) genes located in cytogenetic band interval 19p12. A highly integrated physical map ( approximately 700 kb) of overlapping cosmid and BAC clones was developed between genetic STS markers D19S454 and D19S269. Using ZNF91 exon-specific probes to interrogate a detailed EcoRI restriction map of the region, ZNF genes were found to be distributed in a head-to-tail fashion throughout the region with an average density of one ZNF duplicon every 150-180 kb of genomic distance. Sequence analysis of 208,967 bp of this region indicated the presence of two putative ZNF genes: one consisting of a novel member of this gene family (ZNF208) expressed ubiquitously in all tissues examined and the other representing a nonprocessed pseudogene (ZNF209), located 450 kb proximal to ZNF208. Large blocks of ( approximately 25-kb) inverted beta-satellite repeats with a remarkably symmetrical higher order repeat structure were found to bracket the functional ZNF gene. Hybridization analysis using the beta-satellite repeat as a probe indicates that beta-satellite interspersion between ZNF gene cassettes is a general property for 1.5 Mb of the ZNF gene cluster in 19p12. Both molecular clock data as well as a retroposon-mapping molecular fossil approach indicate that this ZNF cluster arose early during primate evolution (approximately 50 million years ago). We propose an evolutionary model in which heteromorphic pericentromeric repeat structures such as the beta satellites have been coopted to accommodate rapid expansion of a large gene family over a short period of evolutionary time. [The sequence data described in this paper have been submitted to GenBank under accession nos. AC003973 and AC004004.]
Collapse
Affiliation(s)
- E E Eichler
- Human Genome Center, BBRP, L-452, Lawrence Livermore National Laboratory, Livermore, California 94550, USA.
| | | | | | | | | | | | | |
Collapse
|
35
|
Affiliation(s)
- E E Eichler
- Department of Genetics, School of Medicine, Case Western Reserve University, Cleveland, Ohio 44106, USA.
| |
Collapse
|